From: AMEET M. PARANJAPE <aparanja@redhat.com> Date: Thu, 19 Feb 2009 13:14:06 -0600 Subject: [ppc64] cell spufs: update to the upstream for RHEL-5.4 Message-id: 499DAF7E.1020806@REDHAT.COM O-Subject: Re: [PATCH RHEL5.4 BZ475620] Update spufs for Cell in the kernel of RHEL5.4 to the upstream version Bugzilla: 475620 RH-Acked-by: David Howells <dhowells@redhat.com> RHBZ#: ====== https://bugzilla.redhat.com/show_bug.cgi?id=475620 Description: =========== The spufs for Cell was updated upstream after the code cutoff for RHEL5.3. Now we would like to update the spufs in RHEL5.4 to the most current level to incorporate bug fixes, etc. from upstream. Please see the Upstream Status section (below) for a more detailed description on each of the fixes. RHEL Version Found: ================ RHEL 5.3 kABI Status: ============ No symbols were harmed. Brew: ===== Built on all platforms. http://brewweb.devel.redhat.com/brew/taskinfo?taskID=1671290 Upstream Status: ================ commit 8d5636fbca202f61fdb808fc9e20c0142291d802 Author: Jeremy Kerr <jk@ozlabs.org> Date: Thu Aug 14 14:59:12 2008 +1000 powerpc/spufs: reference context while dropping state mutex in scheduler Based on an original patch from Christoph Hellwig <hch@lst.de>. Currently, there is a possible reference-after-free in the spusched code - contexts may be freed after we have released their state_mutex in spusched_tick and find_victim. This change takes a reference to the context before releasing the mutex, so that the context doesn't get destroyed. Signed-off-by: Jeremy Kerr <jk@ozlabs.org> commit 9f43e3914dceb0f8191875b3cdf4325b48d0d70a Author: Jeremy Kerr <jk@ozlabs.org> Date: Tue Sep 2 11:57:09 2008 +1000 powerpc/spufs: Fix multiple get_spu_context() Commit 8d5636fbca202f61fdb808fc9e20c0142291d802 introduced a reference count on SPU contexts during find_victim, but this may cause a leak in the reference count if we later find a better contender for a context to unschedule. Change the reference to after we've found our victim context, so we don't do the extra get_spu_context(). Signed-off-by: Jeremy Kerr <jk@ozlabs.org> commit b2e601d14deb2083e2a537b47869ab3895d23a28 Author: Andre Detsch <adetsch@br.ibm.com> Date: Thu Sep 4 21:16:27 2008 +0000 powerpc/spufs: Fix possible scheduling of a context to multiple SPEs We currently have a race when scheduling a context to a SPE - after we have found a runnable context in spusched_tick, the same context may have been scheduled by spu_activate(). This may result in a panic if we try to unschedule a context that has been freed in the meantime. This change exits spu_schedule() if the context has already been scheduled, so we don't end up scheduling it twice. Signed-off-by: Andre Detsch <adetsch@br.ibm.com> Signed-off-by: Jeremy Kerr <jk@ozlabs.org> commit f776d12dd16da1b0cd55a1240002c1b31f315d5d Author: Matthew Wilcox <matthew@wil.cx> Date: Thu Dec 6 11:15:50 2007 -0500 Add fatal_signal_pending Like signal_pending, but it's only true for signals which are fatal to this process Signed-off-by: Matthew Wilcox <willy@linux.intel.com> commit 13f09b95a82c46ed608d057b22e0dd18ebfff22a Author: Trond Myklebust <Trond.Myklebust@netapp.com> Date: Thu Jan 31 20:40:29 2008 -0500 Ensure that we export __fatal_signal_pending() It may be used by the modules nfs.ko and sunrpc.ko Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> [ Made it a regular export rather than GPL-only - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> commit 606572634c3faa5b32a8fc430266e6e9d78d2179 Author: Jeremy Kerr <jk@ozlabs.org> Date: Tue Nov 11 10:22:22 2008 +1100 powerpc/spufs: Fix spinning in spufs_ps_fault on signal Currently, we can end up in an infinite loop if we get a signal while the kernel has faulted in spufs_ps_fault. Eg: alarm(1); write(fd, some_spu_psmap_register_address, 4); - the write's copy_from_user will fault on the ps mapping, and signal_pending will be non-zero. Because returning from the fault handler will never clear TIF_SIGPENDING, so we'll just keep faulting, resulting in an unkillable process using 100% of CPU. This change returns VM_FAULT_SIGBUS if there's a fatal signal pending, letting us escape the loop. Signed-off-by: Jeremy Kerr <jk@ozlabs.org> commit 9c8b4aff18b59cd0c2d9a77b3df1f9d7077df90c Author: Alexey Dobriyan <adobriyan@gmail.com> Date: Sun Nov 2 10:21:57 2008 +0000 powerpc/cell: Fix compile error in ras.c This fixes this error on Cell when CONFIG_KEXEC = n: arch/powerpc/platforms/cell/ras.c:299: error: implicit declaration of function 'crash_shutdown_register' We have to include <asm/kexec.h> because it contains the dummy definition of crash_shutdown_register that is used when CONFIG_KEXEC=n, but <linux/kexec.h> doesn't include <asm/kexec.h> in that case. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Paul Mackerras <paulus@samba.org> commit 6747c2ee8abf749e63fee8cd01a9ee293e6a4247 Author: Kou Ishizaki <kou.ishizaki@toshiba.co.jp> Date: Thu Oct 9 10:45:49 2008 +1100 powerpc/spufs: add a missing mutex_unlock A mutex_unlock(&gang->aff_mutex) in spufs_create_context() is missing in case spufs_context_open() fails. As a result, spu_create syscall and spu_get_idle() may block. This patch adds the mutex_unlock. Signed-off-by: Kou Ishizaki <kou.ishizaki@toshiba.co.jp> Signed-off-by: Jeremy Kerr <jk@ozlabs.org> Acked-by: Andre Detsch <adetsch@br.ibm.com> commit cb9808d3d0cb0ed97197decadcf0431140b9e231 Author: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Date: Tue Aug 19 08:48:57 2008 +0300 powerpc/spufs: Remove invalid semicolon after if statement Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: Jeremy Kerr <jk@ozlabs.org> commit 23d893f51cde7013e4c29094da2237cce4f20035 Author: Jeremy Kerr <jk@ozlabs.org> Date: Mon Jun 30 12:17:28 2008 +1000 powerpc/spufs: allow spufs files to specify sizes Currently, spufs never specifies the i_size for the files in context directories, so stat() always reports 0-byte files. This change adds allows the spufs_dir_(nosched_)contents arrays to specify a file size. This allows stat() to report correct file sizes, and makes SEEK_END work. Signed-off-by: Jeremy Kerr <jk@ozlabs.org> commit 6f7dde812defe5bc49cf463ac1579ffeda5cbfd4 Author: Jeremy Kerr <jk@ozlabs.org> Date: Mon Jun 30 14:38:37 2008 +1000 powerpc/spufs: add sizes for context files Populate the size member of a few context files. Leave out files that have different semantics with read vs mmap, or contain a variable-length hex string. Signed-off-by: Jeremy Kerr <jk@ozlabs.org> commit e2ed6e4daa6f16f088600d98568cb5730b5238a6 Author: Jeremy Kerr <jk@ozlabs.org> Date: Tue Oct 7 18:26:55 2008 +1100 powerpc/spufs: set nlink count for spufs root correctly Currently, an empty spufs root inode has nlink count of 1. However, the directory has two links; / -> spu and /spu/ -> . This change increments the link count of the root inode in spufs. Signed-off-by: Jeremy Kerr <jk@ozlabs.org> commit 926dc3d76ec7561c72d466d2a0e26cc1a69fc372 Author: Jeremy Kerr <jk@ozlabs.org> Date: Tue Feb 17 11:44:14 2009 +1100 powerpc/spufs: Constify context contents and coredump callback constants The spufs context directory contents definitions are not changed after initialization, so we can declare them as const. We can do the same with the spu coredump reader callbacks too. diff --git a/arch/powerpc/platforms/cell/ras.c b/arch/powerpc/platforms/cell/ras.c index 2dbaba7..6d916b3 100644 --- a/arch/powerpc/platforms/cell/ras.c +++ b/arch/powerpc/platforms/cell/ras.c @@ -5,6 +5,7 @@ #include <linux/smp.h> #include <linux/reboot.h> +#include <asm/kexec.h> #include <asm/reg.h> #include <asm/io.h> #include <asm/prom.h> diff --git a/arch/powerpc/platforms/cell/spufs/file.c b/arch/powerpc/platforms/cell/spufs/file.c index 9bde808..8b1087e 100644 --- a/arch/powerpc/platforms/cell/spufs/file.c +++ b/arch/powerpc/platforms/cell/spufs/file.c @@ -39,6 +39,7 @@ #include "spufs.h" #define SPUFS_MMAP_4K (PAGE_SIZE == 0x1000) +#define SPUFS_PS_MAP_SIZE 0x20000 /* Simple attribute files */ struct spufs_attr { @@ -370,6 +371,9 @@ static unsigned long spufs_ps_nopfn(struct vm_area_struct *vma, */ get_spu_context(ctx); + if (fatal_signal_pending(current)) + return VM_FAULT_SIGBUS; + /* * We have to wait for context to be loaded before we have * pages to hand out to the user, but we don't want to wait @@ -2583,22 +2587,22 @@ void spu_switch_log_notify(struct spu *spu, struct spu_context *ctx, wake_up(&ctx->switch_log->wait); } -struct tree_descr spufs_dir_contents[] = { +const struct spufs_tree_descr spufs_dir_contents[] = { { "capabilities", &spufs_caps_fops, 0444, }, - { "mem", &spufs_mem_fops, 0666, }, - { "regs", &spufs_regs_fops, 0666, }, + { "mem", &spufs_mem_fops, 0666, LS_SIZE, }, + { "regs", &spufs_regs_fops, 0666, sizeof(struct spu_reg128[128]), }, { "mbox", &spufs_mbox_fops, 0444, }, { "ibox", &spufs_ibox_fops, 0444, }, { "wbox", &spufs_wbox_fops, 0222, }, - { "mbox_stat", &spufs_mbox_stat_fops, 0444, }, - { "ibox_stat", &spufs_ibox_stat_fops, 0444, }, - { "wbox_stat", &spufs_wbox_stat_fops, 0444, }, + { "mbox_stat", &spufs_mbox_stat_fops, 0444, sizeof(u32), }, + { "ibox_stat", &spufs_ibox_stat_fops, 0444, sizeof(u32), }, + { "wbox_stat", &spufs_wbox_stat_fops, 0444, sizeof(u32), }, { "signal1", &spufs_signal1_fops, 0666, }, { "signal2", &spufs_signal2_fops, 0666, }, { "signal1_type", &spufs_signal1_type, 0666, }, { "signal2_type", &spufs_signal2_type, 0666, }, { "cntl", &spufs_cntl_fops, 0666, }, - { "fpcr", &spufs_fpcr_fops, 0666, }, + { "fpcr", &spufs_fpcr_fops, 0666, sizeof(struct spu_reg128), }, { "lslr", &spufs_lslr_ops, 0444, }, { "mfc", &spufs_mfc_fops, 0666, }, { "mss", &spufs_mss_fops, 0666, }, @@ -2608,29 +2612,31 @@ struct tree_descr spufs_dir_contents[] = { { "decr_status", &spufs_decr_status_ops, 0666, }, { "event_mask", &spufs_event_mask_ops, 0666, }, { "event_status", &spufs_event_status_ops, 0444, }, - { "psmap", &spufs_psmap_fops, 0666, }, + { "psmap", &spufs_psmap_fops, 0666, SPUFS_PS_MAP_SIZE, }, { "phys-id", &spufs_id_ops, 0666, }, { "object-id", &spufs_object_id_ops, 0666, }, - { "mbox_info", &spufs_mbox_info_fops, 0444, }, - { "ibox_info", &spufs_ibox_info_fops, 0444, }, - { "wbox_info", &spufs_wbox_info_fops, 0444, }, - { "dma_info", &spufs_dma_info_fops, 0444, }, - { "proxydma_info", &spufs_proxydma_info_fops, 0444, }, + { "mbox_info", &spufs_mbox_info_fops, 0444, sizeof(u32), }, + { "ibox_info", &spufs_ibox_info_fops, 0444, sizeof(u32), }, + { "wbox_info", &spufs_wbox_info_fops, 0444, sizeof(u32), }, + { "dma_info", &spufs_dma_info_fops, 0444, + sizeof(struct spu_dma_info), }, + { "proxydma_info", &spufs_proxydma_info_fops, 0444, + sizeof(struct spu_proxydma_info)}, { "tid", &spufs_tid_fops, 0444, }, { "stat", &spufs_stat_fops, 0444, }, { "switch_log", &spufs_switch_log_fops, 0444 }, {}, }; -struct tree_descr spufs_dir_nosched_contents[] = { +const struct spufs_tree_descr spufs_dir_nosched_contents[] = { { "capabilities", &spufs_caps_fops, 0444, }, - { "mem", &spufs_mem_fops, 0666, }, + { "mem", &spufs_mem_fops, 0666, LS_SIZE, }, { "mbox", &spufs_mbox_fops, 0444, }, { "ibox", &spufs_ibox_fops, 0444, }, { "wbox", &spufs_wbox_fops, 0222, }, - { "mbox_stat", &spufs_mbox_stat_fops, 0444, }, - { "ibox_stat", &spufs_ibox_stat_fops, 0444, }, - { "wbox_stat", &spufs_wbox_stat_fops, 0444, }, + { "mbox_stat", &spufs_mbox_stat_fops, 0444, sizeof(u32), }, + { "ibox_stat", &spufs_ibox_stat_fops, 0444, sizeof(u32), }, + { "wbox_stat", &spufs_wbox_stat_fops, 0444, sizeof(u32), }, { "signal1", &spufs_signal1_fops, 0666, }, { "signal2", &spufs_signal2_fops, 0666, }, { "signal1_type", &spufs_signal1_type, 0666, }, @@ -2639,7 +2645,7 @@ struct tree_descr spufs_dir_nosched_contents[] = { { "mfc", &spufs_mfc_fops, 0666, }, { "cntl", &spufs_cntl_fops, 0666, }, { "npc", &spufs_npc_ops, 0666, }, - { "psmap", &spufs_psmap_fops, 0666, }, + { "psmap", &spufs_psmap_fops, 0666, SPUFS_PS_MAP_SIZE, }, { "phys-id", &spufs_id_ops, 0666, }, { "object-id", &spufs_object_id_ops, 0666, }, { "tid", &spufs_tid_fops, 0444, }, @@ -2647,7 +2653,7 @@ struct tree_descr spufs_dir_nosched_contents[] = { {}, }; -struct spufs_coredump_reader spufs_coredump_read[] = { +const struct spufs_coredump_reader spufs_coredump_read[] = { { "regs", __spufs_regs_read, NULL, sizeof(struct spu_reg128[128])}, { "fpcr", __spufs_fpcr_read, NULL, sizeof(struct spu_reg128) }, { "lslr", NULL, spufs_lslr_get, 19 }, diff --git a/arch/powerpc/platforms/cell/spufs/inode.c b/arch/powerpc/platforms/cell/spufs/inode.c index 857272d..4f04be8 100644 --- a/arch/powerpc/platforms/cell/spufs/inode.c +++ b/arch/powerpc/platforms/cell/spufs/inode.c @@ -108,7 +108,7 @@ spufs_setattr(struct dentry *dentry, struct iattr *attr) static int spufs_new_file(struct super_block *sb, struct dentry *dentry, const struct file_operations *fops, int mode, - struct spu_context *ctx) + size_t size, struct spu_context *ctx) { static struct inode_operations spufs_file_iops = { .setattr = spufs_setattr, @@ -124,6 +124,7 @@ spufs_new_file(struct super_block *sb, struct dentry *dentry, ret = 0; inode->i_op = &spufs_file_iops; inode->i_fop = fops; + inode->i_size = size; inode->i_private = SPUFS_I(inode)->i_ctx = get_spu_context(ctx); d_add(dentry, inode); out: @@ -176,8 +177,9 @@ static int spufs_rmdir(struct inode *parent, struct dentry *dir) return simple_rmdir(parent, dir); } -static int spufs_fill_dir(struct dentry *dir, struct tree_descr *files, - int mode, struct spu_context *ctx) +static int spufs_fill_dir(struct dentry *dir, + const struct spufs_tree_descr *files, int mode, + struct spu_context *ctx) { struct dentry *dentry, *tmp; int ret; @@ -188,7 +190,7 @@ static int spufs_fill_dir(struct dentry *dir, struct tree_descr *files, if (!dentry) goto out; ret = spufs_new_file(dir->d_sb, dentry, files->ops, - files->mode & mode, ctx); + files->mode & mode, files->size, ctx); if (ret) goto out; files++; @@ -478,6 +480,8 @@ spufs_create_context(struct inode *inode, struct dentry *dentry, ret = spufs_context_open(dget(dentry), mntget(mnt)); if (ret < 0) { WARN_ON(spufs_rmdir(inode, dentry)); + if (affinity) + mutex_unlock(&gang->aff_mutex); mutex_unlock(&inode->i_mutex); spu_forget(SPUFS_I(dentry->d_inode)->i_ctx); goto out; @@ -729,6 +733,7 @@ spufs_create_root(struct super_block *sb, void *data) inode->i_op = &simple_dir_inode_operations; inode->i_fop = &simple_dir_operations; SPUFS_I(inode)->i_ctx = NULL; + inc_nlink(inode); ret = -EINVAL; if (!spufs_parse_options(data, inode)) diff --git a/arch/powerpc/platforms/cell/spufs/sched.c b/arch/powerpc/platforms/cell/spufs/sched.c index d601739..8ca9df2 100644 --- a/arch/powerpc/platforms/cell/spufs/sched.c +++ b/arch/powerpc/platforms/cell/spufs/sched.c @@ -632,9 +632,12 @@ static struct spu *find_victim(struct spu_context *ctx) if (tmp && tmp->prio > ctx->prio && !(tmp->flags & SPU_CREATE_NOSCHED) && - (!victim || tmp->prio > victim->prio)) + (!victim || tmp->prio > victim->prio)) { victim = spu->ctx; + } } + if (victim) + get_spu_context(victim); mutex_unlock(&cbe_spu_info[node].list_mutex); if (victim) { @@ -649,6 +652,7 @@ static struct spu *find_victim(struct spu_context *ctx) * look at another context or give up after X retries. */ if (!mutex_trylock(&victim->state_mutex)) { + put_spu_context(victim); victim = NULL; goto restart; } @@ -661,6 +665,7 @@ static struct spu *find_victim(struct spu_context *ctx) * restart the search. */ mutex_unlock(&victim->state_mutex); + put_spu_context(victim); victim = NULL; goto restart; } @@ -676,6 +681,7 @@ static struct spu *find_victim(struct spu_context *ctx) spu_add_to_rq(victim); mutex_unlock(&victim->state_mutex); + put_spu_context(victim); return spu; } @@ -711,7 +717,8 @@ static void spu_schedule(struct spu *spu, struct spu_context *ctx) /* not a candidate for interruptible because it's called either from the scheduler thread or from spu_deactivate */ mutex_lock(&ctx->state_mutex); - __spu_schedule(spu, ctx); + if (ctx->state == SPU_STATE_SAVED) + __spu_schedule(spu, ctx); spu_release(ctx); } @@ -982,9 +989,11 @@ static int spusched_thread(void *unused) struct spu_context *ctx = spu->ctx; if (ctx) { + get_spu_context(ctx); mutex_unlock(mtx); spusched_tick(ctx); mutex_lock(mtx); + put_spu_context(ctx); } } mutex_unlock(mtx); @@ -1027,7 +1036,7 @@ void spuctx_switch_state(struct spu_context *ctx, node = spu->node; if (old_state == SPU_UTIL_USER) atomic_dec(&cbe_spu_info[node].busy_spus); - if (new_state == SPU_UTIL_USER); + if (new_state == SPU_UTIL_USER) atomic_inc(&cbe_spu_info[node].busy_spus); } } diff --git a/arch/powerpc/platforms/cell/spufs/spufs.h b/arch/powerpc/platforms/cell/spufs/spufs.h index 14a61ec..caed499 100644 --- a/arch/powerpc/platforms/cell/spufs/spufs.h +++ b/arch/powerpc/platforms/cell/spufs/spufs.h @@ -227,8 +227,15 @@ struct spufs_inode_info { #define SPUFS_I(inode) \ container_of(inode, struct spufs_inode_info, vfs_inode) -extern struct tree_descr spufs_dir_contents[]; -extern struct tree_descr spufs_dir_nosched_contents[]; +struct spufs_tree_descr { + const char *name; + const struct file_operations *ops; + int mode; + size_t size; +}; + +extern const struct spufs_tree_descr spufs_dir_contents[]; +extern const struct spufs_tree_descr spufs_dir_nosched_contents[]; /* system call implementation */ extern struct spufs_calls spufs_calls; @@ -343,7 +350,7 @@ struct spufs_coredump_reader { u64 (*get)(struct spu_context *ctx); size_t size; }; -extern struct spufs_coredump_reader spufs_coredump_read[]; +extern const struct spufs_coredump_reader spufs_coredump_read[]; extern int spufs_coredump_num_notes; extern int spu_init_csa(struct spu_state *csa); diff --git a/include/linux/sched.h b/include/linux/sched.h index ce42dbb..ce00fe8 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1538,7 +1538,14 @@ static inline int signal_pending(struct task_struct *p) { return unlikely(test_tsk_thread_flag(p,TIF_SIGPENDING)); } - + +extern int FASTCALL(__fatal_signal_pending(struct task_struct *p)); + +static inline int fatal_signal_pending(struct task_struct *p) +{ + return signal_pending(p) && __fatal_signal_pending(p); +} + static inline int need_resched(void) { return unlikely(test_thread_flag(TIF_NEED_RESCHED)); diff --git a/kernel/signal.c b/kernel/signal.c index e1296e6..44391c7 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -954,6 +954,12 @@ void zap_other_threads(struct task_struct *p) } } +int fastcall __fatal_signal_pending(struct task_struct *tsk) +{ + return sigismember(&tsk->pending.signal, SIGKILL); +} +EXPORT_SYMBOL(__fatal_signal_pending); + /* * Must be called under rcu_read_lock() or with tasklist_lock read-held. */