SUNRPC: Close a race in __rpc_wait_for_completion_task()
Trond Myklebust authored

Although they run as rpciod background tasks, under normal operation
(i.e. no SIGKILL), functions like nfs_sillyrename(), nfs4_proc_unlck()
and nfs4_do_close() want to be fully synchronous. This means that when we
exit, we want all references to the rpc_task to be gone, and we want
any dentry references etc. held by that task to be released.

For this reason these functions call __rpc_wait_for_completion_task(),
followed by rpc_put_task() in the expectation that the latter will be
releasing the last reference to the rpc_task, and thus ensuring that the
callback_ops->rpc_release() has been called synchronously.

This patch fixes a race which exists due to the fact that
rpciod calls rpc_complete_task() (in order to wake up the callers of
__rpc_wait_for_completion_task()) and then subsequently calls
rpc_put_task() without ensuring that these two steps are done atomically.

In order to avoid adding new spin locks, the patch uses the existing
waitqueue spin lock to order the rpc_task reference count releases between
the waiting process and rpciod.
The common case where nobody is waiting for completion is optimised for by
checking if the RPC_TASK_ASYNC flag is cleared and/or if the rpc_task
reference count is 1: in those cases we drop trying to grab the spin lock,
and immediately free up the rpc_task.

Those few processes that need to put the rpc_task from inside an
asynchronous context and that do not care about ordering are given a new
helper: rpc_put_task_async().
Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
bf294b41
Name Last commit Last update
..
debug Merge branch 'master' into for-next
gcov llseek: automatically add .llseek fop
irq genirq: Disable the SHIRQ_DEBUG call in request_threaded_irq for now
power Merge branch 'fixes-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
time clockevents: Prevent oneshot mode when broadcast device is periodic
trace blktrace: Remove blk_fill_rwbs_rq.
.gitignore Update kernel/.gitignore with new auto-generated files
Kconfig.freezer container freezer: implement freezer cgroup subsystem
Kconfig.hz sched: fix SCHED_HRTICK dependency
Kconfig.locks mutex: Better control mutex adaptive spinning config
Kconfig.preempt rcu: provide RCU options on non-preempt architectures too
Makefile kernel: clean up USE_GENERIC_SMP_HELPERS
acct.c pass a struct path to vfs_statfs
async.c async: use workqueue for worker pool
audit.c audit: error message typo correction
audit.h audit: make functions static
audit_tree.c in untag_chunk() we need to do alloc_chunk() a bit earlier
audit_watch.c audit: make functions static
auditfilter.c Audit: add support to match lsm labels on user audit messages
auditsc.c audit mmap
backtracetest.c backtrace: replace timer with tasklet + completions
bounds.c kbuild: move bounds.h to include/generated
capability.c security: add cred argument to security_capable()
cgroup.c Merge branch 'vfs-scale-working' of git://git.kernel.org/pub/scm/linux/kernel/git/npiggin/linux-npiggin
cgroup_freezer.c cgroup_freezer: update_freezer_state() does incorrect state transitions
compat.c
configs.c
cpu.c
cpuset.c
cred.c
delayacct.c
dma.c
elfcore.c
exec_domain.c
exit.c
extable.c
fork.c
freezer.c
futex.c
futex_compat.c
groups.c
hrtimer.c
hung_task.c
hw_breakpoint.c
irq_work.c
itimer.c
jump_label.c
kallsyms.c
kexec.c
kfifo.c
kmod.c
kprobes.c
ksysfs.c
kthread.c
latencytop.c
lockdep.c
lockdep_internals.h
lockdep_proc.c
lockdep_states.h
module.c
mutex-debug.c
mutex-debug.h
mutex.c
mutex.h
notifier.c
ns_cgroup.c
nsproxy.c
padata.c
panic.c
params.c
perf_event.c
pid.c
pid_namespace.c
pm_qos_params.c
posix-cpu-timers.c
posix-timers.c
printk.c
profile.c
ptrace.c
range.c
rcupdate.c
rcutiny.c
rcutiny_plugin.h
rcutorture.c
rcutree.c
rcutree.h
rcutree_plugin.h
rcutree_trace.c
relay.c
res_counter.c
resource.c
rtmutex-debug.c
rtmutex-debug.h
rtmutex-tester.c
rtmutex.c
rtmutex.h
rtmutex_common.h
rwsem.c
sched.c
sched_autogroup.c
sched_autogroup.h
sched_clock.c
sched_cpupri.c
sched_cpupri.h
sched_debug.c
sched_fair.c
fs/nfs/Makefile kernel: clean up USE_GENERIC_SMP_HELPERS
fs/nfs/callback.c
fs/nfs/callback.h
fs/nfs/callback_proc.c
fs/nfs/callback_xdr.c
fs/nfs/delegation.c
fs/nfs/delegation.h
fs/nfs/dir.c
fs/nfs/direct.c
fs/nfs/file.c
fs/nfs/idmap.c
fs/nfs/inode.c
fs/nfs/iostat.h
fs/nfs/mount_clnt.c
fs/nfs/namespace.c
fs/nfs/nfs2xdr.c
fs/nfs/nfs3acl.c
fs/nfs/nfs3proc.c
fs/nfs/nfs3xdr.c
fs/nfs/nfs4_fs.h
fs/nfs/nfs4proc.c
fs/nfs/nfs4renewd.c
fs/nfs/nfs4state.c
fs/nfs/nfs4xdr.c
fs/nfs/nfsroot.c
fs/nfs/pagelist.c
fs/nfs/proc.c
fs/nfs/read.c
fs/nfs/symlink.c
fs/nfs/sysctl.c
fs/nfs/unlink.c
fs/nfs/write.c