nohz: Fix printk_needs_cpu() return value on offline cpus
Heiko Carstens authored

This patch fixes a hang observed with 2.6.32 kernels where timers got enqueued
on offline cpus.

printk_needs_cpu() may return 1 if called on offline cpus. When a cpu gets
offlined it schedules the idle process which, before killing its own cpu, will
call tick_nohz_stop_sched_tick(). That function in turn will call
printk_needs_cpu() in order to check if the local tick can be disabled. On
offline cpus this function should naturally return 0 since regardless if the
tick gets disabled or not the cpu will be dead short after. That is besides the
fact that __cpu_disable() should already have made sure that no interrupts on
the offlined cpu will be delivered anyway.

In this case it prevents tick_nohz_stop_sched_tick() to call
select_nohz_load_balancer(). No idea if that really is a problem. However what
made me debug this is that on 2.6.32 the function get_nohz_load_balancer() is
used within __mod_timer() to select a cpu on which a timer gets enqueued. If
printk_needs_cpu() returns 1 then the nohz_load_balancer cpu doesn't get
updated when a cpu gets offlined. It may contain the cpu number of an offline
cpu. In turn timers get enqueued on an offline cpu and not very surprisingly
they never expire and cause system hangs.

This has been observed 2.6.32 kernels. On current kernels __mod_timer() uses
get_nohz_timer_target() which doesn't have that problem. However there might be
other problems because of the too early exit tick_nohz_stop_sched_tick() in
case a cpu goes offline.

Easiest way to fix this is just to test if the current cpu is offline and call
printk_tick() directly which clears the condition.

Alternatively I tried a cpu hotplug notifier which would clear the condition,
however between calling the notifier function and printk_needs_cpu() something
could have called printk() again and the problem is back again. This seems to
be the safest fix.
Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: stable@kernel.org
LKML-Reference: <20101126120235.406766476@de.ibm.com>
Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
61ab2544
Name Last commit Last update
..
debug debug_core,x86,blackfin: Clean up hw debug disable API
gcov llseek: automatically add .llseek fop
irq Merge branches 'irq-core-for-linus' and 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
power PM / OPP: Hide OPP configuration when SoCs do not provide an implementation
time ntp: Clamp PLL update interval
trace Merge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6
Documentation
arch
block
crypto
drivers
fs
include
init
ipc
kernel
lib
mm
net
samples
scripts
security
sound
usr
.gitignore Update kernel/.gitignore with new auto-generated files
Kconfig.freezer container freezer: implement freezer cgroup subsystem
Kconfig.hz sched: fix SCHED_HRTICK dependency
Kconfig.locks mutex: Better control mutex adaptive spinning config
Kconfig.preempt rcu: provide RCU options on non-preempt architectures too
Makefile Merge branch 'core-memblock-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
acct.c pass a struct path to vfs_statfs
async.c async: use workqueue for worker pool
audit.c audit: Use rcu for task lookup protection
audit.h audit: make functions static
audit_tree.c in untag_chunk() we need to do alloc_chunk() a bit earlier
audit_watch.c audit: make functions static
auditfilter.c Audit: add support to match lsm labels on user audit messages
auditsc.c audit mmap
backtracetest.c backtrace: replace timer with tasklet + completions
bounds.c kbuild: move bounds.h to include/generated
capability.c sched: Remove remaining USER_SCHED code
cgroup.c convert cgroup and cpuset
cgroup_freezer.c cgroup_freezer: update_freezer_state() does incorrect state transitions
compat.c
configs.c
cpu.c
cpuset.c
cred.c
delayacct.c
dma.c
elfcore.c
exec_domain.c
exit.c
extable.c
fork.c
freezer.c
futex.c
futex_compat.c
groups.c
hrtimer.c
hung_task.c
hw_breakpoint.c
irq_work.c
itimer.c
jump_label.c
kallsyms.c
kexec.c
kfifo.c
kmod.c
kprobes.c
ksysfs.c
kthread.c
latencytop.c
lockdep.c
lockdep_internals.h
lockdep_proc.c
lockdep_states.h
module.c
mutex-debug.c
mutex-debug.h
mutex.c
mutex.h
notifier.c
ns_cgroup.c
nsproxy.c
padata.c
panic.c
params.c
perf_event.c
pid.c
pid_namespace.c
pm_qos_params.c
posix-cpu-timers.c
posix-timers.c
printk.c
profile.c
ptrace.c
range.c
rcupdate.c
rcutiny.c
rcutiny_plugin.h
rcutorture.c
rcutree.c
rcutree.h
rcutree_plugin.h
rcutree_trace.c
relay.c
res_counter.c
resource.c
rtmutex-debug.c
rtmutex-debug.h
rtmutex-tester.c
rtmutex.c
rtmutex.h
rtmutex_common.h
rwsem.c
sched.c
sched_clock.c
sched_cpupri.c
sched_cpupri.h
sched_debug.c
sched_fair.c
sched_features.h
sched_idletask.c
.gitignore Update kernel/.gitignore with new auto-generated files
.mailmap
COPYING
CREDITS
Kbuild
MAINTAINERS
Makefile Merge branch 'core-memblock-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
README
REPORTING-BUGS