From 0e37739b1c96d65e6433998454985de994383019 Mon Sep 17 00:00:00 2001 From: Michael Ellerman Date: Thu, 13 Jun 2013 21:04:56 +1000 Subject: powerpc: Fix stack overflow crash in resume_kernel when ftracing It's possible for us to crash when running with ftrace enabled, eg: Bad kernel stack pointer bffffd12 at c00000000000a454 cpu 0x3: Vector: 300 (Data Access) at [c00000000ffe3d40] pc: c00000000000a454: resume_kernel+0x34/0x60 lr: c00000000000335c: performance_monitor_common+0x15c/0x180 sp: bffffd12 msr: 8000000000001032 dar: bffffd12 dsisr: 42000000 If we look at current's stack (paca->__current->stack) we see it is equal to c0000002ecab0000. Our stack is 16K, and comparing to paca->kstack (c0000002ecab3e30) we can see that we have overflowed our kernel stack. This leads to us writing over our struct thread_info, and in this case we have corrupted thread_info->flags and set _TIF_EMULATE_STACK_STORE. Dumping the stack we see: 3:mon> t c0000002ecab0000 [c0000002ecab0000] c00000000002131c .performance_monitor_exception+0x5c/0x70 [c0000002ecab0080] c00000000000335c performance_monitor_common+0x15c/0x180 --- Exception: f01 (Performance Monitor) at c0000000000fb2ec .trace_hardirqs_off+0x1c/0x30 [c0000002ecab0370] c00000000016fdb0 .trace_graph_entry+0xb0/0x280 (unreliable) [c0000002ecab0410] c00000000003d038 .prepare_ftrace_return+0x98/0x130 [c0000002ecab04b0] c00000000000a920 .ftrace_graph_caller+0x14/0x28 [c0000002ecab0520] c0000000000d6b58 .idle_cpu+0x18/0x90 [c0000002ecab05a0] c00000000000a934 .return_to_handler+0x0/0x34 [c0000002ecab0620] c00000000001e660 .timer_interrupt+0x160/0x300 [c0000002ecab06d0] c0000000000025dc decrementer_common+0x15c/0x180 --- Exception: 901 (Decrementer) at c0000000000104d4 .arch_local_irq_restore+0x74/0xa0 [c0000002ecab09c0] c0000000000fe044 .trace_hardirqs_on+0x14/0x30 (unreliable) [c0000002ecab0fb0] c00000000016fe3c .trace_graph_entry+0x13c/0x280 [c0000002ecab1050] c00000000003d038 .prepare_ftrace_return+0x98/0x130 [c0000002ecab10f0] c00000000000a920 .ftrace_graph_caller+0x14/0x28 [c0000002ecab1160] c0000000000161f0 .__ppc64_runlatch_on+0x10/0x40 [c0000002ecab11d0] c00000000000a934 .return_to_handler+0x0/0x34 --- Exception: 901 (Decrementer) at c0000000000104d4 .arch_local_irq_restore+0x74/0xa0 ... and so on __ppc64_runlatch_on() is called from RUNLATCH_ON in the exception entry path. At that point the irq state is not consistent, ie. interrupts are hard disabled (by the exception entry), but the paca soft-enabled flag may be out of sync. This leads to the local_irq_restore() in trace_graph_entry() actually enabling interrupts, which we do not want. Because we have not yet reprogrammed the decrementer we immediately take another decrementer exception, and recurse. The fix is twofold. Firstly make sure we call DISABLE_INTS before calling RUNLATCH_ON. The badly named DISABLE_INTS actually reconciles the irq state in the paca with the hardware, making it safe again to call local_irq_save/restore(). Although that should be sufficient to fix the bug, we also mark the runlatch routines as notrace. They are called very early in the exception entry and we are asking for trouble tracing them. They are also fairly uninteresting and tracing them just adds unnecessary overhead. [ This regression was introduced by fe1952fc0afb9a2e4c79f103c08aef5d13db1873 "powerpc: Rework runlatch code" by myself --BenH ] CC: [v3.4+] Signed-off-by: Michael Ellerman Signed-off-by: Benjamin Herrenschmidt --- arch/powerpc/include/asm/exception-64s.h | 2 +- arch/powerpc/kernel/process.c | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) (limited to 'arch/powerpc') diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h index 8e5fae8beaf..46793b58a76 100644 --- a/arch/powerpc/include/asm/exception-64s.h +++ b/arch/powerpc/include/asm/exception-64s.h @@ -513,7 +513,7 @@ label##_common: \ */ #define STD_EXCEPTION_COMMON_ASYNC(trap, label, hdlr) \ EXCEPTION_COMMON(trap, label, hdlr, ret_from_except_lite, \ - FINISH_NAP;RUNLATCH_ON;DISABLE_INTS) + FINISH_NAP;DISABLE_INTS;RUNLATCH_ON) /* * When the idle code in power4_idle puts the CPU into NAP mode, diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c index b0f3e3f77e7..076d1242507 100644 --- a/arch/powerpc/kernel/process.c +++ b/arch/powerpc/kernel/process.c @@ -1369,7 +1369,7 @@ void show_stack(struct task_struct *tsk, unsigned long *stack) #ifdef CONFIG_PPC64 /* Called with hard IRQs off */ -void __ppc64_runlatch_on(void) +void notrace __ppc64_runlatch_on(void) { struct thread_info *ti = current_thread_info(); unsigned long ctrl; @@ -1382,7 +1382,7 @@ void __ppc64_runlatch_on(void) } /* Called with hard IRQs off */ -void __ppc64_runlatch_off(void) +void notrace __ppc64_runlatch_off(void) { struct thread_info *ti = current_thread_info(); unsigned long ctrl; -- cgit v1.2.3 From bf593907f7236e95698a76b7c7a2bbf8b1165327 Mon Sep 17 00:00:00 2001 From: Paul Mackerras Date: Fri, 14 Jun 2013 20:07:41 +1000 Subject: powerpc: Fix emulation of illegal instructions on PowerNV platform Normally, the kernel emulates a few instructions that are unimplemented on some processors (e.g. the old dcba instruction), or privileged (e.g. mfpvr). The emulation of unimplemented instructions is currently not working on the PowerNV platform. The reason is that on these machines, unimplemented and illegal instructions cause a hypervisor emulation assist interrupt, rather than a program interrupt as on older CPUs. Our vector for the emulation assist interrupt just calls program_check_exception() directly, without setting the bit in SRR1 that indicates an illegal instruction interrupt. This fixes it by making the emulation assist interrupt set that bit before calling program_check_interrupt(). With this, old programs that use no-longer implemented instructions such as dcba now work again. CC: Signed-off-by: Paul Mackerras Signed-off-by: Benjamin Herrenschmidt --- arch/powerpc/kernel/exceptions-64s.S | 2 +- arch/powerpc/kernel/traps.c | 10 ++++++++++ 2 files changed, 11 insertions(+), 1 deletion(-) (limited to 'arch/powerpc') diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index e783453f910..40e4a17c8ba 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -683,7 +683,7 @@ machine_check_common: STD_EXCEPTION_COMMON(0xb00, trap_0b, .unknown_exception) STD_EXCEPTION_COMMON(0xd00, single_step, .single_step_exception) STD_EXCEPTION_COMMON(0xe00, trap_0e, .unknown_exception) - STD_EXCEPTION_COMMON(0xe40, emulation_assist, .program_check_exception) + STD_EXCEPTION_COMMON(0xe40, emulation_assist, .emulation_assist_interrupt) STD_EXCEPTION_COMMON(0xe60, hmi_exception, .unknown_exception) #ifdef CONFIG_PPC_DOORBELL STD_EXCEPTION_COMMON_ASYNC(0xe80, h_doorbell, .doorbell_exception) diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c index f18c79c324e..c0e5caf8ccc 100644 --- a/arch/powerpc/kernel/traps.c +++ b/arch/powerpc/kernel/traps.c @@ -1165,6 +1165,16 @@ bail: exception_exit(prev_state); } +/* + * This occurs when running in hypervisor mode on POWER6 or later + * and an illegal instruction is encountered. + */ +void __kprobes emulation_assist_interrupt(struct pt_regs *regs) +{ + regs->msr |= REASON_ILLEGAL; + program_check_exception(regs); +} + void alignment_exception(struct pt_regs *regs) { enum ctx_state prev_state = exception_enter(); -- cgit v1.2.3 From 230b3034793247f61e6a0b08c44cf415f6d92981 Mon Sep 17 00:00:00 2001 From: Benjamin Herrenschmidt Date: Sat, 15 Jun 2013 12:13:40 +1000 Subject: powerpc: Fix missing/delayed calls to irq_work When replaying interrupts (as a result of the interrupt occurring while soft-disabled), in the case of the decrementer, we are exclusively testing for a pending timer target. However we also use decrementer interrupts to trigger the new "irq_work", which in this case would be missed. This change the logic to force a replay in both cases of a timer boundary reached and a decrementer interrupt having actually occurred while disabled. The former test is still useful to catch cases where a CPU having been hard-disabled for a long time completely misses the interrupt due to a decrementer rollover. CC: [v3.4+] Signed-off-by: Benjamin Herrenschmidt Tested-by: Steven Rostedt --- arch/powerpc/kernel/irq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'arch/powerpc') diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c index 5cbcf4d5a80..ea185e0b3ca 100644 --- a/arch/powerpc/kernel/irq.c +++ b/arch/powerpc/kernel/irq.c @@ -162,7 +162,7 @@ notrace unsigned int __check_irq_replay(void) * in case we also had a rollover while hard disabled */ local_paca->irq_happened &= ~PACA_IRQ_DEC; - if (decrementer_check_overflow()) + if ((happened & PACA_IRQ_DEC) || decrementer_check_overflow()) return 0x900; /* Finally check if an external interrupt happened */ -- cgit v1.2.3