aboutsummaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2014-06-18arm64: fix pud_huge() for 2-level pagetablesMark Salter
commit 4797ec2dc83a43be35bad56037d1b53db9e2b5d5 upstream. The following happens when trying to run a kvm guest on a kernel configured for 64k pages. This doesn't happen with 4k pages: BUG: failure at include/linux/mm.h:297/put_page_testzero()! Kernel panic - not syncing: BUG! CPU: 2 PID: 4228 Comm: qemu-system-aar Tainted: GF 3.13.0-0.rc7.31.sa2.k32v1.aarch64.debug #1 Call trace: [<fffffe0000096034>] dump_backtrace+0x0/0x16c [<fffffe00000961b4>] show_stack+0x14/0x1c [<fffffe000066e648>] dump_stack+0x84/0xb0 [<fffffe0000668678>] panic+0xf4/0x220 [<fffffe000018ec78>] free_reserved_area+0x0/0x110 [<fffffe000018edd8>] free_pages+0x50/0x88 [<fffffe00000a759c>] kvm_free_stage2_pgd+0x30/0x40 [<fffffe00000a5354>] kvm_arch_destroy_vm+0x18/0x44 [<fffffe00000a1854>] kvm_put_kvm+0xf0/0x184 [<fffffe00000a1938>] kvm_vm_release+0x10/0x1c [<fffffe00001edc1c>] __fput+0xb0/0x288 [<fffffe00001ede4c>] ____fput+0xc/0x14 [<fffffe00000d5a2c>] task_work_run+0xa8/0x11c [<fffffe0000095c14>] do_notify_resume+0x54/0x58 In arch/arm/kvm/mmu.c:unmap_range(), we end up doing an extra put_page() on the stage2 pgd which leads to the BUG in put_page_testzero(). This happens because a pud_huge() test in unmap_range() returns true when it should always be false with 2-level pages tables used by 64k pages. This patch removes support for huge puds if 2-level pagetables are being used. Signed-off-by: Mark Salter <msalter@redhat.com> [catalin.marinas@arm.com: removed #ifndef around PUD_SIZE check] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18x86, mm, hugetlb: Add missing TLB page invalidation for hugetlb_cow()Anthony Iliopoulos
commit 9844f5462392b53824e8b86726e7c33b5ecbb676 upstream. The invalidation is required in order to maintain proper semantics under CoW conditions. In scenarios where a process clones several threads, a thread operating on a core whose DTLB entry for a particular hugepage has not been invalidated, will be reading from the hugepage that belongs to the forked child process, even after hugetlb_cow(). The thread will not see the updated page as long as the stale DTLB entry remains cached, the thread attempts to write into the page, the child process exits, or the thread gets migrated to a different processor. Signed-off-by: Anthony Iliopoulos <anthony.iliopoulos@huawei.com> Link: http://lkml.kernel.org/r/20140514092948.GA17391@server-36.huawei.corp Suggested-by: Shay Goikhman <shay.goikhman@huawei.com> Acked-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18genirq: Allow forcing cpu affinity of interruptsThomas Gleixner
commit 01f8fa4f01d8362358eb90e412bd7ae18a3ec1ad upstream. The current implementation of irq_set_affinity() refuses rightfully to route an interrupt to an offline cpu. But there is a special case, where this is actually desired. Some of the ARM SoCs have per cpu timers which require setting the affinity during cpu startup where the cpu is not yet in the online mask. If we can't do that, then the local timer interrupt for the about to become online cpu is routed to some random online cpu. The developers of the affected machines tried to work around that issue, but that results in a massive mess in that timer code. We have a yet unused argument in the set_affinity callbacks of the irq chips, which I added back then for a similar reason. It was never required so it got not used. But I'm happy that I never removed it. That allows us to implement a sane handling of the above scenario. So the affected SoC drivers can add the required force handling to their interrupt chip, switch the timer code to irq_force_affinity() and things just work. This does not affect any existing user of irq_set_affinity(). Tagged for stable to allow a simple fix of the affected SoC clock event drivers. Reported-and-tested-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Kyungmin Park <kyungmin.park@samsung.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Cc: Tomasz Figa <t.figa@samsung.com>, Cc: Daniel Lezcano <daniel.lezcano@linaro.org>, Cc: Kukjin Kim <kgene.kim@samsung.com> Cc: linux-arm-kernel@lists.infradead.org, Link: http://lkml.kernel.org/r/20140416143315.717251504@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18mips: dts: Fix missing device_type="memory" property in memory nodesLeif Lindholm
commit dfc44f8030653b345fc6fb337558c3a07536823f upstream. A few platforms lack a 'device_type = "memory"' for their memory nodes, relying on an old ppc quirk in order to discover its memory. Add the missing data so that all parsing code can find memory nodes correctly. Signed-off-by: Leif Lindholm <leif.lindholm@linaro.org> Cc: linux-mips@linux-mips.org Cc: devicetree@vger.kernel.org Cc: Mark Rutland <mark.rutland@arm.com> Acked-by: John Crispin <blogic@openwrt.org> Signed-off-by: Grant Likely <grant.likely@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18MIPS/loongson2_cpufreq: Fix CPU clock rate settingAaro Koskinen
commit 8e8acb32960f42c81b1d50deac56a2c07bb6a18a upstream. Loongson2 has been using (incorrectly) kHz for cpu_clk rate. This has been unnoticed, as loongson2_cpufreq was the only place where the rate was set/get. After commit 652ed95d5fa6074b3c4ea245deb0691f1acb6656 (cpufreq: introduce cpufreq_generic_get() routine) things however broke, and now loops_per_jiffy adjustments are incorrect (1000 times too long). The patch fixes this by changing cpu_clk rate to Hz. Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Cc: cpufreq@vger.kernel.org Cc: Aaro Koskinen <aaro.koskinen@iki.fi> Patchwork: https://patchwork.linux-mips.org/patch/6678/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18arm64: Add big.LITTLE switcher stubMark Brown
The big.LITTLE cpufreq driver is useful on arm64 big.LITTLE systems even without IKS support since it implements support for clusters with shared clocks (a common big.LITTLE configuration). In order to allow it to be built provide the non-IKS stubs for arm64, enabling cpufreq with all the cores available. It may make sense to make an asm-generic version of these stubs instead but given that there's only likely to be these two architectures using the code and asm-generic stubs also need per architecture updates it's probably more trouble than it's worth. Signed-off-by: Mark Brown <broonie@linaro.org> (cherry picked from commit LSK 1eb3e5188edfffca2e5ed9e28e77e4ea4036ea45) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2014-06-18arm64: Enable OPPMark Brown
Upstream OPP has been converted into a selectable symbol by commit 049d595a4db3b (PM / OPP: Make OPP invisible to users in Kconfig) however for v3.10 it is less invasive to follow the practice in v3.10. Signed-off-by: Mark Brown <broonie@linaro.org> (cherry picked from LSK commit 4c9563dfc768f176de0c9cdf398ec8a108f374d0) Signed-off-by: Alex Shi <alex.shi@linaro.org> Conflicts: arch/arm64/Kconfig
2014-06-18arm64: Enable HMP for ARMv8Mark Hambleton
Signed-off-by: Mark Hambleton <mahamble@broadcom.com> Signed-off-by: Mark Brown <broonie@linaro.org> (cherry picked from commit 8ecd48091c61213762407f5ce02d3fa4ede66402) Signed-off-by: Alex Shi <alex.shi@linaro.org> Conflicts: arch/arm64/kernel/topology.c
2014-06-18arm64: Add scheduler multicore and SMT Kconfig optionsMark Brown
Enable additional use of additional scheduler features with the topology information. Reported-by: Hanjun Guo <hanjun.guo@linaro.org> Signed-off-by: Mark Brown <broonie@linaro.org> (cherry picked from commit d474d10a1a827e0b3402943a62a97d95c0eabf8d) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2014-06-18arm64: ftrace: (bugfix) synced with ftcace interface changeAKASHI Takahiro
ftrace_init() failed since ftrace_dyn_arch_init() doesn't initialize the argument to null. This bug comes in only if arm64 ftrace is back-ported as ftrace_dyn_arch_init() interface has been changed in the merge window of 3.15. Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Signed-off-by: Mark Brown <broonie@linaro.org> (cherry picked from LSK commit 0cc5286fc3ca12ec0d388e4d8c08a66b40d52233) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2014-06-18arm64: Add __ASSEMBLY__ guards to insn.hMark Brown
Signed-off-by: Mark Brown <broonie@linaro.org> (cherry picked from LSK commit d10fbc12d0c4c6e872de2a342b88b896e430cea3) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2014-06-18arm64: ftrace: Add CALLER_ADDRx macrosAKASHI Takahiro
CALLER_ADDRx returns caller's address at specified level in call stacks. They are used for several tracers like irqsoff and preemptoff. Strange to say, however, they are refered even without FTRACE. Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Signed-off-by: Mark Brown <broonie@linaro.org> (cherry picked from LSK commit ea76465b82b819230ab16ab8ab1b460c09ed5b3b) Signed-off-by: Alex Shi <alex.shi@linaro.org> Conflicts: arch/arm64/kernel/Makefile
2014-06-18arm64: ftrace: Add dynamic ftrace supportAKASHI Takahiro
This patch allows "dynamic ftrace" if CONFIG_DYNAMIC_FTRACE is enabled. Here we can turn on and off tracing dynamically per-function base. On arm64, this is done by patching single branch instruction to _mcount() inserted by gcc -pg option. The branch is replaced to NOP initially at kernel start up, and later on, NOP to branch to ftrace_caller() when enabled or branch to NOP when disabled. Please note that ftrace_caller() is a counterpart of _mcount() in case of 'static' ftrace. More details on architecture specific requirements are described in Documentation/trace/ftrace-design.txt. Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Signed-off-by: Mark Brown <broonie@linaro.org> (cherry picked from LSK commit 95d7e5533398803e3ced340771cb3461c9443407) Signed-off-by: Alex Shi <alex.shi@linaro.org> Conflicts: arch/arm64/Kconfig
2014-06-18arm64: Add ftrace supportAKASHI Takahiro
This patch implements arm64 specific part to support function tracers, such as function (CONFIG_FUNCTION_TRACER), function_graph (CONFIG_FUNCTION_GRAPH_TRACER) and function profiler (CONFIG_FUNCTION_PROFILER). With 'function' tracer, all the functions in the kernel are traced with timestamps in ${sysfs}/tracing/trace. If function_graph tracer is specified, call graph is generated. The kernel must be compiled with -pg option so that _mcount() is inserted at the beginning of functions. This function is called on every function's entry as long as tracing is enabled. In addition, function_graph tracer also needs to be able to probe function's exit. ftrace_graph_caller() & return_to_handler do this by faking link register's value to intercept function's return path. More details on architecture specific requirements are described in Documentation/trace/ftrace-design.txt. Reviewed-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Signed-off-by: Mark Brown <broonie@linaro.org> (cherry picked from LSK commit fc0c93936ae80bdffac7499a820d7e5105ef44d5) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2014-06-18ftrace: Add arm64 support to recordmcountAKASHI Takahiro
Recordmcount utility under scripts is run, after compiling each object, to find out all the locations of calling _mcount() and put them into specific seciton named __mcount_loc. Then linker collects all such information into a table in the kernel image (between __start_mcount_loc and __stop_mcount_loc) for later use by ftrace. This patch adds arm64 specific definitions to identify such locations. There are two types of implementation, C and Perl. On arm64, only C version is used to build the kernel now that CONFIG_HAVE_C_RECORDMCOUNT is on. But Perl version is also maintained. This patch also contains a workaround just in case where a header file, elf.h, on host machine doesn't have definitions of EM_AARCH64 nor R_AARCH64_ABS64. Without them, compiling C version of recordmcount will fail. Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Signed-off-by: Mark Brown <broonie@linaro.org> (cherry picked from LSK commit 6470430d6d73aa46e00cc28ee732b292538419c9) Signed-off-by: Alex Shi <alex.shi@linaro.org> Conflicts: arch/arm64/Kconfig
2014-06-18arm64: Add 'notrace' attribute to unwind_frame() for ftraceAKASHI Takahiro
walk_stackframe() calls unwind_frame(), and if walk_stackframe() is "notrace", unwind_frame() should be also "notrace". Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Signed-off-by: Mark Brown <broonie@linaro.org> (cherry picked from LSK commit 8370369c5c33b9d573575249f080a905323d7eb4) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2014-06-18ftrace: make CALLER_ADDRx macros more genericAKASHI Takahiro
Most archs with HAVE_ARCH_CALLER_ADDR have the almost same definitions of CALLER_ADDRx(n), and so put them into linux/ftrace.h. Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Signed-off-by: Mark Brown <broonie@linaro.org> (cherry picked from LSK commit f2f646bdd599cb551bfeb8f1934a9e14ccfeac84) Signed-off-by: Alex Shi <alex.shi@linaro.org> Conflicts: arch/xtensa/include/asm/ftrace.h
2014-06-18arm64: is_compat_task is defined both in asm/compat.h and linux/compat.hAKASHI Takahiro
Some kernel files may include both linux/compat.h and asm/compat.h directly or indirectly. Since both header files contain is_compat_task() under !CONFIG_COMPAT, compiling them with !CONFIG_COMPAT will eventually fail. Such files include kernel/auditsc.c, kernel/seccomp.c and init/do_mountfs.c (do_mountfs.c may read asm/compat.h via asm/ftrace.h once ftrace is implemented). So this patch proactively 1) removes is_compat_task() under !CONFIG_COMPAT from asm/compat.h 2) replaces asm/compat.h to linux/compat.h in kernel/*.c, but asm/compat.h is still necessary in ptrace.c and process.c because they use is_compat_thread(). Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> (cherry picked from commit fd92d4a54a069953b4679958121317f2a25389cd) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2014-06-18arm64: Add regs_return_value() in syscall.hAKASHI Takahiro
This macro, regs_return_value, is used mainly for audit to record system call's results, but may also be used in test_kprobes.c. Acked-by: Will Deacon <will.deacon@arm.com> Acked-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> (cherry picked from commit d34a3ebd8d25cf691a94fae66a957a480cf46430) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2014-06-18arm64: split syscall_trace() into separate functions for enter/exitAKASHI Takahiro
As done in arm, this change makes it easy to confirm we invoke syscall related hooks, including syscall tracepoint, audit and seccomp which would be implemented later, in correct order. That is, undoing operations in the opposite order on exit that they were done on entry. Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> (cherry picked from commit 3157858feff89196635b01495d5ec9ebe206639e) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2014-06-18arm64: make a single hook to syscall_trace() for all syscall featuresAKASHI Takahiro
Currently syscall_trace() is called only for ptrace. With additional TIF_xx flags defined, it is now called in all the cases of audit, ftrace and seccomp in addition to ptrace. Acked-by: Richard Guy Briggs <rgb@redhat.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> (cherry picked from commit 449f81a4da4d99980064943d504bb19d07e86aec) Signed-off-by: Alex Shi <alex.shi@linaro.org> Conflicts: arch/arm64/kernel/entry.S
2014-06-18arm64: topology: Provide relative power numbers for coresMark Brown
Provide performance numbers to the scheduler to help it fill the cores in the system on big.LITTLE systems. With the current scheduler this may perform poorly for applications that try to do OpenMP style work over all cores but should help for more common workloads. The current 32 bit ARM implementation provides a similar estimate so this helps ensure that work to improve big.LITTLE systems on ARMv7 systems performs similarly on ARMv8 systems. The power numbers are the same as for ARMv7 since it seems that the expected differential between the big and little cores is very similar on both ARMv7 and ARMv8. In both ARMv7 and ARMv8 cases the numbers were based on the published DMIPS numbers. These numbers are just an initial and basic approximation for use with the current scheduler, it is likely that both experience with silicon and ongoing work on improving the scheduler will lead to further tuning or will tune automatically at runtime and so make the specific choice of numbers here less critical. Signed-off-by: Mark Brown <broonie@linaro.org> (cherry picked from LSK commit a84034fddb11f30849dd7ce050689d615995c0d2) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2014-06-18arm64: topology: Tell the scheduler about the relative power of coresMark Brown
In heterogeneous systems like big.LITTLE systems the scheduler will be able to make better use of the available cores if we provide power numbers to it indicating their relative performance. Do this by parsing the CPU nodes in the DT. This code currently has no effect as no information on the relative performance of the cores is provided. Signed-off-by: Mark Brown <broonie@linaro.org> (cherry picked from LSK commit 474bddad45eeb9457f5fae2a0cc133372a61a64c) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2014-06-18arm64: mm: fix pmd_write CoW brokennessWill Deacon
Commit 9c7e535fcc17 ("arm64: mm: Route pmd thp functions through pte equivalents") changed the pmd manipulator and accessor functions to convert the target pmd to a pte, process it with the pte functions, then convert it back. Along the way, we gained support for PTE_WRITE, however this is completely ignored by set_pmd_at, and so we fail to set the PMD_SECT_RDONLY for PMDs, resulting in all sorts of lovely failures (like CoW not working). Partially reverting the offending commit (by making use of PMD_SECT_RDONLY explicitly for pmd_{write,wrprotect,mkwrite} functions) leads to further issues because pmd_write can then return potentially incorrect values for page table entries marked as RDONLY, leading to BUG_ON(pmd_write(entry)) tripping under some THP workloads. This patch fixes the issue by routing set_pmd_at through set_pte_at, which correctly takes the PTE_WRITE flag into account. Given that THP mappings are always anonymous, the additional cache-flushing code in __sync_icache_dcache won't impose any significant overhead as the flush will be skipped. Cc: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Steve Capper <steve.capper@arm.com> Tested-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> (cherry picked from commit ceb218359de22e70980801d4fa04fffbfc44adb8) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2014-06-18net: filter: s390: fix JIT address randomizationHeiko Carstens
[ Upstream commit e84d2f8d2ae33c8215429824e1ecf24cbca9645e ] This is the s390 variant of Alexei's JIT bug fix. (patch description below stolen from Alexei's patch) bpf_alloc_binary() adds 128 bytes of room to JITed program image and rounds it up to the nearest page size. If image size is close to page size (like 4000), it is rounded to two pages: round_up(4000 + 4 + 128) == 8192 then 'hole' is computed as 8192 - (4000 + 4) = 4188 If prandom_u32() % hole selects a number >= PAGE_SIZE - sizeof(*header) then kernel will crash during bpf_jit_free(): kernel BUG at arch/x86/mm/pageattr.c:887! Call Trace: [<ffffffff81037285>] change_page_attr_set_clr+0x135/0x460 [<ffffffff81694cc0>] ? _raw_spin_unlock_irq+0x30/0x50 [<ffffffff810378ff>] set_memory_rw+0x2f/0x40 [<ffffffffa01a0d8d>] bpf_jit_free_deferred+0x2d/0x60 [<ffffffff8106bf98>] process_one_work+0x1d8/0x6a0 [<ffffffff8106bf38>] ? process_one_work+0x178/0x6a0 [<ffffffff8106c90c>] worker_thread+0x11c/0x370 since bpf_jit_free() does: unsigned long addr = (unsigned long)fp->bpf_func & PAGE_MASK; struct bpf_binary_header *header = (void *)addr; to compute start address of 'bpf_binary_header' and header->pages will pass junk to: set_memory_rw(addr, header->pages); Fix it by making sure that &header->image[prandom_u32() % hole] and &header are in the same page. Fixes: aa2d2c73c21f2 ("s390/bpf,jit: address randomize and write protect jit code") Reported-by: Alexei Starovoitov <ast@plumgrid.com> Cc: <stable@vger.kernel.org> # v3.11+ Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18net: filter: x86: fix JIT address randomizationAlexei Starovoitov
[ Upstream commit 773cd38f40b8834be991dbfed36683acc1dd41ee ] bpf_alloc_binary() adds 128 bytes of room to JITed program image and rounds it up to the nearest page size. If image size is close to page size (like 4000), it is rounded to two pages: round_up(4000 + 4 + 128) == 8192 then 'hole' is computed as 8192 - (4000 + 4) = 4188 If prandom_u32() % hole selects a number >= PAGE_SIZE - sizeof(*header) then kernel will crash during bpf_jit_free(): kernel BUG at arch/x86/mm/pageattr.c:887! Call Trace: [<ffffffff81037285>] change_page_attr_set_clr+0x135/0x460 [<ffffffff81694cc0>] ? _raw_spin_unlock_irq+0x30/0x50 [<ffffffff810378ff>] set_memory_rw+0x2f/0x40 [<ffffffffa01a0d8d>] bpf_jit_free_deferred+0x2d/0x60 [<ffffffff8106bf98>] process_one_work+0x1d8/0x6a0 [<ffffffff8106bf38>] ? process_one_work+0x178/0x6a0 [<ffffffff8106c90c>] worker_thread+0x11c/0x370 since bpf_jit_free() does: unsigned long addr = (unsigned long)fp->bpf_func & PAGE_MASK; struct bpf_binary_header *header = (void *)addr; to compute start address of 'bpf_binary_header' and header->pages will pass junk to: set_memory_rw(addr, header->pages); Fix it by making sure that &header->image[prandom_u32() % hole] and &header are in the same page Fixes: 314beb9bcabfd ("x86: bpf_jit_comp: secure bpf jit against spraying attacks") Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18xen/spinlock: Don't enable them unconditionally.Konrad Rzeszutek Wilk
commit e0fc17a936334c08b2729fff87168c03fdecf5b6 upstream. The git commit a945928ea2709bc0e8e8165d33aed855a0110279 ('xen: Do not enable spinlocks before jump_label_init() has executed') was added to deal with the jump machinery. Earlier the code that turned on the jump label was only called by Xen specific functions. But now that it had been moved to the initcall machinery it gets called on Xen, KVM, and baremetal - ouch!. And the detection machinery to only call it on Xen wasn't remembered in the heat of merge window excitement. This means that the slowpath is enabled on baremetal while it should not be. Reported-by: Waiman Long <waiman.long@hp.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> CC: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18parisc: remove _STK_LIM_MAX overrideJohn David Anglin
commit e0d8898d76a785453bfaf6cd08b830a7d5189f78 upstream. There are only a couple of architectures that override _STK_LIM_MAX to a non-infinity value. This changes the stack allocation semantics in subtle ways. For example, GNU make changes its stack allocation to the hard maximum defined by _STK_LIM_MAX. As a results, threads executed by processes running under make are allocated a stack size of _STK_LIM_MAX rather than a sensible default value. This causes various thread stress tests to fail when they can't muster more than about 50 threads. The attached change implements the default behavior used by the majority of architectures. Signed-off-by: John David Anglin <dave.anglin@bell.net> Reviewed-by: Carlos O'Donell <carlos@systemhalted.org> Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18parisc: fix epoll_pwait syscall on compat kernelHelge Deller
commit ab3e55b119c9653b19ea4edffb86f04db867ac98 upstream. This bug was detected with the libio-epoll-perl debian package where the test case IO-Ppoll-compat.t failed. Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18parisc: change value of SHMLBA from 0x00400000 to PAGE_SIZEHelge Deller
commit 0ef36bd2b37815719e31a72d2beecc28ca8ecd26 upstream. On parisc, SHMLBA was defined to 0x00400000 (4MB) to reflect that we need to take care of our caches for shared mappings. But actually, we can map a file at any multiple address of PAGE_SIZE, so let us correct that now with a value of PAGE_SIZE for SHMLBA. Instead we now take care of this cache colouring via the constant SHM_COLOUR while we map shared pages. Signed-off-by: Helge Deller <deller@gmx.de> CC: Jeroen Roovers <jer@gentoo.org> CC: John David Anglin <dave.anglin@bell.net> CC: Carlos O'Donell <carlos@systemhalted.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18x86,preempt: Fix preemption for i386Peter Zijlstra
Many people reported preemption/reschedule problems with i386 kernels for .13 and .14. After Michele bisected this to a combination of 3e8e42c69bb ("sched: Revert need_resched() to look at TIF_NEED_RESCHED") ded79754754 ("irq: Force hardirq exit's softirq processing on its own stack") it finally dawned on me that i386's current_thread_info() was to blame. When we are on interrupt/exception stacks, we fail to observe the right TIF_NEED_RESCHED bit and therefore the PREEMPT_NEED_RESCHED folding malfunctions. Current upstream fixes this by making i386 behave the same as x86_64 already did: 2432e1364bbe ("x86: Nuke the supervisor_stack field in i386 thread_info") b807902a88c4 ("x86: Nuke GET_THREAD_INFO_WITH_ESP() macro for i386") 0788aa6a23cb ("x86: Prepare removal of previous_esp from i386 thread_info structure") 198d208df437 ("x86: Keep thread_info on thread stack in x86_32") However, that is far too much to stuff into -stable. Therefore I propose we merge the below patch which uses task_thread_info(current) for tif_need_resched() instead of the ESP based current_thread_info(). This makes sure we always observe the one true TIF_NEED_RESCHED bit and things will work as expected again. Cc: bp@alien8.de Cc: fweisbec@gmail.com Cc: david.a.cohen@linux.intel.com Cc: mingo@kernel.org Cc: fweisbec@gmail.com Cc: greg@kroah.com Cc: Steven Rostedt <rostedt@goodmis.org> Cc: gregkh@linuxfoundation.org Cc: pbonzini@redhat.com Cc: rostedt@goodmis.org Cc: stefan.bader@canonical.com Cc: mingo@kernel.org Cc: toralf.foerster@gmx.de Cc: David Cohen <david.a.cohen@linux.intel.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: torvalds@linux-foundation.org Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: David Cohen <david.a.cohen@linux.intel.com> Cc: <stable@vger.kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: <stable-commits@vger.kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: peterz@infradead.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: barra_cuda@katamail.com Tested-by: Stefan Bader <stefan.bader@canonical.com> Tested-by: Toralf F¿rster <toralf.foerster@gmx.de> Tested-by: Michele Ballabio <barra_cuda@katamail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20140409142447.GD13658@twins.programming.kicks-ass.net
2014-06-18KVM: x86: remove WARN_ON from get_kernel_ns()Marcelo Tosatti
commit b351c39cc9e0151cee9b8d52a1e714928faabb38 upstream. Function and callers can be preempted. https://bugzilla.kernel.org/show_bug.cgi?id=73721 Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18arm64: topology: Add support for topology DT bindingsMark Brown
Add support for parsing the explicit topology bindings to discover the topology of the system. Since it is not currently clear how to map multi-level clusters for the scheduler all leaf clusters are presented to the scheduler at the same level. This should be enough to provide good support for current systems. Signed-off-by: Mark Brown <broonie@linaro.org> Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> (cherry picked from commit 3252efc39608be2aac69c184559b9ae168003284) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2014-06-18arm64: topology: Initialise default topology state immediatelyMark Brown
As a legacy of the way 32 bit ARM did things the topology code uses a null topology map by default and then overwrites it by mapping cores with no information to a cluster by themselves later. In order to make it simpler to reset things as part of recovering from parse failures in firmware information directly set this configuration on init. A core will always be its own sibling so there should be no risk of confusion with firmware provided information. Signed-off-by: Mark Brown <broonie@linaro.org> Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> (cherry picked from commit 8759b2d0f8067d726c269602ffe310221437ce5e) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2014-06-18arm64: debug: remove noisy, pointless warningWill Deacon
Sending a SIGTRAP to a user task after execution of a BRK instruction at EL0 is fundamental to the way in which software breakpoints work and doesn't deserve a warning to be logged in dmesg. Whilst the warning can be justified from EL1, do_debug_exception will already do the right thing, so simply remove the code altogether. Cc: Sandeepa Prabhu <sandeepa.prabhu@linaro.org> Reported-by: Kyrylo Tkachov <kyrylo.tkachov@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> (cherry picked from commit 43683afbcb32f7b7318ac1badd6469d91fe22711) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2014-06-18arm64: Use bus notifiers to set per-device coherent DMA opsCatalin Marinas
Recently, the default DMA ops have been changed to non-coherent for alignment with 32-bit ARM platforms (and DT files). This patch adds bus notifiers to be able to set the coherent DMA ops (with no cache maintenance) for devices explicitly marked as coherent via the "dma-coherent" DT property. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> (cherry picked from commit 6ecba8eb51b7d23fda66388a5420be7d8688b186) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2014-06-18arm64: dtbs: add VGA panel descriptionRyan Harkin
Signed-off-by: Ryan Harkin <ryan.harkin@linaro.org> Signed-off-by: Mark Brown <broonie@linaro.org> (cherry picked from commit 4cb9670f141f980afa9e68d1f44023976a166f0b) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2014-06-18arm64: dts: Add panel for ARMv8 4xA53 4xA57 FVPMark Brown
This is not in mainline since the panel bindings have not yet been merged into mainline. Signed-off-by: Mark Brown <broonie@linaro.org> (cherry picked from commit 94eb302681ca87d0078ffb88dfc6d0e4a9b1141d) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2014-06-18arm64: dts: Add CPU topology properties for ARMv8 4xA53 4xA57 FVPMark Brown
This provides for discovery of the clusters using the CPU topology bindings, it is added as a separate commit to aid testing of the code for systems without explicit topology bindings. Signed-off-by: Mark Brown <broonie@linaro.org> (cherry picked from commit d09f1f75c5e8f3b930cc3803af50c694eb903849) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2014-06-18x86-64, build: Fix stack protector Makefile breakage with 32-bit userlandGeorge Spelvin
commit 14262d67fe348018af368a07430fbc06eadeabb1 upstream. If you are using a 64-bit kernel with 32-bit userland, then scripts/gcc-x86_64-has-stack-protector.sh invokes 32-bit gcc with -mcmodel=kernel, which produces: <stdin>:1:0: error: code model 'kernel' not supported in the 32 bit mode and trips the "broken compiler" test at arch/x86/Makefile:120. There are several places a fix is possible, but the following seems cleanest. (But it's minimal; it would also be possible to factor out a bunch of stuff from the two branches of the if.) Signed-off-by: George Spelvin <linux@horizon.com> Link: http://lkml.kernel.org/r/20140507210552.7581.qmail@ns.horizon.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18ARC: !PREEMPT: Ensure Return to kernel mode is IRQ safeVineet Gupta
commit 8aa9e85adac609588eeec356e5a85059b3b819ba upstream. There was a very small race window where resume to kernel mode from a Exception Path (or pure kernel mode which is true for most of ARC exceptions anyways), was not disabling interrupts in restore_regs, clobbering the exception regs Anton found the culprit call flow (after many sleepless nights) | 1. we got a Trap from user land | 2. started to service it. | 3. While doing some stuff on user-land memory (I think it is padzero()), | we got a DataTlbMiss | 4. On return from it we are taking "resume_kernel_mode" path | 5. NEED_RESHED is not set, so we go to "return from exception" path in | restore regs. | 6. there seems to be IRQ happening Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Cc: Anton Kolesov <Anton.Kolesov@synopsys.com> Cc: Francois Bedard <Francois.Bedard@synopsys.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18ARC: Remove ARC_HAS_COH_RTSCRichard Weinberger
commit d345ea2892ae7a2b70f84cf881c20731e43e4993 upstream. The symbol is an orphan, get rid of it. Fixes: 7d0857a54aed ("ARC: [SMP] Disallow RTSC") Signed-off-by: Richard Weinberger <richard@nod.at> Acked-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18s390/bpf,jit: initialize A register if 1st insn is BPF_S_LDX_B_MSHMartin Schwidefsky
commit 6e0de817594c61f3b392a9245deeb09609ec707d upstream. The A register needs to be initialized to zero in the prolog if the first instruction of the BPF program is BPF_S_LDX_B_MSH to prevent leaking the content of %r5 to user space. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18powerpc: Fix Oops in rtas_stop_self()Li Zhong
commit 4fb8d027dca0236c811272d342cf185569d91311 upstream. commit 41dd03a9 may cause Oops in rtas_stop_self(). The reason is that the rtas_args was moved into stack space. For a box with more that 4GB RAM, the stack could easily be outside 32bit range, but RTAS is 32bit. So the patch moves rtas_args away from stack by adding static before it. Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18powerpc/tm: Disable IRQ in tm_recheckpointMichael Neuling
commit e6b8fd028b584ffca7a7255b8971f254932c9fce upstream. We can't take an IRQ when we're about to do a trechkpt as our GPR state is set to user GPR values. We've hit this when running some IBM Java stress tests in the lab resulting in the following dump: cpu 0x3f: Vector: 700 (Program Check) at [c000000007eb3d40] pc: c000000000050074: restore_gprs+0xc0/0x148 lr: 00000000b52a8184 sp: ac57d360 msr: 8000000100201030 current = 0xc00000002c500000 paca = 0xc000000007dbfc00 softe: 0 irq_happened: 0x00 pid = 34535, comm = Pooled Thread # R00 = 00000000b52a8184 R16 = 00000000b3e48fda R01 = 00000000ac57d360 R17 = 00000000ade79bd8 R02 = 00000000ac586930 R18 = 000000000fac9bcc R03 = 00000000ade60000 R19 = 00000000ac57f930 R04 = 00000000f6624918 R20 = 00000000ade79be8 R05 = 00000000f663f238 R21 = 00000000ac218a54 R06 = 0000000000000002 R22 = 000000000f956280 R07 = 0000000000000008 R23 = 000000000000007e R08 = 000000000000000a R24 = 000000000000000c R09 = 00000000b6e69160 R25 = 00000000b424cf00 R10 = 0000000000000181 R26 = 00000000f66256d4 R11 = 000000000f365ec0 R27 = 00000000b6fdcdd0 R12 = 00000000f66400f0 R28 = 0000000000000001 R13 = 00000000ada71900 R29 = 00000000ade5a300 R14 = 00000000ac2185a8 R30 = 00000000f663f238 R15 = 0000000000000004 R31 = 00000000f6624918 pc = c000000000050074 restore_gprs+0xc0/0x148 cfar= c00000000004fe28 dont_restore_vec+0x1c/0x1a4 lr = 00000000b52a8184 msr = 8000000100201030 cr = 24804888 ctr = 0000000000000000 xer = 0000000000000000 trap = 700 This moves tm_recheckpoint to a C function and moves the tm_restore_sprs into that function. It then adds IRQ disabling over the trechkpt critical section. It also sets the TEXASR FS in the signals code to ensure this is never set now that we explictly write the TM sprs in tm_recheckpoint. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18powerpc/compat: 32-bit little endian machine name is ppcle, not ppcAnton Blanchard
commit 422b9b9684db3c511e65c91842275c43f5910ae9 upstream. I noticed this when testing setarch. No, we don't magically support a big endian userspace on a little endian kernel. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18MIPS: Hibernate: Flush TLB entries in swsusp_arch_resume()Huacai Chen
commit c14af233fbe279d0e561ecf84f1208b1bae087ef upstream. The original MIPS hibernate code flushes cache and TLB entries in swsusp_arch_resume(). But they are removed in Commit 44eeab67416711 (MIPS: Hibernation: Remove SMP TLB and cacheflushing code.). A cross- CPU flush is surely unnecessary because all but the local CPU have already been disabled. But a local flush (at least the TLB flush) is needed. When we do hibernation on Loongson-3 with an E1000E NIC, it is very easy to produce a kernel panic (kernel page fault, or unaligned access). The root cause is E1000E driver use vzalloc_node() to allocate pages, the stale TLB entries of the booting kernel will be misused by the resumed target kernel. Signed-off-by: Huacai Chen <chenhc@lemote.com> Cc: John Crispin <john@phrozen.org> Cc: Steven J. Hill <Steven.Hill@imgtec.com> Cc: Aurelien Jarno <aurelien@aurel32.net> Cc: linux-mips@linux-mips.org Cc: Fuxin Zhang <zhangfx@lemote.com> Cc: Zhangjin Wu <wuzhangjin@gmail.com> Patchwork: https://patchwork.linux-mips.org/patch/6643/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18KVM: PPC: Book3S HV: Fix KVM hang with CONFIG_KVM_XICS=nAnton Blanchard
commit 7505258c5fcb0a1cc3c76a47b4cf9506d21d10e6 upstream. I noticed KVM is broken when KVM in-kernel XICS emulation (CONFIG_KVM_XICS) is disabled. The problem was introduced in 48eaef05 (KVM: PPC: Book3S HV: use xics_wake_cpu only when defined). It used CONFIG_KVM_XICS to wrap xics_wake_cpu, where CONFIG_PPC_ICP_NATIVE should have been used. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18MIPS: KVM: Pass reserved instruction exceptions to guestJames Hogan
commit 15505679362270d02c449626385cb74af8905514 upstream. Previously a reserved instruction exception while in guest code would cause a KVM internal error if kvm_mips_handle_ri() didn't recognise the instruction (including a RDHWR from an unrecognised hardware register). However the guest OS should really have the opportunity to catch the exception so that it can take the appropriate actions such as sending a SIGILL to the guest user process or emulating the instruction itself. Therefore in these cases emulate a guest RI exception and only return EMULATE_FAIL if that fails, being careful to revert the PC first in case the exception occurred in a branch delay slot in which case the PC will already point to the branch target. Also turn the printk messages relating to these cases into kvm_debug messages so that they aren't usually visible. This allows crashme to run in the guest without killing the entire VM. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Gleb Natapov <gleb@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Sanjay Lal <sanjayl@kymasys.com> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18arm: KVM: fix possible misalignment of PGDs and bounce pageMark Salter
commit 5d4e08c45a6cf8f1ab3c7fa375007635ac569165 upstream. The kvm/mmu code shared by arm and arm64 uses kalloc() to allocate a bounce page (if hypervisor init code crosses page boundary) and hypervisor PGDs. The problem is that kalloc() does not guarantee the proper alignment. In the case of the bounce page, the page sized buffer allocated may also cross a page boundary negating the purpose and leading to a hang during kvm initialization. Likewise the PGDs allocated may not meet the minimum alignment requirements of the underlying MMU. This patch uses __get_free_page() to guarantee the worst case alignment needs of the bounce page and PGDs on both arm and arm64. Signed-off-by: Mark Salter <msalter@redhat.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>