Age | Commit message (Collapse) | Author |
|
commit 4797ec2dc83a43be35bad56037d1b53db9e2b5d5 upstream.
The following happens when trying to run a kvm guest on a kernel
configured for 64k pages. This doesn't happen with 4k pages:
BUG: failure at include/linux/mm.h:297/put_page_testzero()!
Kernel panic - not syncing: BUG!
CPU: 2 PID: 4228 Comm: qemu-system-aar Tainted: GF 3.13.0-0.rc7.31.sa2.k32v1.aarch64.debug #1
Call trace:
[<fffffe0000096034>] dump_backtrace+0x0/0x16c
[<fffffe00000961b4>] show_stack+0x14/0x1c
[<fffffe000066e648>] dump_stack+0x84/0xb0
[<fffffe0000668678>] panic+0xf4/0x220
[<fffffe000018ec78>] free_reserved_area+0x0/0x110
[<fffffe000018edd8>] free_pages+0x50/0x88
[<fffffe00000a759c>] kvm_free_stage2_pgd+0x30/0x40
[<fffffe00000a5354>] kvm_arch_destroy_vm+0x18/0x44
[<fffffe00000a1854>] kvm_put_kvm+0xf0/0x184
[<fffffe00000a1938>] kvm_vm_release+0x10/0x1c
[<fffffe00001edc1c>] __fput+0xb0/0x288
[<fffffe00001ede4c>] ____fput+0xc/0x14
[<fffffe00000d5a2c>] task_work_run+0xa8/0x11c
[<fffffe0000095c14>] do_notify_resume+0x54/0x58
In arch/arm/kvm/mmu.c:unmap_range(), we end up doing an extra put_page()
on the stage2 pgd which leads to the BUG in put_page_testzero(). This
happens because a pud_huge() test in unmap_range() returns true when it
should always be false with 2-level pages tables used by 64k pages.
This patch removes support for huge puds if 2-level pagetables are
being used.
Signed-off-by: Mark Salter <msalter@redhat.com>
[catalin.marinas@arm.com: removed #ifndef around PUD_SIZE check]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 9844f5462392b53824e8b86726e7c33b5ecbb676 upstream.
The invalidation is required in order to maintain proper semantics
under CoW conditions. In scenarios where a process clones several
threads, a thread operating on a core whose DTLB entry for a
particular hugepage has not been invalidated, will be reading from
the hugepage that belongs to the forked child process, even after
hugetlb_cow().
The thread will not see the updated page as long as the stale DTLB
entry remains cached, the thread attempts to write into the page,
the child process exits, or the thread gets migrated to a different
processor.
Signed-off-by: Anthony Iliopoulos <anthony.iliopoulos@huawei.com>
Link: http://lkml.kernel.org/r/20140514092948.GA17391@server-36.huawei.corp
Suggested-by: Shay Goikhman <shay.goikhman@huawei.com>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 01f8fa4f01d8362358eb90e412bd7ae18a3ec1ad upstream.
The current implementation of irq_set_affinity() refuses rightfully to
route an interrupt to an offline cpu.
But there is a special case, where this is actually desired. Some of
the ARM SoCs have per cpu timers which require setting the affinity
during cpu startup where the cpu is not yet in the online mask.
If we can't do that, then the local timer interrupt for the about to
become online cpu is routed to some random online cpu.
The developers of the affected machines tried to work around that
issue, but that results in a massive mess in that timer code.
We have a yet unused argument in the set_affinity callbacks of the irq
chips, which I added back then for a similar reason. It was never
required so it got not used. But I'm happy that I never removed it.
That allows us to implement a sane handling of the above scenario. So
the affected SoC drivers can add the required force handling to their
interrupt chip, switch the timer code to irq_force_affinity() and
things just work.
This does not affect any existing user of irq_set_affinity().
Tagged for stable to allow a simple fix of the affected SoC clock
event drivers.
Reported-and-tested-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: Tomasz Figa <t.figa@samsung.com>,
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>,
Cc: Kukjin Kim <kgene.kim@samsung.com>
Cc: linux-arm-kernel@lists.infradead.org,
Link: http://lkml.kernel.org/r/20140416143315.717251504@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit dfc44f8030653b345fc6fb337558c3a07536823f upstream.
A few platforms lack a 'device_type = "memory"' for their memory
nodes, relying on an old ppc quirk in order to discover its memory.
Add the missing data so that all parsing code can find memory nodes
correctly.
Signed-off-by: Leif Lindholm <leif.lindholm@linaro.org>
Cc: linux-mips@linux-mips.org
Cc: devicetree@vger.kernel.org
Cc: Mark Rutland <mark.rutland@arm.com>
Acked-by: John Crispin <blogic@openwrt.org>
Signed-off-by: Grant Likely <grant.likely@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 8e8acb32960f42c81b1d50deac56a2c07bb6a18a upstream.
Loongson2 has been using (incorrectly) kHz for cpu_clk rate. This has
been unnoticed, as loongson2_cpufreq was the only place where the rate
was set/get. After commit 652ed95d5fa6074b3c4ea245deb0691f1acb6656
(cpufreq: introduce cpufreq_generic_get() routine) things however broke,
and now loops_per_jiffy adjustments are incorrect (1000 times too long).
The patch fixes this by changing cpu_clk rate to Hz.
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: cpufreq@vger.kernel.org
Cc: Aaro Koskinen <aaro.koskinen@iki.fi>
Patchwork: https://patchwork.linux-mips.org/patch/6678/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The big.LITTLE cpufreq driver is useful on arm64 big.LITTLE systems even
without IKS support since it implements support for clusters with shared
clocks (a common big.LITTLE configuration). In order to allow it to be
built provide the non-IKS stubs for arm64, enabling cpufreq with all the
cores available.
It may make sense to make an asm-generic version of these stubs instead but
given that there's only likely to be these two architectures using the code
and asm-generic stubs also need per architecture updates it's probably more
trouble than it's worth.
Signed-off-by: Mark Brown <broonie@linaro.org>
(cherry picked from commit LSK 1eb3e5188edfffca2e5ed9e28e77e4ea4036ea45)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
|
|
Upstream OPP has been converted into a selectable symbol by commit
049d595a4db3b (PM / OPP: Make OPP invisible to users in Kconfig) however
for v3.10 it is less invasive to follow the practice in v3.10.
Signed-off-by: Mark Brown <broonie@linaro.org>
(cherry picked from LSK commit 4c9563dfc768f176de0c9cdf398ec8a108f374d0)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
Conflicts:
arch/arm64/Kconfig
|
|
Signed-off-by: Mark Hambleton <mahamble@broadcom.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
(cherry picked from commit 8ecd48091c61213762407f5ce02d3fa4ede66402)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
Conflicts:
arch/arm64/kernel/topology.c
|
|
Enable additional use of additional scheduler features with the topology
information.
Reported-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Mark Brown <broonie@linaro.org>
(cherry picked from commit d474d10a1a827e0b3402943a62a97d95c0eabf8d)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
|
|
ftrace_init() failed since ftrace_dyn_arch_init() doesn't initialize
the argument to null.
This bug comes in only if arm64 ftrace is back-ported as
ftrace_dyn_arch_init() interface has been changed in the merge window of
3.15.
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: Mark Brown <broonie@linaro.org>
(cherry picked from LSK commit 0cc5286fc3ca12ec0d388e4d8c08a66b40d52233)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
|
|
Signed-off-by: Mark Brown <broonie@linaro.org>
(cherry picked from LSK commit d10fbc12d0c4c6e872de2a342b88b896e430cea3)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
|
|
CALLER_ADDRx returns caller's address at specified level in call stacks.
They are used for several tracers like irqsoff and preemptoff.
Strange to say, however, they are refered even without FTRACE.
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: Mark Brown <broonie@linaro.org>
(cherry picked from LSK commit ea76465b82b819230ab16ab8ab1b460c09ed5b3b)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
Conflicts:
arch/arm64/kernel/Makefile
|
|
This patch allows "dynamic ftrace" if CONFIG_DYNAMIC_FTRACE is enabled.
Here we can turn on and off tracing dynamically per-function base.
On arm64, this is done by patching single branch instruction to _mcount()
inserted by gcc -pg option. The branch is replaced to NOP initially at
kernel start up, and later on, NOP to branch to ftrace_caller() when
enabled or branch to NOP when disabled.
Please note that ftrace_caller() is a counterpart of _mcount() in case of
'static' ftrace.
More details on architecture specific requirements are described in
Documentation/trace/ftrace-design.txt.
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: Mark Brown <broonie@linaro.org>
(cherry picked from LSK commit 95d7e5533398803e3ced340771cb3461c9443407)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
Conflicts:
arch/arm64/Kconfig
|
|
This patch implements arm64 specific part to support function tracers,
such as function (CONFIG_FUNCTION_TRACER), function_graph
(CONFIG_FUNCTION_GRAPH_TRACER) and function profiler
(CONFIG_FUNCTION_PROFILER).
With 'function' tracer, all the functions in the kernel are traced with
timestamps in ${sysfs}/tracing/trace. If function_graph tracer is
specified, call graph is generated.
The kernel must be compiled with -pg option so that _mcount() is inserted
at the beginning of functions. This function is called on every function's
entry as long as tracing is enabled.
In addition, function_graph tracer also needs to be able to probe function's
exit. ftrace_graph_caller() & return_to_handler do this by faking link
register's value to intercept function's return path.
More details on architecture specific requirements are described in
Documentation/trace/ftrace-design.txt.
Reviewed-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: Mark Brown <broonie@linaro.org>
(cherry picked from LSK commit fc0c93936ae80bdffac7499a820d7e5105ef44d5)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
|
|
Recordmcount utility under scripts is run, after compiling each object,
to find out all the locations of calling _mcount() and put them into
specific seciton named __mcount_loc.
Then linker collects all such information into a table in the kernel image
(between __start_mcount_loc and __stop_mcount_loc) for later use by ftrace.
This patch adds arm64 specific definitions to identify such locations.
There are two types of implementation, C and Perl. On arm64, only C version
is used to build the kernel now that CONFIG_HAVE_C_RECORDMCOUNT is on.
But Perl version is also maintained.
This patch also contains a workaround just in case where a header file,
elf.h, on host machine doesn't have definitions of EM_AARCH64 nor
R_AARCH64_ABS64. Without them, compiling C version of recordmcount will
fail.
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: Mark Brown <broonie@linaro.org>
(cherry picked from LSK commit 6470430d6d73aa46e00cc28ee732b292538419c9)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
Conflicts:
arch/arm64/Kconfig
|
|
walk_stackframe() calls unwind_frame(), and if walk_stackframe() is
"notrace", unwind_frame() should be also "notrace".
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: Mark Brown <broonie@linaro.org>
(cherry picked from LSK commit 8370369c5c33b9d573575249f080a905323d7eb4)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
|
|
Most archs with HAVE_ARCH_CALLER_ADDR have the almost same definitions
of CALLER_ADDRx(n), and so put them into linux/ftrace.h.
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: Mark Brown <broonie@linaro.org>
(cherry picked from LSK commit f2f646bdd599cb551bfeb8f1934a9e14ccfeac84)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
Conflicts:
arch/xtensa/include/asm/ftrace.h
|
|
Some kernel files may include both linux/compat.h and asm/compat.h directly
or indirectly. Since both header files contain is_compat_task() under
!CONFIG_COMPAT, compiling them with !CONFIG_COMPAT will eventually fail.
Such files include kernel/auditsc.c, kernel/seccomp.c and init/do_mountfs.c
(do_mountfs.c may read asm/compat.h via asm/ftrace.h once ftrace is
implemented).
So this patch proactively
1) removes is_compat_task() under !CONFIG_COMPAT from asm/compat.h
2) replaces asm/compat.h to linux/compat.h in kernel/*.c,
but asm/compat.h is still necessary in ptrace.c and process.c because
they use is_compat_thread().
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit fd92d4a54a069953b4679958121317f2a25389cd)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
|
|
This macro, regs_return_value, is used mainly for audit to record system
call's results, but may also be used in test_kprobes.c.
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit d34a3ebd8d25cf691a94fae66a957a480cf46430)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
|
|
As done in arm, this change makes it easy to confirm we invoke syscall
related hooks, including syscall tracepoint, audit and seccomp which would
be implemented later, in correct order. That is, undoing operations in the
opposite order on exit that they were done on entry.
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 3157858feff89196635b01495d5ec9ebe206639e)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
|
|
Currently syscall_trace() is called only for ptrace.
With additional TIF_xx flags defined, it is now called in all the cases
of audit, ftrace and seccomp in addition to ptrace.
Acked-by: Richard Guy Briggs <rgb@redhat.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 449f81a4da4d99980064943d504bb19d07e86aec)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
Conflicts:
arch/arm64/kernel/entry.S
|
|
Provide performance numbers to the scheduler to help it fill the cores in
the system on big.LITTLE systems. With the current scheduler this may
perform poorly for applications that try to do OpenMP style work over all
cores but should help for more common workloads. The current 32 bit ARM
implementation provides a similar estimate so this helps ensure that
work to improve big.LITTLE systems on ARMv7 systems performs similarly
on ARMv8 systems.
The power numbers are the same as for ARMv7 since it seems that the
expected differential between the big and little cores is very similar on
both ARMv7 and ARMv8. In both ARMv7 and ARMv8 cases the numbers were
based on the published DMIPS numbers.
These numbers are just an initial and basic approximation for use with
the current scheduler, it is likely that both experience with silicon
and ongoing work on improving the scheduler will lead to further tuning
or will tune automatically at runtime and so make the specific choice of
numbers here less critical.
Signed-off-by: Mark Brown <broonie@linaro.org>
(cherry picked from LSK commit a84034fddb11f30849dd7ce050689d615995c0d2)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
|
|
In heterogeneous systems like big.LITTLE systems the scheduler will be
able to make better use of the available cores if we provide power numbers
to it indicating their relative performance. Do this by parsing the CPU
nodes in the DT.
This code currently has no effect as no information on the relative
performance of the cores is provided.
Signed-off-by: Mark Brown <broonie@linaro.org>
(cherry picked from LSK commit 474bddad45eeb9457f5fae2a0cc133372a61a64c)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
|
|
Commit 9c7e535fcc17 ("arm64: mm: Route pmd thp functions through pte
equivalents") changed the pmd manipulator and accessor functions to
convert the target pmd to a pte, process it with the pte functions, then
convert it back. Along the way, we gained support for PTE_WRITE, however
this is completely ignored by set_pmd_at, and so we fail to set the
PMD_SECT_RDONLY for PMDs, resulting in all sorts of lovely failures (like
CoW not working).
Partially reverting the offending commit (by making use of
PMD_SECT_RDONLY explicitly for pmd_{write,wrprotect,mkwrite} functions)
leads to further issues because pmd_write can then return potentially
incorrect values for page table entries marked as RDONLY, leading to
BUG_ON(pmd_write(entry)) tripping under some THP workloads.
This patch fixes the issue by routing set_pmd_at through set_pte_at,
which correctly takes the PTE_WRITE flag into account. Given that
THP mappings are always anonymous, the additional cache-flushing code
in __sync_icache_dcache won't impose any significant overhead as the
flush will be skipped.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Steve Capper <steve.capper@arm.com>
Tested-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
(cherry picked from commit ceb218359de22e70980801d4fa04fffbfc44adb8)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
|
|
[ Upstream commit e84d2f8d2ae33c8215429824e1ecf24cbca9645e ]
This is the s390 variant of Alexei's JIT bug fix.
(patch description below stolen from Alexei's patch)
bpf_alloc_binary() adds 128 bytes of room to JITed program image
and rounds it up to the nearest page size. If image size is close
to page size (like 4000), it is rounded to two pages:
round_up(4000 + 4 + 128) == 8192
then 'hole' is computed as 8192 - (4000 + 4) = 4188
If prandom_u32() % hole selects a number >= PAGE_SIZE - sizeof(*header)
then kernel will crash during bpf_jit_free():
kernel BUG at arch/x86/mm/pageattr.c:887!
Call Trace:
[<ffffffff81037285>] change_page_attr_set_clr+0x135/0x460
[<ffffffff81694cc0>] ? _raw_spin_unlock_irq+0x30/0x50
[<ffffffff810378ff>] set_memory_rw+0x2f/0x40
[<ffffffffa01a0d8d>] bpf_jit_free_deferred+0x2d/0x60
[<ffffffff8106bf98>] process_one_work+0x1d8/0x6a0
[<ffffffff8106bf38>] ? process_one_work+0x178/0x6a0
[<ffffffff8106c90c>] worker_thread+0x11c/0x370
since bpf_jit_free() does:
unsigned long addr = (unsigned long)fp->bpf_func & PAGE_MASK;
struct bpf_binary_header *header = (void *)addr;
to compute start address of 'bpf_binary_header'
and header->pages will pass junk to:
set_memory_rw(addr, header->pages);
Fix it by making sure that &header->image[prandom_u32() % hole] and &header
are in the same page.
Fixes: aa2d2c73c21f2 ("s390/bpf,jit: address randomize and write protect jit code")
Reported-by: Alexei Starovoitov <ast@plumgrid.com>
Cc: <stable@vger.kernel.org> # v3.11+
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 773cd38f40b8834be991dbfed36683acc1dd41ee ]
bpf_alloc_binary() adds 128 bytes of room to JITed program image
and rounds it up to the nearest page size. If image size is close
to page size (like 4000), it is rounded to two pages:
round_up(4000 + 4 + 128) == 8192
then 'hole' is computed as 8192 - (4000 + 4) = 4188
If prandom_u32() % hole selects a number >= PAGE_SIZE - sizeof(*header)
then kernel will crash during bpf_jit_free():
kernel BUG at arch/x86/mm/pageattr.c:887!
Call Trace:
[<ffffffff81037285>] change_page_attr_set_clr+0x135/0x460
[<ffffffff81694cc0>] ? _raw_spin_unlock_irq+0x30/0x50
[<ffffffff810378ff>] set_memory_rw+0x2f/0x40
[<ffffffffa01a0d8d>] bpf_jit_free_deferred+0x2d/0x60
[<ffffffff8106bf98>] process_one_work+0x1d8/0x6a0
[<ffffffff8106bf38>] ? process_one_work+0x178/0x6a0
[<ffffffff8106c90c>] worker_thread+0x11c/0x370
since bpf_jit_free() does:
unsigned long addr = (unsigned long)fp->bpf_func & PAGE_MASK;
struct bpf_binary_header *header = (void *)addr;
to compute start address of 'bpf_binary_header'
and header->pages will pass junk to:
set_memory_rw(addr, header->pages);
Fix it by making sure that &header->image[prandom_u32() % hole] and &header
are in the same page
Fixes: 314beb9bcabfd ("x86: bpf_jit_comp: secure bpf jit against spraying attacks")
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit e0fc17a936334c08b2729fff87168c03fdecf5b6 upstream.
The git commit a945928ea2709bc0e8e8165d33aed855a0110279
('xen: Do not enable spinlocks before jump_label_init() has executed')
was added to deal with the jump machinery. Earlier the code
that turned on the jump label was only called by Xen specific
functions. But now that it had been moved to the initcall machinery
it gets called on Xen, KVM, and baremetal - ouch!. And the detection
machinery to only call it on Xen wasn't remembered in the heat
of merge window excitement.
This means that the slowpath is enabled on baremetal while it should
not be.
Reported-by: Waiman Long <waiman.long@hp.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
CC: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit e0d8898d76a785453bfaf6cd08b830a7d5189f78 upstream.
There are only a couple of architectures that override _STK_LIM_MAX to
a non-infinity value. This changes the stack allocation semantics in
subtle ways. For example, GNU make changes its stack allocation to the
hard maximum defined by _STK_LIM_MAX. As a results, threads executed
by processes running under make are allocated a stack size of
_STK_LIM_MAX rather than a sensible default value. This causes various
thread stress tests to fail when they can't muster more than about 50
threads.
The attached change implements the default behavior used by the
majority of architectures.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Reviewed-by: Carlos O'Donell <carlos@systemhalted.org>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit ab3e55b119c9653b19ea4edffb86f04db867ac98 upstream.
This bug was detected with the libio-epoll-perl debian package where the
test case IO-Ppoll-compat.t failed.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 0ef36bd2b37815719e31a72d2beecc28ca8ecd26 upstream.
On parisc, SHMLBA was defined to 0x00400000 (4MB) to reflect that we need to
take care of our caches for shared mappings. But actually, we can map a file at
any multiple address of PAGE_SIZE, so let us correct that now with a value of
PAGE_SIZE for SHMLBA. Instead we now take care of this cache colouring via the
constant SHM_COLOUR while we map shared pages.
Signed-off-by: Helge Deller <deller@gmx.de>
CC: Jeroen Roovers <jer@gentoo.org>
CC: John David Anglin <dave.anglin@bell.net>
CC: Carlos O'Donell <carlos@systemhalted.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Many people reported preemption/reschedule problems with i386 kernels
for .13 and .14. After Michele bisected this to a combination of
3e8e42c69bb ("sched: Revert need_resched() to look at TIF_NEED_RESCHED")
ded79754754 ("irq: Force hardirq exit's softirq processing on its own stack")
it finally dawned on me that i386's current_thread_info() was to
blame.
When we are on interrupt/exception stacks, we fail to observe the
right TIF_NEED_RESCHED bit and therefore the PREEMPT_NEED_RESCHED
folding malfunctions.
Current upstream fixes this by making i386 behave the same as x86_64
already did:
2432e1364bbe ("x86: Nuke the supervisor_stack field in i386 thread_info")
b807902a88c4 ("x86: Nuke GET_THREAD_INFO_WITH_ESP() macro for i386")
0788aa6a23cb ("x86: Prepare removal of previous_esp from i386 thread_info structure")
198d208df437 ("x86: Keep thread_info on thread stack in x86_32")
However, that is far too much to stuff into -stable. Therefore I
propose we merge the below patch which uses task_thread_info(current)
for tif_need_resched() instead of the ESP based current_thread_info().
This makes sure we always observe the one true TIF_NEED_RESCHED bit
and things will work as expected again.
Cc: bp@alien8.de
Cc: fweisbec@gmail.com
Cc: david.a.cohen@linux.intel.com
Cc: mingo@kernel.org
Cc: fweisbec@gmail.com
Cc: greg@kroah.com
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: gregkh@linuxfoundation.org
Cc: pbonzini@redhat.com
Cc: rostedt@goodmis.org
Cc: stefan.bader@canonical.com
Cc: mingo@kernel.org
Cc: toralf.foerster@gmx.de
Cc: David Cohen <david.a.cohen@linux.intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: torvalds@linux-foundation.org
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: David Cohen <david.a.cohen@linux.intel.com>
Cc: <stable@vger.kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: <stable-commits@vger.kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: peterz@infradead.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: barra_cuda@katamail.com
Tested-by: Stefan Bader <stefan.bader@canonical.com>
Tested-by: Toralf F¿rster <toralf.foerster@gmx.de>
Tested-by: Michele Ballabio <barra_cuda@katamail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140409142447.GD13658@twins.programming.kicks-ass.net
|
|
commit b351c39cc9e0151cee9b8d52a1e714928faabb38 upstream.
Function and callers can be preempted.
https://bugzilla.kernel.org/show_bug.cgi?id=73721
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Add support for parsing the explicit topology bindings to discover the
topology of the system.
Since it is not currently clear how to map multi-level clusters for the
scheduler all leaf clusters are presented to the scheduler at the same
level. This should be enough to provide good support for current systems.
Signed-off-by: Mark Brown <broonie@linaro.org>
Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
(cherry picked from commit 3252efc39608be2aac69c184559b9ae168003284)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
|
|
As a legacy of the way 32 bit ARM did things the topology code uses a null
topology map by default and then overwrites it by mapping cores with no
information to a cluster by themselves later. In order to make it simpler
to reset things as part of recovering from parse failures in firmware
information directly set this configuration on init. A core will always be
its own sibling so there should be no risk of confusion with firmware
provided information.
Signed-off-by: Mark Brown <broonie@linaro.org>
Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
(cherry picked from commit 8759b2d0f8067d726c269602ffe310221437ce5e)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
|
|
Sending a SIGTRAP to a user task after execution of a BRK instruction at
EL0 is fundamental to the way in which software breakpoints work and
doesn't deserve a warning to be logged in dmesg. Whilst the warning can
be justified from EL1, do_debug_exception will already do the right thing,
so simply remove the code altogether.
Cc: Sandeepa Prabhu <sandeepa.prabhu@linaro.org>
Reported-by: Kyrylo Tkachov <kyrylo.tkachov@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 43683afbcb32f7b7318ac1badd6469d91fe22711)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
|
|
Recently, the default DMA ops have been changed to non-coherent for
alignment with 32-bit ARM platforms (and DT files). This patch adds bus
notifiers to be able to set the coherent DMA ops (with no cache
maintenance) for devices explicitly marked as coherent via the
"dma-coherent" DT property.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 6ecba8eb51b7d23fda66388a5420be7d8688b186)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
|
|
Signed-off-by: Ryan Harkin <ryan.harkin@linaro.org>
Signed-off-by: Mark Brown <broonie@linaro.org>
(cherry picked from commit 4cb9670f141f980afa9e68d1f44023976a166f0b)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
|
|
This is not in mainline since the panel bindings have not yet been merged
into mainline.
Signed-off-by: Mark Brown <broonie@linaro.org>
(cherry picked from commit 94eb302681ca87d0078ffb88dfc6d0e4a9b1141d)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
|
|
This provides for discovery of the clusters using the CPU topology
bindings, it is added as a separate commit to aid testing of the code
for systems without explicit topology bindings.
Signed-off-by: Mark Brown <broonie@linaro.org>
(cherry picked from commit d09f1f75c5e8f3b930cc3803af50c694eb903849)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
|
|
commit 14262d67fe348018af368a07430fbc06eadeabb1 upstream.
If you are using a 64-bit kernel with 32-bit userland, then
scripts/gcc-x86_64-has-stack-protector.sh invokes 32-bit gcc
with -mcmodel=kernel, which produces:
<stdin>:1:0: error: code model 'kernel' not supported in the 32 bit mode
and trips the "broken compiler" test at arch/x86/Makefile:120.
There are several places a fix is possible, but the following seems
cleanest. (But it's minimal; it would also be possible to factor
out a bunch of stuff from the two branches of the if.)
Signed-off-by: George Spelvin <linux@horizon.com>
Link: http://lkml.kernel.org/r/20140507210552.7581.qmail@ns.horizon.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 8aa9e85adac609588eeec356e5a85059b3b819ba upstream.
There was a very small race window where resume to kernel mode from a
Exception Path (or pure kernel mode which is true for most of ARC
exceptions anyways), was not disabling interrupts in restore_regs,
clobbering the exception regs
Anton found the culprit call flow (after many sleepless nights)
| 1. we got a Trap from user land
| 2. started to service it.
| 3. While doing some stuff on user-land memory (I think it is padzero()),
| we got a DataTlbMiss
| 4. On return from it we are taking "resume_kernel_mode" path
| 5. NEED_RESHED is not set, so we go to "return from exception" path in
| restore regs.
| 6. there seems to be IRQ happening
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Anton Kolesov <Anton.Kolesov@synopsys.com>
Cc: Francois Bedard <Francois.Bedard@synopsys.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit d345ea2892ae7a2b70f84cf881c20731e43e4993 upstream.
The symbol is an orphan, get rid of it.
Fixes: 7d0857a54aed ("ARC: [SMP] Disallow RTSC")
Signed-off-by: Richard Weinberger <richard@nod.at>
Acked-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 6e0de817594c61f3b392a9245deeb09609ec707d upstream.
The A register needs to be initialized to zero in the prolog if the
first instruction of the BPF program is BPF_S_LDX_B_MSH to prevent
leaking the content of %r5 to user space.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 4fb8d027dca0236c811272d342cf185569d91311 upstream.
commit 41dd03a9 may cause Oops in rtas_stop_self().
The reason is that the rtas_args was moved into stack space. For a box
with more that 4GB RAM, the stack could easily be outside 32bit range,
but RTAS is 32bit.
So the patch moves rtas_args away from stack by adding static before
it.
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit e6b8fd028b584ffca7a7255b8971f254932c9fce upstream.
We can't take an IRQ when we're about to do a trechkpt as our GPR state is set
to user GPR values.
We've hit this when running some IBM Java stress tests in the lab resulting in
the following dump:
cpu 0x3f: Vector: 700 (Program Check) at [c000000007eb3d40]
pc: c000000000050074: restore_gprs+0xc0/0x148
lr: 00000000b52a8184
sp: ac57d360
msr: 8000000100201030
current = 0xc00000002c500000
paca = 0xc000000007dbfc00 softe: 0 irq_happened: 0x00
pid = 34535, comm = Pooled Thread #
R00 = 00000000b52a8184 R16 = 00000000b3e48fda
R01 = 00000000ac57d360 R17 = 00000000ade79bd8
R02 = 00000000ac586930 R18 = 000000000fac9bcc
R03 = 00000000ade60000 R19 = 00000000ac57f930
R04 = 00000000f6624918 R20 = 00000000ade79be8
R05 = 00000000f663f238 R21 = 00000000ac218a54
R06 = 0000000000000002 R22 = 000000000f956280
R07 = 0000000000000008 R23 = 000000000000007e
R08 = 000000000000000a R24 = 000000000000000c
R09 = 00000000b6e69160 R25 = 00000000b424cf00
R10 = 0000000000000181 R26 = 00000000f66256d4
R11 = 000000000f365ec0 R27 = 00000000b6fdcdd0
R12 = 00000000f66400f0 R28 = 0000000000000001
R13 = 00000000ada71900 R29 = 00000000ade5a300
R14 = 00000000ac2185a8 R30 = 00000000f663f238
R15 = 0000000000000004 R31 = 00000000f6624918
pc = c000000000050074 restore_gprs+0xc0/0x148
cfar= c00000000004fe28 dont_restore_vec+0x1c/0x1a4
lr = 00000000b52a8184
msr = 8000000100201030 cr = 24804888
ctr = 0000000000000000 xer = 0000000000000000 trap = 700
This moves tm_recheckpoint to a C function and moves the tm_restore_sprs into
that function. It then adds IRQ disabling over the trechkpt critical section.
It also sets the TEXASR FS in the signals code to ensure this is never set now
that we explictly write the TM sprs in tm_recheckpoint.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 422b9b9684db3c511e65c91842275c43f5910ae9 upstream.
I noticed this when testing setarch. No, we don't magically
support a big endian userspace on a little endian kernel.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit c14af233fbe279d0e561ecf84f1208b1bae087ef upstream.
The original MIPS hibernate code flushes cache and TLB entries in
swsusp_arch_resume(). But they are removed in Commit 44eeab67416711
(MIPS: Hibernation: Remove SMP TLB and cacheflushing code.). A cross-
CPU flush is surely unnecessary because all but the local CPU have
already been disabled. But a local flush (at least the TLB flush) is
needed. When we do hibernation on Loongson-3 with an E1000E NIC, it is
very easy to produce a kernel panic (kernel page fault, or unaligned
access). The root cause is E1000E driver use vzalloc_node() to allocate
pages, the stale TLB entries of the booting kernel will be misused by
the resumed target kernel.
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Cc: John Crispin <john@phrozen.org>
Cc: Steven J. Hill <Steven.Hill@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: linux-mips@linux-mips.org
Cc: Fuxin Zhang <zhangfx@lemote.com>
Cc: Zhangjin Wu <wuzhangjin@gmail.com>
Patchwork: https://patchwork.linux-mips.org/patch/6643/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 7505258c5fcb0a1cc3c76a47b4cf9506d21d10e6 upstream.
I noticed KVM is broken when KVM in-kernel XICS emulation
(CONFIG_KVM_XICS) is disabled.
The problem was introduced in 48eaef05 (KVM: PPC: Book3S HV: use
xics_wake_cpu only when defined). It used CONFIG_KVM_XICS to wrap
xics_wake_cpu, where CONFIG_PPC_ICP_NATIVE should have been
used.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 15505679362270d02c449626385cb74af8905514 upstream.
Previously a reserved instruction exception while in guest code would
cause a KVM internal error if kvm_mips_handle_ri() didn't recognise the
instruction (including a RDHWR from an unrecognised hardware register).
However the guest OS should really have the opportunity to catch the
exception so that it can take the appropriate actions such as sending a
SIGILL to the guest user process or emulating the instruction itself.
Therefore in these cases emulate a guest RI exception and only return
EMULATE_FAIL if that fails, being careful to revert the PC first in case
the exception occurred in a branch delay slot in which case the PC will
already point to the branch target.
Also turn the printk messages relating to these cases into kvm_debug
messages so that they aren't usually visible.
This allows crashme to run in the guest without killing the entire VM.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sanjay Lal <sanjayl@kymasys.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 5d4e08c45a6cf8f1ab3c7fa375007635ac569165 upstream.
The kvm/mmu code shared by arm and arm64 uses kalloc() to allocate
a bounce page (if hypervisor init code crosses page boundary) and
hypervisor PGDs. The problem is that kalloc() does not guarantee
the proper alignment. In the case of the bounce page, the page sized
buffer allocated may also cross a page boundary negating the purpose
and leading to a hang during kvm initialization. Likewise the PGDs
allocated may not meet the minimum alignment requirements of the
underlying MMU. This patch uses __get_free_page() to guarantee the
worst case alignment needs of the bounce page and PGDs on both arm
and arm64.
Signed-off-by: Mark Salter <msalter@redhat.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|