Age | Commit message (Collapse) | Author |
|
Conflicts:
adopt s/cpufreq_val/cpufreq_freqs in drivers/thermal/cpu_cooling.c
|
|
|
|
commit bcaf669b4bdbad09888df086d266a34e293ace85 upstream
When boot arm64 kernel with KASAN enabled, the below error is reported by
kasan:
BUG: KASAN: out-of-bounds in unwind_frame+0xec/0x260 at addr ffffffc064d57ba0
Read of size 8 by task pidof/499
page:ffffffbdc39355c0 count:0 mapcount:0 mapping: (null) index:0x0
flags: 0x0()
page dumped because: kasan: bad access detected
CPU: 2 PID: 499 Comm: pidof Not tainted 4.5.0-rc1 #119
Hardware name: Freescale Layerscape 2085a RDB Board (DT)
Call trace:
[<ffffffc00008d078>] dump_backtrace+0x0/0x290
[<ffffffc00008d32c>] show_stack+0x24/0x30
[<ffffffc0006a981c>] dump_stack+0x8c/0xd8
[<ffffffc0002e4400>] kasan_report_error+0x558/0x588
[<ffffffc0002e4958>] kasan_report+0x60/0x70
[<ffffffc0002e3188>] __asan_load8+0x60/0x78
[<ffffffc00008c92c>] unwind_frame+0xec/0x260
[<ffffffc000087e60>] get_wchan+0x110/0x160
[<ffffffc0003b647c>] do_task_stat+0xb44/0xb68
[<ffffffc0003b7730>] proc_tgid_stat+0x40/0x50
[<ffffffc0003ac840>] proc_single_show+0x88/0xd8
[<ffffffc000345be8>] seq_read+0x370/0x770
[<ffffffc00030aba0>] __vfs_read+0xc8/0x1d8
[<ffffffc00030c0ec>] vfs_read+0x94/0x168
[<ffffffc00030d458>] SyS_read+0xb8/0x128
[<ffffffc000086530>] el0_svc_naked+0x24/0x28
Memory state around the buggy address:
ffffffc064d57a80: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 f4 f4
ffffffc064d57b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>ffffffc064d57b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
^
ffffffc064d57c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffffffc064d57c80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Since the shadow byte pointed by the report is 0, so it may mean it is just hit
oob in non-current task. So, disable the instrumentation to silence these
warnings.
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Yang Shi <yang.shi@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Xishi Qiu <qiuxishi@huawei.com>
|
|
commit 75edb54a1dea5ea1c8d3d82e27dc9ee3070f5935 upstream
thread_saved_pc() reads stack of a potentially running task.
This can cause false KASAN stack-out-of-bounds reports,
because the running task concurrently poisons and unpoisons
own stack.
The same happens in get_wchan(), and get get_wchan() was fixed
by using READ_ONCE_NOCHECK(). Do the same here.
Example KASAN report triggered by sysrq-t:
BUG: KASAN: out-of-bounds in sched_show_task+0x306/0x3b0 at addr ffff880043c97c18
Read of size 8 by task syz-executor/23839
[...]
page dumped because: kasan: bad access detected
[...]
Call Trace:
[<ffffffff8175ea0e>] __asan_report_load8_noabort+0x3e/0x40
[<ffffffff813e7a26>] sched_show_task+0x306/0x3b0
[<ffffffff813e7bf4>] show_state_filter+0x124/0x1a0
[<ffffffff82d2ca00>] fn_show_state+0x10/0x20
[<ffffffff82d2cf98>] k_spec+0xa8/0xe0
[<ffffffff82d3354f>] kbd_event+0xb9f/0x4000
[<ffffffff843ca8a7>] input_to_handler+0x3a7/0x4b0
[<ffffffff843d1954>] input_pass_values.part.5+0x554/0x6b0
[<ffffffff843d29bc>] input_handle_event+0x2ac/0x1070
[<ffffffff843d3a47>] input_inject_event+0x237/0x280
[<ffffffff843e8c28>] evdev_write+0x478/0x680
[<ffffffff817ac653>] __vfs_write+0x113/0x480
[<ffffffff817ae0e7>] vfs_write+0x167/0x4a0
[<ffffffff817b13d1>] SyS_write+0x111/0x220
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: glider@google.com
Cc: kasan-dev@googlegroups.com
Cc: kcc@google.com
Cc: linux-kernel@vger.kernel.org
Cc: ryabinin.a.a@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Xishi Qiu <qiuxishi@huawei.com>
|
|
commit a75ca545e8d57473da47ece828ad98a10727ec6f upstream
Declaration of memcpy() is hidden under #ifndef CONFIG_KMEMCHECK.
In asm/efi.h under #ifdef CONFIG_KASAN we #undef memcpy(), due to
which the following happens:
In file included from arch/x86/kernel/setup.c:96:0:
./arch/x86/include/asm/desc.h: In function 'native_write_idt_entry':
./arch/x86/include/asm/desc.h:122:2: error: implicit declaration of function 'memcpy' [-Werror=implicit-function-declaration] memcpy(&idt[entry], gate, sizeof
^
cc1: some warnings being treated as errors
make[2]: *** [arch/x86/kernel/setup.o] Error 1
We will get rid of that #undef in asm/efi.h eventually.
But in the meanwhile move memcpy() declaration out of #ifdefs
to fix the build.
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1444994933-28328-1-git-send-email-aryabinin@virtuozzo.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Xishi Qiu <qiuxishi@huawei.com>
|
|
commit d6f2d75a7ae06ffd793bb504c4f0d1665548cffc upstream
KASAN_SHADOW_OFFSET is purely arch specific setting,
so it should be in arch's Kconfig file.
Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Cc: Alexander Popov <alpopov@ptsecurity.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <adech.fo@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Bolle <pebolle@tiscali.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1435828178-10975-7-git-send-email-a.ryabinin@samsung.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Xishi Qiu <qiuxishi@huawei.com>
|
|
After backport patch "arm64/efi: isolate EFI stub from the kernel proper",
It will cause build error in LSK, that's because we remove the conflict when
backport patch "arm64: add KASAN support", so add the conflict code again.
Signed-off-by: Xishi Qiu <qiuxishi@huawei.com>
|
|
commit e8f3010f7326c00368dbc057bd052bec80dfc072 upstream
Since arm64 does not use a builtin decompressor, the EFI stub is built
into the kernel proper. So far, this has been working fine, but actually,
since the stub is in fact a PE/COFF relocatable binary that is executed
at an unknown offset in the 1:1 mapping provided by the UEFI firmware, we
should not be seamlessly sharing code with the kernel proper, which is a
position dependent executable linked at a high virtual offset.
So instead, separate the contents of libstub and its dependencies, by
putting them into their own namespace by prefixing all of its symbols
with __efistub. This way, we have tight control over what parts of the
kernel proper are referenced by the stub.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Xishi Qiu <qiuxishi@huawei.com>
|
|
commit 4ac86a6dcec1c3878de9747bf5a2aa4455be69e3 upstream
With KMEMCHECK=y, KASAN=n we get this build failure:
arch/x86/platform/efi/efi.c:673:3: error: implicit declaration of function 'memcpy' [-Werror=implicit-function-declaration]
arch/x86/platform/efi/efi_64.c:139:2: error: implicit declaration of function 'memcpy' [-Werror=implicit-function-declaration]
arch/x86/include/asm/desc.h:121:2: error: implicit declaration of function 'memcpy' [-Werror=implicit-function-declaration]
Don't #undef memcpy if KASAN=n.
Reported-by: Ingo Molnar <mingo@kernel.org>
Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt.fleming@intel.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 769a8089c1fd ("x86, efi, kasan: #undef memset/memcpy/memmove per arch")
Link: http://lkml.kernel.org/r/1443544814-20122-1-git-send-email-ryabinin.a.a@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Xishi Qiu <qiuxishi@huawei.com>
|
|
commit 769a8089c1fd2fe94c13e66fe6e03d7820953ee3 upstream
In not-instrumented code KASAN replaces instrumented memset/memcpy/memmove
with not-instrumented analogues __memset/__memcpy/__memove.
However, on x86 the EFI stub is not linked with the kernel. It uses
not-instrumented mem*() functions from arch/x86/boot/compressed/string.c
So we don't replace them with __mem*() variants in EFI stub.
On ARM64 the EFI stub is linked with the kernel, so we should replace
mem*() functions with __mem*(), because the EFI stub runs before KASAN
sets up early shadow.
So let's move these #undef mem* into arch's asm/efi.h which is also
included by the EFI stub.
Also, this will fix the warning in 32-bit build reported by kbuild test
robot:
efi-stub-helper.c:599:2: warning: implicit declaration of function 'memcpy'
[akpm@linux-foundation.org: use 80 cols in comment]
Signed-off-by: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Reported-by: Fengguang Wu <fengguang.wu@gmail.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Matt Fleming <matt.fleming@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Xishi Qiu <qiuxishi@huawei.com>
|
|
Including linux/acpi.h from asm/dma-mapping.h causes tons of compile-time
warnings, e.g.
drivers/isdn/mISDN/dsp_ecdis.h:43:0: warning: "FALSE" redefined
drivers/isdn/mISDN/dsp_ecdis.h:44:0: warning: "TRUE" redefined
drivers/net/fddi/skfp/h/targetos.h:62:0: warning: "TRUE" redefined
drivers/net/fddi/skfp/h/targetos.h:63:0: warning: "FALSE" redefined
However, it looks like the dependency should not even there as
I do not see why __generic_dma_ops() cares about whether we have
an ACPI based system or not.
The current behavior is to fall back to the global dma_ops when
a device has not set its own dma_ops, but only for DT based systems.
This seems dangerous, as a random device might have different
requirements regarding IOMMU or coherency, so we should really
never have that fallback and just forbid DMA when we have not
initialized DMA for a device.
This removes the global dma_ops variable and the special-casing
for ACPI, and just returns the dma ops that got set for the
device, or the dummy_dma_ops if none were present.
The original code has apparently been copied from arm32 where we
rely on it for ISA devices things like the floppy controller, but
we should have no such devices on ARM64.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
[catalin.marinas@arm.com: removed acpi_disabled check in arch_setup_dma_ops()]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 1dccb598df549d892b6450c261da54cdd7af44b4)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
|
|
|
|
Boris reported that gcc version 4.4.4 20100503 (Red Hat
4.4.4-2) fails to build linux-next kernels that have
this fresh commit via the locking tree:
11276d5306b8 ("locking/static_keys: Add a new static_key interface")
The problem appears to be that even though @key and @branch are
compile time constants, it doesn't see the following expression
as an immediate value:
&((char *)key)[branch]
More recent GCCs don't appear to have this problem.
In particular, Red Hat backported the 'asm goto' feature into 4.4,
'normal' 4.4 compilers will not have this feature and thus not
run into this asm.
The workaround is to supply both values to the asm as immediates
and do the addition in asm.
Suggested-by: H. Peter Anvin <hpa@zytor.com>
Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit d420acd816c07c7be31bd19d09cbcb16e5572fa6)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
|
|
[ Upstream commit 2c2a63e301fd19ccae673e79de59b30a232ff7f9 ]
The recent commit 7cc851039d64 ("powerpc/pseries: Add POWER8NVL support
to ibm,client-architecture-support call") added a new PVR mask & value
to the start of the ibm_architecture_vec[] array.
However it missed the fact that further down in the array, we hard code
the offset of one of the fields, and then at boot use that value to
patch the value in the array. This means every update to the array must
also update the #define, ugh.
This means that on pseries machines we will misreport to firmware the
number of cores we support, by a factor of threads_per_core.
Fix it for now by updating the #define.
Fixes: 7cc851039d64 ("powerpc/pseries: Add POWER8NVL support to ibm,client-architecture-support call")
Cc: stable@vger.kernel.org # v4.0+
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit 9c77679cadb118c0aa99e6f88533d91765a131ba ]
For newer versions of Syslinux, we need ldlinux.c32 in addition to
isolinux.bin to reside on the boot disk, so if the latter is found,
copy it, too, to the isoimage tree.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Linux Stable Tree <stable@vger.kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit 8b78f260887df532da529f225c49195d18fef36b ]
One of the debian buildd servers had this crash in the syslog without
any other information:
Unaligned handler failed, ret = -2
clock_adjtime (pid 22578): Unaligned data reference (code 28)
CPU: 1 PID: 22578 Comm: clock_adjtime Tainted: G E 4.5.0-2-parisc64-smp #1 Debian 4.5.4-1
task: 000000007d9960f8 ti: 00000001bde7c000 task.ti: 00000001bde7c000
YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
PSW: 00001000000001001111100000001111 Tainted: G E
r00-03 000000ff0804f80f 00000001bde7c2b0 00000000402d2be8 00000001bde7c2b0
r04-07 00000000409e1fd0 00000000fa6f7fff 00000001bde7c148 00000000fa6f7fff
r08-11 0000000000000000 00000000ffffffff 00000000fac9bb7b 000000000002b4d4
r12-15 000000000015241c 000000000015242c 000000000000002d 00000000fac9bb7b
r16-19 0000000000028800 0000000000000001 0000000000000070 00000001bde7c218
r20-23 0000000000000000 00000001bde7c210 0000000000000002 0000000000000000
r24-27 0000000000000000 0000000000000000 00000001bde7c148 00000000409e1fd0
r28-31 0000000000000001 00000001bde7c320 00000001bde7c350 00000001bde7c218
sr00-03 0000000001200000 0000000001200000 0000000000000000 0000000001200000
sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000
IASQ: 0000000000000000 0000000000000000 IAOQ: 00000000402d2e84 00000000402d2e88
IIR: 0ca0d089 ISR: 0000000001200000 IOR: 00000000fa6f7fff
CPU: 1 CR30: 00000001bde7c000 CR31: ffffffffffffffff
ORIG_R28: 00000002369fe628
IAOQ[0]: compat_get_timex+0x2dc/0x3c0
IAOQ[1]: compat_get_timex+0x2e0/0x3c0
RP(r2): compat_get_timex+0x40/0x3c0
Backtrace:
[<00000000402d4608>] compat_SyS_clock_adjtime+0x40/0xc0
[<0000000040205024>] syscall_exit+0x0/0x14
This means the userspace program clock_adjtime called the clock_adjtime()
syscall and then crashed inside the compat_get_timex() function.
Syscalls should never crash programs, but instead return EFAULT.
The IIR register contains the executed instruction, which disassebles
into "ldw 0(sr3,r5),r9".
This load-word instruction is part of __get_user() which tried to read the word
at %r5/IOR (0xfa6f7fff). This means the unaligned handler jumped in. The
unaligned handler is able to emulate all ldw instructions, but it fails if it
fails to read the source e.g. because of page fault.
The following program reproduces the problem:
#define _GNU_SOURCE
#include <unistd.h>
#include <sys/syscall.h>
#include <sys/mman.h>
int main(void) {
/* allocate 8k */
char *ptr = mmap(NULL, 2*4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
/* free second half (upper 4k) and make it invalid. */
munmap(ptr+4096, 4096);
/* syscall where first int is unaligned and clobbers into invalid memory region */
/* syscall should return EFAULT */
return syscall(__NR_clock_adjtime, 0, ptr+4095);
}
To fix this issue we simply need to check if the faulting instruction address
is in the exception fixup table when the unaligned handler failed. If it
is, call the fixup routine instead of crashing.
While looking at the unaligned handler I found another issue as well: The
target register should not be modified if the handler was unsuccessful.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit d14bdb553f9196169f003058ae1cdabe514470e6 ]
MOV to DR6 or DR7 causes a #GP if an attempt is made to write a 1 to
any of bits 63:32. However, this is not detected at KVM_SET_DEBUGREGS
time, and the next KVM_RUN oopses:
general protection fault: 0000 [#1] SMP
CPU: 2 PID: 14987 Comm: a.out Not tainted 4.4.9-300.fc23.x86_64 #1
Hardware name: LENOVO 2325F51/2325F51, BIOS G2ET32WW (1.12 ) 05/30/2012
[...]
Call Trace:
[<ffffffffa072c93d>] kvm_arch_vcpu_ioctl_run+0x141d/0x14e0 [kvm]
[<ffffffffa071405d>] kvm_vcpu_ioctl+0x33d/0x620 [kvm]
[<ffffffff81241648>] do_vfs_ioctl+0x298/0x480
[<ffffffff812418a9>] SyS_ioctl+0x79/0x90
[<ffffffff817a0f2e>] entry_SYSCALL_64_fastpath+0x12/0x71
Code: 55 83 ff 07 48 89 e5 77 27 89 ff ff 24 fd 90 87 80 81 0f 23 fe 5d c3 0f 23 c6 5d c3 0f 23 ce 5d c3 0f 23 d6 5d c3 0f 23 de 5d c3 <0f> 23 f6 5d c3 0f 0b 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00
RIP [<ffffffff810639eb>] native_set_debugreg+0x2b/0x40
RSP <ffff88005836bd50>
Testcase (beautified/reduced from syzkaller output):
#include <unistd.h>
#include <sys/syscall.h>
#include <string.h>
#include <stdint.h>
#include <linux/kvm.h>
#include <fcntl.h>
#include <sys/ioctl.h>
long r[8];
int main()
{
struct kvm_debugregs dr = { 0 };
r[2] = open("/dev/kvm", O_RDONLY);
r[3] = ioctl(r[2], KVM_CREATE_VM, 0);
r[4] = ioctl(r[3], KVM_CREATE_VCPU, 7);
memcpy(&dr,
"\x5d\x6a\x6b\xe8\x57\x3b\x4b\x7e\xcf\x0d\xa1\x72"
"\xa3\x4a\x29\x0c\xfc\x6d\x44\x00\xa7\x52\xc7\xd8"
"\x00\xdb\x89\x9d\x78\xb5\x54\x6b\x6b\x13\x1c\xe9"
"\x5e\xd3\x0e\x40\x6f\xb4\x66\xf7\x5b\xe3\x36\xcb",
48);
r[7] = ioctl(r[4], KVM_SET_DEBUGREGS, &dr);
r[6] = ioctl(r[4], KVM_RUN, 0);
}
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit e2dfb4b880146bfd4b6aa8e138c0205407cebbaf ]
PTRACE_SETVFPREGS fails to properly mark the VFP register set to be
reloaded, because it undoes one of the effects of vfp_flush_hwstate().
Specifically vfp_flush_hwstate() sets thread->vfpstate.hard.cpu to
an invalid CPU number, but vfp_set() overwrites this with the original
CPU number, thereby rendering the hardware state as apparently "valid",
even though the software state is more recent.
Fix this by reverting the previous change.
Cc: <stable@vger.kernel.org>
Fixes: 8130b9d7b9d8 ("ARM: 7308/1: vfp: flush thread hwstate before copying ptrace registers")
Acked-by: Will Deacon <will.deacon@arm.com>
Tested-by: Simon Marchi <simon.marchi@ericsson.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit 7cc851039d643a2ee7df4d18177150f2c3a484f5 ]
If we do not provide the PVR for POWER8NVL, a guest on this system
currently ends up in PowerISA 2.06 compatibility mode on KVM, since QEMU
does not provide a generic PowerISA 2.07 mode yet. So some new
instructions from POWER8 (like "mtvsrd") get disabled for the guest,
resulting in crashes when using code compiled explicitly for
POWER8 (e.g. with the "-mcpu=power8" option of GCC).
Fixes: ddee09c099c3 ("powerpc: Add PVR for POWER8NVL processor")
Cc: stable@vger.kernel.org # v4.0+
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit 8dd75ccb571f3c92c48014b3dabd3d51a115ab41 ]
We are already using the privileged versions of MMCR0, MMCR1
and MMCRA in the kernel, so for MMCR2, we should better use
the privileged versions, too, to be consistent.
Fixes: 240686c13687 ("powerpc: Initialise PMU related regs on Power8")
Cc: stable@vger.kernel.org # v3.10+
Suggested-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Acked-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit d23fac2b27d94aeb7b65536a50d32bfdc21fe01e ]
The SIAR and SDAR registers are available twice, one time as SPRs
780 / 781 (unprivileged, but read-only), and one time as the SPRs
796 / 797 (privileged, but read and write). The Linux kernel code
currently uses the unprivileged SPRs - while this is OK for reading,
writing to that register of course does not work.
Since the KVM code tries to write to this register, too (see the mtspr
in book3s_hv_rmhandlers.S), the contents of this register sometimes get
lost for the guests, e.g. during migration of a VM.
To fix this issue, simply switch to the privileged SPR numbers instead.
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Huth <thuth@redhat.com>
Acked-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit 871e178e0f2c4fa788f694721a10b4758d494ce1 ]
In the "ibm,configure-pe" and "ibm,configure-bridge" RTAS calls, the
spec states that values of 9900-9905 can be returned, indicating that
software should delay for 10^x (where x is the last digit, i.e. 990x)
milliseconds and attempt the call again. Currently, the kernel doesn't
know about this, and respecting it fixes some PCI failures when the
hypervisor is busy.
The delay is capped at 0.2 seconds.
Cc: <stable@vger.kernel.org> # 3.10+
Signed-off-by: Russell Currey <ruscur@russell.cc>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
This is the 4.1.26 stable release
|
|
Conflicts:
fs/dax.c
fs/ext2/file.c
fs/ext4/inode.c
include/linux/cgroup-defs.h
include/linux/cgroup.h
include/linux/fs.h
|
|
Add a little selftest that validates all combinations.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit 1987c947d905baa05bee430178e8eef882f36bd4)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
|
|
There are various problems and short-comings with the current
static_key interface:
- static_key_{true,false}() read like a branch depending on the key
value, instead of the actual likely/unlikely branch depending on
init value.
- static_key_{true,false}() are, as stated above, tied to the
static_key init values STATIC_KEY_INIT_{TRUE,FALSE}.
- we're limited to the 2 (out of 4) possible options that compile to
a default NOP because that's what our arch_static_branch() assembly
emits.
So provide a new static_key interface:
DEFINE_STATIC_KEY_TRUE(name);
DEFINE_STATIC_KEY_FALSE(name);
Which define a key of different types with an initial true/false
value.
Then allow:
static_branch_likely()
static_branch_unlikely()
to take a key of either type and emit the right instruction for the
case.
This means adding a second arch_static_branch_jump() assembly helper
which emits a JMP per default.
In order to determine the right instruction for the right state,
encode the branch type in the LSB of jump_entry::key.
This is the final step in removing the naming confusion that has led to
a stream of avoidable bugs such as:
a833581e372a ("x86, perf: Fix static_key bug in load_mm_cr4()")
... but it also allows new static key combinations that will give us
performance enhancements in the subsequent patches.
Tested-by: Rabin Vincent <rabin@rab.in> # arm
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> # ppc
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # s390
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit 11276d5306b8e5b438a36bbff855fe792d7eaa61)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
|
|
Since we've already stepped away from ENABLE is a JMP and DISABLE is a
NOP with the branch_default bits, and are going to make it even worse,
rename it to make it all clearer.
This way we don't mix multiple levels of logic attributes, but have a
plain 'physical' name for what the current instruction patching status
of a jump label is.
This is a first step in removing the naming confusion that has led to
a stream of avoidable bugs such as:
a833581e372a ("x86, perf: Fix static_key bug in load_mm_cr4()")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
[ Beefed up the changelog. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit 76b235c6bcb16062d663e2ee96db0b69f2e6bc14)
Signed-off-by: Alex Shi <alex.shi@linaro.org>
|
|
[ Upstream commit 702f926067d2a4b28c10a3c41a1172dd62d9e735 ]
b4ff8389ed14 is incomplete: relies on nr_legacy_irqs() to get the number
of legacy interrupts when actually nr_legacy_irqs() returns 0 after
probe_8259A(). Use NR_IRQS_LEGACY instead.
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
CC: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit e4fe9e7dc3828bf6a5714eb3c55aef6260d823a2 ]
The EC field of the constructed ESR is conditionally modified by ORing in
ESR_ELx_EC_DABT_LOW for a data abort. However, ESR_ELx_EC_SHIFT is missing
from this condition.
Signed-off-by: Matt Evans <matt.evans@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit e49d38488515057dba8f0c2ba4cfde5be4a7281f ]
Fix a build regression from commit c9017757c532 ("MIPS: init upper 64b
of vector registers when MSA is first used"):
arch/mips/built-in.o: In function `enable_restore_fp_context':
traps.c:(.text+0xbb90): undefined reference to `_init_msa_upper'
traps.c:(.text+0xbb90): relocation truncated to fit: R_MIPS_26 against `_init_msa_upper'
traps.c:(.text+0xbef0): undefined reference to `_init_msa_upper'
traps.c:(.text+0xbef0): relocation truncated to fit: R_MIPS_26 against `_init_msa_upper'
to !CONFIG_CPU_HAS_MSA configurations with older GCC versions, which are
unable to figure out that calls to `_init_msa_upper' are indeed dead.
Of the many ways to tackle this failure choose the approach we have
already taken in `thread_msa_context_live'.
[ralf@linux-mips.org: Drop patch segment to junk file.]
Signed-off-by: Maciej W. Rozycki <macro@imgtec.com>
Cc: stable@vger.kernel.org # v3.16+
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13271/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit bd239f1e1429e7781096bf3884bdb1b2b1bb4f28 ]
Whilst a PR_SET_FP_MODE prctl is performed there are decisions made
based upon whether the task is executing on the current CPU. This may
change if we're preempted, so disable preemption to avoid such changes
for the lifetime of the mode switch.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Fixes: 9791554b45a2 ("MIPS,prctl: add PR_[GS]ET_FP_MODE prctl options for MIPS")
Reviewed-by: Maciej W. Rozycki <macro@imgtec.com>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: stable <stable@vger.kernel.org> # v4.0+
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/13144/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit abf378be49f38c4d3e23581d3df3fa9f1b1b11d2 ]
Correct the cases missed with commit 9b26616c8d9d ("MIPS: Respect the
ISA level in FCSR handling") and prevent writes to read-only FCSR bits
there.
This in particular applies to FP context initialisation where any IEEE
754-2008 bits preset by `mips_set_personality_nan' are cleared before
the relevant ptrace(2) call takes effect and the PTRACE_POKEUSR request
addressing FPC_CSR where no masking of read-only FCSR bits is done.
Remove the FCSR clearing from FP context initialisation then and unify
PTRACE_POKEUSR/FPC_CSR and PTRACE_SETFPREGS handling, by factoring out
code from `ptrace_setfpregs' and calling it from both places.
This mostly matters to soft float configurations where the emulator can
be switched this way to a mode which should not be accessible and cannot
be set with the CTC1 instruction. With hard float configurations any
effect is transient anyway as read-only bits will retain their values at
the time the FP context is restored.
Signed-off-by: Maciej W. Rozycki <macro@imgtec.com>
Cc: stable@vger.kernel.org # v4.0+
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13239/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit 4249548454f7ba4581aeee26bd83f42b48a14d15 ]
Fix a floating-point context restoration regression introduced with
commit 9b26616c8d9d ("MIPS: Respect the ISA level in FCSR handling")
that causes a Floating Point exception and consequently a kernel oops
with hard float configurations when one or more FCSR Enable and their
corresponding Cause bits are set both at a time via a ptrace(2) call.
To do so reinstate Cause bit masking originally introduced with commit
b1442d39fac2 ("MIPS: Prevent user from setting FCSR cause bits") to
address this exact problem and then inadvertently removed from the
PTRACE_SETFPREGS request with the commit referred above.
Signed-off-by: Maciej W. Rozycki <macro@imgtec.com>
Cc: stable@vger.kernel.org # v4.0+
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13238/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit ab4a92e66741b35ca12f8497896bafbe579c28a1 ]
When emulating a jalr instruction with rd == $0, the code in
isBranchInstr was incorrectly writing to GPR $0 which should actually
always remain zeroed. This would lead to any further instructions
emulated which use $0 operating on a bogus value until the task is next
context switched, at which point the value of $0 in the task context
would be restored to the correct zero by a store in SAVE_SOME. Fix this
by not writing to rd if it is $0.
Fixes: 102cedc32a6e ("MIPS: microMIPS: Floating point support.")
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: Maciej W. Rozycki <macro@imgtec.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: stable <stable@vger.kernel.org> # v3.10
Patchwork: https://patchwork.linux-mips.org/patch/13160/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit 987e5b834467c9251ca584febda65ef8f66351a9 ]
Since commit 8cb48fe169dd ("MIPS: Provide correct siginfo_t.si_stime"),
MIPS' uapi/asm/siginfo.h has included uapi/asm-generic/siginfo.h
directly before defining MIPS' struct siginfo, in order to get the
necessary definitions needed for the siginfo struct without the generic
copy_siginfo() hitting compiler errors due to struct siginfo not yet
being defined.
Now that the generic copy_siginfo() is moved out to linux/signal.h we
can safely include asm-generic/siginfo.h before defining the MIPS
specific struct siginfo, which avoids the uapi/ include as well as
breakage due to generic copy_siginfo() being defined before struct
siginfo.
Reported-by: Christopher Ferris <cferris@google.com>
Fixes: 8cb48fe169dd ("MIPS: Provide correct siginfo_t.si_stime")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Petr Malat <oss@malat.biz>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 4.0-
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit 37d22a0d798b5c938b277d32cfd86dc231381342 ]
It's possible for pages to become visible prior to update_mmu_cache
running if a thread within the same address space preempts the current
thread or runs simultaneously on another CPU. That is, the following
scenario is possible:
CPU0 CPU1
write to page
flush_dcache_page
flush_icache_page
set_pte_at
map page
update_mmu_cache
If CPU1 maps the page in between CPU0's set_pte_at, which marks it valid
& visible, and update_mmu_cache where the dcache flush occurs then CPU1s
icache will fill from stale data (unless it fills from the dcache, in
which case all is good, but most MIPS CPUs don't have this property).
Commit 4d46a67a3eb8 ("MIPS: Fix race condition in lazy cache flushing.")
attempted to fix that by performing the dcache flush in
flush_icache_page such that it occurs before the set_pte_at call makes
the page visible. However it has the problem that not all code that
writes to pages exposed to userland call flush_icache_page. There are
many callers of set_pte_at under mm/ and only 2 of them do call
flush_icache_page. Thus the race window between a page becoming visible
& being coherent between the icache & dcache remains open in some cases.
To illustrate some of the cases, a WARN was added to __update_cache with
this patch applied that triggered in cases where a page about to be
flushed from the dcache was not the last page provided to
flush_icache_page. That is, backtraces were obtained for cases in which
the race window is left open without this patch. The 2 standout examples
follow.
When forking a process:
[ 15.271842] [<80417630>] __update_cache+0xcc/0x188
[ 15.277274] [<80530394>] copy_page_range+0x56c/0x6ac
[ 15.282861] [<8042936c>] copy_process.part.54+0xd40/0x17ac
[ 15.289028] [<80429f80>] do_fork+0xe4/0x420
[ 15.293747] [<80413808>] handle_sys+0x128/0x14c
When exec'ing an ELF binary:
[ 14.445964] [<80417630>] __update_cache+0xcc/0x188
[ 14.451369] [<80538d88>] move_page_tables+0x414/0x498
[ 14.457075] [<8055d848>] setup_arg_pages+0x220/0x318
[ 14.462685] [<805b0f38>] load_elf_binary+0x530/0x12a0
[ 14.468374] [<8055ec3c>] search_binary_handler+0xbc/0x214
[ 14.474444] [<8055f6c0>] do_execveat_common+0x43c/0x67c
[ 14.480324] [<8055f938>] do_execve+0x38/0x44
[ 14.485137] [<80413808>] handle_sys+0x128/0x14c
These code paths write into a page, call flush_dcache_page then call
set_pte_at without flush_icache_page inbetween. The end result is that
the icache can become corrupted & userland processes may execute
unexpected or invalid code, typically resulting in a reserved
instruction exception, a trap or a segfault.
Fix this race condition fully by performing any cache maintenance
required to keep the icache & dcache in sync in set_pte_at, before the
page is made valid. This has the added bonus of ensuring the cache
maintenance always happens in one location, rather than being duplicated
in flush_icache_page & update_mmu_cache. It also matches the way other
architectures solve the same problem (see arm, ia64 & powerpc).
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Reported-by: Ionela Voinescu <ionela.voinescu@imgtec.com>
Cc: Lars Persson <lars.persson@axis.com>
Fixes: 4d46a67a3eb8 ("MIPS: Fix race condition in lazy cache flushing.")
Cc: Steven J. Hill <sjhill@realitydiluted.com>
Cc: David Daney <david.daney@cavium.com>
Cc: Huacai Chen <chenhc@lemote.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: stable <stable@vger.kernel.org> # v4.1+
Patchwork: https://patchwork.linux-mips.org/patch/12722/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit f4281bba818105c7c91799abe40bc05c0dbdaa25 ]
The following patch will expose __update_cache to highmem pages. Handle
them by mapping them in for the duration of the cache maintenance, just
like in __flush_dcache_page. The code for that isn't shared because we
need the page address in __update_cache so sharing became messy. Given
that the entirity is an extra 5 lines, just duplicate it.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: Lars Persson <lars.persson@axis.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: stable <stable@vger.kernel.org> # v4.1+
Patchwork: https://patchwork.linux-mips.org/patch/12721/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit 234859e49a15323cf1b2331bdde7f658c4cb45fb ]
When flush_dcache_page is called on an executable page, that page is
about to be provided to userland & we can presume that the icache
contains no valid entries for its address range. However if the icache
does not fill from the dcache then we cannot presume that the pages
content has been written back as far as the memories that the dcache
will fill from (ie. L2 or further out).
This was being done for lowmem pages, but not for highmem which can lead
to icache corruption. Fix this by mapping highmem pages & flushing their
content from the dcache in __flush_dcache_page before providing the page
to userland, just as is done for lowmem pages.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: Lars Persson <lars.persson@axis.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/12720/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit c2078d9ef600bdbe568c89e5ddc2c6f15b7982c8 ]
This reverts commit 89a51df5ab1d38b257300b8ac940bbac3bb0eb9b.
The function eeh_add_device_early() is used to perform EEH
initialization in devices added later on the system, like in
hotplug/DLPAR scenarios. Since the commit 89a51df5ab1d ("powerpc/eeh:
Fix crash in eeh_add_device_early() on Cell") a new check was introduced
in this function - Cell has no EEH capabilities which led to kernel oops
if hotplug was performed, so checking for eeh_enabled() was introduced
to avoid the issue.
However, in architectures that EEH is present like pSeries or PowerNV,
we might reach a case in which no PCI devices are present on boot time
and so EEH is not initialized. Then, if a device is added via DLPAR for
example, eeh_add_device_early() fails because eeh_enabled() is false,
and EEH end up not being enabled at all.
This reverts the aforementioned patch since a new verification was
introduced by the commit d91dafc02f42 ("powerpc/eeh: Delay probing EEH
device during hotplug") and so the original Cell issue does not happen
anymore.
Cc: stable@vger.kernel.org # v4.1+
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit 5a0cdbfd17b90a89c64a71d8aec9773ecdb20d0d ]
The function eeh_pe_reset_and_recover() is used to recover EEH
error when the passthrou device are transferred to guest and
backwards. The content in the device's config space will be lost
on PE reset issued in the middle of the recovery. The function
saves/restores it before/after the reset. However, config access
to some adapters like Broadcom BCM5719 at this point will causes
fenced PHB. The config space is always blocked and we save 0xFF's
that are restored at late point. The memory BARs are totally
corrupted, causing another EEH error upon access to one of the
memory BARs.
This restores the config space on those adapters like BCM5719
from the content saved to the EEH device when it's populated,
to resolve above issue.
Fixes: 5cfb20b9 ("powerpc/eeh: Emulate EEH recovery for VFIO devices")
Cc: stable@vger.kernel.org #v3.18+
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit affeb0f2d3a9af419ad7ef4ac782e1540b2f7b28 ]
The function eeh_pe_reset_and_recover() is used to recover EEH
error when the passthrough device are transferred to guest and
backwards, meaning the device's driver is vfio-pci or none.
When the driver is vfio-pci that provides error_detected() error
handler only, the handler simply stops the guest and it's not
expected behaviour. On the other hand, no error handlers will
be called if we don't have a bound driver.
This ignores the error handler in eeh_pe_reset_and_recover()
that reports the error to device driver to avoid the exceptional
behaviour.
Fixes: 5cfb20b9 ("powerpc/eeh: Emulate EEH recovery for VFIO devices")
Cc: stable@vger.kernel.org #v3.18+
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit b45bacd2d048f405c7760e5cc9b60dd67708734f ]
Writing CP0_Compare clears the timer interrupt pending bit
(CP0_Cause.TI), but this wasn't being done atomically. If a timer
interrupt raced with the write of the guest CP0_Compare, the timer
interrupt could end up being pending even though the new CP0_Compare is
nowhere near CP0_Count.
We were already updating the hrtimer expiry with
kvm_mips_update_hrtimer(), which used both kvm_mips_freeze_hrtimer() and
kvm_mips_resume_hrtimer(). Close the race window by expanding out
kvm_mips_update_hrtimer(), and clearing CP0_Cause.TI and setting
CP0_Compare between the freeze and resume. Since the pending timer
interrupt should not be cleared when CP0_Compare is written via the KVM
user API, an ack argument is added to distinguish the source of the
write.
Fixes: e30492bbe95a ("MIPS: KVM: Rewrite count/compare timer emulation")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim KrÄmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Cc: <stable@vger.kernel.org> # 3.16.x-
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit 4355c44f063d3de4f072d796604c7f4ba4085cc3 ]
There's a particularly narrow and subtle race condition when the
software emulated guest timer is frozen which can allow a guest timer
interrupt to be missed.
This happens due to the hrtimer expiry being inexact, so very
occasionally the freeze time will be after the moment when the emulated
CP0_Count transitions to the same value as CP0_Compare (so an IRQ should
be generated), but before the moment when the hrtimer is due to expire
(so no IRQ is generated). The IRQ won't be generated when the timer is
resumed either, since the resume CP0_Count will already match CP0_Compare.
With VZ guests in particular this is far more likely to happen, since
the soft timer may be frozen frequently in order to restore the timer
state to the hardware guest timer. This happens after 5-10 hours of
guest soak testing, resulting in an overflow in guest kernel timekeeping
calculations, hanging the guest. A more focussed test case to
intentionally hit the race (with the help of a new hypcall to cause the
timer state to migrated between hardware & software) hits the condition
fairly reliably within around 30 seconds.
Instead of relying purely on the inexact hrtimer expiry to determine
whether an IRQ should be generated, read the guest CP0_Compare and
directly check whether the freeze time is before or after it. Only if
CP0_Count is on or after CP0_Compare do we check the hrtimer expiry to
determine whether the last IRQ has already been generated (which will
have pushed back the expiry by one timer period).
Fixes: e30492bbe95a ("MIPS: KVM: Rewrite count/compare timer emulation")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim KrÄmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Cc: <stable@vger.kernel.org> # 3.16.x-
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit 41fa29e4d8cf4150568a0fe9bb4d62229f9caed5 ]
Error recovery pointers for fixups was improperly set as ".word"
which is unsuitable for MIPS64.
Replaced by STR(PTR)
[ralf@linux-mips.org: Apply changes as requested in the review process.]
Signed-off-by: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Markos Chandras <markos.chandras@imgtec.com>
Fixes: b0a668fb2038 ("MIPS: kernel: mips-r2-to-r6-emul: Add R2 emulator for MIPS R6")
Cc: macro@linux-mips.org
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: <stable@vger.kernel.org> # 4.0+
Patchwork: https://patchwork.linux-mips.org/patch/9911/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit 81a76d7119f63c359750e4adeff922a31ad1135f ]
When showing backtraces in response to traps, for example crashes and
address errors (usually unaligned accesses) when they are set in debugfs
to be reported, unwind_stack will be used if the PC was in the kernel
text address range. However since EVA it is possible for user and kernel
address ranges to overlap, and even without EVA userland can still
trigger an address error by jumping to a KSeg0 address.
Adjust the check to also ensure that it was running in kernel mode. I
don't believe any harm can come of this problem, since unwind_stack() is
sufficiently defensive, however it is only meant for unwinding kernel
code, so to be correct it should use the raw backtracing instead.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 3.15+
Patchwork: https://patchwork.linux-mips.org/patch/11701/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
(cherry picked from commit d2941a975ac745c607dfb590e92bb30bc352dad9)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit a816b306c62195b7c43c92cb13330821a96bdc27 ]
When unwinding through IRQs and exceptions, the unwinding only continues
if the PC is a kernel text address, however since EVA it is possible for
user and kernel address ranges to overlap, potentially allowing
unwinding to continue to user mode if the user PC happens to be in the
kernel text address range.
Adjust the check to also ensure that the register state from before the
exception is actually running in kernel mode, i.e. !user_mode(regs).
I don't believe any harm can come of this problem, since the PC is only
output, the stack pointer is checked to ensure it resides within the
task's stack page before it is dereferenced in search of the return
address, and the return address register is similarly only output (if
the PC is in a leaf function or the beginning of a non-leaf function).
However unwind_stack() is only meant for unwinding kernel code, so to be
correct the unwind should stop there.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 3.15+
Patchwork: https://patchwork.linux-mips.org/patch/11700/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit 5daebc477da4dfeb31ae193d83084def58fd2697 ]
Commit 85efde6f4e0d ("make exported headers use strict posix types")
changed the asm-generic siginfo.h to use the __kernel_* types, and
commit 3a471cbc081b ("remove __KERNEL_STRICT_NAMES") make the internal
types accessible only to the kernel, but the MIPS implementation hasn't
been updated to match.
Switch to proper types now so that the exported asm/siginfo.h won't
produce quite so many compiler errors when included alone by a user
program.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Christopher Ferris <cferris@google.com>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 2.6.30-
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/12477/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit 5bb1cc0ff9a6b68871970737e6c4c16919928d8b ]
Currently, pmd_present() only checks for a non-zero value, returning
true even after pmd_mknotpresent() (which only clears the type bits).
This patch converts pmd_present() to using pte_present(), similar to the
other pmd_*() checks. As a side effect, it will return true for
PROT_NONE mappings, though they are not yet used by the kernel with
transparent huge pages.
For consistency, also change pmd_mknotpresent() to only clear the
PMD_SECT_VALID bit, even though the PMD_TABLE_BIT is already 0 for block
mappings (no functional change). The unused PMD_SECT_PROT_NONE
definition is removed as transparent huge pages use the pte page prot
values.
Fixes: 9c7e535fcc17 ("arm64: mm: Route pmd thp functions through pte equivalents")
Cc: <stable@vger.kernel.org> # 3.15+
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit f5b556c94c8490d42fea79d7b4ae0ecbc291e69d ]
This makes the ath79 bootconsole behave the same way as the generic 8250
bootconsole.
Also waiting for TEMT (transmit buffer is empty) instead of just THRE
(transmit buffer is not full) ensures that all characters have been
transmitted before the real serial driver starts reconfiguring the serial
controller (which would sometimes result in garbage being transmitted.)
This change does not cause a visible performance loss.
In addition, this seems to fix a hang observed in certain configurations on
many AR7xxx/AR9xxx SoCs during autoconfig of the real serial driver.
A more complete follow-up patch will disable 8250 autoconfig for ath79
altogether (the serial controller is detected as a 16550A, which is not
fully compatible with the ath79 serial, and the autoconfig may lead to
undefined behavior on ath79.)
Cc: <stable@vger.kernel.org>
Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|
|
[ Upstream commit d4b9e0790aa764c0b01e18d4e8d33e93ba36d51f ]
The ARM architecture mandates that when changing a page table entry
from a valid entry to another valid entry, an invalid entry is first
written, TLB invalidated, and only then the new entry being written.
The current code doesn't respect this, directly writing the new
entry and only then invalidating TLBs. Let's fix it up.
Cc: <stable@vger.kernel.org>
Reported-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
|