aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2015-09-09ARM64: perf: override arch_perf_userspace_access()linux-lng-preempt-rt-3.18.16-2015.10linux-lng-preempt-rt-3.18.16-2015.09linux-linaro-lng-v3.18-rt-testlinux-linaro-lng-v3.18-rtYogesh Tillu
Override implementation of arch_perf_userspace_access() to enable or disable access to "perf hw counter" from userspace for mmap Signed-off-by: Yogesh Tillu <yogesh.tillu@linaro.org> Signed-off-by: Gary S. Robertson <gary.robertson@linaro.org>
2015-09-09ARM64: perf: enable/disable access to "perf hw counter" from userspace for mmapYogesh Tillu
__weak implementation of arch_perf_userspace_access() to enable or disable access to "perf hw counter" from userspace at runtime for mmap. Signed-off-by: Yogesh Tillu <yogesh.tillu@linaro.org> Signed-off-by: Gary S. Robertson <gary.robertson@linaro.org>
2015-09-09ARM64: perf: add support for accessing counters from userspace with mmap wayYogesh Tillu
This patch adds support for accessing perf hw counters from userspace with usage of perf_event_mmap_page. Signed-off-by: Yogesh Tillu <yogesh.tillu@linaro.org> Signed-off-by: Gary S. Robertson <gary.robertson@linaro.org>
2015-09-09linaro/configs/kvm-host.conf: Add missing fragments for x86.Christian Ziethén
Adds a few x86 specific config fragments to enable building a host kernel for x86. Signed-off-by: Christian Ziethen <christian.ziethen@linaro.org> Signed-off-by: Gary S. Robertson <gary.robertson@linaro.org>
2015-09-09linaro/configs/kvm-guest: enable virtio-net-pci devicesChristian Ziethén
This enables running a virtio-net-pci device in a guest. Added to kvm-guest and not kvm-host as this affects the driver in the guest. The host side implementation is in userspace. Signed-off-by: Christian Ziethen <christian.ziethen@linaro.org> Signed-off-by: Gary S. Robertson <gary.robertson@linaro.org>
2015-09-09Merge branch 'config-boards-3.19' into linux-linaro-lng-v3.18-rtGary S. Robertson
Signed-off-by: Gary S. Robertson <gary.robertson@linaro.org> Conflicts: linaro/configs/arndale.conf linaro/configs/omap4.conf linaro/configs/vexpress.conf linaro/configs/vexpress64.conf
2015-09-09Merge branch 'config-core-3.18' into linux-linaro-lng-v3.18-rtGary S. Robertson
Signed-off-by: Gary S. Robertson <gary.robertson@linaro.org> Conflicts: linaro/configs/android.conf linaro/configs/big-LITTLE-IKS.conf linaro/configs/distribution.conf linaro/configs/linaro-base.conf
2015-08-19nohz: Set isolcpus when nohz_full is setlinux-lng-preempt-rt-3.18.16-2015.08Chris Metcalf
nohz_full is only useful with isolcpus are also set, since otherwise the scheduler has to run periodically to try to determine whether to steal work from other cores. Accordingly, when booting with nohz_full=xxx on the command line, we should act as if isolcpus=xxx was also set, and set (or extend) the isolcpus set to include the nohz_full cpus. Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Rik van Riel <riel@redhat.com> Acked-by: Mike Galbraith <umgwanakikbuti@gmail.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Jones <davej@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1430928266-24888-5-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-19nohz: Add tick_nohz_full_add_cpus_to() APIChris Metcalf
This API is useful to modify a cpumask indicating some special nohz-type functionality so that the nohz cores are automatically added to that set. Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Jones <davej@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Mike Galbraith <umgwanakikbuti@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1429024675-18938-1-git-send-email-cmetcalf@ezchip.com Link: http://lkml.kernel.org/r/1430928266-24888-4-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-09Merge tag 'lsk-v3.18-15.07-rt' of ↵Gary S. Robertson
http://git.linaro.org/kernel/linux-linaro-stable into linux-linaro-lng-v3.18-rt LSK RT 15.07 v3.18
2015-07-22hrtimer.c: remove extraneous braceslinux-lng-preempt-rt-3.18.13-2015.07Gary S. Robertson
Signed-off-by: Gary S. Robertson <gary.robertson@linaro.org>
2015-07-22Revert "tick: SHUTDOWN event-dev if no events are required for KTIME_MAX"Gary S. Robertson
This reverts commit c817b87cb66410545e0b45f05a015d3b6bc2cec3. Per request from the patch's author.
2015-07-22clocksource: exynos: migrate to new per-mode set_mode_*() callbacksViresh Kumar
In order to support ONESHOT_STOPPED mode. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2015-07-22x86: apic: migrate to new per-mode set_mode_*() callbacksViresh Kumar
In order to support ONESHOT_STOPPED mode. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2015-07-22clockevents: Stop unused clockevent deviceViresh Kumar
Clockevent device can now be stopped by switching to ONESHOT_STOPPED mode, to avoid getting spurious interrupts on a tickless CPU. This patch switches mode to ONESHOT_STOPPED at three different places and following is the reasoning behind them. 1.) NOHZ_MODE_LOWRES Timers & hrtimers are dependent on tick for their working in this mode and the only place from where clockevent device is programmed is the tick-code. So, we only need to switch clockevent device to ONESHOT_STOPPED mode once ticks aren't required anymore. And the only call site is: tick_nohz_stop_sched_tick(). In LOWRES mode we skip reprogramming the clockevent device here if expires == KTIME_MAX. In addition to that we must also switch the clockevent device to ONESHOT_STOPPED mode to avoid all spurious interrupts that may follow. 2.) NOHZ_MODE_HIGHRES Tick & timers are dependent on hrtimers for their working in this mode and the only place from where clockevent device is programmed is the hrtimer-code. There are two places here from which we reprogram the clockevent device or skip reprogramming it on expires == KTIME_MAX. Instead of skipping reprogramming the clockevent device, also switch its mode to ONESHOT_STOPPED so that it doesn't generate any spurious interrupts. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> [forward port to 3.18 - manually applied few patch. ] Signed-off-by: Santosh Shukla <santosh.shukla@linaro.org>
2015-07-22clockevents: Restart clockevent device before using itViresh Kumar
Clockevent device might have been switched to ONESHOT_STOPPED mode to avoid getting spurious interrupts on a tickless CPU. Before reprogramming next event, we must reconfigure clockevent device to ONESHOT mode if required. This patch switches mode to ONESHOT at three different places and following is the reasoning behind them. 1.) NOHZ_MODE_LOWRES Timers & hrtimers are dependent on tick for their working in this mode and the only place from where clockevent device is programmed is the tick-code. So, we need to switch clockevent device to ONESHOT mode before we starting using it. Two routines can restart ticks here in LOWRES mode: tick_nohz_stop_sched_tick() and tick_nohz_restart(). 2.) NOHZ_MODE_HIGHRES Tick & timers are dependent on hrtimers for their working in this mode and the only place from where clockevent device is programmed is the hrtimer-code. Only hrtimer_reprogram() is responsible for programming the clockevent device for next event, if the clockevent device is stopped earlier. And updating that alone is sufficient here. To make sure we haven't missed any corner case, add a WARN() for the case where we try to reprogram clockevent device while we aren't configured in ONESHOT_STOPPED mode. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2015-07-22clockevents: Introduce CLOCK_EVT_MODE_ONESHOT_STOPPED modeViresh Kumar
When no timers/hrtimers are pending, the expiry time is set to a special value: 'KTIME_MAX'. This normally happens with NO_HZ_{IDLE|FULL} in both LOWRES/HIGHRES modes. When 'expiry == KTIME_MAX', we either cancel the 'tick-sched' hrtimer (NOHZ_MODE_HIGHRES) or skip reprogramming clockevent device (NOHZ_MODE_LOWRES). But, the clockevent device is already reprogrammed from the tick-handler for next tick. As the clock event device is programmed in ONESHOT mode it will atleast fire one more time (unnecessarily). Timers on many implementations (like arm_arch_timer, powerpc, etc.) only support PERIODIC mode and their drivers emulate ONESHOT over that. Which means that on these platforms we will get spurious interrupts at last programmed interval rate, normally tick rate. In order to avoid spurious interrupts/wakeups, the clockevent device should be stopped or its interrupts should be masked. A simple (yet hacky) solution to get this fixed could be: update hrtimer_force_reprogram() to always reprogram clockevent device and update clockevent drivers to STOP generating events (or delay it to max time) when 'expires' is set to KTIME_MAX. But the drawback here is that every clockevent driver has to be hacked for this particular case and its very easy for new ones to miss this. However, Thomas suggested to add an optional mode ONESHOT_STOPPED to solve this problem: lkml.org/lkml/2014/5/9/508. This patch adds support for ONESHOT_STOPPED mode in clockevents core. It will only be available to drivers that implement the mode-specific set-mode callbacks instead of the legacy ->set_mode() callback. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2015-07-22clocksource: Add (missing) default case for switch blocksViresh Kumar
Many clockevent drivers are using a switch block for handling modes in their ->set_mode() callback. Some of these do not have a 'default' case and adding a new mode in the enum clock_event_mode, starts giving warnings for these platforms about unhandled modes. This patch adds default cases for them. In order to keep things simple, add these two lines to the switch blocks: default: break; This can lead to different behavior for individual cases. Some of the drivers don't do any special stuff in their ->set_mode() callback before or after the switch blocks. And so this default case would simply return for them without any updates to the clockevent device. But in some cases, the clockevent device is stopped as soon as we enter the ->set_mode() callback and so it will stay stopped if we hit the default case. The rationale behind this approach was that the default case *will never* be hit during execution of code. All new modes (beyond RESUME) are handled with mode specific ->set_mode_*() callbacks and ->set_mode() is never called for them. And all modes before and including RESUME are handled by the clockevent drivers. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> [forward port to 3.18 , ignore these files as they doesn't exist in 3.18 kernel - arch/mips/loongson/loongson-3/hpet.c - arch/nios2/kernel/time.c ] Signed-off-by: Santosh Shukla <santosh.shukla@linaro.org>
2015-07-22clockevents: Introduce mode specific callbacksViresh Kumar
It is not possible for the clockevents core to know which modes (other than those with a corresponding feature flag) are supported by a particular implementation. And drivers are expected to handle transition to all modes elegantly, as ->set_mode() would be issued for them unconditionally. Now, adding support for a new mode complicates things a bit if we want to use the legacy ->set_mode() callback. We need to closely review all clockevents drivers to see if they would break on addition of a new mode. And after such reviews, it is found that we have to do non-trivial changes to most of the drivers [1]. Introduce mode-specific set_mode_*() callbacks, some of which the drivers may or may not implement. A missing callback would clearly convey the message that the corresponding mode isn't supported. A driver may still choose to keep supporting the legacy ->set_mode() callback, but ->set_mode() wouldn't be supporting any new modes beyond RESUME. If a driver wants to get benefited by using a new mode, it would be required to migrate to the mode specific callbacks. The legacy ->set_mode() callback and the newly introduced mode-specific callbacks are mutually exclusive. Only one of them should be supported by the driver. Sanity check is done at the time of registration to distinguish between optional and required callbacks and to make error recovery and handling simpler. If the legacy ->set_mode() callback is provided, all mode specific ones would be ignored by the core. Call sites calling ->set_mode() directly are also updated to use __clockevents_set_mode() instead, as ->set_mode() may not be available anymore for few drivers. [1] https://lkml.org/lkml/2014/12/9/605 [2] https://lkml.org/lkml/2015/1/23/255 Suggested-by: Thomas Gleixner <tglx@linutronix.de> [2] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2015-07-22no-hz_full: build fixSantosh Shukla
Signed-off-by: Santosh Shukla <santosh.shukla@linaro.org>
2015-07-22hrtimer.h: prevent pinned timer state from breaking inactive testGary S. Robertson
An hrtimer may be pinned to a CPU but inactive, so it is no longer valid to test the hrtimer.state struct member as having no bits set when inactive. Changed the test function to mask out the HRTIMER_STATE_PINNED bit when checking for inactive state. Signed-off-by: Gary S. Robertson <gary.robertson@linaro.org>
2015-07-22hrtimer: make sure PINNED flag is cleared after removing hrtimerViresh Kumar
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> [forward port to 3.18] Signed-off-by: Santosh Shukla <santosh.shukla@linaro.org>
2015-07-22sched/nohz: add debugfs control over sched_tick_max_defermentKevin Hilman
Allow debugfs override of sched_tick_max_deferment in order to ease finding/fixing the remaining issues with full nohz. The value to be written is in jiffies, and -1 means the max deferment is disabled (scheduler_tick_max_deferment() returns KTIME_MAX.) Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Kevin Hilman <khilman@linaro.org>
2015-07-22tick: SHUTDOWN event-dev if no events are required for KTIME_MAXViresh Kumar
When expires is set to KTIME_MAX in tick_program_event(), we are sure that there are no events enqueued for a very long time and so there is no point keeping event device running. We will get interrupted without any work to do many a times, for example when timer's counter overflows. So, its better to SHUTDOWN the event device then and restart it ones we get a request for next event. For implementing this a new field 'last_mode' is added to 'struct clock_event_device' to keep track of last mode used. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2015-07-22hrtimer: reprogram event for expires=KTIME_MAX in hrtimer_force_reprogram()Viresh Kumar
In hrtimer_force_reprogram(), we are reprogramming event device only if the next timer event is before KTIME_MAX. But what if it is equal to KTIME_MAX? As we aren't reprogramming it again, it will be set to the last value it was, probably tick interval, i.e. few milliseconds. And we will get a interrupt due to that, wouldn't have any hrtimers to service and return without doing much. But the implementation of event device's driver may make it more stupid. For example: drivers/clocksource/arm_arch_timer.c disables the event device only on SHUTDOWN/UNUSED requests in set-mode. Otherwise, it will keep giving interrupts at tick interval even if hrtimer_interrupt() didn't reprogram tick.. To get this fixed, lets reprogram event device even for KTIME_MAX, so that the timer is scheduled for long enough. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> [forward port to 3.18 kernel] Signed-off-by: Santosh Shukla <santosh.shukla@linaro.org>
2015-07-22sched: don't queue timers on quiesced CPUsViresh Kumar
CPUSets have cpusets.quiesce sysfs file now, with which some CPUs can opt for isolating themselves from background kernel activities, like: timers & hrtimers. get_nohz_timer_target() is used for finding suitable CPU for firing a timer. To guarantee that new timers wouldn't be queued on quiesced CPUs, we need to modify this routine. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2015-07-22cpuset: Create sysfs file: cpusets.quiesce to isolate CPUsViresh Kumar
For networking applications, platforms need to provide one CPU per each user space data plane thread. These CPUs shouldn't be interrupted by kernel at all unless userspace has requested for some functionality. Currently, there are background kernel activities that are running on almost every CPU, like: timers/hrtimers/watchdogs/etc, and these are required to be migrated to other CPUs. To achieve that, this patch adds another option to cpusets, i.e. 'quiesce'. Writing '1' on this file would migrate these unbound/unpinned timers/hrtimers away from the CPUs of the cpuset in question. Also it would disallow addition of any new unpinned timers/hrtimers to isolated CPUs (This would be handled in next patch). Writing '0' will disable isolation of CPUs in current cpuset and unpinned timers/hrtimers would be allowed in future on these CPUs. Currently, only timers and hrtimers are migrated. This would be followed by other kernel infrastructure later if required. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> [forward port to 3.18] Signed-off-by: Santosh Shukla <santosh.shukla@linaro.org>
2015-07-22hrtimer: create hrtimer_quiesce_cpu() to isolate CPU from hrtimersViresh Kumar
To isolate CPUs (isolate from hrtimers) from sysfs using cpusets, we need some support from the hrtimer core. i.e. A routine hrtimer_quiesce_cpu() which would migrate away all the unpinned hrtimers, but shouldn't touch the pinned ones. This patch creates this routine. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> [forward port to 3.18] Signed-off-by: Santosh Shukla <santosh.shukla@linaro.org>
2015-07-22hrtimer: update timer->state with 'pinned' informationViresh Kumar
'Pinned' information would be required in migrate_hrtimers() now, as we can migrate non-pinned timers away without a hotplug (i.e. with cpuset.quiesce). And so we may need to identify pinned timers now, as we can't migrate them. This patch reuses the timer->state variable for setting this flag as there were enough number of free bits available in this variable. And there is no point increasing size of this struct by adding another field. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> [forward port to 3.18] Signed-off-by: Santosh Shukla <santosh.shukla@linaro.org>
2015-07-22timer: create timer_quiesce_cpu() to isolate CPU from timersViresh Kumar
To isolate CPUs (isolate from timers) from sysfs using cpusets, we need some support from the timer core. i.e. A routine timer_quiesce_cpu() which would migrates away all the unpinned timers, but shouldn't touch the pinned ones. This patch creates this routine. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> [forward port to 3.18] Signed-off-by: Santosh Shukla <santosh.shukla@linaro.org>
2015-07-22timer: track pinned timers with TIMER_PINNED flagViresh Kumar
In order to quiesce a CPU on which Isolation might be required, we need to move away all the timers queued on that CPU. There are two types of timers queued on any CPU: ones that are pinned to that CPU and others can run on any CPU but are queued on CPU in question. And we need to migrate only the second type of timers away from the CPU entering quiesce state. For this we need some basic infrastructure in timer core to identify which timers are pinned and which are not. Hence, this patch adds another flag bit TIMER_PINNED which will be set only for the timers which are pinned to a CPU. It also removes 'pinned' parameter of __mod_timer() as it is no more required. NOTE: One functional change worth mentioning Existing Behavior: add_timer_on() followed by multiple mod_timer() wouldn't pin the timer on CPU mentioned in add_timer_on().. New Behavior: add_timer_on() followed by multiple mod_timer() would pin the timer on CPU running mod_timer(). I didn't gave much attention to this as we should call mod_timer_on() for the timers queued with add_timer_on(). Though if required we can simply clear the TIMER_PINNED flag in mod_timer(). Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> [forward port to 3.18] Signed-off-by: Santosh Shukla <santosh.shukla@linaro.org>
2015-06-29Merge branch 'linux-linaro-lsk-v3.18' into linux-linaro-lsk-v3.18-rtlsk-v3.18-16.01-rtlsk-v3.18-15.12-rtlsk-v3.18-15.11-rtlsk-v3.18-15.10-rtlsk-v3.18-15.09-rtlsk-v3.18-15.08-rtlsk-v3.18-15.07-rtKevin Hilman
2015-06-29Merge tag 'v3.18.16-rt13-lno1' of ↵Kevin Hilman
git://git.linaro.org/people/anders.roxell/linux-rt into linux-linaro-lsk-v3.18-rt Linux 3.18.16-rt13 Changes since v3.18.13-rt10: - arch/x86/kvm/mmu.c: work around gcc-4.4.4 bug - md/raid0: fix restore to sector variable in raid0_make_request * tag 'v3.18.16-rt13-lno1' of git://git.linaro.org/people/anders.roxell/linux-rt: (339 commits) Linux 3.18.16-rt13 REBASE workqueue: Prevent deadlock/stall on RT sched: Do not clear PF_NO_SETAFFINITY flag in select_fallback_rq() md: disable bcache rt,ntp: Move call to schedule_delayed_work() to helper thread scheduling while atomic in cgroup code cgroups: use simple wait in css_release() a few open coded completions completion: Use simple wait queues rcu-more-swait-conversions.patch kernel/treercu: use a simple waitqueue work-simple: Simple work queue implemenation simple-wait: rename and export the equivalent of waitqueue_active() wait-simple: Rework for use with completions wait-simple: Simple waitqueue implementation wait.h: include atomic.h drm/i915: drop trace_i915_gem_ring_dispatch on rt gpu/i915: don't open code these things cpufreq: drop K8's driver from beeing selected mmc: sdhci: don't provide hard irq handler ...
2015-06-29Merge tag 'v3.18.16' of ↵Kevin Hilman
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable into linux-linaro-lsk-v3.18-rt Linux 3.18.16 * tag 'v3.18.16' of git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable: (394 commits) Linux 3.18.16 arch/x86/kvm/mmu.c: work around gcc-4.4.4 bug md/raid0: fix restore to sector variable in raid0_make_request Linux 3.18.15 ARM: OMAP3: Fix booting with thumb2 kernel xfrm: release dst_orig in case of error in xfrm_lookup() ARC: unbork !LLSC build power/reset: at91: fix return value check in at91_reset_platform_probe() vfs: read file_handle only once in handle_to_path drm/radeon: partially revert "fix VM_CONTEXT*_PAGE_TABLE_END_ADDR handling" drm/radeon: don't share plls if monitors differ in audio support drm/radeon: retry dcpd fetch drm/radeon: fix VM_CONTEXT*_PAGE_TABLE_END_ADDR handling drm/radeon: add new bonaire pci id iwlwifi: pcie: prevent using unmapped memory in fw monitor ACPI / init: Fix the ordering of acpi_reserve_resources() sd: Disable support for 256 byte/sector disks storvsc: Set the SRB flags correctly when no data transfer is needed rtlwifi: rtl8192cu: Fix kernel deadlock md/raid5: don't record new size if resize_stripes fails. ...
2015-06-29Merge remote-tracking branch 'v3.18/topic/dm-crypt' into linux-linaro-lsk-v3.18Alex Shi
2015-06-29dm crypt: fix missing error code return from crypt_ctr error pathWei Yongjun
Fix to return a negative error code from crypt_ctr()'s optional parameter processing error path. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Mike Snitzer <snitzer@redhat.com> (cherry picked from commit 44c144f9c8e8fbd73ede2848da8253b3aae42ec2) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-06-29dm crypt: leverage immutable biovecs when decrypting on readMike Snitzer
Commit 003b5c571 ("block: Convert drivers to immutable biovecs") stopped short of changing dm-crypt to leverage the fact that the biovec array of a bio will no longer be modified. Switch to using bio_clone_fast() when cloning bios for decryption after read. Signed-off-by: Mike Snitzer <snitzer@redhat.com> (cherry picked from commit 5977907937afa2b5584a874d44ba6c0f56aeaa9c) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-06-29dm crypt: update URLs to new cryptsetup project pageMilan Broz
Cryptsetup home page moved to GitLab. Also remove link to abandonded Truecrypt page. Signed-off-by: Milan Broz <gmazyland@gmail.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> (cherry picked from commit e44f23b32dc7916b2bc12817e2f723fefa21ba41) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-06-29dm crypt: sort writesMikulas Patocka
Write requests are sorted in a red-black tree structure and are submitted in the sorted order. In theory the sorting should be performed by the underlying disk scheduler, however, in practice the disk scheduler only accepts and sorts a finite number of requests. To allow the sorting of all requests, dm-crypt needs to implement its own sorting. The overhead associated with rbtree-based sorting is considered negligible so it is not used conditionally. Even on SSD sorting can be beneficial since in-order request dispatch promotes lower latency IO completion to the upper layers. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> (cherry picked from commit b3c5fd3052492f1b8d060799d4f18be5a5438add) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-06-29dm crypt: add 'submit_from_crypt_cpus' optionMikulas Patocka
Make it possible to disable offloading writes by setting the optional 'submit_from_crypt_cpus' table argument. There are some situations where offloading write bios from the encryption threads to a single thread degrades performance significantly. The default is to offload write bios to the same thread because it benefits CFQ to have writes submitted using the same IO context. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> (cherry picked from commit 0f5d8e6ee758f7023e4353cca75d785b2d4f6abe) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-06-29dm crypt: offload writes to threadMikulas Patocka
Submitting write bios directly in the encryption thread caused serious performance degradation. On a multiprocessor machine, encryption requests finish in a different order than they were submitted. Consequently, write requests would be submitted in a different order and it could cause severe performance degradation. Move the submission of write requests to a separate thread so that the requests can be sorted before submitting. But this commit improves dm-crypt performance even without having dm-crypt perform request sorting (in particular it enables IO schedulers like CFQ to sort more effectively). Note: it is required that a previous commit ("dm crypt: don't allocate pages for a partial request") be applied before applying this patch. Otherwise, this commit could introduce a crash. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> (cherry picked from commit dc2676210c425ee8e5cb1bec5bc84d004ddf4179) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-06-29dm crypt: remove unused io_pool and _crypt_io_poolMikulas Patocka
The previous commit ("dm crypt: don't allocate pages for a partial request") stopped using the io_pool slab mempool and backing _crypt_io_pool kmem cache. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> (cherry picked from commit 94f5e0243c48aa01441c987743dc468e2d6eaca2) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-06-29dm crypt: avoid deadlock in mempoolsMikulas Patocka
Fix a theoretical deadlock introduced in the previous commit ("dm crypt: don't allocate pages for a partial request"). The function crypt_alloc_buffer may be called concurrently. If we allocate from the mempool concurrently, there is a possibility of deadlock. For example, if we have mempool of 256 pages, two processes, each wanting 256, pages allocate from the mempool concurrently, it may deadlock in a situation where both processes have allocated 128 pages and the mempool is exhausted. To avoid such a scenario we allocate the pages under a mutex. In order to not degrade performance with excessive locking, we try non-blocking allocations without a mutex first and if that fails, we fallback to a blocking allocations with a mutex. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> (cherry picked from commit 7145c241a1bf2841952c3e297c4080b357b3e52d) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-06-29dm crypt: don't allocate pages for a partial requestMikulas Patocka
Change crypt_alloc_buffer so that it only ever allocates pages for a full request. This is a prerequisite for the commit "dm crypt: offload writes to thread". This change simplifies the dm-crypt code at the expense of reduced throughput in low memory conditions (where allocation for a partial request is most useful). Note: the next commit ("dm crypt: avoid deadlock in mempools") is needed to fix a theoretical deadlock. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> (cherry picked from commit cf2f1abfbd0dba701f7f16ef619e4d2485de3366) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-06-29dm crypt: use unbound workqueue for request processingMikulas Patocka
Use unbound workqueue by default so that work is automatically balanced between available CPUs. The original behavior of encrypting using the same cpu that IO was submitted on can still be enabled by setting the optional 'same_cpu_crypt' table argument. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> (cherry picked from commit f3396c58fd8442850e759843457d78b6ec3a9589) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-06-24Merge branch 'v3.18.16-rt13' into v3.18-rtAnders Roxell
2015-06-24Merge branch 'v3.18.13-rt10' into v3.18.16-rt13Anders Roxell
Used the "ours" merge strategy to throw away the previous -rt releases
2015-06-24Linux 3.18.16-rt13 REBASESteven Rostedt (Red Hat)
2015-06-24workqueue: Prevent deadlock/stall on RTThomas Gleixner
Austin reported a XFS deadlock/stall on RT where scheduled work gets never exececuted and tasks are waiting for each other for ever. The underlying problem is the modification of the RT code to the handling of workers which are about to go to sleep. In mainline a worker thread which goes to sleep wakes an idle worker if there is more work to do. This happens from the guts of the schedule() function. On RT this must be outside and the accessed data structures are not protected against scheduling due to the spinlock to rtmutex conversion. So the naive solution to this was to move the code outside of the scheduler and protect the data structures by the pool lock. That approach turned out to be a little naive as we cannot call into that code when the thread blocks on a lock, as it is not allowed to block on two locks in parallel. So we dont call into the worker wakeup magic when the worker is blocked on a lock, which causes the deadlock/stall observed by Austin and Mike. Looking deeper into that worker code it turns out that the only relevant data structure which needs to be protected is the list of idle workers which can be woken up. So the solution is to protect the list manipulation operations with preempt_enable/disable pairs on RT and call unconditionally into the worker code even when the worker is blocked on a lock. The preemption protection is safe as there is nothing which can fiddle with the list outside of thread context. Reported-and_tested-by: Austin Schuh <austin@peloton-tech.com> Reported-and_tested-by: Mike Galbraith <umgwanakikbuti@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: http://vger.kernel.org/r/alpine.DEB.2.10.1406271249510.5170@nanos Cc: Richard Weinberger <richard.weinberger@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: stable-rt@vger.kernel.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-06-24sched: Do not clear PF_NO_SETAFFINITY flag in select_fallback_rq()Steven Rostedt
I talked with Peter Zijlstra about this, and he told me that the clearing of the PF_NO_SETAFFINITY flag was to deal with the optimization of migrate_disable/enable() that ignores tasks that have that flag set. But that optimization was removed when I did a rework of the cpu hotplug code. I found that ignoring tasks that had that flag set would cause those tasks to not sync with the hotplug code and cause the kernel to crash. Thus it needed to not treat them special and those tasks had to go though the same work as tasks without that flag set. Now that those tasks are not treated special, there's no reason to clear the flag. May still need to be tested as the migrate_me() code does not ignore those flags. Cc: stable-rt@vger.kernel.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Clark Williams <williams@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20140701111444.0cfebaa1@gandalf.local.home Signed-off-by: Thomas Gleixner <tglx@linutronix.de>