aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2012-02-07Linux 3.0.20-rt35 REBASEv3.0.20-rt35-rebaseSteven Rostedt
2012-02-07acpi-gpe-use-wait-simple.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-02-07wait-simple: Simple waitqueue implementationThomas Gleixner
wait_queue is a swiss army knife and in most of the cases the complexity is not needed. For RT waitqueues are a constant source of trouble as we can't convert the head lock to a raw spinlock due to fancy and long lasting callbacks. Provide a slim version, which allows RT to replace wait queues. This should go mainline as well, as it lowers memory consumption and runtime overhead. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-02-07sched: ttwu: Return success when only changing the saved_state valueThomas Gleixner
When a task blocks on a rt lock, it saves the current state in p->saved_state, so a lock related wake up will not destroy the original state. When a real wakeup happens, while the task is running due to a lock wakeup already, we update p->saved_state to TASK_RUNNING, but we do not return success, which might cause another wakeup in the waitqueue code and the task remains in the waitqueue list. Return success in that case as well. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable-rt@vger.kernel.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-02-07ACPI: Convert embedded controller lock to raw spinlockClark Williams
Was seeing multiple "scheduling while atomic" backtraces on the 3.2-rc2-rt5 realtime kernel. This patch converts the spinlock in the ACPI embedded controller structure (curr_lock) to be a raw spinlock. Signed-off-by: Clark Williams <williams@redhat.com> Link: http://lkml.kernel.org/r/20111203093537.7d805f64@redhat.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable-rt@vger.kernel.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-02-07slab, lockdep: Annotate all slab cachesPeter Zijlstra
Currently we only annotate the kmalloc caches, annotate all of them. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Hans Schillstrom <hans@schillstrom.com> Cc: Christoph Lameter <cl@gentwo.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Matt Mackall <mpm@selenic.com> Cc: Sitsofe Wheeler <sitsofe@yahoo.com> Cc: linux-mm@kvack.org Cc: David Rientjes <rientjes@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/n/tip-10bey2cgpcvtbdkgigaoab8w@git.kernel.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-02-07slab, lockdep: Fix silly bugPeter Zijlstra
Commit 30765b92 ("slab, lockdep: Annotate the locks before using them") moves the init_lock_keys() call from after g_cpucache_up = FULL, to before it. And overlooks the fact that init_node_lock_keys() tests for it and ignores everything !FULL. Introduce a LATE stage and change the lockdep test to be <LATE. Cc: stable@kernel.org Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Hans Schillstrom <hans@schillstrom.com> Cc: Christoph Lameter <cl@gentwo.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Matt Mackall <mpm@selenic.com> Cc: Sitsofe Wheeler <sitsofe@yahoo.com> Cc: linux-mm@kvack.org Cc: David Rientjes <rientjes@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/n/tip-gadqbdfxorhia1w5ewmoiodd@git.kernel.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-02-07acpi: Make gbl_[hardware|gpe]_lock rawThomas Gleixner
These locks are taken in the guts of the idle code and cannot be converted to "sleeping" spinlocks on RT Cc: stable-rt@vger.kernel.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-02-07pci: Use __wake_up_all_locked pci_unblock_user_cfg_access()Thomas Gleixner
The waitqueue is protected by the pci_lock, so we can just avoid to lock the waitqueue lock itself. That prevents the might_sleep()/scheduling while atomic problem on RT Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable-rt@vger.kernel.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-02-07wait: Provide __wake_up_all_lockedThomas Gleixner
For code which protects the waitqueue itself with another lock it makes no sense to acquire the waitqueue lock for wakeup all. Provide __wake_up_all_locked. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable-rt@vger.kernel.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-02-07KVM: fix XSAVE bit scanning (now properly)Andre Przywara
commit 123108f1c1aafd51d6a5c79cc04d7999dd88a930 tried to fix KVMs XSAVE valid feature scanning, but it was wrong. It was not considering the sparse nature of this bitfield, instead reading values from uninitialized members of the entries array. This patch now separates subleaf indicies from KVM's array indicies and fills the entry before querying it's value. This fixes AVX support in KVM guests. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-07KVM: Sanitize cpuidAvi Kivity
Instead of blacklisting known-unsupported cpuid leaves, whitelist known- supported leaves. This is more conservative and prevents us from reporting features we don't support. Also whitelist a few more leaves while at it. Signed-off-by: Avi Kivity <avi@redhat.com> Acked-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-02-07intel-iommu: Fix AB-BA lockdep reportRoland Dreier
When unbinding a device so that I could pass it through to a KVM VM, I got the lockdep report below. It looks like a legitimate lock ordering problem: - domain_context_mapping_one() takes iommu->lock and calls iommu_support_dev_iotlb(), which takes device_domain_lock (inside iommu->lock). - domain_remove_one_dev_info() starts by taking device_domain_lock then takes iommu->lock inside it (near the end of the function). So this is the classic AB-BA deadlock. It looks like a safe fix is to simply release device_domain_lock a bit earlier, since as far as I can tell, it doesn't protect any of the stuff accessed at the end of domain_remove_one_dev_info() anyway. BTW, the use of device_domain_lock looks a bit unsafe to me... it's at least not obvious to me why we aren't vulnerable to the race below: iommu_support_dev_iotlb() domain_remove_dev_info() lock device_domain_lock find info unlock device_domain_lock lock device_domain_lock find same info unlock device_domain_lock free_devinfo_mem(info) do stuff with info after it's free However I don't understand the locking here well enough to know if this is a real problem, let alone what the best fix is. Anyway here's the full lockdep output that prompted all of this: ======================================================= [ INFO: possible circular locking dependency detected ] 2.6.39.1+ #1 ------------------------------------------------------- bash/13954 is trying to acquire lock: (&(&iommu->lock)->rlock){......}, at: [<ffffffff812f6421>] domain_remove_one_dev_info+0x121/0x230 but task is already holding lock: (device_domain_lock){-.-...}, at: [<ffffffff812f6508>] domain_remove_one_dev_info+0x208/0x230 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (device_domain_lock){-.-...}: [<ffffffff8109ca9d>] lock_acquire+0x9d/0x130 [<ffffffff81571475>] _raw_spin_lock_irqsave+0x55/0xa0 [<ffffffff812f8350>] domain_context_mapping_one+0x600/0x750 [<ffffffff812f84df>] domain_context_mapping+0x3f/0x120 [<ffffffff812f9175>] iommu_prepare_identity_map+0x1c5/0x1e0 [<ffffffff81ccf1ca>] intel_iommu_init+0x88e/0xb5e [<ffffffff81cab204>] pci_iommu_init+0x16/0x41 [<ffffffff81002165>] do_one_initcall+0x45/0x190 [<ffffffff81ca3d3f>] kernel_init+0xe3/0x168 [<ffffffff8157ac24>] kernel_thread_helper+0x4/0x10 -> #0 (&(&iommu->lock)->rlock){......}: [<ffffffff8109bf3e>] __lock_acquire+0x195e/0x1e10 [<ffffffff8109ca9d>] lock_acquire+0x9d/0x130 [<ffffffff81571475>] _raw_spin_lock_irqsave+0x55/0xa0 [<ffffffff812f6421>] domain_remove_one_dev_info+0x121/0x230 [<ffffffff812f8b42>] device_notifier+0x72/0x90 [<ffffffff8157555c>] notifier_call_chain+0x8c/0xc0 [<ffffffff81089768>] __blocking_notifier_call_chain+0x78/0xb0 [<ffffffff810897b6>] blocking_notifier_call_chain+0x16/0x20 [<ffffffff81373a5c>] __device_release_driver+0xbc/0xe0 [<ffffffff81373ccf>] device_release_driver+0x2f/0x50 [<ffffffff81372ee3>] driver_unbind+0xa3/0xc0 [<ffffffff813724ac>] drv_attr_store+0x2c/0x30 [<ffffffff811e4506>] sysfs_write_file+0xe6/0x170 [<ffffffff8117569e>] vfs_write+0xce/0x190 [<ffffffff811759e4>] sys_write+0x54/0xa0 [<ffffffff81579a82>] system_call_fastpath+0x16/0x1b other info that might help us debug this: 6 locks held by bash/13954: #0: (&buffer->mutex){+.+.+.}, at: [<ffffffff811e4464>] sysfs_write_file+0x44/0x170 #1: (s_active#3){++++.+}, at: [<ffffffff811e44ed>] sysfs_write_file+0xcd/0x170 #2: (&__lockdep_no_validate__){+.+.+.}, at: [<ffffffff81372edb>] driver_unbind+0x9b/0xc0 #3: (&__lockdep_no_validate__){+.+.+.}, at: [<ffffffff81373cc7>] device_release_driver+0x27/0x50 #4: (&(&priv->bus_notifier)->rwsem){.+.+.+}, at: [<ffffffff8108974f>] __blocking_notifier_call_chain+0x5f/0xb0 #5: (device_domain_lock){-.-...}, at: [<ffffffff812f6508>] domain_remove_one_dev_info+0x208/0x230 stack backtrace: Pid: 13954, comm: bash Not tainted 2.6.39.1+ #1 Call Trace: [<ffffffff810993a7>] print_circular_bug+0xf7/0x100 [<ffffffff8109bf3e>] __lock_acquire+0x195e/0x1e10 [<ffffffff810972bd>] ? trace_hardirqs_off+0xd/0x10 [<ffffffff8109d57d>] ? trace_hardirqs_on_caller+0x13d/0x180 [<ffffffff8109ca9d>] lock_acquire+0x9d/0x130 [<ffffffff812f6421>] ? domain_remove_one_dev_info+0x121/0x230 [<ffffffff81571475>] _raw_spin_lock_irqsave+0x55/0xa0 [<ffffffff812f6421>] ? domain_remove_one_dev_info+0x121/0x230 [<ffffffff810972bd>] ? trace_hardirqs_off+0xd/0x10 [<ffffffff812f6421>] domain_remove_one_dev_info+0x121/0x230 [<ffffffff812f8b42>] device_notifier+0x72/0x90 [<ffffffff8157555c>] notifier_call_chain+0x8c/0xc0 [<ffffffff81089768>] __blocking_notifier_call_chain+0x78/0xb0 [<ffffffff810897b6>] blocking_notifier_call_chain+0x16/0x20 [<ffffffff81373a5c>] __device_release_driver+0xbc/0xe0 [<ffffffff81373ccf>] device_release_driver+0x2f/0x50 [<ffffffff81372ee3>] driver_unbind+0xa3/0xc0 [<ffffffff813724ac>] drv_attr_store+0x2c/0x30 [<ffffffff811e4506>] sysfs_write_file+0xe6/0x170 [<ffffffff8117569e>] vfs_write+0xce/0x190 [<ffffffff811759e4>] sys_write+0x54/0xa0 [<ffffffff81579a82>] system_call_fastpath+0x16/0x1b Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2012-02-07tasklet/rt: Prevent tasklets from going into infinite spin in RTIngo Molnar
When CONFIG_PREEMPT_RT_FULL is enabled, tasklets run as threads, and spinlocks turn are mutexes. But this can cause issues with tasks disabling tasklets. A tasklet runs under ksoftirqd, and if a tasklets are disabled with tasklet_disable(), the tasklet count is increased. When a tasklet runs, it checks this counter and if it is set, it adds itself back on the softirq queue and returns. The problem arises in RT because ksoftirq will see that a softirq is ready to run (the tasklet softirq just re-armed itself), and will not sleep, but instead run the softirqs again. The tasklet softirq will still see that the count is non-zero and will not execute the tasklet and requeue itself on the softirq again, which will cause ksoftirqd to run it again and again and again. It gets worse because ksoftirqd runs as a real-time thread. If it preempted the task that disabled tasklets, and that task has migration disabled, or can't run for other reasons, the tasklet softirq will never run because the count will never be zero, and ksoftirqd will go into an infinite loop. As an RT task, it this becomes a big problem. This is a hack solution to have tasklet_disable stop tasklets, and when a tasklet runs, instead of requeueing the tasklet softirqd it delays it. When tasklet_enable() is called, and tasklets are waiting, then the tasklet_enable() will kick the tasklets to continue. This prevents the lock up from ksoftirq going into an infinite loop. Signed-off-by: Ingo Molnar <mingo@elte.hu> [ ported to 3.0-rt ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-02-07tracing: Show padding as unsigned shortSteven Rostedt
RT added two bytes to trace migrate disable counting to the trace events and used two bytes of the padding to make the change. The structures and all were updated correctly, but the display in the event formats was not: cat /debug/tracing/events/sched/sched_switch/format name: sched_switch ID: 51 format: field:unsigned short common_type; offset:0; size:2; signed:0; field:unsigned char common_flags; offset:2; size:1; signed:0; field:unsigned char common_preempt_count; offset:3; size:1; signed:0; field:int common_pid; offset:4; size:4; signed:1; field:unsigned short common_migrate_disable; offset:8; size:2; signed:0; field:int common_padding; offset:10; size:2; signed:0; The field for common_padding has the correct size and offset, but the use of "int" might confuse some parsers (and people that are reading it). This needs to be changed to "unsigned short". Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1321467575.4181.36.camel@frodo Cc: stable-rt@vger.kernel.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-07dm: Make rt awareThomas Gleixner
Use the BUG_ON_NONRT variant for the irq_disabled() checks. RT has interrupts legitimately enabled here as we cant deadlock against the irq thread due to the "sleeping spinlocks" conversion. Reported-by: Luis Claudio R. Goncalves <lclaudio@uudg.org> Cc: stable-rt@vger.kernel.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-02-07x86: crypto: Reduce preempt disabled regionsPeter Zijlstra
Restrict the preempt disabled regions to the actual floating point operations and enable preemption for the administrative actions. This is necessary on RT to avoid that kfree and other operations are called with preemption disabled. Reported-and-tested-by: Carsten Emde <cbe@osadl.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: stable-rt@vger.kernel.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-02-07rcu: Fix macro substitution for synchronize_rcu_bh() on RTJohn Kacur
kernel/rcutorture.c:492: error: ‘synchronize_rcu_bh’ undeclared here (not in a function) synchronize_rcu_bh() is not just called as a normal function, but can also be referenced as a function pointer. When CONFIG_PREEMPT_RT_FULL is enabled, synchronize_rcu_bh() is defined as synchronize_rcu(), but needs to be defined without the parenthesis because the compiler will complain when synchronize_rcu_bh is referenced as a function pointer and not a function. Link: http://lkml.kernel.org/r/1321235083-21756-1-git-send-email-jkacur@redhat.com Cc: stable-rt@vger.kernel.org Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: John Kacur <jkacur@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-02-07softirq: Export in_serving_softirq()John Kacur
ERROR: "in_serving_softirq" [net/sched/cls_cgroup.ko] undefined! The above can be fixed by exporting in_serving_softirq Link: http://lkml.kernel.org/r/1321235083-21756-2-git-send-email-jkacur@redhat.com Cc: stable-rt@vger.kernel.org Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: John Kacur <jkacur@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-02-07kconfig-preempt-rt-full.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-07kconfig-disable-a-few-options-rt.patchThomas Gleixner
Disable stuff which is known to have issues on RT Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-07scsi-fcoe-rt-aware.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-07x86-kvm-require-const-tsc-for-rt.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-07sysrq: Allow immediate Magic SysRq output for PREEMPT_RT_FULLFrank Rowand
Add a CONFIG option to allow the output from Magic SysRq to be output immediately, even if this causes large latencies. If PREEMPT_RT_FULL, printk() will not try to acquire the console lock when interrupts or preemption are disabled. If the console lock is not acquired the printk() output will be buffered, but will not be output immediately. Some drivers call into the Magic SysRq code with interrupts or preemption disabled, so the output of Magic SysRq will be buffered instead of printing immediately if this option is not selected. Even with this option selected, Magic SysRq output will be delayed if the attempt to acquire the console lock fails. Signed-off-by: Frank Rowand <frank.rowand@am.sony.com> Link: http://lkml.kernel.org/r/4E7CEF60.5020508@am.sony.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-07ipc/sem: Rework semaphore wakeupsPeter Zijlstra
Current sysv sems have a weird ass wakeup scheme that involves keeping preemption disabled over a potential O(n^2) loop and busy waiting on that on other CPUs. Kill this and simply wake the task directly from under the sem_lock. This was discovered by a migrate_disable() debug feature that disallows: spin_lock(); preempt_disable(); spin_unlock() preempt_enable(); Cc: Manfred Spraul <manfred@colorfullife.com> Suggested-by: Thomas Gleixner <tglx@linutronix.de> Reported-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Manfred Spraul <manfred@colorfullife.com> Link: http://lkml.kernel.org/r/1315994224.5040.1.camel@twins Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-07mm, rt: kmap_atomic schedulingPeter Zijlstra
In fact, with migrate_disable() existing one could play games with kmap_atomic. You could save/restore the kmap_atomic slots on context switch (if there are any in use of course), this should be esp easy now that we have a kmap_atomic stack. Something like the below.. it wants replacing all the preempt_disable() stuff with pagefault_disable() && migrate_disable() of course, but then you can flip kmaps around like below. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> [dvhart@linux.intel.com: build fix] Link: http://lkml.kernel.org/r/1311842631.5890.208.camel@twins
2012-02-07add /sys/kernel/realtime entryClark Williams
Add a /sys/kernel entry to indicate that the kernel is a realtime kernel. Clark says that he needs this for udev rules, udev needs to evaluate if its a PREEMPT_RT kernel a few thousand times and parsing uname output is too slow or so. Are there better solutions? Should it exist and return 0 on !-rt? Signed-off-by: Clark Williams <williams@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
2012-02-07kgdb/serial: Short term workaroundJason Wessel
On 07/27/2011 04:37 PM, Thomas Gleixner wrote: > - KGDB (not yet disabled) is reportedly unusable on -rt right now due > to missing hacks in the console locking which I dropped on purpose. > To work around this in the short term you can use this patch, in addition to the clocksource watchdog patch that Thomas brewed up. Comments are welcome of course. Ultimately the right solution is to change separation between the console and the HW to have a polled mode + work queue so as not to introduce any kind of latency. Thanks, Jason.
2012-02-07ping-sysrq.patchCarsten Emde
There are (probably rare) situations when a system crashed and the system console becomes unresponsive but the network icmp layer still is alive. Wouldn't it be wonderful, if we then could submit a sysreq command via ping? This patch provides this facility. Please consult the updated documentation Documentation/sysrq.txt for details. Signed-off-by: Carsten Emde <C.Emde@osadl.org>
2012-02-07net: Avoid livelock in net_tx_action() on RTSteven Rostedt
qdisc_lock is taken w/o disabling interrupts or bottom halfs. So code holding a qdisc_lock() can be interrupted and softirqs can run on the return of interrupt in !RT. The spin_trylock() in net_tx_action() makes sure, that the softirq does not deadlock. When the lock can't be acquired q is requeued and the NET_TX softirq is raised. That causes the softirq to run over and over. That works in mainline as do_softirq() has a retry loop limit and leaves the softirq processing in the interrupt return path and schedules ksoftirqd. The task which holds qdisc_lock cannot be preempted, so the lock is released and either ksoftirqd or the next softirq in the return from interrupt path can proceed. Though it's a bit strange to actually run MAX_SOFTIRQ_RESTART (10) loops before it decides to bail out even if it's clear in the first iteration :) On RT all softirq processing is done in a FIFO thread and we don't have a loop limit, so ksoftirqd preempts the lock holder forever and unqueues and requeues until the reset button is hit. Due to the forced threading of ksoftirqd on RT we actually cannot deadlock on qdisc_lock because it's a "sleeping lock". So it's safe to replace the spin_trylock() with a spin_lock(). When contended, ksoftirqd is scheduled out and the lock holder can proceed. [ tglx: Massaged changelog and code comments ] Solved-by: Thomas Gleixner <tglx@linuxtronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Tested-by: Carsten Emde <cbe@osadl.org> Cc: Clark Williams <williams@redhat.com> Cc: John Kacur <jkacur@redhat.com> Cc: Luis Claudio R. Goncalves <lclaudio@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-07mips-disable-highmem-on-rt.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-07ARM: at91: tclib: Default to tclib timer for RTThomas Gleixner
RT is not too happy about the shared timer interrupt in AT91 devices. Default to tclib timer for RT. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-07arm-disable-highmem-on-rt.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-07power-disable-highmem-on-rt.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-07power-use-generic-rwsem-on-rtThomas Gleixner
2012-02-07console-make-rt-friendly.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-07x86-no-perf-irq-work-rt.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-07skbufhead-raw-lock.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-07jump-label-rt.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-07debugobjects-rt.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-07hotplug-stuff.patchThomas Gleixner
Do not take lock for non handled cases (might be atomic context) Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-07workqueue: Use get_cpu_light() in flush_gcwq()Yong Zhang
BUG: sleeping function called from invalid context at kernel/rtmutex.c:645 in_atomic(): 1, irqs_disabled(): 0, pid: 1739, name: bash Pid: 1739, comm: bash Not tainted 3.0.6-rt17-00284-gb76d419 #3 Call Trace: [<c06e3b5d>] ? printk+0x1d/0x20 [<c01390b6>] __might_sleep+0xe6/0x110 [<c06e633c>] rt_spin_lock+0x1c/0x30 [<c01655a6>] flush_gcwq+0x236/0x320 [<c021c651>] ? kfree+0xe1/0x1a0 [<c05b7178>] ? __cpufreq_remove_dev+0xf8/0x260 [<c0183fad>] ? rt_down_write+0xd/0x10 [<c06cd91e>] workqueue_cpu_down_callback+0x26/0x2d [<c06e9d65>] notifier_call_chain+0x45/0x60 [<c0171cfe>] __raw_notifier_call_chain+0x1e/0x30 [<c014c9b4>] __cpu_notify+0x24/0x40 [<c06cbc6f>] _cpu_down+0xdf/0x330 [<c06cbef0>] cpu_down+0x30/0x50 [<c06cd6b0>] store_online+0x50/0xa7 [<c06cd660>] ? acpi_os_map_memory+0xec/0xec [<c04f2faa>] sysdev_store+0x2a/0x40 [<c02887a4>] sysfs_write_file+0xa4/0x100 [<c0229ab2>] vfs_write+0xa2/0x170 [<c0288700>] ? sysfs_poll+0x90/0x90 [<c0229d92>] sys_write+0x42/0x70 [<c06ecedf>] sysenter_do_call+0x12/0x2d CPU 1 is now offline SMP alternatives: switching to UP code SMP alternatives: switching to SMP code Booting Node 0 Processor 1 APIC 0x1 smpboot cpu 1: start_ip = 9b000 Initializing CPU#1 BUG: sleeping function called from invalid context at kernel/rtmutex.c:645 in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: kworker/0:0 Pid: 0, comm: kworker/0:0 Not tainted 3.0.6-rt17-00284-gb76d419 #3 Call Trace: [<c06e3b5d>] ? printk+0x1d/0x20 [<c01390b6>] __might_sleep+0xe6/0x110 [<c06e633c>] rt_spin_lock+0x1c/0x30 [<c06cd85b>] workqueue_cpu_up_callback+0x56/0xf3 [<c06e9d65>] notifier_call_chain+0x45/0x60 [<c0171cfe>] __raw_notifier_call_chain+0x1e/0x30 [<c014c9b4>] __cpu_notify+0x24/0x40 [<c014c9ec>] cpu_notify+0x1c/0x20 [<c06e1d43>] notify_cpu_starting+0x1e/0x20 [<c06e0aad>] smp_callin+0xfb/0x10e [<c06e0ad9>] start_secondary+0x19/0xd7 NMI watchdog enabled, takes one hw-pmu counter. Switched to NOHz mode on CPU #1 Signed-off-by: Yong Zhang <yong.zhang0@gmail.com> Link: http://lkml.kernel.org/r/1318762607-2261-5-git-send-email-yong.zhang0@gmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-07workqueue: Fix PF_THREAD_BOUND abusePeter Zijlstra
PF_THREAD_BOUND is set by kthread_bind() and means the thread is bound to a particular cpu for correctness. The workqueue code abuses this flag and blindly sets it for all created threads, including those that are free to migrate. Restore the original semantics now that the worst abuse in the cpu-hotplug path are gone. The only icky bit is the rescue thread for per-cpu workqueues, this cannot use kthread_bind() but will use set_cpus_allowed_ptr() to migrate itself to the desired cpu. Set and clear PF_THREAD_BOUND manually here. XXX: I think worker_maybe_bind_and_lock()/worker_unbind_and_unlock() should also do a get_online_cpus(), this would likely allow us to remove the while loop. XXX: should probably repurpose GCWQ_DISASSOCIATED to warn on adding works after CPU_DOWN_PREPARE -- its dual use to mark unbound gcwqs is a tad annoying though. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-07workqueue: Fix cpuhotplug trainwreckPeter Zijlstra
The current workqueue code does crazy stuff on cpu unplug, it relies on forced affine breakage, thereby violating per-cpu expectations. Worse, it tries to re-attach to a cpu if the thing comes up again before all previously queued works are finished. This breaks (admittedly bonkers) cpu-hotplug use that relies on a down-up cycle to push all usage away. Introduce a new WQ_NON_AFFINE flag that indicates a per-cpu workqueue will not respect cpu affinity and use this to migrate all its pending works to whatever cpu is doing cpu-down. This also adds a warning for queue_on_cpu() users which warns when its used on WQ_NON_AFFINE workqueues for the API implies you care about what cpu things are ran on when such workqueues cannot guarantee this. For the rest, simply flush all per-cpu works and don't mess about. This also means that currently all workqueues that are manually flushing things on cpu-down in order to provide the per-cpu guarantee no longer need to do so. In short, we tell the WQ what we want it to do, provide validation for this and loose ~250 lines of code. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-07mm-vmalloc.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-07epoll.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-07workqueue-use-get-cpu-light.patchThomas Gleixner
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-07perf, x86: Avoid kfree in CPU_STARTINGPeter Zijlstra
On -rt kfree() can schedule, but CPU_STARTING is before the CPU is fully up and running. These are contradictory, so avoid it. Instead push the kfree() to CPU_ONLINE where we're free to schedule. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-b4ag6dk2jkuvupyz2l7txjmd@git.kernel.org
2012-02-07x86: Disable IST stacks for debug/int 3/stack fault for PREEMPT_RTAndi Kleen
Normally the x86-64 trap handlers for debug/int 3/stack fault run on a special interrupt stack to make them more robust when dealing with kernel code. The PREEMPT_RT kernel can sleep in locks even while allocating GFP_ATOMIC memory. When one of these trap handlers needs to send real time signals for ptrace it allocates memory and could then try to to schedule. But it is not allowed to schedule on a IST stack. This can cause warnings and hangs. This patch disables the IST stacks for these handlers for PREEMPT_RT kernel. Instead let them run on the normal process stack. The kernel only really needs the ISTs here to make kernel debuggers more robust in case someone sets a break point somewhere where the stack is invalid. But there are no kernel debuggers in the standard kernel that do this. It also means kprobes cannot be set in situations with invalid stack; but that sounds like a reasonable restriction. The stack fault change could minimally impact oops quality, but not very much because stack faults are fairly rare. A better solution would be to use similar logic as the NMI "paranoid" path: check if signal is for user space, if yes go back to entry.S, switch stack, call sync_regs, then do the signal sending etc. But this patch is much simpler and should work too with minimal impact. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-02-07x86: Use generic rwsem_spinlocks on -rtThomas Gleixner
Simplifies the separation of anon_rw_semaphores and rw_semaphores for -rt. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>