cpu/rt: Rework cpu down for PREEMPT_RT

Bringing a CPU down is a pain with the PREEMPT_RT kernel because tasks can be preempted in many more places than in non-RT. In order to handle per_cpu variables, tasks may be pinned to a CPU for a while, and even sleep. But these tasks need to be off the CPU if that CPU is going down. Several synchronization methods have been tried, but when stressed they failed. This is a new approach. A sync_tsk thread is still created and tasks may still block on a lock when the CPU is going down, but how that works is a bit different. When cpu_down() starts, it will create the sync_tsk and wait on it to inform that current tasks that are pinned on the CPU are no longer pinned. But new tasks that are about to be pinned will still be allowed to do so at this time. Then the notifiers are called. Several notifiers will bring down tasks that will enter these locations. Some of these tasks will take locks of other tasks that are on the CPU. If we don't let those other tasks continue, but make them block until CPU down is done, the tasks that the notifiers are waiting on will never complete as they are waiting for the locks held by the tasks that are blocked. Thus we still let the task pin the CPU until the notifiers are done. After the notifiers run, we then make new tasks entering the pinned CPU sections grab a mutex and wait. This mutex is now a per CPU mutex in the hotplug_pcp descriptor. To help things along, a new function in the scheduler code is created called migrate_me(). This function will try to migrate the current task off the CPU this is going down if possible. When the sync_tsk is created, all tasks will then try to migrate off the CPU going down. There are several cases that this wont work, but it helps in most cases. After the notifiers are called and if a task can't migrate off but enters the pin CPU sections, it will be forced to wait on the hotplug_pcp mutex until the CPU down is complete. Then the scheduler will force the migration anyway. Also, I found that THREAD_BOUND need to also be accounted for in the pinned CPU, and the migrate_disable no longer treats them special. This helps fix issues with ksoftirqd and workqueue that unbind on CPU down. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
author: Steven Rostedt <srostedt@redhat.com> 2012-07-16 08:07:43 +0000
committer: Steven Rostedt <rostedt@rostedt.homelinux.com> 2012-11-27 21:17:39 -0500
commit: eabcaadcb2d5a49137b898e34048d89c30d453f8 (patch)
tree: e47f7cf97619560a63d5731b95d2b518e5c053f5 /include
parent: c3cf517713365e5862a660ac9f7c32f5f0b2d5e2 (diff)
1 files changed, 7 insertions, 0 deletions
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 117a468f5b85..8155129c9f01 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1911,6 +1911,10 @@ extern void do_set_cpus_allowed(struct task_struct *p,
 
 extern int set_cpus_allowed_ptr(struct task_struct *p,
 				const struct cpumask *new_mask);
+int migrate_me(void);
+void tell_sched_cpu_down_begin(int cpu);
+void tell_sched_cpu_down_done(int cpu);
+
 #else
 static inline void do_set_cpus_allowed(struct task_struct *p,
 				      const struct cpumask *new_mask)
@@ -1923,6 +1927,9 @@ static inline int set_cpus_allowed_ptr(struct task_struct *p,
 		return -EINVAL;
 	return 0;
 }
+static inline int migrate_me(void) { return 0; }
+static inline void tell_sched_cpu_down_begin(int cpu) { }
+static inline void tell_sched_cpu_down_done(int cpu) { }
 #endif
 
 #ifndef CONFIG_CPUMASK_OFFSTACK
author	Steven Rostedt <srostedt@redhat.com>	2012-07-16 08:07:43 +0000
committer	Steven Rostedt <rostedt@rostedt.homelinux.com>	2012-11-27 21:17:39 -0500
commit	eabcaadcb2d5a49137b898e34048d89c30d453f8 (patch)
tree	e47f7cf97619560a63d5731b95d2b518e5c053f5 /include
parent	c3cf517713365e5862a660ac9f7c32f5f0b2d5e2 (diff)