aboutsummaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)Author
2014-03-30AUDIT: Allow login in non-init namespacesEric Paris
It its possible to configure your PAM stack to refuse login if audit messages (about the login) were unable to be sent. This is common in many distros and thus normal configuration of many containers. The PAM modules determine if audit is enabled/disabled in the kernel based on the return value from sending an audit message on the netlink socket. If userspace gets back ECONNREFUSED it believes audit is disabled in the kernel. If it gets any other error else it refuses to let the login proceed. Just about ever since the introduction of namespaces the kernel audit subsystem has returned EPERM if the task sending a message was not in the init user or pid namespace. So many forms of containers have never worked if audit was enabled in the kernel. BUT if the container was not in net_init then the kernel network code would send ECONNREFUSED (instead of the audit code sending EPERM). Thus by pure accident/dumb luck/bug if an admin configured the PAM stack to reject all logins that didn't talk to audit, but then ran the login untility in the non-init_net namespace, it would work!! Clearly this was a bug, but it is a bug some people expected. With the introduction of network namespace support in 3.14-rc1 the two bugs stopped cancelling each other out. Now, containers in the non-init_net namespace refused to let users log in (just like PAM was configfured!) Obviously some people were not happy that what used to let users log in, now didn't! This fix is kinda hacky. We return ECONNREFUSED for all non-init relevant namespaces. That means that not only will the old broken non-init_net setups continue to work, now the broken non-init_pid or non-init_user setups will 'work'. They don't really work, since audit isn't logging things. But it's what most users want. In 3.15 we should have patches to support not only the non-init_net (3.14) namespace but also the non-init_pid and non-init_user namespace. So all will be right in the world. This just opens the doors wide open on 3.14 and hopefully makes users happy, if not the audit system... Reported-by: Andre Tomt <andre@tomt.net> Reported-by: Adam Richter <adam_richter2004@yahoo.com> Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-28Revert "PM / Sleep: Require CAP_BLOCK_SUSPEND to use wake_lock/wake_unlock"Ajay Nandakumar
This reverts commit 11388c87d2abca1f01975ced28ce9eacea239104. The issue is that no wake lock is held at the user space i.e by Power Manager service.This is because the PowerManagerService fails to acquire the Wakelock.In 3.8 the wakelock module in the kernel expects the user process to have the capability of CAP_BLOCK_SUSPEND.Which the powermangersevice does not have. Bug 1274297 Bug 1384311 Change-Id: I3b696108d47278cf40abce8d5a9bd012f98f2925 Signed-off-by: Ajay Nandakumar <anandakumarm@nvidia.com> (cherry picked from commit e8464e785027a15279a13e6e32cd1aecd22d5a00) Reviewed-on: http://git-master/r/282698 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com> Tested-by: Bharat Nihalani <bnihalani@nvidia.com>
2014-03-28time: Revert to calling clock_was_set_delayed() while in irq contextJohn Stultz
In commit 47a1b796306356f35 ("tick/timekeeping: Call update_wall_time outside the jiffies lock"), we moved to calling clock_was_set() due to the fact that we were no longer holding the timekeeping or jiffies lock. However, there is still the problem that clock_was_set() triggers an IPI, which cannot be done from the timer's hard irq context, and will generate WARN_ON warnings. Apparently in my earlier testing, I'm guessing I didn't bump the dmesg log level, so I somehow missed the WARN_ONs. Thus we need to revert back to calling clock_was_set_delayed(). Signed-off-by: John Stultz <john.stultz@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1395963049-11923-1-git-send-email-john.stultz@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-03-26Merge tag 'trace-fixes-v3.14-rc7-v2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fix from Steven Rostedt: "While on my flight to Linux Collaboration Summit, I was working on my slides for the event trigger tutorial. I booted a 3.14-rc7 kernel to perform what I wanted to teach and cut and paste it into my slides. When I tried the traceon event trigger with a condition attached to it (turns tracing on only if a field of the trigger event matches a condition set by the user), nothing happened. Tracing would not turn on. I stopped working on my presentation in order to find what was wrong. It ended up being the way trace event triggers work when they have conditions. Instead of copying the fields, the condition code just looks at the fields that were copied into the ring buffer. This works great, unless tracing is off. That's because when the event is reserved on the ring buffer, the ring buffer returns a NULL pointer, this tells the tracing code that the ring buffer is disabled. This ends up being a problem for the traceon trigger if it is using this information to check its condition. Luckily the code that checks if tracing is on returns the ring buffer to use (because the ring buffer is determined by the event file also passed to that field). I was able to easily solve this bug by checking in that helper function if the returned ring buffer entry is NULL, and if so, also check the file flag if it has a trace event trigger condition, and if so, to pass back a temp ring buffer to use. This will allow the trace event trigger condition to still test the event fields, but nothing will be recorded" * tag 'trace-fixes-v3.14-rc7-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Fix traceon trigger condition to actually turn tracing on
2014-03-25tracing: Fix traceon trigger condition to actually turn tracing onSteven Rostedt (Red Hat)
While working on my tutorial for 2014 Linux Collaboration Summit I found that the traceon trigger did not work when conditions were used. The other triggers worked fine though. Looking into it, it is because of the way the triggers use the ring buffer to store the fields it will use for the condition. But if tracing is off, nothing is stored in the buffer, and the tracepoint exits before calling the trigger to test the condition. This is fine for all the triggers that only work when tracing is on, but for traceon trigger that is to work when tracing is off, nothing happens. The fix is simple, just use a temp ring buffer to record the event if tracing is off and the event has a trace event conditional trigger enabled. The rest of the tracepoint code will work just fine, but the tracepoint wont be recorded in the other buffers. Cc: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-03-20futex: revert back to the explicit waiter counting codeLinus Torvalds
Srikar Dronamraju reports that commit b0c29f79ecea ("futexes: Avoid taking the hb->lock if there's nothing to wake up") causes java threads getting stuck on futexes when runing specjbb on a power7 numa box. The cause appears to be that the powerpc spinlocks aren't using the same ticket lock model that we use on x86 (and other) architectures, which in turn result in the "spin_is_locked()" test in hb_waiters_pending() occasionally reporting an unlocked spinlock even when there are pending waiters. So this reinstates Davidlohr Bueso's original explicit waiter counting code, which I had convinced Davidlohr to drop in favor of figuring out the pending waiters by just using the existing state of the spinlock and the wait queue. Reported-and-tested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Original-code-by: Davidlohr Bueso <davidlohr@hp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-20Merge tag 'trace-fixes-v3.14-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull trace fix from Steven Rostedt: "Vaibhav Nagarnaik discovered that since 3.10 a clean-up patch made the array index in the trace event format bogus. He supplied an elegant solution that uses __stringify() and also removes the need for the event_storage and event_storage_mutex and also cuts off a few K of overhead from the trace events" * tag 'trace-fixes-v3.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Fix array size mismatch in format string
2014-03-20tracing: Fix array size mismatch in format stringVaibhav Nagarnaik
In event format strings, the array size is reported in two locations. One in array subscript and then via the "size:" attribute. The values reported there have a mismatch. For e.g., in sched:sched_switch the prev_comm and next_comm character arrays have subscript values as [32] where as the actual field size is 16. name: sched_switch ID: 301 format: field:unsigned short common_type; offset:0; size:2; signed:0; field:unsigned char common_flags; offset:2; size:1; signed:0; field:unsigned char common_preempt_count; offset:3; size:1;signed:0; field:int common_pid; offset:4; size:4; signed:1; field:char prev_comm[32]; offset:8; size:16; signed:1; field:pid_t prev_pid; offset:24; size:4; signed:1; field:int prev_prio; offset:28; size:4; signed:1; field:long prev_state; offset:32; size:8; signed:1; field:char next_comm[32]; offset:40; size:16; signed:1; field:pid_t next_pid; offset:56; size:4; signed:1; field:int next_prio; offset:60; size:4; signed:1; After bisection, the following commit was blamed: 92edca0 tracing: Use direct field, type and system names This commit removes the duplication of strings for field->name and field->type assuming that all the strings passed in __trace_define_field() are immutable. This is not true for arrays, where the type string is created in event_storage variable and field->type for all array fields points to event_storage. Use __stringify() to create a string constant for the type string. Also, get rid of event_storage and event_storage_mutex that are not needed anymore. also, an added benefit is that this reduces the overhead of events a bit more: text data bss dec hex filename 8424787 2036472 1302528 11763787 b3804b vmlinux 8420814 2036408 1302528 11759750 b37086 vmlinux.patched Link: http://lkml.kernel.org/r/1392349908-29685-1-git-send-email-vnagarnaik@google.com Cc: Laurent Chavey <chavey@google.com> Cc: stable@vger.kernel.org # 3.10+ Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-03-19Merge branch 'for-3.14-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fix from Tejun Heo: "One really late cgroup patch to fix error path in create_css(). Hitting this bug would be pretty rare but still possible and it gets delayed we'd need to backport it through -stable anyway. It only updates error path in create_css() and has low chance of new breakages" * 'for-3.14-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: fix a failure path in create_css()
2014-03-19power: wakeup_reason: rename irq_count to irqcountGreg Hackmann
On x86, irq_count conflicts with a declaration in arch/x86/include/asm/processor.h Change-Id: I3e4fde0ff64ef59ff5ed2adc0ea3a644641ee0b7 Signed-off-by: Greg Hackmann <ghackmann@google.com>
2014-03-19Power: Add guard condition for maximum wakeup reasonsRuchi Kandoi
Ensure the array for the wakeup reason IRQs does not overflow. Change-Id: Iddc57a3aeb1888f39d4e7b004164611803a4d37c Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com> (cherry picked from commit b5ea40cdfcf38296535f931a7e5e7bf47b6fad7f)
2014-03-19POWER: fix compile warnings in log_wakeup_reasonRuchi Kandoi
Change I81addaf420f1338255c5d0638b0d244a99d777d1 introduced compile warnings, fix these. Change-Id: I05482a5335599ab96c0a088a7d175c8d4cf1cf69 Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com>
2014-03-19Power: add an API to log wakeup reasonsRuchi Kandoi
Add API log_wakeup_reason() and expose it to userspace via sysfs path /sys/kernel/wakeup_reasons/last_resume_reason Change-Id: I81addaf420f1338255c5d0638b0d244a99d777d1 Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com>
2014-03-19ARM: Fix "Make low-level printk work" to use a separate config optionArve Hjønnevåg
Signed-off-by: Arve Hjønnevåg <arve@android.com>
2014-03-19mm: add a field to store names for private anonymous memoryColin Cross
Userspace processes often have multiple allocators that each do anonymous mmaps to get memory. When examining memory usage of individual processes or systems as a whole, it is useful to be able to break down the various heaps that were allocated by each layer and examine their size, RSS, and physical memory usage. This patch adds a user pointer to the shared union in vm_area_struct that points to a null terminated string inside the user process containing a name for the vma. vmas that point to the same address will be merged, but vmas that point to equivalent strings at different addresses will not be merged. Userspace can set the name for a region of memory by calling prctl(PR_SET_VMA, PR_SET_VMA_ANON_NAME, start, len, (unsigned long)name); Setting the name to NULL clears it. The names of named anonymous vmas are shown in /proc/pid/maps as [anon:<name>] and in /proc/pid/smaps in a new "Name" field that is only present for named vmas. If the userspace pointer is no longer valid all or part of the name will be replaced with "<fault>". The idea to store a userspace pointer to reduce the complexity within mm (at the expense of the complexity of reading /proc/pid/mem) came from Dave Hansen. This results in no runtime overhead in the mm subsystem other than comparing the anon_name pointers when considering vma merging. The pointer is stored in a union with fieds that are only used on file-backed mappings, so it does not increase memory usage. Change-Id: Ie2ffc0967d4ffe7ee4c70781313c7b00cf7e3092 Signed-off-by: Colin Cross <ccross@android.com>
2014-03-19add extra free kbytes tunableRik van Riel
Add a userspace visible knob to tell the VM to keep an extra amount of memory free, by increasing the gap between each zone's min and low watermarks. This is useful for realtime applications that call system calls and have a bound on the number of allocations that happen in any short time period. In this application, extra_free_kbytes would be left at an amount equal to or larger than than the maximum number of allocations that happen in any burst. It may also be useful to reduce the memory use of virtual machines (temporarily?), in a way that does not cause memory fragmentation like ballooning does. [ccross] Revived for use on old kernels where no other solution exists. The tunable will be removed on kernels that do better at avoiding direct reclaim. Change-Id: I765a42be8e964bfd3e2886d1ca85a29d60c3bb3e Signed-off-by: Rik van Riel<riel@redhat.com> Signed-off-by: Colin Cross <ccross@android.com>
2014-03-19trace: add non-hierarchical function_graph optionJamie Gennis
Add the 'funcgraph-flat' option to the function_graph tracer to use the default trace printing format rather than the hierarchical formatting normally used. Change-Id: If2900bfb86e6f8f51379f56da4f6fabafa630909 Signed-off-by: Jamie Gennis <jgennis@google.com>
2014-03-19trace: Add an option to show tgids in trace outputJamie Gennis
The tgids are tracked along side the saved_cmdlines tracking, and can be included in trace output by enabling the 'print-tgid' trace option. This is useful when doing post-processing of the trace data, as it allows events to be grouped by tgid. Change-Id: I52ed04c3a8ca7fddbb868b792ce5d21ceb76250e Signed-off-by: Jamie Gennis <jgennis@google.com>
2014-03-19trace/events: add gpu trace eventsJamie Gennis
Change-Id: I0607b9c776acf61cb796b8572cf8cfb8b2dc1377 Signed-off-by: Jamie Gennis <jgennis@google.com>
2014-03-19hardlockup: detect hard lockups without NMIs using secondary cpusColin Cross
Emulate NMIs on systems where they are not available by using timer interrupts on other cpus. Each cpu will use its softlockup hrtimer to check that the next cpu is processing hrtimer interrupts by verifying that a counter is increasing. This patch is useful on systems where the hardlockup detector is not available due to a lack of NMIs, for example most ARM SoCs. Without this patch any cpu stuck with interrupts disabled can cause a hardware watchdog reset with no debugging information, but with this patch the kernel can detect the lockup and panic, which can result in useful debugging info. Signed-off-by: Colin Cross <ccross@android.com>
2014-03-19PM / Suspend: Print wall time at suspend entry and exitTodd Poynor
Change-Id: I92f252414c013b018b9a392eae1ee039aa0e89dc Signed-off-by: Todd Poynor <toddpoynor@google.com>
2014-03-19debug: add parameters to prevent entering debug mode on errorsColin Cross
On non-developer devices kgdb prevents CONFIG_PANIC_TIMEOUT from rebooting the device after a panic. Add module parameters debug_core.break_on_exception and debug_core.break_on_panic to allow skipping debug on panics and exceptions respectively. Both default to true to preserve existing behavior. Change-Id: I75dce7263e96cee069a9750920cce83dc6f98e8c Signed-off-by: Colin Cross <ccross@android.com>
2014-03-19kdb: support new lines without carriage returnsColin Cross
kdb expects carriage returns through the serial port to terminate commands. Modify it to accept the first seen carriage return or new line as a terminator, but not treat \r\n as two terminators. Change-Id: I06166017e7703d24310eefcb71c3a7d427088db7 Signed-off-by: Colin Cross <ccross@android.com>
2014-03-19Move x86_64 idle notifiers to genericTodd Poynor
Move the x86_64 idle notifiers originally by Andi Kleen and Venkatesh Pallipadi to generic. Change-Id: Idf29cda15be151f494ff245933c12462643388d5 Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org> Signed-off-by: Todd Poynor <toddpoynor@google.com>
2014-03-19sched: Add a generic notifier when a task struct is about to be freedSan Mehat
This patch adds a notifier which can be used by subsystems that may be interested in when a task has completely died and is about to have it's last resource freed. The Android lowmemory killer uses this to determine when a task it has killed has finally given up its goods. Signed-off-by: San Mehat <san@google.com>
2014-03-19proc: smaps: Allow smaps access for CAP_SYS_RESOURCESan Mehat
Signed-off-by: San Mehat <san@google.com>
2014-03-19PM / Sleep: Add wake lock api wrapper on top of wakeup sourcesArve Hjønnevåg
Change-Id: Icaad02fe1e8856fdc2e4215f380594a5dde8e002 Signed-off-by: Arve Hjønnevåg <arve@android.com>
2014-03-19mm: Add min_free_order_shift tunable.Arve Hjønnevåg
By default the kernel tries to keep half as much memory free at each order as it does for one order below. This can be too agressive when running without swap. Change-Id: I5efc1a0b50f41ff3ac71e92d2efd175dedd54ead Signed-off-by: Arve Hjønnevåg <arve@android.com>
2014-03-19ARM: Make low-level printk workTony Lindgren
Makes low-level printk work. Signed-off-by: Tony Lindgren <tony@atomide.com>
2014-03-19cgroup: Add generic cgroup subsystem permission checksColin Cross
Rather than using explicit euid == 0 checks when trying to move tasks into a cgroup via CFS, move permission checks into each specific cgroup subsystem. If a subsystem does not specify a 'allow_attach' handler, then we fall back to doing our checks the old way. Use the 'allow_attach' handler for the 'cpu' cgroup to allow non-root processes to add arbitrary processes to a 'cpu' cgroup if it has the CAP_SYS_NICE capability set. This version of the patch adds a 'allow_attach' handler instead of reusing the 'can_attach' handler. If the 'can_attach' handler is reused, a new cgroup that implements 'can_attach' but not the permission checks could end up with no permission checks at all. Change-Id: Icfa950aa9321d1ceba362061d32dc7dfa2c64f0c Original-Author: San Mehat <san@google.com> Signed-off-by: Colin Cross <ccross@android.com>
2014-03-19power: Add option to log time spent in suspendColin Cross
Prints the time spent in suspend in the kernel log, and keeps statistics on the time spent in suspend in /sys/kernel/debug/suspend_time Change-Id: Ia6b9ebe4baa0f7f5cd211c6a4f7e813aefd3fa1d Signed-off-by: Colin Cross <ccross@android.com> Signed-off-by: Todd Poynor <toddpoynor@google.com>
2014-03-19panic: Add board ID to panic outputNishanth Menon
At times, it is necessary for boards to provide some additional information as part of panic logs. Provide information on the board hardware as part of panic logs. It is safer to print this information at the very end in case something bad happens as part of the information retrieval itself. To use this, set global mach_panic_string to an appropriate string in the board file. Change-Id: Id12cdda87b0cd2940dd01d52db97e6162f671b4d Signed-off-by: Nishanth Menon <nm@ti.com>
2014-03-19PM: Print pending wakeup IRQ preventing suspend to dmesgTodd Poynor
Prints the name of the first action for a pending wakeup IRQ. Change-Id: I36f90735c75fb7c7ab1084775ec0d0ab02336e6e Signed-off-by: Todd Poynor <toddpoynor@google.com>
2014-03-19Revert "genirq: Do not consider disabled wakeup irqs"Arve Hjønnevåg
This reverts commit 9c6079aa1bfcf7e14de10b824779ce39b679bcb8.
2014-03-19sched: Enable might_sleep before initializing drivers.Arve Hjønnevåg
This allows detection of init bugs in built-in drivers. Signed-off-by: Arve Hjønnevåg <arve@android.com>
2014-03-18cgroup: fix a failure path in create_css()Li Zefan
If online_css() fails, we should remove cgroup files belonging to css->ss. Signed-off-by: Li Zefan <lizefan@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2014-03-18uprobes: allow ignoring of probe hitsDavid A. Long
Allow arches to decided to ignore a probe hit. ARM will use this to only call handlers if the conditions to execute a conditionally executed instruction are satisfied. Signed-off-by: David A. Long <dave.long@linaro.org> Acked-by: Oleg Nesterov <oleg@redhat.com>
2014-03-18uprobes: Kconfig dependency fixDavid A. Long
Suggested change from Oleg Nesterov. Fixes incomplete dependencies for uprobes feature. Signed-off-by: David A. Long <dave.long@linaro.org> Acked-by: Oleg Nesterov <oleg@redhat.com>
2014-03-16Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "Three small fixes" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/clock: Prevent tracing recursion in sched_clock_cpu() stop_machine: Fix^2 race between stop_two_cpus() and stop_cpus() sched/deadline: Deny unprivileged users to set/change SCHED_DEADLINE policy
2014-03-11Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull audit namespace fixes from Eric Biederman: "Starting with 3.14-rc1 the audit code is faulty (think oopses and races) with respect to how it computes the network namespace of which socket to reply to, and I happened to notice by chance when reading through the code. My testing and the automated build bots don't find any problems with these fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: audit: Update kdoc for audit_send_reply and audit_list_rules_send audit: Send replies in the proper network namespace. audit: Use struct net not pid_t to remember the network namespce to reply in
2014-03-11sched/clock: Prevent tracing recursion in sched_clock_cpu()Fernando Luis Vazquez Cao
Prevent tracing of preempt_disable/enable() in sched_clock_cpu(). When CONFIG_DEBUG_PREEMPT is enabled, preempt_disable/enable() are traced and this causes trace_clock() users (and probably others) to go into an infinite recursion. Systems with a stable sched_clock() are not affected. This problem is similar to that fixed by upstream commit 95ef1e52922 ("KVM guest: prevent tracing recursion with kvmclock"). Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1394083528.4524.3.camel@nexus Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-03-11stop_machine: Fix^2 race between stop_two_cpus() and stop_cpus()Peter Zijlstra
We must use smp_call_function_single(.wait=1) for the irq_cpu_stop_queue_work() to ensure the queueing is actually done under stop_cpus_lock. Without this we could have dropped the lock by the time we do the queueing and get the race we tried to fix. Fixes: 7053ea1a34fa ("stop_machine: Fix race between stop_two_cpus() and stop_cpus()") Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Christoph Hellwig <hch@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20140228123905.GK3104@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-03-11sched/deadline: Deny unprivileged users to set/change SCHED_DEADLINE policyJuri Lelli
Deny the use of SCHED_DEADLINE policy to unprivileged users. Even if root users can set the policy for normal users, we don't want the latter to be able to change their parameters (safest behavior). Signed-off-by: Juri Lelli <juri.lelli@gmail.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1393844961-18097-1-git-send-email-juri.lelli@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-03-10mm: fix GFP_THISNODE callers and clarifyJohannes Weiner
GFP_THISNODE is for callers that implement their own clever fallback to remote nodes. It restricts the allocation to the specified node and does not invoke reclaim, assuming that the caller will take care of it when the fallback fails, e.g. through a subsequent allocation request without GFP_THISNODE set. However, many current GFP_THISNODE users only want the node exclusive aspect of the flag, without actually implementing their own fallback or triggering reclaim if necessary. This results in things like page migration failing prematurely even when there is easily reclaimable memory available, unless kswapd happens to be running already or a concurrent allocation attempt triggers the necessary reclaim. Convert all callsites that don't implement their own fallback strategy to __GFP_THISNODE. This restricts the allocation a single node too, but at the same time allows the allocator to enter the slowpath, wake kswapd, and invoke direct reclaim if necessary, to make the allocation happen when memory is full. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Rik van Riel <riel@redhat.com> Cc: Jan Stancek <jstancek@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-08audit: Update kdoc for audit_send_reply and audit_list_rules_sendEric W. Biederman
The kbuild test robot reported: > tree: git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace.git for-next > head: 6f285b19d09f72e801525f5eea1bdad22e559bf0 > commit: 6f285b19d09f72e801525f5eea1bdad22e559bf0 [2/2] audit: Send replies in the proper network namespace. > reproduce: make htmldocs > > >> Warning(kernel/audit.c:575): No description found for parameter 'request_skb' > >> Warning(kernel/audit.c:575): Excess function parameter 'portid' description in 'audit_send_reply' > >> Warning(kernel/auditfilter.c:1074): No description found for parameter 'request_skb' > >> Warning(kernel/auditfilter.c:1074): Excess function parameter 'portid' description in 'audit_list_rules_s Which was caused by my failure to update the kdoc annotations when I updated the functions. Fix that small oversight now. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2014-03-08Merge branch 'for-3.14-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fixes from Tejun Heo: "Two cpuset locking fixes from Li. Both tagged for -stable" * 'for-3.14-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cpuset: fix a race condition in __cpuset_node_allowed_softwall() cpuset: fix a locking issue in cpuset_migrate_mm()
2014-03-07Merge tag 'trace-fixes-v3.14-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fix from Steven Rostedt: "In the past, I've had lots of reports about trace events not working. Developers would say they put a trace_printk() before and after the trace event but when they enable it (and the trace event said it was enabled) they would see the trace_printks but not the trace event. I was not able to reproduce this, but that's because I wasn't looking at the right location. Recently, another bug came up that showed the issue. If your kernel supports signed modules but allows for non-signed modules to be loaded, then when one is, the kernel will silently set the MODULE_FORCED taint on the module. Although, this taint happens without the need for insmod --force or anything of the kind, it labels the module with that taint anyway. If this tainted module has tracepoints, the tracepoints will be ignored because of the MODULE_FORCED taint. But no error message will be displayed. Worse yet, the event infrastructure will still be created letting users enable the trace event represented by the tracepoint, although that event will never actually be enabled. This is because the tracepoint infrastructure allows for non-existing tracepoints to be enabled for new modules to arrive and have their tracepoints set. Although there are several things wrong with the above, this change only addresses the creation of the trace event files for tracepoints that are not created when a module is loaded and is tainted. This change will print an error message about the module being tainted and not the trace events will not be created, and it does not create the trace event infrastructure" * tag 'trace-fixes-v3.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Do not add event files for modules that fail tracepoints
2014-03-07Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Thomas Gleixner: - a bugfix for a long standing waitqueue race - a trivial fix for a missing include * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq: Include missing header file in irqdomain.c genirq: Remove racy waitqueue_active check
2014-03-03tracing: Do not add event files for modules that fail tracepointsSteven Rostedt (Red Hat)
If a module fails to add its tracepoints due to module tainting, do not create the module event infrastructure in the debugfs directory. As the events will not work and worse yet, they will silently fail, making the user wonder why the events they enable do not display anything. Having a warning on module load and the events not visible to the users will make the cause of the problem much clearer. Link: http://lkml.kernel.org/r/20140227154923.265882695@goodmis.org Fixes: 6d723736e472 "tracing/events: add support for modules to TRACE_EVENT" Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: stable@vger.kernel.org # 2.6.31+ Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-03-03Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "Misc fixes, most of them SCHED_DEADLINE fallout" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/deadline: Prevent rt_time growth to infinity sched/deadline: Switch CPU's presence test order sched/deadline: Cleanup RT leftovers from {inc/dec}_dl_migration sched: Fix double normalization of vruntime