aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2010-05-10sched: Remove a stale commentLi Zefan
This comment should have been removed together with uids_mutex when removing user sched. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Dhaval Giani <dhaval.giani@gmail.com> LKML-Reference: <4BE77C6B.5010402@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-09ondemand: Make the iowait-is-busy time a sysfs tunableArjan van de Ven
Pavel Machek pointed out that not all CPUs have an efficient idle at high frequency. Specifically, older Intel and various AMD cpus would get a higher powerusage when copying files from USB. Mike Chan pointed out that the same is true for various ARM chips as well. Thomas Renninger suggested to make this a sysfs tunable with a reasonable default. This patch adds a sysfs tunable for the new behavior, and uses a very simple function to determine a reasonable default, depending on the CPU vendor/type. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Rik van Riel <riel@redhat.com> Acked-by: Pavel Machek <pavel@ucw.cz> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: davej@redhat.com LKML-Reference: <20100509082651.46914d04@infradead.org> [ minor tidyup ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-09ondemand: Solve a big performance issue by counting IOWAIT time as busyArjan van de Ven
The ondemand cpufreq governor uses CPU busy time (e.g. not-idle time) as a measure for scaling the CPU frequency up or down. If the CPU is busy, the CPU frequency scales up, if it's idle, the CPU frequency scales down. Effectively, it uses the CPU busy time as proxy variable for the more nebulous "how critical is performance right now" question. This algorithm falls flat on its face in the light of workloads where you're alternatingly disk and CPU bound, such as the ever popular "git grep", but also things like startup of programs and maildir using email clients... much to the chagarin of Andrew Morton. This patch changes the ondemand algorithm to count iowait time as busy, not idle, time. As shown in the breakdown cases above, iowait is performance critical often, and by counting iowait, the proxy variable becomes a more accurate representation of the "how critical is performance" question. The problem and fix are both verified with the "perf timechar" tool. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dave Jones <davej@redhat.com> Reviewed-by: Rik van Riel <riel@redhat.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20100509082606.3d9f00d0@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-09sched: Intoduce get_cpu_iowait_time_us()Arjan van de Ven
For the ondemand cpufreq governor, it is desired that the iowait time is microaccounted in a similar way as idle time is. This patch introduces the infrastructure to account and expose this information via the get_cpu_iowait_time_us() function. [akpm@linux-foundation.org: fix CONFIG_NO_HZ=n build] Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Rik van Riel <riel@redhat.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: davej@redhat.com LKML-Reference: <20100509082523.284feab6@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-09sched: Eliminate the ts->idle_lastupdate fieldArjan van de Ven
Now that the only user of ts->idle_lastupdate is update_ts_time_stats(), the entire field can be eliminated. In update_ts_time_stats(), idle_lastupdate is first set to "now", and a few lines later, the only user is an if() statement that assigns a variable either to "now" or to ts->idle_lastupdate, which has the value of "now" at that point. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Rik van Riel <riel@redhat.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: davej@redhat.com LKML-Reference: <20100509082439.2fab0b4f@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-09sched: Fold updating of the last_update_time_info into update_ts_time_stats()Arjan van de Ven
This patch folds the updating of the last_update_time into the update_ts_time_stats() function, and updates the callers. This allows for further cleanups that are done in the next patch. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Rik van Riel <riel@redhat.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: davej@redhat.com LKML-Reference: <20100509082403.60072967@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-09sched: Update the idle statistics in get_cpu_idle_time_us()Arjan van de Ven
Right now, get_cpu_idle_time_us() only reports the idle statistics upto the point the CPU entered last idle; not what is valid right now. This patch adds an update of the idle statistics to get_cpu_idle_time_us(), so that calling this function always returns statistics that are accurate at the point of the call. This includes resetting the start of the idle time for accounting purposes to avoid double accounting. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Rik van Riel <riel@redhat.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: davej@redhat.com LKML-Reference: <20100509082323.2d2f1945@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-09sched: Introduce a function to update the idle statisticsArjan van de Ven
Currently, two places update the idle statistics (and more to come later in this series). This patch creates a helper function for updating these statistics. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Rik van Riel <riel@redhat.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: davej@redhat.com LKML-Reference: <20100509082245.163e67ed@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-09sched: Add a comment to get_cpu_idle_time_us()Arjan van de Ven
The exported function get_cpu_idle_time_us() has no comment describing it; add a kerneldoc comment Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Rik van Riel <riel@redhat.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: davej@redhat.com LKML-Reference: <20100509082208.7cb721f0@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-08Merge branch 'cpu_stop' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc into sched/core
2010-05-08cpu_stop: add dummy implementation for UPTejun Heo
When !CONFIG_SMP, cpu_stop functions weren't defined at all which could lead to build failures if UP code uses cpu_stop facility. Add dummy cpu_stop implementation for UP. The waiting variants execute the work function directly with preempt disabled and stop_one_cpu_nowait() schedules a workqueue work. Makefile and ifdefs around stop_machine implementation are updated to accomodate CONFIG_SMP && !CONFIG_STOP_MACHINE case. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Ingo Molnar <mingo@elte.hu>
2010-05-07sched: Remove rq argument to the tracepointsPeter Zijlstra
struct rq isn't visible outside of sched.o so its near useless to expose the pointer, also there are no users of it, so remove it. Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1272997616.1642.207.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-07Merge commit 'v2.6.34-rc6' into sched/coreIngo Molnar
2010-05-07rcu: need barrier() in UP synchronize_sched_expedited()Paul E. McKenney
If synchronize_sched_expedited() is ever to be called from within kernel/sched.c in a !SMP PREEMPT kernel, the !SMP implementation needs a barrier(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2010-05-06sched: correctly place paranioa memory barriers in synchronize_sched_expedited()Paul E. McKenney
The memory barriers must be in the SMP case, not in the !SMP case. Also add a barrier after the atomic_inc() in order to ensure that other CPUs see post-synchronize_sched_expedited() actions as following the expedited grace period. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2010-05-06sched: kill paranoia check in synchronize_sched_expedited()Tejun Heo
The paranoid check which verifies that the cpu_stop callback is actually called on all online cpus is completely superflous. It's guaranteed by cpu_stop facility and if it didn't work as advertised other things would go horribly wrong and trying to recover using synchronize_sched() wouldn't be very meaningful. Kill the paranoid check. Removal of this feature is done as a separate step so that it can serve as a bisection point if something actually goes wrong. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Dipankar Sarma <dipankar@in.ibm.com> Cc: Josh Triplett <josh@freedesktop.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Dimitri Sivanich <sivanich@sgi.com>
2010-05-06sched: replace migration_thread with cpu_stopTejun Heo
Currently migration_thread is serving three purposes - migration pusher, context to execute active_load_balance() and forced context switcher for expedited RCU synchronize_sched. All three roles are hardcoded into migration_thread() and determining which job is scheduled is slightly messy. This patch kills migration_thread and replaces all three uses with cpu_stop. The three different roles of migration_thread() are splitted into three separate cpu_stop callbacks - migration_cpu_stop(), active_load_balance_cpu_stop() and synchronize_sched_expedited_cpu_stop() - and each use case now simply asks cpu_stop to execute the callback as necessary. synchronize_sched_expedited() was implemented with private preallocated resources and custom multi-cpu queueing and waiting logic, both of which are provided by cpu_stop. synchronize_sched_expedited_count is made atomic and all other shared resources along with the mutex are dropped. synchronize_sched_expedited() also implemented a check to detect cases where not all the callback got executed on their assigned cpus and fall back to synchronize_sched(). If called with cpu hotplug blocked, cpu_stop already guarantees that and the condition cannot happen; otherwise, stop_machine() would break. However, this patch preserves the paranoid check using a cpumask to record on which cpus the stopper ran so that it can serve as a bisection point if something actually goes wrong theree. Because the internal execution state is no longer visible, rcu_expedited_torture_stats() is removed. This patch also renames cpu_stop threads to from "stopper/%d" to "migration/%d". The names of these threads ultimately don't matter and there's no reason to make unnecessary userland visible changes. With this patch applied, stop_machine() and sched now share the same resources. stop_machine() is faster without wasting any resources and sched migration users are much cleaner. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Dipankar Sarma <dipankar@in.ibm.com> Cc: Josh Triplett <josh@freedesktop.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Dimitri Sivanich <sivanich@sgi.com>
2010-05-06stop_machine: reimplement using cpu_stopTejun Heo
Reimplement stop_machine using cpu_stop. As cpu stoppers are guaranteed to be available for all online cpus, stop_machine_create/destroy() are no longer necessary and removed. With resource management and synchronization handled by cpu_stop, the new implementation is much simpler. Asking the cpu_stop to execute the stop_cpu() state machine on all online cpus with cpu hotplug disabled is enough. stop_machine itself doesn't need to manage any global resources anymore, so all per-instance information is rolled into struct stop_machine_data and the mutex and all static data variables are removed. The previous implementation created and destroyed RT workqueues as necessary which made stop_machine() calls highly expensive on very large machines. According to Dimitri Sivanich, preventing the dynamic creation/destruction makes booting faster more than twice on very large machines. cpu_stop resources are preallocated for all online cpus and should have the same effect. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Dimitri Sivanich <sivanich@sgi.com>
2010-05-06cpu_stop: implement stop_cpu[s]()Tejun Heo
Implement a simplistic per-cpu maximum priority cpu monopolization mechanism. A non-sleeping callback can be scheduled to run on one or multiple cpus with maximum priority monopolozing those cpus. This is primarily to replace and unify RT workqueue usage in stop_machine and scheduler migration_thread which currently is serving multiple purposes. Four functions are provided - stop_one_cpu(), stop_one_cpu_nowait(), stop_cpus() and try_stop_cpus(). This is to allow clean sharing of resources among stop_cpu and all the migration thread users. One stopper thread per cpu is created which is currently named "stopper/CPU". This will eventually replace the migration thread and take on its name. * This facility was originally named cpuhog and lived in separate files but Peter Zijlstra nacked the name and thus got renamed to cpu_stop and moved into stop_machine.c. * Better reporting of preemption leak as per Peter's suggestion. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Dimitri Sivanich <sivanich@sgi.com>
2010-04-29Linux 2.6.34-rc6Linus Torvalds
2010-04-29Merge branch 'for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb: kgdb: don't needlessly skip PAGE_USER test for Fsl booke
2010-04-29Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfsLinus Torvalds
* 'for-linus' of git://oss.sgi.com/xfs/xfs: xfs: add a shrinker to background inode reclaim
2010-04-29kgdb: don't needlessly skip PAGE_USER test for Fsl bookeWufei
The bypassing of this test is a leftover from 2.4 vintage kernels, and is no longer appropriate, or even used by KGDB. Currently KGDB uses probe_kernel_write() for all access to memory via the KGDB core, so it can simply be deleted. This fixes CVE-2010-1446. CC: Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: Paul Mackerras <paulus@samba.org> CC: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Wufei <fei.wu@windriver.com> Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2010-04-29Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds
* 'for-linus' of git://git.kernel.dk/linux-2.6-block: exofs: Fix "add bdi backing to mount session" fall out fs: fs/super.c needs to include backing-dev.h for !CONFIG_BLOCK
2010-04-29Merge master.kernel.org:/home/rmk/linux-2.6-armLinus Torvalds
* master.kernel.org:/home/rmk/linux-2.6-arm: ARM: 6061/1: PL061 GPIO: Bug fix - setting gpio for HIGH_LEVEL interrupt is not working. ARM: 5957/1: ARM: RealView SD/MMC Card detection and write-protect using GPIOLIB ARM: 6030/1: KS8695: enable console ARM: 6060/1: PL061 GPIO: Setting gpio val after changing direction to OUT. ARM: 6059/1: PL061 GPIO: Changing *_irq_chip_data with *_irq_data for real irqs. ARM: 6023/1: update bcmring_defconfig to latest version and fix build error ARM: fix build error in arch/arm/kernel/process.c
2010-04-29Merge branch 'merge' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc/ps3: Update ps3_defconfig powerpc/ps3: Update platform maintainer powerpc/pseries: Flush lazy kernel mappings after unplug operations powerpc/numa: Add form 1 NUMA affinity powerpc/fsl-booke: Fix CONFIG_RELOCATABLE support on FSL Book-E ppc32 powerpc: 2.6.34 update of defconfigs for embedded 6xx/7xxx, 8xx, 8xxx powerpc/mpc8xxx defconfigs - turn off SYSFS_DEPRECATED powerpc/83xx: configure SIL SATA driver in 83xx-wide defconfig powerpc/83xx: enable EPOLL syscall in defconfig powerpc/83xx: add RTC drivers in 83xx defconfig powerpc/fsl-cpm: Configure clock correctly for SCC powerpc/fsl_booke: Correct test for MMU_FTR_BIG_PHYS powerpc/85xx/86xx: Fix build w/ CONFIG_PCI=n
2010-04-29ARM: 6061/1: PL061 GPIO: Bug fix - setting gpio for HIGH_LEVEL interrupt is ↵viresh kumar
not working. In current implementation of PL061, setting type of irq to HIGH_LEVEL is not working. This patch fixes this bug. Signed-off-by: Viresh Kumar <viresh.kumar@st.com> Acked-by: Baruch Siach <baruch@tkos.co.il> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-04-29xfs: add a shrinker to background inode reclaimDave Chinner
On low memory boxes or those with highmem, kernel can OOM before the background reclaims inodes via xfssyncd. Add a shrinker to run inode reclaim so that it inode reclaim is expedited when memory is low. This is more complex than it needs to be because the VM folk don't want a context added to the shrinker infrastructure. Hence we need to add a global list of XFS mount structures so the shrinker can traverse them. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2010-04-29exofs: Fix "add bdi backing to mount session" fall outBoaz Harrosh
The patch: add bdi backing to mount session (b3d0ab7e60d1865bb6f6a79a77aaba22f2543236) Has a bug in the placement of the bdi member at struct exofs_sb_info. The layout member must be kept last. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-04-29fs: fs/super.c needs to include backing-dev.h for !CONFIG_BLOCKJens Axboe
When CONFIG_BLOCK is set, it ends up getting backing-dev.h included. But for !CONFIG_BLOCK, it isn't so lucky. The proper thing to do is include <linux/backing-dev.h> directly from the file it's used from, so do that. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-04-29Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6Linus Torvalds
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: nfs: fix memory leak in nfs_get_sb with CONFIG_NFS_V4 nfs: fix some issues in nfs41_proc_reclaim_complete() NFS: Ensure that nfs_wb_page() waits for Pg_writeback to clear NFS: Fix an unstable write data integrity race nfs: testing for null instead of ERR_PTR() NFS: rsize and wsize settings ignored on v4 mounts NFSv4: Don't attempt an atomic open if the file is a mountpoint SUNRPC: Fix a bug in rpcauth_prune_expired
2010-04-29pktcdvd: improve BKL and compat_ioctl.c usageArnd Bergmann
The pktcdvd driver uses proper locking and does not need the BKL in the ioctl and llseek functions of the character device, so kill both. Moving the compat_ioctl handling from common code into the driver itself fixes build problems when CONFIG_BLOCK is disabled. Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-04-29exofs: Fix "add bdi backing to mount session" fall outBoaz Harrosh
Commit b3d0ab7e60d1865bb6f6a79a77aaba22f2543236 ("exofs: add bdi backing to mount session") has a bug in the placement of the bdi member at struct exofs_sb_info. The layout member must be kept last. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Acked-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-04-28Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-tip: x86: Disable large pages on CPUs with Atom erratum AAE44 x86-64: Clear a 64-bit FS/GS base on fork if selector is nonzero x86, mrst: Conditionally register cpu hotplug notifier for apbt
2010-04-28Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: x86/PCI: compute Address Space length rather than using _LEN x86/PCI: never allocate PCI MMIO resources below BIOS_END
2010-04-28nfs d_revalidate() is too trigger-happy with d_drop()Al Viro
If dentry found stale happens to be a root of disconnected tree, we can't d_drop() it; its d_hash is actually part of s_anon and d_drop() would simply hide it from shrink_dcache_for_umount(), leading to all sorts of fun, including busy inodes on umount and oopsen after that. Bug had been there since at least 2006 (commit c636eb already has it), so it's definitely -stable fodder. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-04-28ARM: 5957/1: ARM: RealView SD/MMC Card detection and write-protect using GPIOLIBColin Tuckley
The switch to using GPIOLIB broke the sd/mmc card detection on the RealView development boards if GPIO_PL061 was not selected. This patch selects GPIO_PL061 if GPIOLIB is selected. The sense of the return value from mmc_status has also changed and is corrected. Signed-off-by: Colin Tuckley <colin.tuckley@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-04-28Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/lrg/voltage-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lrg/voltage-2.6: regulator: fix enabling regulator issue on max8925
2010-04-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (27 commits) sfc: Change falcon_probe_board() to fail for unsupported boards sfc: Always close net device at the end of a disabling reset sfc: Wait at most 10ms for the MC to finish reading out MAC statistics sctp: Fix oops when sending queued ASCONF chunks sctp: fix to calc the INIT/INIT-ACK chunk length correctly is set sctp: per_cpu variables should be in bh_disabled section sctp: fix potential reference of a freed pointer sctp: avoid irq lock inversion while call sk->sk_data_ready() Revert "tcp: bind() fix when many ports are bound" net/usb: add sierra_net.c driver cdc_ether: fix autosuspend for mbm devices bluetooth: handle l2cap_create_connless_pdu() errors gianfar: Wait for both RX and TX to stop ipheth: potential null dereferences on error path smc91c92_cs: spin_unlock_irqrestore before calling smc_interrupt() drivers/usb/net/kaweth.c: add device "Allied Telesyn AT-USB10 USB Ethernet Adapter" bnx2: Update version to 2.0.9. bnx2: Prevent "scheduling while atomic" warning with cnic, bonding and vlan. bnx2: Fix lost MSI-X problem on 5709 NICs. cxgb3: Wait longer for control packets on initialization ...
2010-04-28sfc: Change falcon_probe_board() to fail for unsupported boardsBen Hutchings
The driver needs specific PHY and board support code for each SFC4000 board; there is no point trying to continue if it is missing. Currently unsupported boards can trigger an 'oops'. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Cc: stable@kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-28sfc: Always close net device at the end of a disabling resetBen Hutchings
This fixes a regression introduced by commit eb9f6744cbfa97674c13263802259b5aa0034594 "sfc: Implement ethtool reset operation". Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Cc: stable@kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-28sfc: Wait at most 10ms for the MC to finish reading out MAC statisticsBen Hutchings
The original code would wait indefinitely if MAC stats DMA failed. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Cc: stable@kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-28sctp: Fix oops when sending queued ASCONF chunksVlad Yasevich
When we finish processing ASCONF_ACK chunk, we try to send the next queued ASCONF. This action runs the sctp state machine recursively and it's not prepared to do so. kernel BUG at kernel/timer.c:790! invalid opcode: 0000 [#1] SMP last sysfs file: /sys/module/ipv6/initstate Modules linked in: sha256_generic sctp libcrc32c ipv6 dm_multipath uinput 8139too i2c_piix4 8139cp mii i2c_core pcspkr virtio_net joydev floppy virtio_blk virtio_pci [last unloaded: scsi_wait_scan] Pid: 0, comm: swapper Not tainted 2.6.34-rc4 #15 /Bochs EIP: 0060:[<c044a2ef>] EFLAGS: 00010286 CPU: 0 EIP is at add_timer+0xd/0x1b EAX: cecbab14 EBX: 000000f0 ECX: c0957b1c EDX: 03595cf4 ESI: cecba800 EDI: cf276f00 EBP: c0957aa0 ESP: c0957aa0 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Process swapper (pid: 0, ti=c0956000 task=c0988ba0 task.ti=c0956000) Stack: c0957ae0 d1851214 c0ab62e4 c0ab5f26 0500ffff 00000004 00000005 00000004 <0> 00000000 d18694fd 00000004 1666b892 cecba800 cecba800 c0957b14 00000004 <0> c0957b94 d1851b11 ceda8b00 cecba800 cf276f00 00000001 c0957b14 000000d0 Call Trace: [<d1851214>] ? sctp_side_effects+0x607/0xdfc [sctp] [<d1851b11>] ? sctp_do_sm+0x108/0x159 [sctp] [<d1863386>] ? sctp_pname+0x0/0x1d [sctp] [<d1861a56>] ? sctp_primitive_ASCONF+0x36/0x3b [sctp] [<d185657c>] ? sctp_process_asconf_ack+0x2a4/0x2d3 [sctp] [<d184e35c>] ? sctp_sf_do_asconf_ack+0x1dd/0x2b4 [sctp] [<d1851ac1>] ? sctp_do_sm+0xb8/0x159 [sctp] [<d1863334>] ? sctp_cname+0x0/0x52 [sctp] [<d1854377>] ? sctp_assoc_bh_rcv+0xac/0xe1 [sctp] [<d1858f0f>] ? sctp_inq_push+0x2d/0x30 [sctp] [<d186329d>] ? sctp_rcv+0x797/0x82e [sctp] Tested-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Yuansong Qiao <ysqiao@research.ait.ie> Signed-off-by: Shuaijun Zhang <szhang@research.ait.ie> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-28sctp: fix to calc the INIT/INIT-ACK chunk length correctly is setWei Yongjun
When calculating the INIT/INIT-ACK chunk length, we should not only account the length of parameters, but also the parameters zero padding length, such as AUTH HMACS parameter and CHUNKS parameter. Without the parameters zero padding length we may get following oops. skb_over_panic: text:ce2068d2 len:130 put:6 head:cac3fe00 data:cac3fe00 tail:0xcac3fe82 end:0xcac3fe80 dev:<NULL> ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:127! invalid opcode: 0000 [#2] SMP last sysfs file: /sys/module/aes_generic/initstate Modules linked in: authenc ...... Pid: 4102, comm: sctp_darn Tainted: G D 2.6.34-rc2 #6 EIP: 0060:[<c0607630>] EFLAGS: 00010282 CPU: 0 EIP is at skb_over_panic+0x37/0x3e EAX: 00000078 EBX: c07c024b ECX: c07c02b9 EDX: cb607b78 ESI: 00000000 EDI: cac3fe7a EBP: 00000002 ESP: cb607b74 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 Process sctp_darn (pid: 4102, ti=cb607000 task=cabdc990 task.ti=cb607000) Stack: c07c02b9 ce2068d2 00000082 00000006 cac3fe00 cac3fe00 cac3fe82 cac3fe80 <0> c07c024b cac3fe7c cac3fe7a c0608dec ca986e80 ce2068d2 00000006 0000007a <0> cb8120ca ca986e80 cb812000 00000003 cb8120c4 ce208a25 cb8120ca cadd9400 Call Trace: [<ce2068d2>] ? sctp_addto_chunk+0x45/0x85 [sctp] [<c0608dec>] ? skb_put+0x2e/0x32 [<ce2068d2>] ? sctp_addto_chunk+0x45/0x85 [sctp] [<ce208a25>] ? sctp_make_init+0x279/0x28c [sctp] [<c0686a92>] ? apic_timer_interrupt+0x2a/0x30 [<ce1fdc0b>] ? sctp_sf_do_prm_asoc+0x2b/0x7b [sctp] [<ce202823>] ? sctp_do_sm+0xa0/0x14a [sctp] [<ce2133b9>] ? sctp_pname+0x0/0x14 [sctp] [<ce211d72>] ? sctp_primitive_ASSOCIATE+0x2b/0x31 [sctp] [<ce20f3cf>] ? sctp_sendmsg+0x7a0/0x9eb [sctp] [<c064eb1e>] ? inet_sendmsg+0x3b/0x43 [<c04244b7>] ? task_tick_fair+0x2d/0xd9 [<c06031e1>] ? sock_sendmsg+0xa7/0xc1 [<c0416afe>] ? smp_apic_timer_interrupt+0x6b/0x75 [<c0425123>] ? dequeue_task_fair+0x34/0x19b [<c0446abb>] ? sched_clock_local+0x17/0x11e [<c052ea87>] ? _copy_from_user+0x2b/0x10c [<c060ab3a>] ? verify_iovec+0x3c/0x6a [<c06035ca>] ? sys_sendmsg+0x186/0x1e2 [<c042176b>] ? __wake_up_common+0x34/0x5b [<c04240c2>] ? __wake_up+0x2c/0x3b [<c057e35c>] ? tty_wakeup+0x43/0x47 [<c04430f2>] ? remove_wait_queue+0x16/0x24 [<c0580c94>] ? n_tty_read+0x5b8/0x65e [<c042be02>] ? default_wake_function+0x0/0x8 [<c0604e0e>] ? sys_socketcall+0x17f/0x1cd [<c040264c>] ? sysenter_do_call+0x12/0x22 Code: 0f 45 de 53 ff b0 98 00 00 00 ff b0 94 ...... EIP: [<c0607630>] skb_over_panic+0x37/0x3e SS:ESP 0068:cb607b74 To reproduce: # modprobe sctp # echo 1 > /proc/sys/net/sctp/addip_enable # echo 1 > /proc/sys/net/sctp/auth_enable # sctp_test -H 3ffe:501:ffff:100:20c:29ff:fe4d:f37e -P 800 -l # sctp_darn -H 3ffe:501:ffff:100:20c:29ff:fe4d:f37e -P 900 -h 192.168.0.21 -p 800 -I -s -t sctp_darn ready to send... 3ffe:501:ffff:100:20c:29ff:fe4d:f37e:900-192.168.0.21:800 Interactive mode> bindx-add=192.168.0.21 3ffe:501:ffff:100:20c:29ff:fe4d:f37e:900-192.168.0.21:800 Interactive mode> bindx-add=192.168.1.21 3ffe:501:ffff:100:20c:29ff:fe4d:f37e:900-192.168.0.21:800 Interactive mode> snd=10 ------------------------------------------------------------------ eth0 has addresses: 3ffe:501:ffff:100:20c:29ff:fe4d:f37e and 192.168.0.21 eth1 has addresses: 192.168.1.21 ------------------------------------------------------------------ Reported-by: George Cheimonidis <gchimon@gmail.com> Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-28sctp: per_cpu variables should be in bh_disabled sectionVlad Yasevich
Since the change of the atomics to percpu variables, we now have to disable BH in process context when touching percpu variables. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-28sctp: fix potential reference of a freed pointerVlad Yasevich
When sctp attempts to update an assocition, it removes any addresses that were not in the updated INITs. However, the loop may attempt to refrence a transport with address after removing it. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-28sctp: avoid irq lock inversion while call sk->sk_data_ready()Wei Yongjun
sk->sk_data_ready() of sctp socket can be called from both BH and non-BH contexts, but the default sk->sk_data_ready(), sock_def_readable(), can not be used in this case. Therefore, we have to make a new function sctp_data_ready() to grab sk->sk_data_ready() with BH disabling. ========================================================= [ INFO: possible irq lock inversion dependency detected ] 2.6.33-rc6 #129 --------------------------------------------------------- sctp_darn/1517 just changed the state of lock: (clock-AF_INET){++.?..}, at: [<c06aab60>] sock_def_readable+0x20/0x80 but this lock took another, SOFTIRQ-unsafe lock in the past: (slock-AF_INET){+.-...} and interrupts could create inverse lock ordering between them. other info that might help us debug this: 1 lock held by sctp_darn/1517: #0: (sk_lock-AF_INET){+.+.+.}, at: [<cdfe363d>] sctp_sendmsg+0x23d/0xc00 [sctp] Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-28Revert "tcp: bind() fix when many ports are bound"David S. Miller
This reverts two commits: fda48a0d7a8412cedacda46a9c0bf8ef9cd13559 tcp: bind() fix when many ports are bound and a follow-on fix for it: 6443bb1fc2050ca2b6585a3fa77f7833b55329ed ipv6: Fix inet6_csk_bind_conflict() It causes problems with binding listening sockets when time-wait sockets from a previous instance still are alive. It's too late to keep fiddling with this so late in the -rc series, and we'll deal with it in net-next-2.6 instead. Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-28nfs: fix memory leak in nfs_get_sb with CONFIG_NFS_V4Xiaotian Feng
With CONFIG_NFS_V4 and data version 4, nfs_get_sb will allocate memory for export_path in nfs4_validate_text_mount_data, so we need to free it then. This is addressed in following kmemleak report: unreferenced object 0xffff88016bf48a50 (size 16): comm "mount.nfs", pid 22567, jiffies 4651574704 (age 175471.200s) hex dump (first 16 bytes): 2f 6f 70 74 2f 77 6f 72 6b 00 6b 6b 6b 6b 6b a5 /opt/work.kkkkk. backtrace: [<ffffffff814b34f9>] kmemleak_alloc+0x60/0xa7 [<ffffffff81102c76>] kmemleak_alloc_recursive.clone.5+0x1b/0x1d [<ffffffff811046b3>] __kmalloc_track_caller+0x18f/0x1b7 [<ffffffff810e1b08>] kstrndup+0x37/0x54 [<ffffffffa0336971>] nfs_parse_devname+0x152/0x204 [nfs] [<ffffffffa0336af3>] nfs4_validate_text_mount_data+0xd0/0xdc [nfs] [<ffffffffa0338deb>] nfs_get_sb+0x325/0x736 [nfs] [<ffffffff81113671>] vfs_kern_mount+0xbd/0x17c [<ffffffff81113798>] do_kern_mount+0x4d/0xed [<ffffffff81129a87>] do_mount+0x787/0x7fe [<ffffffff81129b86>] sys_mount+0x88/0xc2 [<ffffffff81009b42>] system_call_fastpath+0x16/0x1b Signed-off-by: Xiaotian Feng <dfeng@redhat.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: Benny Halevy <bhalevy@panasas.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-04-28nfs: fix some issues in nfs41_proc_reclaim_complete()Dan Carpenter
The original code passed an ERR_PTR() to rpc_put_task() and instead of returning zero on success it returned -ENOMEM. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>