aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2012-05-15quota: Fix double lock in add_dquot_ref() with CONFIG_QUOTA_DEBUGJan Kara
When CONFIG_QUOTA_DEBUG is enabled we call inode_get_rsv_space() from add_dquot_ref() while holding i_lock. But inode_get_rsv_space() is trying to get i_lock as well resulting in double lock. Fix the problem by moving inode_get_rsv_space() call out of i_lock. Reported-and-analyzed-by: Jie Liu <jeff.liu@oracle.com> Signed-off-by: Jan Kara <jack@suse.cz>
2012-05-15jbd: Write journal superblock with WRITE_FUA after checkpointingJan Kara
If journal superblock is written only in disk's caches and other transaction starts reusing space of the transaction cleaned from the log, it can happen blocks of a new transaction reach the disk before journal superblock. When power failure happens in such case, subsequent journal replay would still try to replay the old transaction but some of it's blocks may be already overwritten by the new transaction. For this reason we must use WRITE_FUA when updating log tail and we must first write new log tail to disk and update in-memory information only after that. Signed-off-by: Jan Kara <jack@suse.cz>
2012-05-15jbd: protect all log tail updates with j_checkpoint_mutexJan Kara
There are some log tail updates that are not protected by j_checkpoint_mutex. Some of these are harmless because they happen during startup or shutdown but updates in journal_commit_transaction() and journal_flush() can really race with other log tail updates (e.g. someone doing journal_flush() with someone running cleanup_journal_tail()). So protect all log tail updates with j_checkpoint_mutex. Signed-off-by: Jan Kara <jack@suse.cz>
2012-05-15jbd: Split updating of journal superblock and marking journal emptyJan Kara
There are three case of updating journal superblock. In the first case, we want to mark journal as empty (setting s_sequence to 0), in the second case we want to update log tail, in the third case we want to update s_errno. Split these cases into separate functions. It makes the code slightly more straightforward and later patches will make the distinction even more important. Signed-off-by: Jan Kara <jack@suse.cz>
2012-04-11ext2: do not register write_super within VFSArtem Bityutskiy
Jan Kara removed 'sb->s_dirt' VFS flag references, so we do not need to register the ext2 'ext2_write_super()' method in the VFS superblock operations, because 'sb->s_dirt' won't be ever set to 1 and VFS won't ever call '->write_super()' anyway. Thus, remove the method. Tested using xfstests. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Jan Kara <jack@suse.cz>
2012-04-11ext2: Remove s_dirt handlingJan Kara
Places which modify superblock feature / state fields mark the superblock buffer dirty so it is written out by flusher thread. Thus there's no need to set s_dirt there. The only other fields changing in the superblock are the numbers of free blocks, free inodes and s_wtime. There's no real need to write (or even compute) these periodically. Free blocks / inodes counters are recomputed on every mount from group counters anyway and value of s_wtime is only informational and imprecise anyway. So it should be enough to write these opportunistically on mount, remount, umount, and sync_fs times. Signed-off-by: Jan Kara <jack@suse.cz>
2012-04-11ext2: write superblock only once on unmountArtem Bityutskiy
Currently on unmount if we are mounted R/W, we first write the superblock to the media if it is dirty, and then write it again, which is not optimal. This patch makes ext2 write the superblock on unmount less times. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Jan Kara <jack@suse.cz>
2012-04-11ext3: update documentation with barrier=1 defaultStefan Hajnoczi
Commit 00eacd6 ("ext3: make ext3 mount default to barrier=1") changed the default barrier mount option for ext3. The documentation needs to be updated, so this patch does that. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Jan Kara <jack@suse.cz>
2012-04-11ext3: remove max_debt in find_group_orlov()Akira Fujita
max_debt, involved variables and calculations are no longer needed, clean them up. Signed-off-by: Akira Fujita <a-fujita@rs.jp.nec.com> Signed-off-by: Jan Kara <jack@suse.cz>
2012-04-11jbd: Refine commit writeout logicJan Kara
Currently we write out all journal buffers in WRITE_SYNC mode. This improves performance for fsync heavy workloads but hinders performance when writes are mostly asynchronous, most noticably it slows down readers and users complain about slow desktop response etc. So submit writes as asynchronous in the normal case and only submit writes as WRITE_SYNC if we detect someone is waiting for current transaction commit. I've gathered some numbers to back this change. The first is the read latency test. It measures time to read 1 MB after several seconds of sleeping in presence of streaming writes. Top 10 times (out of 90) in us: Before After 2131586 697473 1709932 557487 1564598 535642 1480462 347573 1478579 323153 1408496 222181 1388960 181273 1329565 181070 1252486 172832 1223265 172278 Average: 619377 82180 So the improvement in both maximum and average latency is massive. I've measured fsync throughput by: fs_mark -n 100 -t 1 -s 16384 -d /mnt/fsync/ -S 1 -L 4 in presence of streaming reader. The numbers (fsyncs/s) are: Before After 9.9 6.3 6.8 6.0 6.3 6.2 5.8 6.1 So fsync performance seems unharmed by this change. Signed-off-by: Jan Kara <jack@suse.cz>
2012-04-10Smack: build when CONFIG_AUDIT not definedKees Cook
This fixes builds where CONFIG_AUDIT is not defined and CONFIG_SECURITY_SMACK=y. This got introduced by the stack-usage reducation commit 48c62af68a40 ("LSM: shrink the common_audit_data data union"). Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Eric Paris <eparis@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-04-10Merge tag 'dmaengine-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/djbw/dmaengine Pull dmaengine fixes from Dan Williams: 1/ regression fix for Xen as it now trips over a broken assumption about the dma address size on 32-bit builds 2/ new quirk for netdma to ignore dma channels that cannot meet netdma alignment requirements 3/ fixes for two long standing issues in ioatdma (ring size overflow) and iop-adma (potential stack corruption) * tag 'dmaengine-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/dmaengine: netdma: adding alignment check for NETDMA ops ioatdma: DMA copy alignment needed to address IOAT DMA silicon errata ioat: ring size variables need to be 32bit to avoid overflow iop-adma: Corrected array overflow in RAID6 Xscale(R) test. ioat: fix size of 'completion' for Xen
2012-04-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcLinus Torvalds
Pull sparc fixes from David Miller: 1) Build fix for LEON, from Sam Ravnborg. 2) Make the sparc side changes that go along with the infrastructure to retry faults when blocking on a disk transfer. From Kautuk Consul. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sparc32,leon: fix leon build sparc/mm/fault_32.c: Port OOM changes to do_sparc_fault sparc/mm/fault_64.c: Port OOM changes to do_sparc64_fault
2012-04-10Merge tag 'regulator-3.4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull a regulator build fix from Mark Brown: "Fix a build warning in the anatop driver for 3.4 This is a trivial rename to stop the build system complaining that we're referencing things we shouldn't be." * tag 'regulator-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: anatop: fix 'anatop_regulator' name collision
2012-04-10Merge branch 'for-3.4-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml Pull UML fixes from Richard Weinberger. * 'for-3.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml: um: uml_setup_stubs': warning: unused variable 'pages' um: Use asm-generic/switch_to.h um: Disintegrate asm/system.h um: switch cow_user.h to htobe{32,64}/betoh{32,64} um: several x86 hw-dependent crypto modules won't build on uml um: fix linker script generation
2012-04-10i2c: prevent spurious interrupt on Designware controllersKristen Carlson Accardi
Don't call i2c_enable on resume because it causes a spurious interrupt. Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-04-09modpost: Fix modpost license checking of vmlinux.oFrank Rowand
Commit f02e8a6596b7 ("module: Sort exported symbols") sorts symbols placing each of them in its own elf section. This sorting and merging into the canonical sections are done by the linker. Unfortunately modpost to generate Module.symvers file parses vmlinux.o (which is not linked yet) and all modules object files (which aren't linked yet). These aren't sanitized by the linker yet. That breaks modpost that can't detect license properly for modules. This patch makes modpost aware of the new exported symbols structure. [ This above is a slightly corrected version of the explanation of the problem, copied from commit 62a2635610db ("modpost: Fix modpost's license checking V3"). That commit fixed the problem for module object files, but not for vmlinux.o. This patch fixes modpost for vmlinux.o. ] Signed-off-by: Frank Rowand <frank.rowand@am.sony.com> Signed-off-by: Alessio Igor Bogani <abogani@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-04-09android, lowmemorykiller: remove task handoff notifierDavid Rientjes
The task handoff notifier leaks task_struct since it never gets freed after the callback returns NOTIFY_OK, which means it is responsible for doing so. It turns out the lowmemorykiller actually doesn't need this notifier at all. It's used to prevent unnecessary killing by waiting for a thread to exit as a result of lowmem_shrink(), however, it's possible to do this in the same way the kernel oom killer works by setting TIF_MEMDIE and avoid killing if we're still waiting for it to exit. The kernel oom killer will already automatically set TIF_MEMDIE for threads that are attempting to allocate memory that have a fatal signal. The thread selected by lowmem_shrink() will have such a signal after the lowmemorykiller sends it a SIGKILL, so this won't result in an unnecessary use of memory reserves for the thread to exit. This has the added benefit that we don't have to rely on CONFIG_PROFILING to prevent needlessly killing tasks. Reported-by: Werner Landgraf <w.landgraf@ru.ru> Cc: stable@vger.kernel.org Signed-off-by: David Rientjes <rientjes@google.com> Acked-by: Colin Cross <ccross@android.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-04-10um: uml_setup_stubs': warning: unused variable 'pages'Boaz Harrosh
Fix the following gcc complain arch/um/kernel/skas/mmu.c: In function 'uml_setup_stubs': arch/um/kernel/skas/mmu.c:106:16: warning: unused variable 'pages' [-Wunused-variable] Signed-Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2012-04-10um: Use asm-generic/switch_to.hRichard Weinberger
Signed-off-by: Richard Weinberger <richard@nod.at>
2012-04-10um: Disintegrate asm/system.hRichard Weinberger
Signed-off-by: Richard Weinberger <richard@nod.at> Reported-by: Toralf Förster <toralf.foerster@gmx.de> CC: dhowells@redhat.com
2012-04-10um: switch cow_user.h to htobe{32,64}/betoh{32,64}Al Viro
... rather than open-coding the 64bit versions. endian.h has those guys. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Richard Weinberger <richard@nod.at>
2012-04-09um: several x86 hw-dependent crypto modules won't build on umlAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-04-09um: fix linker script generationAl Viro
while we can't just use -U$(SUBARCH), we still need to kill idiotic define (implicit -Di386=1), both for SUBARCH=i386 and SUBARCH=x86/CONFIG_64BIT=n builds. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-04-07Linux 3.4-rc2Linus Torvalds
2012-04-07Merge tag 'regmap-3.4-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap Pull two more small regmap fixes from Mark Brown: - Now we have users for it that aren't running Android it turns out that regcache_sync_region() is much more useful to drivers if it's exported for use by modules. Who knew? - Make sure we don't divide by zero when doing debugfs dumps of rbtrees, not visible up until now because everything was providing at least some cache on startup. * tag 'regmap-3.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap: regmap: prevent division by zero in rbtree_show regmap: Export regcache_sync_region()
2012-04-07Merge branch 'kvm-updates/3.4' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull a few KVM fixes from Avi Kivity: "A bunch of powerpc KVM fixes, a guest and a host RCU fix (unrelated), and a small build fix." * 'kvm-updates/3.4' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: Resolve RCU vs. async page fault problem KVM: VMX: vmx_set_cr0 expects kvm->srcu locked KVM: PMU: Fix integer constant is too large warning in kvm_pmu_set_msr() KVM: PPC: Book3S: PR: Fix preemption KVM: PPC: Save/Restore CR over vcpu_run KVM: PPC: Book3S HV: Save and restore CR in __kvmppc_vcore_entry KVM: PPC: Book3S HV: Fix kvm_alloc_linear in case where no linears exist KVM: PPC: Book3S: Compile fix for ppc32 in HIOR access code
2012-04-07Merge tag 'sh-for-linus' of git://github.com/pmundt/linux-shLinus Torvalds
Pull SuperH fixes from Paul Mundt. * tag 'sh-for-linus' of git://github.com/pmundt/linux-sh: sh: fix clock-sh7757 for the latest sh_mobile_sdhi driver serial: sh-sci: use serial_port_in/out vs sci_in/out. sh: vsyscall: Fix up .eh_frame generation. sh: dma: Fix up device attribute mismatch from sysdev fallout. sh: dwarf unwinder depends on SHcompact. sh: fix up fallout from system.h disintegration.
2012-04-07Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security layer fixlet from James Morris. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: sysctl: fix write access to dmesg_restrict/kptr_restrict
2012-04-06Merge branch 'release' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux Pull ACPI & Power Management patches from Len Brown: "Two fixes for cpuidle merge-window changes, plus a URL fix in MAINTAINERS" * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: MAINTAINERS: Update git url for ACPI cpuidle: Fix panic in CPU off-lining with no idle driver ACPI processor: Use safe_halt() rather than halt() in acpi_idle_play_dead()
2012-04-06Merge branch '3.4-rc-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending Pull target fixes from Nicholas Bellinger: "Pull two tcm_fc fabric related fixes for -rc2: Note that both have been CC'ed to stable, and patch #1 is the important one that addresses a memory corruption bug related to FC exchange timeouts + command abort. Thanks again to MDR for tracking down this issue!" * '3.4-rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: tcm_fc: Do not free tpg structure during wq allocation failure tcm_fc: Add abort flag for gracefully handling exchange timeout
2012-04-06tcm_fc: Do not free tpg structure during wq allocation failureMark Rustad
Avoid freeing a registered tpg structure if an alloc_workqueue call fails. This fixes a bug where the failure was leaking memory associated with se_portal_group setup during the original core_tpg_register() call. Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Acked-by: Kiran Patil <Kiran.patil@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-04-06tcm_fc: Add abort flag for gracefully handling exchange timeoutMark Rustad
Add abort flag and use it to terminate processing when an exchange is timed out or is reset. The abort flag is used in place of the transport_generic_free_cmd function call in the reset and timeout cases, because calling that function in that context would free memory that was in use. The aborted flag allows the lifetime to be managed in a more normal way, while truncating the processing. This change eliminates a source of memory corruption which manifested in a variety of ugly ways. (nab: Drop unused struct fc_exch *ep in ft_recv_seq) Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Acked-by: Kiran Patil <Kiran.patil@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-04-06Merge branches 'idle-fix' and 'misc' into releaseLen Brown
2012-04-06MAINTAINERS: Update git url for ACPIIgor Murzov
Signed-off-by: Igor Murzov <e-mail@date.by> Signed-off-by: Len Brown <len.brown@intel.com>
2012-04-06Merge branch 'stable' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile Pull arch/tile bug fixes from Chris Metcalf: "This includes Paul Gortmaker's change to fix the <asm/system.h> disintegration issues on tile, a fix to unbreak the tilepro ethernet driver, and a backlog of bugfix-only changes from internal Tilera development over the last few months. They have all been to LKML and on linux-next for the last few days. The EDAC change to MAINTAINERS is an oddity but discussion on the linux-edac list suggested I ask you to pull that change through my tree since they don't have a tree to pull edac changes from at the moment." * 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile: (39 commits) drivers/net/ethernet/tile: fix netdev_alloc_skb() bombing MAINTAINERS: update EDAC information tilepro ethernet driver: fix a few minor issues tile-srom.c driver: minor code cleanup edac: say "TILEGx" not "TILEPro" for the tilegx edac driver arch/tile: avoid accidentally unmasking NMI-type interrupt accidentally arch/tile: remove bogus performance optimization arch/tile: return SIGBUS for addresses that are unaligned AND invalid arch/tile: fix finv_buffer_remote() for tilegx arch/tile: use atomic exchange in arch_write_unlock() arch/tile: stop mentioning the "kvm" subdirectory arch/tile: export the page_home() function. arch/tile: fix pointer cast in cacheflush.c arch/tile: fix single-stepping over swint1 instructions on tilegx arch/tile: implement panic_smp_self_stop() arch/tile: add "nop" after "nap" to help GX idle power draw arch/tile: use proper memparse() for "maxmem" options arch/tile: fix up locking in pgtable.c slightly arch/tile: don't leak kernel memory when we unload modules arch/tile: fix bug in delay_backoff() ...
2012-04-06Merge tag 'stable/for-linus-3.4-rc1-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen Pull xen fixes from Konrad Rzeszutek Wilk: "Two fixes for regressions: * one is a workaround that will be removed in v3.5 with proper fix in the tip/x86 tree, * the other is to fix drivers to load on PV (a previous patch made them only load in PVonHVM mode). The rest are just minor fixes in the various drivers and some cleanup in the core code." * tag 'stable/for-linus-3.4-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen/pcifront: avoid pci_frontend_enable_msix() falsely returning success xen/pciback: fix XEN_PCI_OP_enable_msix result xen/smp: Remove unnecessary call to smp_processor_id() xen/x86: Workaround 'x86/ioapic: Add register level checks to detect bogus io-apic entries' xen: only check xen_platform_pci_unplug if hvm
2012-04-06Merge tag 'mmc-fixes-for-3.4-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc Pull MMC fixes from Chris Ball: - Disable use of MSI in sdhci-pci, which caused multiple chipsets to stop working in 3.4-rc1. I'll wait to turn this on again until we have a chipset whitelist for it. - Fix a libertas SDIO powered-resume regression introduced in 3.3; thanks to Neil Brown and Rafael Wysocki for this fix. - Fix module reloading on omap_hsmmc. - Stop trusting the spec/card's specified maximum data timeout length, and use three seconds instead. Previously we used 300ms. Also cleanups and fixes for s3c, atmel, sh_mmcif and omap_hsmmc. * tag 'mmc-fixes-for-3.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc: (28 commits) mmc: use really long write timeout to deal with crappy cards mmc: sdhci-dove: Fix compile error by including module.h mmc: Prevent 1.8V switch for SD hosts that don't support UHS modes. Revert "mmc: sdhci-pci: Add MSI support" Revert "mmc: sdhci-pci: add quirks for broken MSI on O2Micro controllers" mmc: core: fix power class selection mmc: omap_hsmmc: fix module re-insertion mmc: omap_hsmmc: convert to module_platform_driver mmc: omap_hsmmc: make it behave well as a module mmc: omap_hsmmc: trivial cleanups mmc: omap_hsmmc: context save after enabling runtime pm mmc: omap_hsmmc: use runtime put sync in probe error patch mmc: sdio: Use empty system suspend/resume callbacks at the bus level mmc: bus: print bus speed mode of UHS-I card mmc: sdhci-pci: add quirks for broken MSI on O2Micro controllers mmc: sh_mmcif: Simplify calculation of mmc->f_min mmc: sh_mmcif: mmc->f_max should be half of the bus clock mmc: sh_mmcif: double clock speed mmc: block: Remove use of mmc_blk_set_blksize mmc: atmel-mci: add support for odd clock dividers ...
2012-04-06Make the "word-at-a-time" helper functions more commonly usableLinus Torvalds
I have a new optimized x86 "strncpy_from_user()" that will use these same helper functions for all the same reasons the name lookup code uses them. This is preparation for that. This moves them into an architecture-specific header file. It's architecture-specific for two reasons: - some of the functions are likely to want architecture-specific implementations. Even if the current code happens to be "generic" in the sense that it should work on any little-endian machine, it's likely that the "multiply by a big constant and shift" implementation is less than optimal for an architecture that has a guaranteed fast bit count instruction, for example. - I expect that if architectures like sparc want to start playing around with this, we'll need to abstract out a few more details (in particular the actual unaligned accesses). So we're likely to have more architecture-specific stuff if non-x86 architectures start using this. (and if it turns out that non-x86 architectures don't start using this, then having it in an architecture-specific header is still the right thing to do, of course) Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-04-06cpuidle: Fix panic in CPU off-lining with no idle driverToshi Kani
Fix a NULL pointer dereference panic in cpuidle_play_dead() during CPU off-lining when no cpuidle driver is registered. A cpuidle driver may be registered at boot-time based on CPU type. This patch allows an off-lined CPU to enter HLT-based idle in this condition. Signed-off-by: Toshi Kani <toshi.kani@hp.com> Cc: Boris Ostrovsky <boris.ostrovsky@amd.com> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Tested-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Len Brown <len.brown@intel.com>
2012-04-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking updates from David Miller: 1) Fix inaccuracies in network driver interface documentation, from Ben Hutchings. 2) Fix handling of negative offsets in BPF JITs, from Jan Seiffert. 3) Compile warning, locking, and refcounting fixes in netfilter's xt_CT, from Pablo Neira Ayuso. 4) phonet sendmsg needs to validate user length just like any other datagram protocol, fix from Sasha Levin. 5) Ipv6 multicast code uses wrong loop index, from RongQing Li. 6) Link handling and firmware fixes in bnx2x driver from Yaniv Rosner and Yuval Mintz. 7) mlx4 erroneously allocates 4 pages at a time, regardless of page size, fix from Thadeu Lima de Souza Cascardo. 8) SCTP socket option wasn't extended in a backwards compatible way, fix from Thomas Graf. 9) Add missing address change event emissions to bonding, from Shlomo Pongratz. 10) /proc/net/dev regressed because it uses a private offset to track where we are in the hash table, but this doesn't track the offset pullback that the seq_file code does resulting in some entries being missed in large dumps. Fix from Eric Dumazet. 11) do_tcp_sendpage() unloads the send queue way too fast, because it invokes tcp_push() when it shouldn't. Let the natural sequence generated by the splice paths, and the assosciated MSG_MORE settings, guide the tcp_push() calls. Otherwise what goes out of TCP is spaghetti and doesn't batch effectively into GSO/TSO clusters. From Eric Dumazet. 12) Once we put a SKB into either the netlink receiver's queue or a socket error queue, it can be consumed and freed up, therefore we cannot touch it after queueing it like that. Fixes from Eric Dumazet. 13) PPP has this annoying behavior in that for every transmit call it immediately stops the TX queue, then calls down into the next layer to transmit the PPP frame. But if that next layer can take it immediately, it just un-stops the TX queue right before returning from the transmit method. Besides being useless work, it makes several facilities unusable, in particular things like the equalizers. Well behaved devices should only stop the TX queue when they really are full, and in PPP's case when it gets backlogged to the downstream device. David Woodhouse therefore fixed PPP to not stop the TX queue until it's downstream can't take data any more. 14) IFF_UNICAST_FLT got accidently lost in some recent stmmac driver changes, re-add. From Marc Kleine-Budde. 15) Fix link flaps in ixgbe, from Eric W. Multanen. 16) Descriptor writeback fixes in e1000e from Matthew Vick. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (47 commits) net: fix a race in sock_queue_err_skb() netlink: fix races after skb queueing doc, net: Update ndo_start_xmit return type and values doc, net: Remove instruction to set net_device::trans_start doc, net: Update netdev operation names doc, net: Update documentation of synchronisation for TX multiqueue doc, net: Remove obsolete reference to dev->poll ethtool: Remove exception to the requirement of holding RTNL lock MAINTAINERS: update for Marvell Ethernet drivers bonding: properly unset current_arp_slave on slave link up phonet: Check input from user before allocating tcp: tcp_sendpages() should call tcp_push() once ipv6: fix array index in ip6_mc_add_src() mlx4: allocate just enough pages instead of always 4 pages stmmac: re-add IFF_UNICAST_FLT for dwmac1000 bnx2x: Clear MDC/MDIO warning message bnx2x: Fix BCM57711+BCM84823 link issue bnx2x: Clear BCM84833 LED after fan failure bnx2x: Fix BCM84833 PHY FW version presentation bnx2x: Fix link issue for BCM8727 boards. ...
2012-04-06xen/pcifront: avoid pci_frontend_enable_msix() falsely returning successJan Beulich
The original XenoLinux code has always had things this way, and for compatibility reasons (in particular with a subsequent pciback adjustment) upstream Linux should behave the same way (allowing for two distinct error indications to be returned by the backend). Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-04-06xen/pciback: fix XEN_PCI_OP_enable_msix resultJan Beulich
Prior to 2.6.19 and as of 2.6.31, pci_enable_msix() can return a positive value to indicate the number of vectors (less than the amount requested) that can be set up for a given device. Returning this as an operation value (secondary result) is fine, but (primary) operation results are expected to be negative (error) or zero (success) according to the protocol. With the frontend fixed to match the XenoLinux behavior, the backend can now validly return zero (success) here, passing the upper limit on the number of vectors in op->value. Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-04-06xen/smp: Remove unnecessary call to smp_processor_id()Srivatsa S. Bhat
There is an extra and unnecessary call to smp_processor_id() in cpu_bringup(). Remove it. Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-04-06xen/x86: Workaround 'x86/ioapic: Add register level checks to detect bogus ↵Konrad Rzeszutek Wilk
io-apic entries' The above mentioned patch checks the IOAPIC and if it contains -1, then it unmaps said IOAPIC. But under Xen we get this: BUG: unable to handle kernel NULL pointer dereference at 0000000000000040 IP: [<ffffffff8134e51f>] xen_irq_init+0x1f/0xb0 PGD 0 Oops: 0002 [#1] SMP CPU 0 Modules linked in: Pid: 1, comm: swapper/0 Not tainted 3.2.10-3.fc16.x86_64 #1 Dell Inc. Inspiron 1525 /0U990C RIP: e030:[<ffffffff8134e51f>] [<ffffffff8134e51f>] xen_irq_init+0x1f/0xb0 RSP: e02b: ffff8800d42cbb70 EFLAGS: 00010202 RAX: 0000000000000000 RBX: 00000000ffffffef RCX: 0000000000000001 RDX: 0000000000000040 RSI: 00000000ffffffef RDI: 0000000000000001 RBP: ffff8800d42cbb80 R08: ffff8800d6400000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 00000000ffffffef R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000000010 FS: 0000000000000000(0000) GS:ffff8800df5fe000(0000) knlGS:0000000000000000 CS: e033 DS: 0000 ES: 0000 CR0:000000008005003b CR2: 0000000000000040 CR3: 0000000001a05000 CR4: 0000000000002660 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process swapper/0 (pid: 1, threadinfo ffff8800d42ca000, task ffff8800d42d0000) Stack: 00000000ffffffef 0000000000000010 ffff8800d42cbbe0 ffffffff8134f157 ffffffff8100a9b2 ffffffff8182ffd1 00000000000000a0 00000000829e7384 0000000000000002 0000000000000010 00000000ffffffff 0000000000000000 Call Trace: [<ffffffff8134f157>] xen_bind_pirq_gsi_to_irq+0x87/0x230 [<ffffffff8100a9b2>] ? check_events+0x12+0x20 [<ffffffff814bab42>] xen_register_pirq+0x82/0xe0 [<ffffffff814bac1a>] xen_register_gsi.part.2+0x4a/0xd0 [<ffffffff814bacc0>] acpi_register_gsi_xen+0x20/0x30 [<ffffffff8103036f>] acpi_register_gsi+0xf/0x20 [<ffffffff8131abdb>] acpi_pci_irq_enable+0x12e/0x202 [<ffffffff814bc849>] pcibios_enable_device+0x39/0x40 [<ffffffff812dc7ab>] do_pci_enable_device+0x4b/0x70 [<ffffffff812dc878>] __pci_enable_device_flags+0xa8/0xf0 [<ffffffff812dc8d3>] pci_enable_device+0x13/0x20 The reason we are dying is b/c the call acpi_get_override_irq() is used, which returns the polarity and trigger for the IRQs. That function calls mp_find_ioapics to get the 'struct ioapic' structure - which along with the mp_irq[x] is used to figure out the default values and the polarity/trigger overrides. Since the mp_find_ioapics now returns -1 [b/c the IOAPIC is filled with 0xffffffff], the acpi_get_override_irq() stops trying to lookup in the mp_irq[x] the proper INT_SRV_OVR and we can't install the SCI interrupt. The proper fix for this is going in v3.5 and adds an x86_io_apic_ops struct so that platforms can override it. But for v3.4 lets carry this work-around. This patch does that by providing a slightly different variant of the fake IOAPIC entries. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-04-06xen: only check xen_platform_pci_unplug if hvmIgor Mammedov
commit b9136d207f08 xen: initialize platform-pci even if xen_emul_unplug=never breaks blkfront/netfront by not loading them because of xen_platform_pci_unplug=0 and it is never set for PV guest. Signed-off-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-04-06net: fix a race in sock_queue_err_skb()Eric Dumazet
As soon as an skb is queued into socket error queue, another thread can consume it, so we are not allowed to reference skb anymore, or risk use after free. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-06netlink: fix races after skb queueingEric Dumazet
As soon as an skb is queued into socket receive_queue, another thread can consume it, so we are not allowed to reference skb anymore, or risk use after free. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-06doc, net: Update ndo_start_xmit return type and valuesBen Hutchings
Commit dc1f8bf68b311b1537cb65893430b6796118498a ('netdev: change transmit to limited range type') changed the required return type and 9a1654ba0b50402a6bd03c7b0fe9b0200a5ea7b1 ('net: Optimize hard_start_xmit() return checking') changed the valid numerical return values. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-06doc, net: Remove instruction to set net_device::trans_startBen Hutchings
Commit 08baf561083bc27a953aa087dd8a664bb2b88e8e ('net: txq_trans_update() helper') made it unnecessary for most drivers to set net_device::trans_start (or netdev_queue::trans_start). Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>