aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2013-02-17Linux 3.7.9v3.7.9Greg Kroah-Hartman
2013-02-17efi: Clear EFI_RUNTIME_SERVICES rather than EFI_BOOT by "noefi" boot parameterSatoru Takeuchi
commit 1de63d60cd5b0d33a812efa455d5933bf1564a51 upstream. There was a serious problem in samsung-laptop that its platform driver is designed to run under BIOS and running under EFI can cause the machine to become bricked or can cause Machine Check Exceptions. Discussion about this problem: https://bugs.launchpad.net/ubuntu-cdimage/+bug/1040557 https://bugzilla.kernel.org/show_bug.cgi?id=47121 The patches to fix this problem: efi: Make 'efi_enabled' a function to query EFI facilities 83e68189745ad931c2afd45d8ee3303929233e7f samsung-laptop: Disable on EFI hardware e0094244e41c4d0c7ad69920681972fc45d8ce34 Unfortunately this problem comes back again if users specify "noefi" option. This parameter clears EFI_BOOT and that driver continues to run even if running under EFI. Refer to the document, this parameter should clear EFI_RUNTIME_SERVICES instead. Documentation/kernel-parameters.txt: =============================================================================== ... noefi [X86] Disable EFI runtime services support. ... =============================================================================== Documentation/x86/x86_64/uefi.txt: =============================================================================== ... - If some or all EFI runtime services don't work, you can try following kernel command line parameters to turn off some or all EFI runtime services. noefi turn off all EFI runtime services ... =============================================================================== Signed-off-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com> Link: http://lkml.kernel.org/r/511C2C04.2070108@jp.fujitsu.com Cc: Matt Fleming <matt.fleming@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-17PCI/PM: Clean up PME state when removing a deviceRafael J. Wysocki
commit 249bfb83cf8ba658955f0245ac3981d941f746ee upstream. Devices are added to pci_pme_list when drivers use pci_enable_wake() or pci_wake_from_d3(), but they aren't removed from the list unless the driver explicitly disables wakeup. Many drivers never disable wakeup, so their devices remain on the list even after they are removed, e.g., via hotplug. A subsequent PME poll will oops when it tries to touch the device. This patch disables PME# on a device before removing it, which removes the device from pci_pme_list. This is safe even if the device never had PME# enabled. This oops can be triggered by unplugging a Thunderbolt ethernet adapter on a Macbook Pro, as reported by Daniel below. [bhelgaas: changelog] Reference: http://lkml.kernel.org/r/CAMVG2svG21yiM1wkH4_2pen2n+cr2-Zv7TbH3Gj+8MwevZjDbw@mail.gmail.com Reported-and-tested-by: Daniel J Blueman <daniel@quora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-17x86/xen: don't assume %ds is usable in xen_iret for 32-bit PVOPS.Jan Beulich
commit 13d2b4d11d69a92574a55bfd985cfb0ca77aebdc upstream. This fixes CVE-2013-0228 / XSA-42 Drew Jones while working on CVE-2013-0190 found that that unprivileged guest user in 32bit PV guest can use to crash the > guest with the panic like this: ------------- general protection fault: 0000 [#1] SMP last sysfs file: /sys/devices/vbd-51712/block/xvda/dev Modules linked in: sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 xen_netfront ext4 mbcache jbd2 xen_blkfront dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan] Pid: 1250, comm: r Not tainted 2.6.32-356.el6.i686 #1 EIP: 0061:[<c0407462>] EFLAGS: 00010086 CPU: 0 EIP is at xen_iret+0x12/0x2b EAX: eb8d0000 EBX: 00000001 ECX: 08049860 EDX: 00000010 ESI: 00000000 EDI: 003d0f00 EBP: b77f8388 ESP: eb8d1fe0 DS: 0000 ES: 007b FS: 0000 GS: 00e0 SS: 0069 Process r (pid: 1250, ti=eb8d0000 task=c2953550 task.ti=eb8d0000) Stack: 00000000 0027f416 00000073 00000206 b77f8364 0000007b 00000000 00000000 Call Trace: Code: c3 8b 44 24 18 81 4c 24 38 00 02 00 00 8d 64 24 30 e9 03 00 00 00 8d 76 00 f7 44 24 08 00 00 02 80 75 33 50 b8 00 e0 ff ff 21 e0 <8b> 40 10 8b 04 85 a0 f6 ab c0 8b 80 0c b0 b3 c0 f6 44 24 0d 02 EIP: [<c0407462>] xen_iret+0x12/0x2b SS:ESP 0069:eb8d1fe0 general protection fault: 0000 [#2] ---[ end trace ab0d29a492dcd330 ]--- Kernel panic - not syncing: Fatal exception Pid: 1250, comm: r Tainted: G D --------------- 2.6.32-356.el6.i686 #1 Call Trace: [<c08476df>] ? panic+0x6e/0x122 [<c084b63c>] ? oops_end+0xbc/0xd0 [<c084b260>] ? do_general_protection+0x0/0x210 [<c084a9b7>] ? error_code+0x73/ ------------- Petr says: " I've analysed the bug and I think that xen_iret() cannot cope with mangled DS, in this case zeroed out (null selector/descriptor) by either xen_failsafe_callback() or RESTORE_REGS because the corresponding LDT entry was invalidated by the reproducer. " Jan took a look at the preliminary patch and came up a fix that solves this problem: "This code gets called after all registers other than those handled by IRET got already restored, hence a null selector in %ds or a non-null one that got loaded from a code or read-only data descriptor would cause a kernel mode fault (with the potential of crashing the kernel as a whole, if panic_on_oops is set)." The way to fix this is to realize that the we can only relay on the registers that IRET restores. The two that are guaranteed are the %cs and %ss as they are always fixed GDT selectors. Also they are inaccessible from user mode - so they cannot be altered. This is the approach taken in this patch. Another alternative option suggested by Jan would be to relay on the subtle realization that using the %ebp or %esp relative references uses the %ss segment. In which case we could switch from using %eax to %ebp and would not need the %ss over-rides. That would also require one extra instruction to compensate for the one place where the register is used as scaled index. However Andrew pointed out that is too subtle and if further work was to be done in this code-path it could escape folks attention and lead to accidents. Reviewed-by: Petr Matousek <pmatouse@redhat.com> Reported-by: Petr Matousek <pmatouse@redhat.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-17x86/mm: Check if PUD is large when validating a kernel addressMel Gorman
commit 0ee364eb316348ddf3e0dfcd986f5f13f528f821 upstream. A user reported the following oops when a backup process reads /proc/kcore: BUG: unable to handle kernel paging request at ffffbb00ff33b000 IP: [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110 [...] Call Trace: [<ffffffff811b8aaa>] read_kcore+0x17a/0x370 [<ffffffff811ad847>] proc_reg_read+0x77/0xc0 [<ffffffff81151687>] vfs_read+0xc7/0x130 [<ffffffff811517f3>] sys_read+0x53/0xa0 [<ffffffff81449692>] system_call_fastpath+0x16/0x1b Investigation determined that the bug triggered when reading system RAM at the 4G mark. On this system, that was the first address using 1G pages for the virt->phys direct mapping so the PUD is pointing to a physical address, not a PMD page. The problem is that the page table walker in kern_addr_valid() is not checking pud_large() and treats the physical address as if it was a PMD. If it happens to look like pmd_none then it'll silently fail, probably returning zeros instead of real data. If the data happens to look like a present PMD though, it will be walked resulting in the oops above. This patch adds the necessary pud_large() check. Unfortunately the problem was not readily reproducible and now they are running the backup program without accessing /proc/kcore so the patch has not been validated but I think it makes sense. Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.coM> Reviewed-by: Michal Hocko <mhocko@suse.cz> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20130211145236.GX21389@suse.de Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-17x86/apic: Work around boot failure on HP ProLiant DL980 G7 Server systemsStoney Wang
commit cb214ede7657db458fd0b2a25ea0b28dbf900ebc upstream. When a HP ProLiant DL980 G7 Server boots a regular kernel, there will be intermittent lost interrupts which could result in a hang or (in extreme cases) data loss. The reason is that this system only supports x2apic physical mode, while the kernel boots with a logical-cluster default setting. This bug can be worked around by specifying the "x2apic_phys" or "nox2apic" boot option, but we want to handle this system without requiring manual workarounds. The BIOS sets ACPI_FADT_APIC_PHYSICAL in FADT table. As all apicids are smaller than 255, BIOS need to pass the control to the OS with xapic mode, according to x2apic-spec, chapter 2.9. Current code handle x2apic when BIOS pass with xapic mode enabled: When user specifies x2apic_phys, or FADT indicates PHYSICAL: 1. During madt oem check, apic driver is set with xapic logical or xapic phys driver at first. 2. enable_IR_x2apic() will enable x2apic_mode. 3. if user specifies x2apic_phys on the boot line, x2apic_phys_probe() will install the correct x2apic phys driver and use x2apic phys mode. Otherwise it will skip the driver will let x2apic_cluster_probe to take over to install x2apic cluster driver (wrong one) even though FADT indicates PHYSICAL, because x2apic_phys_probe does not check FADT PHYSICAL. Add checking x2apic_fadt_phys in x2apic_phys_probe() to fix the problem. Signed-off-by: Stoney Wang <song-bo.wang@hp.com> [ updated the changelog and simplified the code ] Signed-off-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1360263182-16226-1-git-send-email-yinghai@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-17x86: Do not leak kernel page mapping locationsKees Cook
commit e575a86fdc50d013bf3ad3aa81d9100e8e6cc60d upstream. Without this patch, it is trivial to determine kernel page mappings by examining the error code reported to dmesg[1]. Instead, declare the entire kernel memory space as a violation of a present page. Additionally, since show_unhandled_signals is enabled by default, switch branch hinting to the more realistic expectation, and unobfuscate the setting of the PF_PROT bit to improve readability. [1] http://vulnfactory.org/blog/2013/02/06/a-linux-memory-trick/ Reported-by: Dan Rosenberg <dan.j.rosenberg@gmail.com> Suggested-by: Brad Spengler <spender@grsecurity.net> Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: H. Peter Anvin <hpa@zytor.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20130207174413.GA12485@www.outflux.net Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-17s390/timer: avoid overflow when programming clock comparatorHeiko Carstens
commit d911e03d097bdc01363df5d81c43f69432eb785c upstream. Since ed4f209 "s390/time: fix sched_clock() overflow" a new helper function is used to avoid overflows when converting TOD format values to nanosecond values. The kvm interrupt code formerly however only worked by accident because of an overflow. It tried to program a timer that would expire in more than ~29 years. Because of the old TOD-to-nanoseconds overflow bug the real expiry value however was much smaller, but now it isn't anymore. This however triggers yet another bug in the function that programs the clock comparator s390_next_ktime(): if the absolute "expires" value is after 2042 this will result in an overflow and the programmed value is lower than the current TOD value which immediatly triggers a clock comparator (= timer) interrupt. Since the timer isn't expired it will be programmed immediately again and so on... the result is a dead system. To fix this simply program the maximum possible value if an overflow is detected. Reported-by: Christian Borntraeger <borntraeger@de.ibm.com> Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-17mm: don't overwrite mm->def_flags in do_mlockall()Gerald Schaefer
commit 9977f0f164d46613288e0b5778eae500dfe06f31 upstream. With commit 8e72033f2a48 ("thp: make MADV_HUGEPAGE check for mm->def_flags") the VM_NOHUGEPAGE flag may be set on s390 in mm->def_flags for certain processes, to prevent future thp mappings. This would be overwritten by do_mlockall(), which sets it back to 0 with an optional VM_LOCKED flag set. To fix this, instead of overwriting mm->def_flags in do_mlockall(), only the VM_LOCKED flag should be set or cleared. Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Reported-by: Vivek Goyal <vgoyal@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-17drivers/rtc/rtc-pl031.c: restore ST variant functionalityLinus Walleij
commit 3399cfb5df9594495b876d1843a7165f77366b2b upstream. Commit e7e034e18a0a ("drivers/rtc/rtc-pl031.c: fix the missing operation on enable") accidentally broke the ST variants of PL031. The bit that is being poked as "clockwatch" enable bit for the ST variants does the work of bit 0 on this variant. Bit 0 is used for a clock divider on the ST variants, and setting it to 1 will affect timekeeping in a very bad way. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Haojian Zhuang <haojian.zhuang@gmail.com> Cc: Mian Yousaf KAUKAB <mian.yousaf.kaukab@stericsson.com> Cc: Srinidhi Kasagar <srinidhi.kasagar@stericsson.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-17Revert: xfs: fix _xfs_buf_find oops on blocks beyond the filesystem endGreg Kroah-Hartman
This reverts commit a56040731e5b00081c6d6c26b99e6e257a5d63d7 which was commit eb178619f930fa2ba2348de332a1ff1c66a31424 upstream. It has been reported to cause problems: http://bugzilla.redhat.com/show_bug.cgi?id=909602 Acked-by: Ben Myers <bpm@sgi.com> Acked-by: Dave Chinner <dchinner@redhat.com> Cc: Brian Foster <bfoster@redhat.com> Cc: CAI Qian <caiqian@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14Linux 3.7.8v3.7.8Greg Kroah-Hartman
2013-02-14drm/nouveau: add lockdep annotationsMarcin Slusarz
commit 5f97ab913cf0fbc378ea8ffc3ee66f4890d11c55 upstream. 1) Lockdep thinks all nouveau subdevs belong to the same class and can be locked in arbitrary order, which is not true (at least in general case). Tell it to distinguish subdevs by (o)class type. 2) DRM client can be locked under user client lock - tell lockdep to put DRM client lock in a separate class. Reported-by: Arend van Spriel <arend@broadcom.com> Reported-by: Peter Hurley <peter@hurleysoftware.com> Reported-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Reported-by: Daniel J Blueman <daniel@quora.org> Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14net: splice: fix __splice_segment()Eric Dumazet
[ Upstream commit bc9540c637c3d8712ccbf9dcf28621f380ed5e64 ] commit 9ca1b22d6d2 (net: splice: avoid high order page splitting) forgot that skb->head could need a copy into several page frags. This could be the case for loopback traffic mostly. Also remove now useless skb argument from linear_to_page() and __splice_segment() prototypes. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willy Tarreau <w@1wt.eu> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14net: splice: avoid high order page splittingEric Dumazet
[ Upstream commit 82bda6195615891181115f579a480aa5001ce7e9 ] splice() can handle pages of any order, but network code tries hard to split them in PAGE_SIZE units. Not quite successfully anyway, as __splice_segment() assumed poff < PAGE_SIZE. This is true for the skb->data part, not necessarily for the fragments. This patch removes this logic to give the pages as they are in the skb. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willy Tarreau <w@1wt.eu> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14tcp: fix splice() and tcp collapsing interactionEric Dumazet
[ Upstream commit f26845b43c75d3f32f98d194c1327b5b1e6b3fb0 ] Under unusual circumstances, TCP collapse can split a big GRO TCP packet while its being used in a splice(socket->pipe) operation. skb_splice_bits() releases the socket lock before calling splice_to_pipe(). [ 1081.353685] WARNING: at net/ipv4/tcp.c:1330 tcp_cleanup_rbuf+0x4d/0xfc() [ 1081.371956] Hardware name: System x3690 X5 -[7148Z68]- [ 1081.391820] cleanup rbuf bug: copied AD3BCF1 seq AD370AF rcvnxt AD3CF13 To fix this problem, we must eat skbs in tcp_recv_skb(). Remove the inline keyword from tcp_recv_skb() definition since it has three call sites. Reported-by: Christian Becker <c.becker@traviangames.com> Cc: Willy Tarreau <w@1wt.eu> Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Willy Tarreau <w@1wt.eu> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14tcp: splice: fix an infinite loop in tcp_read_sock()Eric Dumazet
[ Upstream commit ff905b1e4aad8ccbbb0d42f7137f19482742ff07 ] commit 02275a2ee7c0 (tcp: don't abort splice() after small transfers) added a regression. [ 83.843570] INFO: rcu_sched self-detected stall on CPU [ 83.844575] INFO: rcu_sched detected stalls on CPUs/tasks: { 6} (detected by 0, t=21002 jiffies, g=4457, c=4456, q=13132) [ 83.844582] Task dump for CPU 6: [ 83.844584] netperf R running task 0 8966 8952 0x0000000c [ 83.844587] 0000000000000000 0000000000000006 0000000000006c6c 0000000000000000 [ 83.844589] 000000000000006c 0000000000000096 ffffffff819ce2bc ffffffffffffff10 [ 83.844592] ffffffff81088679 0000000000000010 0000000000000246 ffff880c4b9ddcd8 [ 83.844594] Call Trace: [ 83.844596] [<ffffffff81088679>] ? vprintk_emit+0x1c9/0x4c0 [ 83.844601] [<ffffffff815ad449>] ? schedule+0x29/0x70 [ 83.844606] [<ffffffff81537bd2>] ? tcp_splice_data_recv+0x42/0x50 [ 83.844610] [<ffffffff8153beaa>] ? tcp_read_sock+0xda/0x260 [ 83.844613] [<ffffffff81537b90>] ? tcp_prequeue_process+0xb0/0xb0 [ 83.844615] [<ffffffff8153c0f0>] ? tcp_splice_read+0xc0/0x250 [ 83.844618] [<ffffffff814dc0c2>] ? sock_splice_read+0x22/0x30 [ 83.844622] [<ffffffff811b820b>] ? do_splice_to+0x7b/0xa0 [ 83.844627] [<ffffffff811ba4bc>] ? sys_splice+0x59c/0x5d0 [ 83.844630] [<ffffffff8119745b>] ? putname+0x2b/0x40 [ 83.844633] [<ffffffff8118bcb4>] ? do_sys_open+0x174/0x1e0 [ 83.844636] [<ffffffff815b6202>] ? system_call_fastpath+0x16/0x1b if recv_actor() returns 0, we should stop immediately, because looping wont give a chance to drain the pipe. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willy Tarreau <w@1wt.eu> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14tcp: don't abort splice() after small transfersWilly Tarreau
[ Upstream commit 02275a2ee7c0ea475b6f4a6428f5df592bc9d30b ] TCP coalescing added a regression in splice(socket->pipe) performance, for some workloads because of the way tcp_read_sock() is implemented. The reason for this is the break when (offset + 1 != skb->len). As we released the socket lock, this condition is possible if TCP stack added a fragment to the skb, which can happen with TCP coalescing. So let's go back to the beginning of the loop when this happens, to give a chance to splice more frags per system call. Doing so fixes the issue and makes GRO 10% faster than LRO on CPU-bound splice() workloads instead of the opposite. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14tcp: fix for zero packets_in_flight was too broadIlpo Järvinen
[ Upstream commit 6731d2095bd4aef18027c72ef845ab1087c3ba63 ] There are transients during normal FRTO procedure during which the packets_in_flight can go to zero between write_queue state updates and firing the resulting segments out. As FRTO processing occurs during that window the check must be more precise to not match "spuriously" :-). More specificly, e.g., when packets_in_flight is zero but FLAG_DATA_ACKED is true the problematic branch that set cwnd into zero would not be taken and new segments might be sent out later. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Tested-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14tcp: frto should not set snd_cwnd to 0Eric Dumazet
[ Upstream commit 2e5f421211ff76c17130b4597bc06df4eeead24f ] Commit 9dc274151a548 (tcp: fix ABC in tcp_slow_start()) uncovered a bug in FRTO code : tcp_process_frto() is setting snd_cwnd to 0 if the number of in flight packets is 0. As Neal pointed out, if no packet is in flight we lost our chance to disambiguate whether a loss timeout was spurious. We should assume it was a proper loss. Reported-by: Pasi Kärkkäinen <pasik@iki.fi> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Cc: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14tcp: fix an infinite loop in tcp_slow_start()Eric Dumazet
[ Upstream commit 973ec449bb4f2b8c514bacbcb4d9506fc31c8ce3 ] Since commit 9dc274151a548 (tcp: fix ABC in tcp_slow_start()), a nul snd_cwnd triggers an infinite loop in tcp_slow_start() Avoid this infinite loop and log a one time error for further analysis. FRTO code is suspected to cause this bug. Reported-by: Pasi Kärkkäinen <pasik@iki.fi> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14tcp: detect SYN/data drop when F-RTO is disabledYuchung Cheng
[ Upstream commit 66555e92fb7a619188c02cceae4bbc414f15f96d ] On receiving the SYN-ACK, Fast Open checks icsk_retransmit for SYN retransmission to detect SYN/data drops. But if F-RTO is disabled, icsk_retransmit is reset at step D of tcp_fastretrans_alert() ( under tcp_ack()) before tcp_rcv_fastopen_synack(). The fix is to use total_retrans instead which accounts for SYN retransmission regardless the use of F-RTO. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14net: sctp: sctp_endpoint_free: zero out secret key dataDaniel Borkmann
[ Upstream commit b5c37fe6e24eec194bb29d22fdd55d73bcc709bf ] On sctp_endpoint_destroy, previously used sensitive keying material should be zeroed out before the memory is returned, as we already do with e.g. auth keys when released. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14net: sctp: sctp_setsockopt_auth_key: use kzfree instead of kfreeDaniel Borkmann
[ Upstream commit 6ba542a291a5e558603ac51cda9bded347ce7627 ] In sctp_setsockopt_auth_key, we create a temporary copy of the user passed shared auth key for the endpoint or association and after internal setup, we free it right away. Since it's sensitive data, we should zero out the key before returning the memory back to the allocator. Thus, use kzfree instead of kfree, just as we do in sctp_auth_key_put(). Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14sctp: refactor sctp_outq_teardown to insure proper re-initalizationNeil Horman
[ Upstream commit 2f94aabd9f6c925d77aecb3ff020f1cc12ed8f86 ] Jamie Parsons reported a problem recently, in which the re-initalization of an association (The duplicate init case), resulted in a loss of receive window space. He tracked down the root cause to sctp_outq_teardown, which discarded all the data on an outq during a re-initalization of the corresponding association, but never reset the outq->outstanding_data field to zero. I wrote, and he tested this fix, which does a proper full re-initalization of the outq, fixing this problem, and hopefully future proofing us from simmilar issues down the road. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Reported-by: Jamie Parsons <Jamie.Parsons@metaswitch.com> Tested-by: Jamie Parsons <Jamie.Parsons@metaswitch.com> CC: Jamie Parsons <Jamie.Parsons@metaswitch.com> CC: Vlad Yasevich <vyasevich@gmail.com> CC: "David S. Miller" <davem@davemloft.net> CC: netdev@vger.kernel.org Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14ipv4: Fix route refcount on pmtu discoverySteffen Klassert
[ Upstream commit b44108dbdbaa07c609bb5755e8dd6c2035236251 ] git commit 9cb3a50c (ipv4: Invalidate the socket cached route on pmtu events if possible) introduced a refcount problem. We don't get a refcount on the route if we get it from__sk_dst_get(), but we need one if we want to reuse this route because __sk_dst_set() releases the refcount of the old route. This patch adds proper refcount handling for that case. We introduce a 'new' flag to indicate that we are going to use a new route and we release the old route only if we replace it by a new one. Reported-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14ipv4: Add a socket release callback for datagram socketsSteffen Klassert
[ Upstream commit 8141ed9fcedb278f4a3a78680591bef1e55f75fb ] This implements a socket release callback function to check if the socket cached route got invalid during the time we owned the socket. The function is used from udp, raw and ping sockets. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14ipv4: Invalidate the socket cached route on pmtu events if possibleSteffen Klassert
[ Upstream commit 9cb3a50c5f63ed745702972f66eaee8767659acd ] The route lookup in ipv4_sk_update_pmtu() might return a route different from the route we cached at the socket. This is because standart routes are per cpu, so each cpu has it's own struct rtable. This means that we do not invalidate the socket cached route if the NET_RX_SOFTIRQ is not served by the same cpu that the sending socket uses. As a result, the cached route reused until we disconnect. With this patch we invalidate the socket cached route if possible. If the socket is owened by the user, we can't update the cached route directly. A followup patch will implement socket release callback functions for datagram sockets to handle this case. Reported-by: Yurij M. Plotnikov <Yurij.Plotnikov@oktetlabs.ru> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14ipv6: Add an error handler for icmp6Steffen Klassert
[ Upstream commit 6f809da27c94425e07be4a64d5093e1df95188e9 ] pmtu and redirect events are now handled in the protocols error handler, so add an error handler for icmp6 to do this. It is needed in the case when we have no socket context. Based on a patch by Duan Jiong. Reported-by: Duan Jiong <djduanjiong@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14ipv4: Don't update the pmtu on mtu locked routesSteffen Klassert
[ Upstream commit fa1e492aa3cbafba9f8fc6d05e5b08a3091daf4a ] Routes with locked mtu should not use learned pmtu informations, so do not update the pmtu on these routes. Reported-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14ipv4: Remove output route check in ipv4_mtuSteffen Klassert
[ Upstream commit 38d523e2948162776903349c89d65f7b9370dadb ] The output route check was introduced with git commit 261663b0 (ipv4: Don't use the cached pmtu informations for input routes) during times when we cached the pmtu informations on the inetpeer. Now the pmtu informations are back in the routes, so this check is obsolete. It also had some unwanted side effects, as reported by Timo Teras and Lukas Tribus. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Acked-by: Timo Teräs <timo.teras@iki.fi> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14netback: correct netbk_tx_err to handle wrap around.Ian Campbell
[ Upstream commit b9149729ebdcfce63f853aa54a404c6a8f6ebbf3 ] Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Jan Beulich <JBeulich@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14xen/netback: free already allocated memory on failure in xen_netbk_get_requestsIan Campbell
[ Upstream commit 4cc7c1cb7b11b6f3515bd9075527576a1eecc4aa ] Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14xen/netback: don't leak pages on failure in xen_netbk_tx_check_gop.Matthew Daley
[ Upstream commit 7d5145d8eb2b9791533ffe4dc003b129b9696c48 ] Signed-off-by: Matthew Daley <mattjd@gmail.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Jan Beulich <JBeulich@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14xen/netback: shutdown the ring if it contains garbage.Ian Campbell
[ Upstream commit 48856286b64e4b66ec62b94e504d0b29c1ade664 ] A buggy or malicious frontend should not be able to confuse netback. If we spot anything which is not as it should be then shutdown the device and don't try to continue with the ring in a potentially hostile state. Well behaved and non-hostile frontends will not be penalised. As well as making the existing checks for such errors fatal also add a new check that ensures that there isn't an insane number of requests on the ring (i.e. more than would fit in the ring). If the ring contains garbage then previously is was possible to loop over this insane number, getting an error each time and therefore not generating any more pending requests and therefore not exiting the loop in xen_netbk_tx_build_gops for an externded period. Also turn various netdev_dbg calls which no precipitate a fatal error into netdev_err, they are rate limited because the device is shutdown afterwards. This fixes at least one known DoS/softlockup of the backend domain. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Jan Beulich <JBeulich@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14atm/iphase: rename fregt_t -> ffreg_tHeiko Carstens
[ Upstream commit ab54ee80aa7585f9666ff4dd665441d7ce41f1e8 ] We have conflicting type qualifiers for "freg_t" in s390's ptrace.h and the iphase atm device driver, which causes the compile error below. Unfortunately the s390 typedef can't be renamed, since it's a user visible api, nor can I change the include order in s390 code to avoid the conflict. So simply rename the iphase typedef to a new name. Fixes this compile error: In file included from drivers/atm/iphase.c:66:0: drivers/atm/iphase.h:639:25: error: conflicting type qualifiers for 'freg_t' In file included from next/arch/s390/include/asm/ptrace.h:9:0, from next/arch/s390/include/asm/lowcore.h:12, from next/arch/s390/include/asm/thread_info.h:30, from include/linux/thread_info.h:54, from include/linux/preempt.h:9, from include/linux/spinlock.h:50, from include/linux/seqlock.h:29, from include/linux/time.h:5, from include/linux/stat.h:18, from include/linux/module.h:10, from drivers/atm/iphase.c:43: next/arch/s390/include/uapi/asm/ptrace.h:197:3: note: previous declaration of 'freg_t' was here Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: chas williams - CONTRACTOR <chas@cmf.nrl.navy.mil> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14ipv6/ip6_gre: fix error case handling in ip6gre_tunnel_xmit()Tommi Rantala
[ Upstream commit 41ab3e31bd50b42c85ac0aa0469642866aee2a9a ] ip6gre_tunnel_xmit() is leaking the skb when we hit this error branch, and the -1 return value from this function is bogus. Use the error handling we already have in place in ip6gre_tunnel_xmit() for this error case to fix this. Signed-off-by: Tommi Rantala <tt.rantala@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14packet: fix leakage of tx_ring memoryPhil Sutter
[ Upstream commit 9665d5d62487e8e7b1f546c00e11107155384b9a ] When releasing a packet socket, the routine packet_set_ring() is reused to free rings instead of allocating them. But when calling it for the first time, it fills req->tp_block_nr with the value of rb->pg_vec_len which in the second invocation makes it bail out since req->tp_block_nr is greater zero but req->tp_block_size is zero. This patch solves the problem by passing a zeroed auto-variable to packet_set_ring() upon each invocation from packet_release(). As far as I can tell, this issue exists even since 69e3c75 (net: TX_RING and packet mmap), i.e. the original inclusion of TX ring support into af_packet, but applies only to sockets with both RX and TX ring allocated, which is probably why this was unnoticed all the time. Signed-off-by: Phil Sutter <phil.sutter@viprinet.com> Cc: Johann Baudy <johann.baudy@gnu-log.net> Cc: Daniel Borkmann <dborkman@redhat.com> Acked-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14via-rhine: Fix bugs in NAPI support.David S. Miller
[ Upstream commit 559bcac35facfed49ab4f408e162971612dcfdf3 ] 1) rhine_tx() should use dev_kfree_skb() not dev_kfree_skb_irq() 2) rhine_slow_event_task's NAPI triggering logic is racey, it should just hit the interrupt mask register. This is the same as commit 7dbb491878a2c51d372a8890fa45a8ff80358af1 ("r8169: avoid NAPI scheduling delay.") made to fix the same problem in the r8169 driver. From Francois Romieu. Reported-by: Jamie Gloudon <jamie.gloudon@gmail.com> Tested-by: Jamie Gloudon <jamie.gloudon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14ipv6: do not create neighbor entries for local deliveryMarcelo Ricardo Leitner
[ Upstream commit bd30e947207e2ea0ff2c08f5b4a03025ddce48d3 ] They will be created at output, if ever needed. This avoids creating empty neighbor entries when TPROXYing/Forwarding packets for addresses that are not even directly reachable. Note that IPv4 already handles it this way. No neighbor entries are created for local input. Tested by myself and customer. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14pktgen: correctly handle failures when adding a deviceCong Wang
[ Upstream commit 604dfd6efc9b79bce432f2394791708d8e8f6efc ] The return value of pktgen_add_device() is not checked, so even if we fail to add some device, for example, non-exist one, we still see "OK:...". This patch fixes it. After this patch, I got: # echo "add_device non-exist" > /proc/net/pktgen/kpktgend_0 -bash: echo: write error: No such device # cat /proc/net/pktgen/kpktgend_0 Running: Stopped: Result: ERROR: can not add device non-exist # echo "add_device eth0" > /proc/net/pktgen/kpktgend_0 # cat /proc/net/pktgen/kpktgend_0 Running: Stopped: eth0 Result: OK: add_device=eth0 (Candidate for -stable) Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14IP_GRE: Fix kernel panic in IP_GRE with GRE csum.Pravin B Shelar
[ Upstream commit 5465740ace36f179de5bb0ccb5d46ddeb945e309 ] Due to IP_GRE GSO support, GRE can recieve non linear skb which results in panic in case of GRE_CSUM. Following patch fixes it by using correct csum API. Bug introduced in commit 6b78f16e4bdde3936b (gre: add GSO support) Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14net: loopback: fix a dst refcounting issueEric Dumazet
[ Upstream commit 794ed393b707f01858f5ebe2ae5eabaf89d00022 ] Ben Greear reported crashes in ip_rcv_finish() on a stress test involving many macvlans. We tracked the bug to a dst use after free. ip_rcv_finish() was calling dst->input() and got garbage for dst->input value. It appears the bug is in loopback driver, lacking a skb_dst_force() before calling netif_rx(). As a result, a non refcounted dst, normally protected by a RCU read_lock section, was escaping this section and could be freed before the packet being processed. [<ffffffff813a3c4d>] loopback_xmit+0x64/0x83 [<ffffffff81477364>] dev_hard_start_xmit+0x26c/0x35e [<ffffffff8147771a>] dev_queue_xmit+0x2c4/0x37c [<ffffffff81477456>] ? dev_hard_start_xmit+0x35e/0x35e [<ffffffff8148cfa6>] ? eth_header+0x28/0xb6 [<ffffffff81480f09>] neigh_resolve_output+0x176/0x1a7 [<ffffffff814ad835>] ip_finish_output2+0x297/0x30d [<ffffffff814ad6d5>] ? ip_finish_output2+0x137/0x30d [<ffffffff814ad90e>] ip_finish_output+0x63/0x68 [<ffffffff814ae412>] ip_output+0x61/0x67 [<ffffffff814ab904>] dst_output+0x17/0x1b [<ffffffff814adb6d>] ip_local_out+0x1e/0x23 [<ffffffff814ae1c4>] ip_queue_xmit+0x315/0x353 [<ffffffff814adeaf>] ? ip_send_unicast_reply+0x2cc/0x2cc [<ffffffff814c018f>] tcp_transmit_skb+0x7ca/0x80b [<ffffffff814c3571>] tcp_connect+0x53c/0x587 [<ffffffff810c2f0c>] ? getnstimeofday+0x44/0x7d [<ffffffff810c2f56>] ? ktime_get_real+0x11/0x3e [<ffffffff814c6f9b>] tcp_v4_connect+0x3c2/0x431 [<ffffffff814d6913>] __inet_stream_connect+0x84/0x287 [<ffffffff814d6b38>] ? inet_stream_connect+0x22/0x49 [<ffffffff8108d695>] ? _local_bh_enable_ip+0x84/0x9f [<ffffffff8108d6c8>] ? local_bh_enable+0xd/0x11 [<ffffffff8146763c>] ? lock_sock_nested+0x6e/0x79 [<ffffffff814d6b38>] ? inet_stream_connect+0x22/0x49 [<ffffffff814d6b49>] inet_stream_connect+0x33/0x49 [<ffffffff814632c6>] sys_connect+0x75/0x98 This bug was introduced in linux-2.6.35, in commit 7fee226ad2397b (net: add a noref bit on skb dst) skb_dst_force() is enforced in dev_queue_xmit() for devices having a qdisc. Reported-by: Ben Greear <greearb@candelatech.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Ben Greear <greearb@candelatech.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14r8169: remove the obsolete and incorrect AMD workaroundTimo Teräs
[ Upstream commit 5d0feaff230c0abfe4a112e6f09f096ed99e0b2d ] This was introduced in commit 6dccd16 "r8169: merge with version 6.001.00 of Realtek's r8169 driver". I did not find the version 6.001.00 online, but in 6.002.00 or any later r8169 from Realtek this hunk is no longer present. Also commit 05af214 "r8169: fix Ethernet Hangup for RTL8110SC rev d" claims to have fixed this issue otherwise. The magic compare mask of 0xfffe000 is dubious as it masks parts of the Reserved part, and parts of the VLAN tag. But this does not make much sense as the VLAN tag parts are perfectly valid there. In matter of fact this seems to be triggered with any VLAN tagged packet as RxVlanTag bit is matched. I would suspect 0xfffe0000 was intended to test reserved part only. Finally, this hunk is evil as it can cause more packets to be handled than what was NAPI quota causing net/core/dev.c: net_rx_action(): WARN_ON_ONCE(work > weight) to trigger, and mess up the NAPI state causing device to hang. As result, any system using VLANs and having high receive traffic (so that NAPI poll budget limits rtl_rx) would result in device hang. Signed-off-by: Timo Teräs <timo.teras@iki.fi> Acked-by: Francois Romieu <romieu@fr.zoreil.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14netxen: fix off by one bug in netxen_release_tx_buffer()Eric Dumazet
[ Upstream commit a05948f296ce103989b28a2606e47d2e287c3c89 ] Christoph Paasch found netxen could trigger a BUG in its dismantle phase, in netxen_release_tx_buffer(), using full size TSO packets. cmd_buf->frag_count includes the skb->data part, so the loop must start at index 1 instead of 0, or else we can make an out of bound access to cmd_buff->frag_array[MAX_SKB_FRAGS + 2] Christoph provided the fixes in netxen_map_tx_skb() function. In case of a dma mapping error, its better to clear the dma fields so that we don't try to unmap them again in netxen_release_tx_buffer() Reported-by: Christoph Paasch <christoph.paasch@uclouvain.be> Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Christoph Paasch <christoph.paasch@uclouvain.be> Cc: Sony Chacko <sony.chacko@qlogic.com> Cc: Rajesh Borundia <rajesh.borundia@qlogic.com> Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14isdn/gigaset: fix zero size border case in debug dumpTilman Schmidt
[ Upstream commit d721a1752ba544df8d7d36959038b26bc92bdf80 ] If subtracting 12 from l leaves zero we'd do a zero size allocation, leading to an oops later when we try to set the NUL terminator. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14tcp: fix incorrect LOCKDROPPEDICMPS counterEric Dumazet
[ Upstream commit b74aa930ef49a3c0d8e4c1987f89decac768fb2c ] commit 563d34d057 (tcp: dont drop MTU reduction indications) added an error leading to incorrect accounting of LINUX_MIB_LOCKDROPPEDICMPS If socket is owned by the user, we want to increment this SNMP counter, unless the message is a (ICMP_DEST_UNREACH,ICMP_FRAG_NEEDED) one. Reported-by: Maciej ¯enczykowski <maze@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Signed-off-by: Maciej ¯enczykowski <maze@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14net/mlx4_core: Set number of msix vectors under SRIOV mode to firmware defaultsOr Gerlitz
[ Upstream commit ca4c7b35f75492de7fbf5ee95be07481c348caee ] The lines if (mlx4_is_mfunc(dev)) { nreq = 2; } else { which hard code the number of requested msi-x vectors under multi-function mode to two can be removed completely, since the firmware sets num_eqs and reserved_eqs appropriately Thus, the code line: nreq = min_t(int, dev->caps.num_eqs - dev->caps.reserved_eqs, nreq); is by itself sufficient and correct for all cases. Currently, for mfunc mode num_eqs = 32 and reserved_eqs = 28, hence four vectors will be enabled. This triples (one vector is used for the async events and commands EQ) the horse power provided for processing of incoming packets on netdev RSS scheme, IO initiators/targets commands processing flows, etc. Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14net/mlx4_en: Fix bridged vSwitch configuration for non SRIOV modeYan Burman
[ Upstream commit 213815a1e6ae70b9648483b110bc5081795f99e8 ] Commit 5b4c4d36860e "mlx4_en: Allow communication between functions on same host" introduced a regression under which a bridge acting as vSwitch whose uplink is an mlx4 Ethernet device become non-operative in native (non sriov) mode. This happens since broadcast ARP requests sent by VMs were loopback-ed by the HW and hence the bridge learned VM source MACs on both the VM and the uplink ports. The fix is to place the DMAC in the send WQE only under SRIOV/eSwitch configuration or when the device is in selftest. Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Yan Burman <yanb@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-14net: calxedaxgmac: throw away overrun framesRob Herring
[ Upstream commit d6fb3be544b46a7611a3373fcaa62b5b0be01888 ] The xgmac driver assumes 1 frame per descriptor. If a frame larger than the descriptor's buffer size is received, the frame will spill over into the next descriptor. So check for received frames that span more than one descriptor and discard them. This prevents a crash if we receive erroneous large packets. Signed-off-by: Rob Herring <rob.herring@calxeda.com> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>