aboutsummaryrefslogtreecommitdiff
path: root/net/netfilter
AgeCommit message (Collapse)Author
2013-07-01netfilter: xt_qtaguid: 3.10 fixesArve Hjønnevåg
Stop using obsolete procfs api. Signed-off-by: Arve Hjønnevåg <arve@android.com>
2013-07-01netfilter: xt_quota2: 3.10 fixes.Arve Hjønnevåg
- Stop using obsolete create_proc_entry api. - Use proc_set_user instead of directly accessing the private structure. Signed-off-by: Arve Hjønnevåg <arve@android.com>
2013-07-01netfilter: xt_qtaguid: fix bad tcp_time_wait sock handlingJP Abgrall
Since (41063e9 ipv4: Early TCP socket demux), skb's can have an sk which is not a struct sock but the smaller struct inet_timewait_sock without an sk->sk_socket. Now we bypass sk_state == TCP_TIME_WAIT Signed-off-by: JP Abgrall <jpa@google.com>
2013-07-01netfilter: qtaguid: rate limit some of the printksJP Abgrall
Some of the printks are in the packet handling path. We now ratelimit the very unlikely errors to avoid kmsg spamming. Signed-off-by: JP Abgrall <jpa@google.com>
2013-07-01netfilter: xt_qtaguid: Allow tracking loopbackJP Abgrall
In the past it would always ignore interfaces with loopback addresses. Now we just treat them like any other. This also helps with writing tests that check for the presence of the qtaguid module. Signed-off-by: JP Abgrall <jpa@google.com>
2013-07-01netfilter: xt_qtaguid: extend iface stat to report protocolsJP Abgrall
In the past the iface_stat_fmt would only show global bytes/packets for the skb-based numbers. For stall detection in userspace, distinguishing tcp vs other protocols makes it easier. Now we report ifname total_skb_rx_bytes total_skb_rx_packets total_skb_tx_bytes total_skb_tx_packets {rx,tx}_{tcp,udp,ohter}_{bytes,packets} Bug: 6818637 Signed-off-by: JP Abgrall <jpa@google.com>
2013-07-01netfilter: xt_qtaguid: remove AID_* dependency for access controlJP Abgrall
qtaguid limits what can be done with /ctrl and /stats based on group membership. This changes removes AID_NET_BW_STATS and AID_NET_BW_ACCT, and picks up the groups from the gid of the matching proc entry files. Signed-off-by: JP Abgrall <jpa@google.com> Change-Id: I42e477adde78a12ed5eb58fbc0b277cdaadb6f94
2013-07-01netfilter: qtaguid: Don't BUG_ON if create_if_tag_stat failsPontus Fuchs
If create_if_tag_stat fails to allocate memory (GFP_ATOMIC) the following will happen: qtaguid: iface_stat: tag stat alloc failed ... kernel BUG at xt_qtaguid.c:1482! Signed-off-by: Pontus Fuchs <pontus.fuchs@gmail.com>
2013-07-01netfilter: xt_qtaguid: fix error exit that would keep a spinlock.JP Abgrall
qtudev_open() could return with a uid_tag_data_tree_lock held when an kzalloc(..., GFP_ATOMIC) would fail. Very unlikely to get triggered AND survive the mayhem of running out of mem. Signed-off-by: JP Abgrall <jpa@google.com>
2013-07-01netfilter: xt_qtaguid: report only uid tags to non-privileged processesJP Abgrall
In the past, a process could only see its own stats (uid-based summary, and details). Now we allow any process to see other UIDs uid-based stats, but still hide the detailed stats. Change-Id: I7666961ed244ac1d9359c339b048799e5db9facc Signed-off-by: JP Abgrall <jpa@google.com>
2013-07-01netfilter: xt_IDLETIMER: Rename INTERFACE to LABEL in netlink notification.Ashish Sharma
Change-Id: Iaeca5dd2d7878c0733923ae03309a2a7b86979ca Signed-off-by: Ashish Sharma <ashishsharma@google.com>
2013-07-01netfilter: xt_qtaguid: start tracking iface rx/tx at low levelJP Abgrall
qtaguid tracks the device stats by monitoring when it goes up and down, then it gets the dev_stats(). But devs don't correctly report stats (either they don't count headers symmetrically between rx/tx, or they count internal control messages). Now qtaguid counts the rx/tx bytes/packets during raw:prerouting and mangle:postrouting (nat is not available in ipv6). The results are in /proc/net/xt_qtaguid/iface_stat_fmt which outputs a format line (bash expansion): ifname total_skb_{rx,tx}_{bytes,packets} Added event counters for pre/post handling. Added extra ctrl_*() pid/uid debugging. Change-Id: Id84345d544ad1dd5f63e3842cab229e71d339297 Signed-off-by: JP Abgrall <jpa@google.com>
2013-07-01netfilter: xt_IDLETIMER: Add new netlink msg typeJP Abgrall
Send notifications when the label becomes active after an idle period. Send netlink message notifications in addition to sysfs notifications. Using a uevent with subsystem=xt_idletimer INTERFACE=... STATE={active,inactive} This is backport from common android-3.0 commit: beb914e987cbbd368988d2b94a6661cb907c4d5a with uevent support instead of a new netlink message type. Change-Id: I31677ef00c94b5f82c8457e5bf9e5e584c23c523 Signed-off-by: Ashish Sharma <ashishsharma@google.com> Signed-off-by: JP Abgrall <jpa@google.com>
2013-07-01netfilter: xt_qtaguid: fix ipv6 protocol lookupJP Abgrall
When updating the stats for a given uid it would incorrectly assume IPV4 and pick up the wrong protocol when IPV6. Change-Id: Iea4a635012b4123bf7aa93809011b7b2040bb3d5 Signed-off-by: JP Abgrall <jpa@google.com>
2013-07-01netfilter: qtaguid: initialize a local var to keep compiler happy.JP Abgrall
There was a case that might have seemed like new_tag_stat was not initialized and actually used. Added comment explaining why it was impossible, and a BUG() in case the logic gets changed. Change-Id: I1eddd1b6f754c08a3bf89f7e9427e5dce1dfb081 Signed-off-by: JP Abgrall <jpa@google.com>
2013-07-01netfilter: fixup the quota2, and enable.JP Abgrall
The xt_quota2 came from http://sourceforge.net/projects/xtables-addons/develop It needed tweaking for it to compile within the kernel tree. Fixed kmalloc() and create_proc_entry() invocations within a non-interruptible context. Removed useless copying of current quota back to the iptable's struct matchinfo: - those are per CPU: they will change randomly based on which cpu gets to update the value. - they prevent matching a rule: e.g. -A chain -m quota2 --name q1 --quota 123 can't be followed by -D chain -m quota2 --name q1 --quota 123 as the 123 will be compared to the struct matchinfo's quota member. Use the NETLINK NETLINK_NFLOG family to log a single message when the quota limit is reached. It uses the same packet type as ipt_ULOG, but - never copies skb data, - uses 112 as the event number (ULOG's +1) It doesn't log if the module param "event_num" is 0. Change-Id: I021d3b743db3b22158cc49acb5c94d905b501492 Signed-off-by: JP Abgrall <jpa@google.com>
2013-07-01netfilter: adding the original quota2 from xtables-addonsJP Abgrall
The original xt_quota in the kernel is plain broken: - counts quota at a per CPU level (was written back when ubiquitous SMP was just a dream) - provides no way to count across IPV4/IPV6. This patch is the original unaltered code from: http://sourceforge.net/projects/xtables-addons at commit e84391ce665cef046967f796dd91026851d6bbf3 Change-Id: I19d49858840effee9ecf6cff03c23b45a97efdeb Signed-off-by: JP Abgrall <jpa@google.com>
2013-07-01netfilter: add xt_qtaguid matching moduleJP Abgrall
This module allows tracking stats at the socket level for given UIDs. It replaces xt_owner. If the --uid-owner is not specified, it will just count stats based on who the skb belongs to. This will even happen on incoming skbs as it looks into the skb via xt_socket magic to see who owns it. If an skb is lost, it will be assigned to uid=0. To control what sockets of what UIDs are tagged by what, one uses: echo t $sock_fd $accounting_tag $the_billed_uid \ > /proc/net/xt_qtaguid/ctrl So whenever an skb belongs to a sock_fd, it will be accounted against $the_billed_uid and matching stats will show up under the uid with the given $accounting_tag. Because the number of allocations for the stats structs is not that big: ~500 apps * 32 per app we'll just do it atomic. This avoids walking lists many times, and the fancy worker thread handling. Slabs will grow when needed later. It use netdevice and inetaddr notifications instead of hooks in the core dev code to track when a device comes and goes. This removes the need for exposed iface_stat.h. Put procfs dirs in /proc/net/xt_qtaguid/ ctrl stats iface_stat/<iface>/... The uid stats are obtainable in ./stats. Change-Id: I01af4fd91c8de651668d3decb76d9bdc1e343919 Signed-off-by: JP Abgrall <jpa@google.com>
2013-07-01nf: xt_socket: export the fancy sock finder codeJP Abgrall
The socket matching function has some nifty logic to get the struct sock from the skb or from the connection tracker. We export this so other xt_* can use it, similarly to ho how xt_socket uses nf_tproxy_get_sock. Change-Id: I11c58f59087e7f7ae09e4abd4b937cd3370fa2fd Signed-off-by: JP Abgrall <jpa@google.com>
2013-06-24netfilter: ctnetlink: send event when conntrack label was modifiedFlorian Westphal
commit 0ceabd83875b72a29f33db4ab703d6ba40ea4c58 (netfilter: ctnetlink: deliver labels to userspace) sets the event bit when we raced with another packet, instead of raising the event bit when the label bit is set for the first time. commit 9b21f6a90924dfe8e5e686c314ddb441fb06501e (netfilter: ctnetlink: allow userspace to modify labels) forgot to update the event mask in the "conntrack already exists" case. Both issues result in CTA_LABELS attribute not getting included in the conntrack event. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-06-24netfilter: nf_nat_sip: fix manglingBalazs Peter Odor
In (b20ab9c netfilter: nf_ct_helper: better logging for dropped packets) there were some missing brackets around the logging information, thus always returning drop. Closes https://bugzilla.kernel.org/show_bug.cgi?id=60061 Signed-off-by: Balazs Peter Odor <balazs@obiserver.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-06-19ipvs: SCTP ports should be writable in ICMP packetsJulian Anastasov
Make sure that SCTP ports are writable when embedded in ICMP from client, so that ip_vs_nat_icmp can translate them safely. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
2013-06-12netfilter: xt_TCPMSS: Fix missing fragmentation handlingPhil Oester
Similar to commit bc6bcb59 ("netfilter: xt_TCPOPTSTRIP: fix possible mangling beyond packet boundary"), add safe fragment handling to xt_TCPMSS. Signed-off-by: Phil Oester <kernel@linuxace.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-06-12netfilter: xt_TCPMSS: Fix IPv6 default MSS tooPhil Oester
As a followup to commit 409b545a ("netfilter: xt_TCPMSS: Fix violation of RFC879 in absence of MSS option"), John Heffner points out that IPv6 has a higher MTU than IPv4, and thus a higher minimum MSS. Update TCPMSS target to account for this, and update RFC comment. While at it, point to more recent reference RFC1122 instead of RFC879. Signed-off-by: Phil Oester <kernel@linuxace.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-06-11netfilter: xt_TCPOPTSTRIP: don't use tcp_hdr()Pablo Neira Ayuso
In (bc6bcb5 netfilter: xt_TCPOPTSTRIP: fix possible mangling beyond packet boundary), the use of tcp_hdr was introduced. However, we cannot assume that skb->transport_header is set for non-local packets. Cc: Florian Westphal <fw@strlen.de> Reported-by: Phil Oester <kernel@linuxace.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-06-10ipvs: info leak in __ip_vs_get_dest_entries()Dan Carpenter
The entry struct has a 2 byte hole after ->port and another 4 byte hole after ->stats.outpkts. You must have CAP_NET_ADMIN in your namespace to hit this information leak. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-06-07netfilter: nfnetlink_queue: fix missing HW protocolPablo Neira Ayuso
Locally generated IPv4 and IPv6 traffic gets skb->protocol unset, thus passing zero. ip6tables -I OUTPUT -j NFQUEUE libmnl/examples/netfilter# ./nf-queue 0 & ping6 ::1 packet received (id=1 hw=0x0000 hook=3) ^^^^^^ Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-06-05netfilter: xt_TCPMSS: Fix violation of RFC879 in absence of MSS optionPhil Oester
The clamp-mss-to-pmtu option of the xt_TCPMSS target can cause issues connecting to websites if there was no MSS option present in the original SYN packet from the client. In these cases, it may add a MSS higher than the default specified in RFC879. Fix this by never setting a value > 536 if no MSS option was specified by the client. This closes netfilter's bugzilla #662. Signed-off-by: Phil Oester <kernel@linuxace.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-06-05netfilter: nfnetlink_cttimeout: fix incomplete dumping of objectsPablo Neira Ayuso
Fix broken incomplete object dumping if the list of objects does not fit into one single netlink message. Reported-by: Gabriel Lazar <Gabriel.Lazar@com.utcluj.ro> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-06-05netfilter: nfnetlink_acct: fix incomplete dumping of objectsPablo Neira Ayuso
Fix broken incomplete object dumping if the list of objects does not fit into one single netlink message. Reported-by: Gabriel Lazar <Gabriel.Lazar@com.utcluj.ro> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-05-29ipvs: ip_vs_sh: fix buildJan Beulich
kfree_rcu() requires offsetof(..., rcu_head) < 4096, which can get violated with a sufficiently high CONFIG_IP_VS_SH_TAB_BITS. Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-05-29netfilter: xt_LOG: fix mark logging for IPv6 packetsMichal Kubeček
In dump_ipv6_packet(), the "recurse" parameter is zero only if dumping contents of a packet embedded into an ICMPv6 error message. Therefore we want to log packet mark if recurse is non-zero, not when it is zero. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-05-27ipvs: Fix reuse connection if real server is deadGrzegorz Lyczba
Expire cached connection for new TCP/SCTP connection if real server is down. Otherwise, IPVS uses the dead server for the reused connection, instead of a new working one. Signed-off-by: Grzegorz Lyczba <grzegorz.lyczba@gmail.com> Acked-by: Hans Schillstrom <hans@schillstrom.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-05-23netfilter: add nf_ipv6_ops hook to fix xt_addrtype with IPv6Florian Westphal
Quoting https://bugzilla.netfilter.org/show_bug.cgi?id=812: [ ip6tables -m addrtype ] When I tried to use in the nat/PREROUTING it messes up the routing cache even if the rule didn't matched at all. [..] If I remove the --limit-iface-in from the non-working scenario, so just use the -m addrtype --dst-type LOCAL it works! This happens when LOCAL type matching is requested with --limit-iface-in, and the default ipv6 route is via the interface the packet we test arrived on. Because xt_addrtype uses ip6_route_output, the ipv6 routing implementation creates an unwanted cached entry, and the packet won't make it to the real/expected destination. Silently ignoring --limit-iface-in makes the routing work but it breaks rule matching (--dst-type LOCAL with limit-iface-in is supposed to only match if the dst address is configured on the incoming interface; without --limit-iface-in it will match if the address is reachable via lo). The test should call ipv6_chk_addr() instead. However, this would add a link-time dependency on ipv6. There are two possible solutions: 1) Revert the commit that moved ipt_addrtype to xt_addrtype, and put ipv6 specific code into ip6t_addrtype. 2) add new "nf_ipv6_ops" struct to register pointers to ipv6 functions. While the former might seem preferable, Pablo pointed out that there are more xt modules with link-time dependeny issues regarding ipv6, so lets go for 2). Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-05-16netfilter: xt_TCPOPTSTRIP: fix possible mangling beyond packet boundaryPablo Neira Ayuso
This target assumes that tcph->doff is well-formed, that may be well not the case. Add extra sanity checkings to avoid possible crash due to read/write out of the real packet boundary. After this patch, the default action on malformed TCP packets is to drop them. Moreover, fragments are skipped. Reported-by: Rafal Kupka <rkupka@telemetry.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-05-15netfilter: log: netns NULL ptr bug when calling from conntrackHans Schillstrom
Since (69b34fb netfilter: xt_LOG: add net namespace support for xt_LOG), we hit this: [ 4224.708977] BUG: unable to handle kernel NULL pointer dereference at 0000000000000388 [ 4224.709074] IP: [<ffffffff8147f699>] ipt_log_packet+0x29/0x270 when callling log functions from conntrack both in and out are NULL i.e. the net pointer is invalid. Adding struct net *net in call to nf_logfn() will secure that there always is a vaild net ptr. Reported as netfilter's bugzilla bug 818: https://bugzilla.netfilter.org/show_bug.cgi?id=818 Reported-by: Ronald <ronald645@gmail.com> Signed-off-by: Hans Schillstrom <hans@schillstrom.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-05-06netfilter: nf_{log,queue}: fix compilation without CONFIG_PROC_FSPablo Neira Ayuso
This patch fixes the following compilation error: net/netfilter/nf_log.c:373:38: error: 'struct netns_nf' has no member named 'proc_netfilter' if procfs is not set. The netns support for nf_log, nfnetlink_log and nfnetlink_queue_core requires CONFIG_PROC_FS in the removal path of their respective /proc interface since net->nf.proc_netfilter is undefined in that case. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Gao feng <gaofeng@cn.fujitsu.com>
2013-05-01Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull VFS updates from Al Viro, Misc cleanups all over the place, mainly wrt /proc interfaces (switch create_proc_entry to proc_create(), get rid of the deprecated create_proc_read_entry() in favor of using proc_create_data() and seq_file etc). 7kloc removed. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (204 commits) don't bother with deferred freeing of fdtables proc: Move non-public stuff from linux/proc_fs.h to fs/proc/internal.h proc: Make the PROC_I() and PDE() macros internal to procfs proc: Supply a function to remove a proc entry by PDE take cgroup_open() and cpuset_open() to fs/proc/base.c ppc: Clean up scanlog ppc: Clean up rtas_flash driver somewhat hostap: proc: Use remove_proc_subtree() drm: proc: Use remove_proc_subtree() drm: proc: Use minor->index to label things, not PDE->name drm: Constify drm_proc_list[] zoran: Don't print proc_dir_entry data in debug reiserfs: Don't access the proc_dir_entry in r_open(), r_start() r_show() proc: Supply an accessor for getting the data from a PDE's parent airo: Use remove_proc_subtree() rtl8192u: Don't need to save device proc dir PDE rtl8187se: Use a dir under /proc/net/r8180/ proc: Add proc_mkdir_data() proc: Move some bits from linux/proc_fs.h to linux/{of.h,signal.h,tty.h} proc: Move PDE_NET() to fs/proc/proc_net.c ...
2013-05-01proc: Supply PDE attribute setting accessor functionsDavid Howells
Supply accessor functions to set attributes in proc_dir_entry structs. The following are supplied: proc_set_size() and proc_set_user(). Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> cc: linuxppc-dev@lists.ozlabs.org cc: linux-media@vger.kernel.org cc: netdev@vger.kernel.org cc: linux-wireless@vger.kernel.org cc: linux-pci@vger.kernel.org cc: netfilter-devel@vger.kernel.org cc: alsa-devel@alsa-project.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-05-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds
Pull networking updates from David Miller: "Highlights (1721 non-merge commits, this has to be a record of some sort): 1) Add 'random' mode to team driver, from Jiri Pirko and Eric Dumazet. 2) Make it so that any driver that supports configuration of multiple MAC addresses can provide the forwarding database add and del calls by providing a default implementation and hooking that up if the driver doesn't have an explicit set of handlers. From Vlad Yasevich. 3) Support GSO segmentation over tunnels and other encapsulating devices such as VXLAN, from Pravin B Shelar. 4) Support L2 GRE tunnels in the flow dissector, from Michael Dalton. 5) Implement Tail Loss Probe (TLP) detection in TCP, from Nandita Dukkipati. 6) In the PHY layer, allow supporting wake-on-lan in situations where the PHY registers have to be written for it to be configured. Use it to support wake-on-lan in mv643xx_eth. From Michael Stapelberg. 7) Significantly improve firewire IPV6 support, from YOSHIFUJI Hideaki. 8) Allow multiple packets to be sent in a single transmission using network coding in batman-adv, from Martin Hundebøll. 9) Add support for T5 cxgb4 chips, from Santosh Rastapur. 10) Generalize the VXLAN forwarding tables so that there is more flexibility in configurating various aspects of the endpoints. From David Stevens. 11) Support RSS and TSO in hardware over GRE tunnels in bxn2x driver, from Dmitry Kravkov. 12) Zero copy support in nfnelink_queue, from Eric Dumazet and Pablo Neira Ayuso. 13) Start adding networking selftests. 14) In situations of overload on the same AF_PACKET fanout socket, or per-cpu packet receive queue, minimize drop by distributing the load to other cpus/fanouts. From Willem de Bruijn and Eric Dumazet. 15) Add support for new payload offset BPF instruction, from Daniel Borkmann. 16) Convert several drivers over to mdoule_platform_driver(), from Sachin Kamat. 17) Provide a minimal BPF JIT image disassembler userspace tool, from Daniel Borkmann. 18) Rewrite F-RTO implementation in TCP to match the final specification of it in RFC4138 and RFC5682. From Yuchung Cheng. 19) Provide netlink socket diag of netlink sockets ("Yo dawg, I hear you like netlink, so I implemented netlink dumping of netlink sockets.") From Andrey Vagin. 20) Remove ugly passing of rtnetlink attributes into rtnl_doit functions, from Thomas Graf. 21) Allow userspace to be able to see if a configuration change occurs in the middle of an address or device list dump, from Nicolas Dichtel. 22) Support RFC3168 ECN protection for ipv6 fragments, from Hannes Frederic Sowa. 23) Increase accuracy of packet length used by packet scheduler, from Jason Wang. 24) Beginning set of changes to make ipv4/ipv6 fragment handling more scalable and less susceptible to overload and locking contention, from Jesper Dangaard Brouer. 25) Get rid of using non-type-safe NLMSG_* macros and use nlmsg_*() instead. From Hong Zhiguo. 26) Optimize route usage in IPVS by avoiding reference counting where possible, from Julian Anastasov. 27) Convert IPVS schedulers to RCU, also from Julian Anastasov. 28) Support cpu fanouts in xt_NFQUEUE netfilter target, from Holger Eitzenberger. 29) Network namespace support for nf_log, ebt_log, xt_LOG, ipt_ULOG, nfnetlink_log, and nfnetlink_queue. From Gao feng. 30) Implement RFC3168 ECN protection, from Hannes Frederic Sowa. 31) Support several new r8169 chips, from Hayes Wang. 32) Support tokenized interface identifiers in ipv6, from Daniel Borkmann. 33) Use usbnet_link_change() helper in USB net driver, from Ming Lei. 34) Add 802.1ad vlan offload support, from Patrick McHardy. 35) Support mmap() based netlink communication, also from Patrick McHardy. 36) Support HW timestamping in mlx4 driver, from Amir Vadai. 37) Rationalize AF_PACKET packet timestamping when transmitting, from Willem de Bruijn and Daniel Borkmann. 38) Bring parity to what's provided by /proc/net/packet socket dumping and the info provided by netlink socket dumping of AF_PACKET sockets. From Nicolas Dichtel. 39) Fix peeking beyond zero sized SKBs in AF_UNIX, from Benjamin Poirier" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1722 commits) filter: fix va_list build error af_unix: fix a fatal race with bit fields bnx2x: Prevent memory leak when cnic is absent bnx2x: correct reading of speed capabilities net: sctp: attribute printl with __printf for gcc fmt checks netlink: kconfig: move mmap i/o into netlink kconfig netpoll: convert mutex into a semaphore netlink: Fix skb ref counting. net_sched: act_ipt forward compat with xtables mlx4_en: fix a build error on 32bit arches Revert "bnx2x: allow nvram test to run when device is down" bridge: avoid OOPS if root port not found drivers: net: cpsw: fix kernel warn on cpsw irq enable sh_eth: use random MAC address if no valid one supplied 3c509.c: call SET_NETDEV_DEV for all device types (ISA/ISAPnP/EISA) tg3: fix to append hardware time stamping flags unix/stream: fix peeking with an offset larger than data in queue unix/dgram: fix peeking with an offset larger than data in queue unix/dgram: peek beyond 0-sized skbs openvswitch: Remove unneeded ovs_netdev_get_ifindex() ...
2013-04-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c drivers/net/ethernet/emulex/benet/be.h include/net/tcp.h net/mac802154/mac802154.h Most conflicts were minor overlapping stuff. The be2net driver brought in some fixes that added __vlan_put_tag calls, which in net-next take an additional argument. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-29net/netfilter: rename random32() to prandom_u32()Akinobu Mita
Use preferable function name which implies using a pseudo-random number generator. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Patrick McHardy <kaber@trash.net> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29xt_hashlimit: allocate a copy of name explicitly, don't rely on procfs gutsAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-04-29sctp: Correct type and usage of sctp_end_cksum()Simon Horman
Change the type of the crc32 parameter of sctp_end_cksum() from __be32 to __u32 to reflect that fact that it is passed to cpu_to_le32(). There are five in-tree users of sctp_end_cksum(). The following four had warnings flagged by sparse which are no longer present with this change. net/netfilter/ipvs/ip_vs_proto_sctp.c:sctp_nat_csum() net/netfilter/ipvs/ip_vs_proto_sctp.c:sctp_csum_check() net/sctp/input.c:sctp_rcv_checksum() net/sctp/output.c:sctp_packet_transmit() The fifth user is net/netfilter/nf_nat_proto_sctp.c:sctp_manip_pkt(). It has been updated to pass a __u32 instead of a __be32, the value in question was already calculated in cpu byte-order. net/netfilter/nf_nat_proto_sctp.c:sctp_manip_pkt() has also been updated to assign the return value of sctp_end_cksum() directly to a variable of type __le32, matching the type of the return value. Previously the return value was assigned to a variable of type __be32 and then that variable was finally assigned to another variable of type __le32. Problems flagged by sparse. Compile and sparse tested only. Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-04-29netfilter: nfnetlink_queue: avoid expensive gso segmentation and checksum fixupFlorian Westphal
Userspace can now indicate that it can cope with larger-than-mtu sized packets and packets that have invalid ipv4/tcp checksums. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-04-29netfilter: nfnetlink_queue: add skb info attributeFlorian Westphal
Once we allow userspace to receive gso/gro packets, userspace needs to be able to determine when checksums appear to be broken, but are not. NFQA_SKB_CSUMNOTREADY means 'checksums will be fixed in kernel later, pretend they are ok'. NFQA_SKB_GSO could be used for statistics, or to determine when packet size exceeds mtu. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-04-29netfilter: move skb_gso_segment into nfnetlink_queue moduleFlorian Westphal
skb_gso_segment is expensive, so it would be nice if we could avoid it in the future. However, userspace needs to be prepared to receive larger-than-mtu-packets (which will also have incorrect l3/l4 checksums), so we cannot simply remove it. The plan is to add a per-queue feature flag that userspace can set when binding the queue. The problem is that in nf_queue, we only have a queue number, not the queue context/configuration settings. This patch should have no impact other than the skb_gso_segment call now being in a function that has access to the queue config data. A new size attribute in nf_queue_entry is needed so nfnetlink_queue can duplicate the entry of the gso skb when segmenting the skb while also copying the route key. The follow up patch adds switch to disable skb_gso_segment when queue config says so. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-04-29netfilter: nf_queue: move device refcount bump to extra functionFlorian Westphal
required by future patch that will need to duplicate the nf_queue_entry, bumping refcounts of the copy. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-04-29netfilter: ipset: set match: add support to match the countersJozsef Kadlecsik
The new revision of the set match supports to match the counters and to suppress updating the counters at matching too. At the set:list types, the updating of the subcounters can be suppressed as well. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-04-29netfilter: ipset: The list:set type with counter supportJozsef Kadlecsik
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>