aboutsummaryrefslogtreecommitdiff
path: root/net/core/rtnetlink.c
AgeCommit message (Collapse)Author
2013-04-25net: fix address check in rtnl_fdb_delVlad Yasevich
Commit 6681712d67eef14c4ce793561c3231659153a320 vxlan: generalize forwarding tables relaxed the address checks in rtnl_fdb_del() to use is_zero_ether_addr(). This allows users to add multicast addresses using the fdb API. However, the check in rtnl_fdb_del() still uses a more strict is_valid_ether_addr() which rejects multicast addresses. Thus it is possible to add an fdb that can not be later removed. Relax the check in rtnl_fdb_del() as well. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: drivers/net/ethernet/emulex/benet/be_main.c drivers/net/ethernet/intel/igb/igb_main.c drivers/net/wireless/brcm80211/brcmsmac/mac80211_if.c include/net/scm.h net/batman-adv/routing.c net/ipv4/tcp_input.c The e{uid,gid} --> {uid,gid} credentials fix conflicted with the cleanup in net-next to now pass cred structs around. The be2net driver had a bug fix in 'net' that overlapped with the VLAN interface changes by Patrick McHardy in net-next. An IGB conflict existed because in 'net' the build_skb() support was reverted, and in 'net-next' there was a comment style fix within that code. Several batman-adv conflicts were resolved by making sure that all calls to batadv_is_my_mac() are changed to have a new bat_priv first argument. Eric Dumazet's TS ECR fix in TCP in 'net' conflicted with the F-RTO rewrite in 'net-next', mostly overlapping changes. Thanks to Stephen Rothwell and Antonio Quartulli for help with several of these merge resolutions. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-08rtnetlink: Call nlmsg_parse() with correct header lengthMichael Riesch
Signed-off-by: Michael Riesch <michael.riesch@omicron.at> Cc: "David S. Miller" <davem@davemloft.net> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jiri Benc <jbenc@redhat.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: linux-kernel@vger.kernel.org Acked-by: Mark Rustad <mark.d.rustad@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: net/mac80211/sta_info.c net/wireless/core.h Two minor conflicts in wireless. Overlapping additions of extern declarations in net/wireless/core.h and a bug fix overlapping with the addition of a boolean parameter to __ieee80211_key_free(). Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-29net: rtnetlink: fdb dflt dump must set idx used for cb->arg[0]John Fastabend
In rtnl_fdb_dump() when the fdb_dump ndo op is not populated we never set the idx value so that cb->arg[0] is always 0. Resulting in a endless loop of messages. Introduced with this commit, commit 090096bf3db1c281ddd034573260045888a68fea Author: Vlad Yasevich <vyasevic@redhat.com> Date: Wed Mar 6 15:39:42 2013 +0000 net: generic fdb support for drivers without ndo_fdb_<op> CC: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-28net-next: replace obsolete NLMSG_* with type safe nlmsg_*Hong zhi guo
Signed-off-by: Hong Zhiguo <honkiko@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-27rtnetlink: fix error return code in rtnl_link_fill()Wei Yongjun
Fix to return a negative error code from the error handling case instead of 0(possible overwrite to 0 by ops->fill_xstats call), as returned elsewhere in this function. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-24ipv4: provide addr and netconf dump consistency infoNicolas Dichtel
This patch takes benefit of dev_addr_genid and dev_base_seq to check if a change occurs during a netlink dump. If a change is detected, the flag NLM_F_DUMP_INTR is set in the first message after the dump was interrupted. Note that seq and prev_seq must be reset between each family in rtnl_dump_all() because they are specific to each family. Reported-by: Junwei Zhang <junwei.zhang@6wind.com> Reported-by: Hongjun Li <hongjun.li@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-22rtnetlink: Remove passing of attributes into rtnl_doit functionsThomas Graf
With decnet converted, we can finally get rid of rta_buf and its computations around it. It also gets rid of the minimal header length verification since all message handlers do that explicitly anyway. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Pull in the 'net' tree to get Daniel Borkmann's flow dissector infrastructure change. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-17vxlan: generalize forwarding tablesDavid Stevens
This patch generalizes VXLAN forwarding table entries allowing an administrator to: 1) specify multiple destinations for a given MAC 2) specify alternate vni's in the VXLAN header 3) specify alternate destination UDP ports 4) use multicast MAC addresses as fdb lookup keys 5) specify multicast destinations 6) specify the outgoing interface for forwarded packets The combination allows configuration of more complex topologies using VXLAN encapsulation. Changes since v1: rebase to 3.9.0-rc2 Signed-Off-By: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-17rtnetlink: Mask the rta_type when range checkingVlad Yasevich
Range/validity checks on rta_type in rtnetlink_rcv_msg() do not account for flags that may be set. This causes the function to return -EINVAL when flags are set on the type (for example NLA_F_NESTED). Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: drivers/net/ethernet/intel/e1000e/netdev.c Minor conflict in e1000e, a line that got fixed in 'net' has been removed in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-10rtnl: fix info leak on RTM_GETLINK request for VF devicesMathias Krause
Initialize the mac address buffer with 0 as the driver specific function will probably not fill the whole buffer. In fact, all in-kernel drivers fill only ETH_ALEN of the MAX_ADDR_LEN bytes, i.e. 6 of the 32 possible bytes. Therefore we currently leak 26 bytes of stack memory to userland via the netlink interface. Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-07net: generic fdb support for drivers without ndo_fdb_<op>Vlad Yasevich
If the driver does not support the ndo_op use the generic handler for it. This should work in the majority of cases. Eventually the fdb_dflt_add call gets translated into a __dev_set_rx_mode() call which should handle hardware support for filtering via the IFF_UNICAST_FLT flag. Namely IFF_UNICAST_FLT indicates if the hardware can do unicast address filtering. If no support is available the device is put into promisc mode. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-27hlist: drop the node parameter from iteratorsSasha Levin
I'm not sure why, but the hlist for each entry iterators were conceived list_for_each_entry(pos, head, member) The hlist ones were greedy and wanted an extra parameter: hlist_for_each_entry(tpos, pos, head, member) Why did they need an extra pos parameter? I'm not quite sure. Not only they don't really need it, it also prevents the iterator from looking exactly like the list iterator, which is unfortunate. Besides the semantic patch, there was some manual work required: - Fix up the actual hlist iterators in linux/list.h - Fix up the declaration of other iterators based on the hlist ones. - A very small amount of places were using the 'node' parameter, this was modified to use 'obj->member' instead. - Coccinelle didn't handle the hlist_for_each_entry_safe iterator properly, so those had to be fixed up manually. The semantic patch which is mostly the work of Peter Senna Tschudin is here: @@ iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host; type T; expression a,c,d,e; identifier b; statement S; @@ -T b; <+... when != b ( hlist_for_each_entry(a, - b, c, d) S | hlist_for_each_entry_continue(a, - b, c) S | hlist_for_each_entry_from(a, - b, c) S | hlist_for_each_entry_rcu(a, - b, c, d) S | hlist_for_each_entry_rcu_bh(a, - b, c, d) S | hlist_for_each_entry_continue_rcu_bh(a, - b, c) S | for_each_busy_worker(a, c, - b, d) S | ax25_uid_for_each(a, - b, c) S | ax25_for_each(a, - b, c) S | inet_bind_bucket_for_each(a, - b, c) S | sctp_for_each_hentry(a, - b, c) S | sk_for_each(a, - b, c) S | sk_for_each_rcu(a, - b, c) S | sk_for_each_from -(a, b) +(a) S + sk_for_each_from(a) S | sk_for_each_safe(a, - b, c, d) S | sk_for_each_bound(a, - b, c) S | hlist_for_each_entry_safe(a, - b, c, d, e) S | hlist_for_each_entry_continue_rcu(a, - b, c) S | nr_neigh_for_each(a, - b, c) S | nr_neigh_for_each_safe(a, - b, c, d) S | nr_node_for_each(a, - b, c) S | nr_node_for_each_safe(a, - b, c, d) S | - for_each_gfn_sp(a, c, d, b) S + for_each_gfn_sp(a, c, d) S | - for_each_gfn_indirect_valid_sp(a, c, d, b) S + for_each_gfn_indirect_valid_sp(a, c, d) S | for_each_host(a, - b, c) S | for_each_host_safe(a, - b, c, d) S | for_each_mesh_entry(a, - b, c, d) S ) ...+> [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c] [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c] [akpm@linux-foundation.org: checkpatch fixes] [akpm@linux-foundation.org: fix warnings] [akpm@linux-foudnation.org: redo intrusive kvm changes] Tested-by: Peter Senna Tschudin <peter.senna@gmail.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Gleb Natapov <gleb@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-13bridge: Add vlan support to static neighborsVlad Yasevich
When a user adds bridge neighbors, allow him to specify VLAN id. If the VLAN id is not specified, the neighbor will be added for VLANs currently in the ports filter list. If no VLANs are configured on the port, we use vlan 0 and only add 1 entry. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Acked-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-13bridge: Dump vlan information from a bridge portVlad Yasevich
Using the RTM_GETLINK dump the vlan filter list of a given bridge port. The information depends on setting the filter flag similar to how nic VF info is dumped. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-13bridge: Add netlink interface to configure vlans on bridge portsVlad Yasevich
Add a netlink interface to add and remove vlan configuration on bridge port. The interface uses the RTM_SETLINK message and encodes the vlan configuration inside the IFLA_AF_SPEC. It is possble to include multiple vlans to either add or remove in a single message. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-04netns: fdb: allow unprivileged users to add/del fdb entriesGao feng
Right now,only ixgdb,macvlan,vxlan and bridge implement fdb_add/fdb_del operations. these operations only operate the private data of net device. So allowing the unprivileged users who creates the userns and netns to add/del fdb entries will do no harm to other netns. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-06ethtool: set addr_assign_type to NET_ADDR_SET when addr is passed on createJiri Pirko
In case user passed address via netlink during create, NET_ADDR_PERM was set. That is not correct so fix this by setting NET_ADDR_SET. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-04bonding: remove usage of dev->masterJiri Pirko
Benefit from new upper dev list and free bonding from dev->master usage. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-04rtnetlink: remove usage of dev->masterJiri Pirko
Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-03rtnl: use dev_set_mac_address() instead of plain ndo_Jiri Pirko
Benefit from existence of dev_set_mac_address() and remove duplicate code. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-28rtnl: expose carrier value with possibility to set itJiri Pirko
Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-30rtnelink: remove unused parameter from rtnl_create_link().Rami Rosen
This patch removes an unused parameter (src_net) from rtnl_create_link() method and from the method single invocation, in veth. This parameter was used in the past when calling ops->get_tx_queues(src_net, tb) in rtnl_create_link(). The get_tx_queues() member of rtnl_link_ops was replaced by two methods, get_num_tx_queues() and get_num_rx_queues(), which do not get any parameter. This was done in commit d40156aa5ecbd51fed932ed4813df82b56e5ff4d by Jiri Pirko ("rtnl: allow to specify different num for rx and tx queue count"). Signed-off-by: Rami Rosen <ramirose@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-18net: Enable a userns root rtnl calls that are safe for unprivilged usersEric W. Biederman
- Only allow moving network devices to network namespaces you have CAP_NET_ADMIN privileges over. - Enable creating/deleting/modifying interfaces - Enable adding/deleting addresses - Enable adding/setting/deleting neighbour entries - Enable adding/removing routes - Enable adding/removing fib rules - Enable setting the forwarding state - Enable adding/removing ipv6 address labels - Enable setting bridge parameter Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-18net: Push capable(CAP_NET_ADMIN) into the rtnl methodsEric W. Biederman
- In rtnetlink_rcv_msg convert the capable(CAP_NET_ADMIN) check to ns_capable(net->user-ns, CAP_NET_ADMIN). Allowing unprivileged users to make netlink calls to modify their local network namespace. - In the rtnetlink doit methods add capable(CAP_NET_ADMIN) so that calls that are not safe for unprivileged users are still protected. Later patches will remove the extra capable calls from methods that are safe for unprivilged users. Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c Minor conflict between the BCM_CNIC define removal in net-next and a bug fix added to net. Based upon a conflict resolution patch posted by Stephen Rothwell. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-03net: fix bridge notify hook to manage flags correctlyJohn Fastabend
The bridge notify hook rtnl_bridge_notify() was not handling the case where the master flags was set or with both flags set. First flags are not being passed correctly and second the logic to parse them is broken. This patch passes the original flags value and fixes the logic. Reported-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-03rtnetlink: Use nlmsg type RTM_NEWNEIGH from dflt fdb dumpJohn Fastabend
Change the dflt fdb dump handler to use RTM_NEWNEIGH to be compatible with bridge dump routines. The dump reply from the network driver handlers should match the reply from bridge handler. The fact they were not in the ixgbe case was effectively a bug. This patch resolves it. Applications that were not checking the nlmsg type will continue to work. And now applications that do check the type will work as expected. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-02net: Fix continued iteration in rtnl_bridge_getlink()Ben Hutchings
Commit e5a55a898720096f43bc24938f8875c0a1b34cd7 ('net: create generic bridge ops') broke the handling of a non-zero starting index in rtnl_bridge_getlink() (based on the old br_dump_ifinfo()). When the starting index is non-zero, we need to increment the current index for each entry that we are skipping. Also, we need to check the index before both cases, since we may previously have stopped iteration between getting information about a device from its master and from itself. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Tested-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-10-31ixgbe: add setlink, getlink support to ixgbe and ixgbevfJohn Fastabend
This adds support for the net device ops to manage the embedded hardware bridge on ixgbe devices. With this patch the bridge mode can be toggled between VEB and VEPA to support stacking macvlan devices or using the embedded switch without any SW component in 802.1Qbg/br environments. Additionally, this adds source address pruning to the ixgbevf driver to prune any frames sent back from a reflective relay on the switch. This is required because the existing hardware does not support this. Without it frames get pushed into the stack with its own src mac which is invalid per 802.1Qbg VEPA definition. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-10-31net: set and query VEB/VEPA bridge mode via PF_BRIDGEJohn Fastabend
Hardware switches may support enabling and disabling the loopback switch which puts the device in a VEPA mode defined in the IEEE 802.1Qbg specification. In this mode frames are not switched in the hardware but sent directly to the switch. SR-IOV capable NICs will likely support this mode I am aware of at least two such devices. Also I am told (but don't have any of this hardware available) that there are devices that only support VEPA modes. In these cases it is important at a minimum to be able to query these attributes. This patch adds an additional IFLA_BRIDGE_MODE attribute that can be set and dumped via the PF_BRIDGE:{SET|GET}LINK operations. Also anticipating bridge attributes that may be common for both embedded bridges and software bridges this adds a flags attribute IFLA_BRIDGE_FLAGS currently used to determine if the command or event is being generated to/from an embedded bridge or software bridge. Finally, the event generation is pulled out of the bridge module and into rtnetlink proper. For example using the macvlan driver in VEPA mode on top of an embedded switch requires putting the embedded switch into a VEPA mode to get the expected results. -------- -------- | VEPA | | VEPA | <-- macvlan vepa edge relays -------- -------- | | | | ------------------ | VEPA | <-- embedded switch in NIC ------------------ | | ------------------- | external switch | <-- shiny new physical ------------------- switch with VEPA support A packet sent from the macvlan VEPA at the top could be loopbacked on the embedded switch and never seen by the external switch. So in order for this to work the embedded switch needs to be set in the VEPA state via the above described commands. By making these attributes nested in IFLA_AF_SPEC we allow future extensions to be made as needed. CC: Lennert Buytenhek <buytenh@wantstofly.org> CC: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-10-31net: create generic bridge opsJohn Fastabend
The PF_BRIDGE:RTM_{GET|SET}LINK nlmsg family and type are currently embedded in the ./net/bridge module. This prohibits them from being used by other bridging devices. One example of this being hardware that has embedded bridging components. In order to use these nlmsg types more generically this patch adds two net_device_ops hooks. One to set link bridge attributes and another to dump the current bride attributes. ndo_bridge_setlink() ndo_bridge_getlink() CC: Lennert Buytenhek <buytenh@wantstofly.org> CC: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-10-23netlink: cleanup the unnecessary return value checkHans Zhang
It's no needed to check the return value of tab since the NULL situation has been handled already, and the rtnl_msg_handlers[PF_UNSPEC] has been initialized as non-NULL during the rtnetlink_init(). Signed-off-by: Hans Zhang <zhanghonghui@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-10-01netlink: add attributes to fdb interfacestephen hemminger
Later changes need to be able to refer to neighbour attributes when doing fdb_add. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-10netlink: Rename pid to portid to avoid confusionEric W. Biederman
It is a frequent mistake to confuse the netlink port identifier with a process identifier. Try to reduce this confusion by renaming fields that hold port identifiers portid instead of pid. I have carefully avoided changing the structures exported to userspace to avoid changing the userspace API. I have successfully built an allyesconfig kernel with this change. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-08netlink: hide struct module parameter in netlink_kernel_createPablo Neira Ayuso
This patch defines netlink_kernel_create as a wrapper function of __netlink_kernel_create to hide the struct module *me parameter (which seems to be THIS_MODULE in all existing netlink subsystems). Suggested by David S. Miller. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-08netlink: kill netlink_set_nonrootPablo Neira Ayuso
Replace netlink_set_nonroot by one new field `flags' in struct netlink_kernel_cfg that is passed to netlink_kernel_create. This patch also renames NL_NONROOT_* to NL_CFG_F_NONROOT_* since now the flags field in nl_table is generic (so we can add more flags if needed in the future). Also adjust all callers in the net-next tree to use these flags instead of netlink_set_nonroot. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-22net: remove delay at device dismantleEric Dumazet
I noticed extra one second delay in device dismantle, tracked down to a call to dst_dev_event() while some call_rcu() are still in RCU queues. These call_rcu() were posted by rt_free(struct rtable *rt) calls. We then wait a little (but one second) in netdev_wait_allrefs() before kicking again NETDEV_UNREGISTER. As the call_rcu() are now completed, dst_dev_event() can do the needed device swap on busy dst. To solve this problem, add a new NETDEV_UNREGISTER_FINAL, called after a rcu_barrier(), but outside of RTNL lock. Use NETDEV_UNREGISTER_FINAL with care ! Change dst_dev_event() handler to react to NETDEV_UNREGISTER_FINAL Also remove NETDEV_UNREGISTER_BATCH, as its not used anymore after IP cache removal. With help from Gao feng Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Mahesh Bandewar <maheshb@google.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-09net: Allow to create links with given ifindexPavel Emelyanov
Currently the RTM_NEWLINK results in -EOPNOTSUPP if the ifinfomsg->ifi_index is not zero. I propose to allow requesting ifindices on link creation. This is required by the checkpoint-restore to correctly restore a net namespace (i.e. -- a container). Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-09time: jiffies_delta_to_clock_t() helper to the rescueEric Dumazet
Various /proc/net files sometimes report crazy timer values, expressed in clock_t units. This happens when an expired timer delta (expires - jiffies) is passed to jiffies_to_clock_t(). This function has an overflow in : return div_u64((u64)x * TICK_NSEC, NSEC_PER_SEC / USER_HZ); commit cbbc719fccdb8cb (time: Change jiffies_to_clock_t() argument type to unsigned long) only got around the problem. As we cant output negative values in /proc/net/tcp without breaking various tools, I suggest adding a jiffies_delta_to_clock_t() wrapper that caps the negative delta to a 0 value. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Maciej Żenczykowski <maze@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: hank <pyu@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-31Merge tag 'random_for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random Pull random subsystem patches from Ted Ts'o: "This patch series contains a major revamp of how we collect entropy from interrupts for /dev/random and /dev/urandom. The goal is to addresses weaknesses discussed in the paper "Mining your Ps and Qs: Detection of Widespread Weak Keys in Network Devices", by Nadia Heninger, Zakir Durumeric, Eric Wustrow, J. Alex Halderman, which will be published in the Proceedings of the 21st Usenix Security Symposium, August 2012. (See https://factorable.net for more information and an extended version of the paper.)" Fix up trivial conflicts due to nearby changes in drivers/{mfd/ab3100-core.c, usb/gadget/omap_udc.c} * tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random: (33 commits) random: mix in architectural randomness in extract_buf() dmi: Feed DMI table to /dev/random driver random: Add comment to random_initialize() random: final removal of IRQF_SAMPLE_RANDOM um: remove IRQF_SAMPLE_RANDOM which is now a no-op sparc/ldc: remove IRQF_SAMPLE_RANDOM which is now a no-op [ARM] pxa: remove IRQF_SAMPLE_RANDOM which is now a no-op board-palmz71: remove IRQF_SAMPLE_RANDOM which is now a no-op isp1301_omap: remove IRQF_SAMPLE_RANDOM which is now a no-op pxa25x_udc: remove IRQF_SAMPLE_RANDOM which is now a no-op omap_udc: remove IRQF_SAMPLE_RANDOM which is now a no-op goku_udc: remove IRQF_SAMPLE_RANDOM which was commented out uartlite: remove IRQF_SAMPLE_RANDOM which is now a no-op drivers: hv: remove IRQF_SAMPLE_RANDOM which is now a no-op xen-blkfront: remove IRQF_SAMPLE_RANDOM which is now a no-op n2_crypto: remove IRQF_SAMPLE_RANDOM which is now a no-op pda_power: remove IRQF_SAMPLE_RANDOM which is now a no-op i2c-pmcmsp: remove IRQF_SAMPLE_RANDOM which is now a no-op input/serio/hp_sdc.c: remove IRQF_SAMPLE_RANDOM which is now a no-op mfd: remove IRQF_SAMPLE_RANDOM which is now a no-op ...
2012-07-29ipv6: fix incorrect route 'expires' value passed to userspaceLi Wei
When userspace use RTM_GETROUTE to dump route table, with an already expired route entry, we always got an 'expires' value(2147157) calculated base on INT_MAX. The reason of this problem is in the following satement: rt->dst.expires - jiffies < INT_MAX gcc promoted the type of both sides of '<' to unsigned long, thus a small negative value would be considered greater than INT_MAX. With the help of Eric Dumazet, do the out of bound checks in rtnl_put_cacheinfo(), _after_ conversion to clock_t. Signed-off-by: Li Wei <lw@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-27net: fix rtnetlink IFF_PROMISC and IFF_ALLMULTI handlingJiri Benc
When device flags are set using rtnetlink, IFF_PROMISC and IFF_ALLMULTI flags are handled specially. Function dev_change_flags sets IFF_PROMISC and IFF_ALLMULTI bits in dev->gflags according to the passed value but do_setlink passes a result of rtnl_dev_combine_flags which takes those bits from dev->flags. This can be easily trigerred by doing: tcpdump -i eth0 & ip l s up eth0 ip sets IFF_UP flag in ifi_flags and ifi_change, which is combined with IFF_PROMISC by rtnl_dev_combine_flags, causing __dev_change_flags to set IFF_PROMISC in gflags. Reported-by: Max Matveev <makc@redhat.com> Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-22rtnl: Add #ifdef CONFIG_RPS around num_rx_queues referenceMark A. Greer
Commit 76ff5cc91935c51fcf1a6a99ffa28b97a6e7a884 (rtnl: allow to specify number of rx and tx queues on device creation) added a reference to the net_device structure's 'num_rx_queues' member in net/core/rtnetlink.c:rtnl_fill_ifinfo() However, the definition for 'num_rx_queues' is surrounded by an '#ifdef CONFIG_RPS' while the new reference to it is not. This causes a compile error when CONFIG_RPS is not defined. Fix the compile error by surrounding the new reference to 'num_rx_queues' by an '#ifdef CONFIG_RPS'. CC: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Mark A. Greer <mgreer@animalcreek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-20rtnl: allow to specify number of rx and tx queues on device creationJiri Pirko
This patch introduces IFLA_NUM_TX_QUEUES and IFLA_NUM_RX_QUEUES by which userspace can set number of rx and/or tx queues to be allocated for newly created netdevice. This overrides ops->get_num_[tr]x_queues() Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-20rtnl: allow to specify different num for rx and tx queue countJiri Pirko
Also cut out unused function parameters and possible err in return value. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-14net: feed /dev/random with the MAC address when registering a deviceTheodore Ts'o
Cc: David Miller <davem@davemloft.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org