aboutsummaryrefslogtreecommitdiff
path: root/include/uapi
AgeCommit message (Collapse)Author
2014-10-03gue: Receive side for Generic UDP EncapsulationTom Herbert
This patch adds support receiving for GUE packets in the fou module. The fou module now supports direct foo-over-udp (no encapsulation header) and GUE. To support this a type parameter is added to the fou netlink parameters. For a GUE socket we define gue_udp_recv, gue_gro_receive, and gue_gro_complete to handle the specifics of the GUE protocol. Most of the code to manage and configure sockets is common with the fou. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-03target: Add a user-passthrough backstoreAndy Grover
Add a LIO storage engine that presents commands to userspace for execution. This would allow more complex backstores to be implemented out-of-kernel, and also make experimentation a-la FUSE (but at the SCSI level -- "SUSE"?) possible. It uses a mmap()able UIO device per LUN to share a command ring and data area. The commands are raw SCSI CDBs and iovs for in/out data. The command ring is also reused for returning scsi command status and optional sense data. This implementation is based on Shaohua Li's earlier version but heavily modified. Differences include: * Shared memory allocated by kernel, not locked-down user pages * Single ring for command request and response * Offsets instead of embedded pointers * Generic SCSI CDB passthrough instead of per-cmd specialization in ring format. * Uses UIO device instead of anon_file passed in mailbox. * Optional in-kernel handling of some commands. The main reason for these differences is to permit greater resiliency if the user process dies or hangs. Things not yet implemented (on purpose): * Zero copy. The data area is flexible enough to allow page flipping or backend-allocated pages to be used by fabrics, but it's not clear these are performance wins. Can come later. * Out-of-order command completion by userspace. Possible to add by just allowing userspace to change cmd_id in rsp cmd entries, but currently not supported. * No locks between kernel cmd submission and completion routines. Sounds like it's possible, but this can come later. * Sparse allocation of mmaped area. Current code vmallocs the whole thing. If the mapped area was larger and not fully mapped then the driver would have more freedom to change cmd and data area sizes based on demand. Current code open issues: * The use of idrs may be overkill -- we maybe can replace them with a simple counter to generate cmd_ids, and a hash table to get a cmd_id's associated pointer. * Use of a free-running counter for cmd ring instead of explicit modulo math. This would require power-of-2 cmd ring size. (Add kconfig depends NET - Randy) Signed-off-by: Andy Grover <agrover@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-10-02wil6210: atomic I/O for the card memoryVladimir Kondratiev
Introduce netdev IOCTLs, to be used by the debug tools. Allows to read/write single dword value or memory block, aligned to dword Different address modes supported: - BAR offset - Firmware "linker" address - target's AHB bus Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2014-10-02netfilter: nft_reject: introduce icmp code abstraction for inet and bridgePablo Neira Ayuso
This patch introduces the NFT_REJECT_ICMPX_UNREACH type which provides an abstraction to the ICMP and ICMPv6 codes that you can use from the inet and bridge tables, they are: * NFT_REJECT_ICMPX_NO_ROUTE: no route to host - network unreachable * NFT_REJECT_ICMPX_PORT_UNREACH: port unreachable * NFT_REJECT_ICMPX_HOST_UNREACH: host unreachable * NFT_REJECT_ICMPX_ADMIN_PROHIBITED: administratevely prohibited You can still use the specific codes when restricting the rule to match the corresponding layer 3 protocol. I decided to not overload the existing NFT_REJECT_ICMP_UNREACH to have different semantics depending on the table family and to allow the user to specify ICMP family specific codes if they restrict it to the corresponding family. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-10-01Merge branches 'pci/aer' and 'pci/virtualization' into nextBjorn Helgaas
* pci/aer: PCI/AER: Rename PCI_ERR_UNC_TRAIN to PCI_ERR_UNC_UND PCI/AER: Add additional PCIe AER error strings trace, RAS: Add additional PCIe AER error strings trace, RAS: Replace bare numbers with #defines for PCIe AER error strings * pci/virtualization: PCI: Add ACS quirk for Intel 10G NICs
2014-09-29macvlan: add source modeMichael Braun
This patch adds a new mode of operation to macvlan, called "source". It allows one to set a list of allowed mac address, which is used to match against source mac address from received frames on underlying interface. This enables creating mac based VLAN associations, instead of standard port or tag based. The feature is useful to deploy 802.1x mac based behavior, where drivers of underlying interfaces doesn't allows that. Configuration is done through the netlink interface using e.g.: ip link add link eth0 name macvlan0 type macvlan mode source ip link add link eth0 name macvlan1 type macvlan mode source ip link set link dev macvlan0 type macvlan macaddr add 00:11:11:11:11:11 ip link set link dev macvlan0 type macvlan macaddr add 00:22:22:22:22:22 ip link set link dev macvlan0 type macvlan macaddr add 00:33:33:33:33:33 ip link set link dev macvlan1 type macvlan macaddr add 00:33:33:33:33:33 ip link set link dev macvlan1 type macvlan macaddr add 00:44:44:44:44:44 This allows clients with MAC addresses 00:11:11:11:11:11, 00:22:22:22:22:22 to be part of only VLAN associated with macvlan0 interface. Clients with MAC addresses 00:44:44:44:44:44 with only VLAN associated with macvlan1 interface. And client with MAC address 00:33:33:33:33:33 to be associated with both VLANs. Based on work of Stefan Gula <steweg@gmail.com> v8: last version of Stefan Gula for Kernel 3.2.1 v9: rework onto linux-next 2014-03-12 by Michael Braun add MACADDR_SET command, enable to configure mac for source mode while creating interface v10: - reduce indention level - rename source_list to source_entry - use aligned 64bit ether address - use hash_64 instead of addr[5] v11: - rebase for 3.14 / linux-next 20.04.2014 v12 - rebase for linux-next 2014-09-25 Signed-off-by: Michael Braun <michael-dev@fami-braun.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller
Pablo Neira Ayuso says: ==================== pull request: netfilter/ipvs updates for net-next The following patchset contains Netfilter/IPVS updates for net-next, most relevantly they are: 1) Four patches to make the new nf_tables masquerading support independent of the x_tables infrastructure. This also resolves a compilation breakage if the masquerade target is disabled but the nf_tables masq expression is enabled. 2) ipset updates via Jozsef Kadlecsik. This includes the addition of the skbinfo extension that allows you to store packet metainformation in the elements. This can be used to fetch and restore this to the packets through the iptables SET target, patches from Anton Danilov. 3) Add the hash:mac set type to ipset, from Jozsef Kadlecsick. 4) Add simple weighted fail-over scheduler via Simon Horman. This provides a fail-over IPVS scheduler (unlike existing load balancing schedulers). Connections are directed to the appropriate server based solely on highest weight value and server availability, patch from Kenny Mathis. 5) Support IPv6 real servers in IPv4 virtual-services and vice versa. Simon Horman informs that the motivation for this is to allow more flexibility in the choice of IP version offered by both virtual-servers and real-servers as they no longer need to match: An IPv4 connection from an end-user may be forwarded to a real-server using IPv6 and vice versa. No ip_vs_sync support yet though. Patches from Alex Gartrell and Julian Anastasov. 6) Add global generation ID to the nf_tables ruleset. When dumping from several different object lists, we need a way to identify that an update has ocurred so userspace knows that it needs to refresh its lists. This also includes a new command to obtain the 32-bits generation ID. The less significant 16-bits of this ID is also exposed through res_id field in the nfnetlink header to quickly detect the interference and retry when there is no risk of ID wraparound. 7) Move br_netfilter out of the bridge core. The br_netfilter code is built in the bridge core by default. This causes problems of different kind to people that don't want this: Jesper reported performance drop due to the inconditional hook registration and I remember to have read complains on netdev from people regarding the unexpected behaviour of our bridging stack when br_netfilter is enabled (fragmentation handling, layer 3 and upper inspection). People that still need this should easily undo the damage by modprobing the new br_netfilter module. 8) Dump the set policy nf_tables that allows set parameterization. So userspace can keep user-defined preferences when saving the ruleset. From Arturo Borrero. 9) Use __seq_open_private() helper function to reduce boiler plate code in x_tables, From Rob Jones. 10) Safer default behaviour in case that you forget to load the protocol tracker. Daniel Borkmann and Florian Westphal detected that if your ruleset is stateful, you allow traffic to at least one single SCTP port and the SCTP protocol tracker is not loaded, then any SCTP traffic may be pass through unfiltered. After this patch, the connection tracking classifies SCTP/DCCP/UDPlite/GRE packets as invalid if your kernel has been compiled with support for these modules. ==================== Trivially resolved conflict in include/linux/skbuff.h, Eric moved some netfilter skbuff members around, and the netfilter tree adjusted the ifdef guards for the bridging info pointer. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-29vfio/iommu_type1: add new VFIO_TYPE1_NESTING_IOMMU IOMMU typeWill Deacon
VFIO allows devices to be safely handed off to userspace by putting them behind an IOMMU configured to ensure DMA and interrupt isolation. This enables userspace KVM clients, such as kvmtool and qemu, to further map the device into a virtual machine. With IOMMUs such as the ARM SMMU, it is then possible to provide SMMU translation services to the guest operating system, which are nested with the existing translation installed by VFIO. However, enabling this feature means that the IOMMU driver must be informed that the VFIO domain is being created for the purposes of nested translation. This patch adds a new IOMMU type (VFIO_TYPE1_NESTING_IOMMU) to the VFIO type-1 driver. The new IOMMU type acts identically to the VFIO_TYPE1v2_IOMMU type, but additionally sets the DOMAIN_ATTR_NESTING attribute on its IOMMU domains. Cc: Joerg Roedel <joro@8bytes.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-09-29net: tcp: add DCTCP congestion control algorithmDaniel Borkmann
This work adds the DataCenter TCP (DCTCP) congestion control algorithm [1], which has been first published at SIGCOMM 2010 [2], resp. follow-up analysis at SIGMETRICS 2011 [3] (and also, more recently as an informational IETF draft available at [4]). DCTCP is an enhancement to the TCP congestion control algorithm for data center networks. Typical data center workloads are i.e. i) partition/aggregate (queries; bursty, delay sensitive), ii) short messages e.g. 50KB-1MB (for coordination and control state; delay sensitive), and iii) large flows e.g. 1MB-100MB (data update; throughput sensitive). DCTCP has therefore been designed for such environments to provide/achieve the following three requirements: * High burst tolerance (incast due to partition/aggregate) * Low latency (short flows, queries) * High throughput (continuous data updates, large file transfers) with commodity, shallow buffered switches The basic idea of its design consists of two fundamentals: i) on the switch side, packets are being marked when its internal queue length > threshold K (K is chosen so that a large enough headroom for marked traffic is still available in the switch queue); ii) the sender/host side maintains a moving average of the fraction of marked packets, so each RTT, F is being updated as follows: F := X / Y, where X is # of marked ACKs, Y is total # of ACKs alpha := (1 - g) * alpha + g * F, where g is a smoothing constant The resulting alpha (iow: probability that switch queue is congested) is then being used in order to adaptively decrease the congestion window W: W := (1 - (alpha / 2)) * W The means for receiving marked packets resp. marking them on switch side in DCTCP is the use of ECN. RFC3168 describes a mechanism for using Explicit Congestion Notification from the switch for early detection of congestion, rather than waiting for segment loss to occur. However, this method only detects the presence of congestion, not the *extent*. In the presence of mild congestion, it reduces the TCP congestion window too aggressively and unnecessarily affects the throughput of long flows [4]. DCTCP, as mentioned, enhances Explicit Congestion Notification (ECN) processing to estimate the fraction of bytes that encounter congestion, rather than simply detecting that some congestion has occurred. DCTCP then scales the TCP congestion window based on this estimate [4], thus it can derive multibit feedback from the information present in the single-bit sequence of marks in its control law. And thus act in *proportion* to the extent of congestion, not its *presence*. Switches therefore set the Congestion Experienced (CE) codepoint in packets when internal queue lengths exceed threshold K. Resulting, DCTCP delivers the same or better throughput than normal TCP, while using 90% less buffer space. It was found in [2] that DCTCP enables the applications to handle 10x the current background traffic, without impacting foreground traffic. Moreover, a 10x increase in foreground traffic did not cause any timeouts, and thus largely eliminates TCP incast collapse problems. The algorithm itself has already seen deployments in large production data centers since then. We did a long-term stress-test and analysis in a data center, short summary of our TCP incast tests with iperf compared to cubic: This test measured DCTCP throughput and latency and compared it with CUBIC throughput and latency for an incast scenario. In this test, 19 senders sent at maximum rate to a single receiver. The receiver simply ran iperf -s. The senders ran iperf -c <receiver> -t 30. All senders started simultaneously (using local clocks synchronized by ntp). This test was repeated multiple times. Below shows the results from a single test. Other tests are similar. (DCTCP results were extremely consistent, CUBIC results show some variance induced by the TCP timeouts that CUBIC encountered.) For this test, we report statistics on the number of TCP timeouts, flow throughput, and traffic latency. 1) Timeouts (total over all flows, and per flow summaries): CUBIC DCTCP Total 3227 25 Mean 169.842 1.316 Median 183 1 Max 207 5 Min 123 0 Stddev 28.991 1.600 Timeout data is taken by measuring the net change in netstat -s "other TCP timeouts" reported. As a result, the timeout measurements above are not restricted to the test traffic, and we believe that it is likely that all of the "DCTCP timeouts" are actually timeouts for non-test traffic. We report them nevertheless. CUBIC will also include some non-test timeouts, but they are drawfed by bona fide test traffic timeouts for CUBIC. Clearly DCTCP does an excellent job of preventing TCP timeouts. DCTCP reduces timeouts by at least two orders of magnitude and may well have eliminated them in this scenario. 2) Throughput (per flow in Mbps): CUBIC DCTCP Mean 521.684 521.895 Median 464 523 Max 776 527 Min 403 519 Stddev 105.891 2.601 Fairness 0.962 0.999 Throughput data was simply the average throughput for each flow reported by iperf. By avoiding TCP timeouts, DCTCP is able to achieve much better per-flow results. In CUBIC, many flows experience TCP timeouts which makes flow throughput unpredictable and unfair. DCTCP, on the other hand, provides very clean predictable throughput without incurring TCP timeouts. Thus, the standard deviation of CUBIC throughput is dramatically higher than the standard deviation of DCTCP throughput. Mean throughput is nearly identical because even though cubic flows suffer TCP timeouts, other flows will step in and fill the unused bandwidth. Note that this test is something of a best case scenario for incast under CUBIC: it allows other flows to fill in for flows experiencing a timeout. Under situations where the receiver is issuing requests and then waiting for all flows to complete, flows cannot fill in for timed out flows and throughput will drop dramatically. 3) Latency (in ms): CUBIC DCTCP Mean 4.0088 0.04219 Median 4.055 0.0395 Max 4.2 0.085 Min 3.32 0.028 Stddev 0.1666 0.01064 Latency for each protocol was computed by running "ping -i 0.2 <receiver>" from a single sender to the receiver during the incast test. For DCTCP, "ping -Q 0x6 -i 0.2 <receiver>" was used to ensure that traffic traversed the DCTCP queue and was not dropped when the queue size was greater than the marking threshold. The summary statistics above are over all ping metrics measured between the single sender, receiver pair. The latency results for this test show a dramatic difference between CUBIC and DCTCP. CUBIC intentionally overflows the switch buffer which incurs the maximum queue latency (more buffer memory will lead to high latency.) DCTCP, on the other hand, deliberately attempts to keep queue occupancy low. The result is a two orders of magnitude reduction of latency with DCTCP - even with a switch with relatively little RAM. Switches with larger amounts of RAM will incur increasing amounts of latency for CUBIC, but not for DCTCP. 4) Convergence and stability test: This test measured the time that DCTCP took to fairly redistribute bandwidth when a new flow commences. It also measured DCTCP's ability to remain stable at a fair bandwidth distribution. DCTCP is compared with CUBIC for this test. At the commencement of this test, a single flow is sending at maximum rate (near 10 Gbps) to a single receiver. One second after that first flow commences, a new flow from a distinct server begins sending to the same receiver as the first flow. After the second flow has sent data for 10 seconds, the second flow is terminated. The first flow sends for an additional second. Ideally, the bandwidth would be evenly shared as soon as the second flow starts, and recover as soon as it stops. The results of this test are shown below. Note that the flow bandwidth for the two flows was measured near the same time, but not simultaneously. DCTCP performs nearly perfectly within the measurement limitations of this test: bandwidth is quickly distributed fairly between the two flows, remains stable throughout the duration of the test, and recovers quickly. CUBIC, in contrast, is slow to divide the bandwidth fairly, and has trouble remaining stable. CUBIC DCTCP Seconds Flow 1 Flow 2 Seconds Flow 1 Flow 2 0 9.93 0 0 9.92 0 0.5 9.87 0 0.5 9.86 0 1 8.73 2.25 1 6.46 4.88 1.5 7.29 2.8 1.5 4.9 4.99 2 6.96 3.1 2 4.92 4.94 2.5 6.67 3.34 2.5 4.93 5 3 6.39 3.57 3 4.92 4.99 3.5 6.24 3.75 3.5 4.94 4.74 4 6 3.94 4 5.34 4.71 4.5 5.88 4.09 4.5 4.99 4.97 5 5.27 4.98 5 4.83 5.01 5.5 4.93 5.04 5.5 4.89 4.99 6 4.9 4.99 6 4.92 5.04 6.5 4.93 5.1 6.5 4.91 4.97 7 4.28 5.8 7 4.97 4.97 7.5 4.62 4.91 7.5 4.99 4.82 8 5.05 4.45 8 5.16 4.76 8.5 5.93 4.09 8.5 4.94 4.98 9 5.73 4.2 9 4.92 5.02 9.5 5.62 4.32 9.5 4.87 5.03 10 6.12 3.2 10 4.91 5.01 10.5 6.91 3.11 10.5 4.87 5.04 11 8.48 0 11 8.49 4.94 11.5 9.87 0 11.5 9.9 0 SYN/ACK ECT test: This test demonstrates the importance of ECT on SYN and SYN-ACK packets by measuring the connection probability in the presence of competing flows for a DCTCP connection attempt *without* ECT in the SYN packet. The test was repeated five times for each number of competing flows. Competing Flows 1 | 2 | 4 | 8 | 16 ------------------------------ Mean Connection Probability 1 | 0.67 | 0.45 | 0.28 | 0 Median Connection Probability 1 | 0.65 | 0.45 | 0.25 | 0 As the number of competing flows moves beyond 1, the connection probability drops rapidly. Enabling DCTCP with this patch requires the following steps: DCTCP must be running both on the sender and receiver side in your data center, i.e.: sysctl -w net.ipv4.tcp_congestion_control=dctcp Also, ECN functionality must be enabled on all switches in your data center for DCTCP to work. The default ECN marking threshold (K) heuristic on the switch for DCTCP is e.g., 20 packets (30KB) at 1Gbps, and 65 packets (~100KB) at 10Gbps (K > 1/7 * C * RTT, [4]). In above tests, for each switch port, traffic was segregated into two queues. For any packet with a DSCP of 0x01 - or equivalently a TOS of 0x04 - the packet was placed into the DCTCP queue. All other packets were placed into the default drop-tail queue. For the DCTCP queue, RED/ECN marking was enabled, here, with a marking threshold of 75 KB. More details however, we refer you to the paper [2] under section 3). There are no code changes required to applications running in user space. DCTCP has been implemented in full *isolation* of the rest of the TCP code as its own congestion control module, so that it can run without a need to expose code to the core of the TCP stack, and thus nothing changes for non-DCTCP users. Changes in the CA framework code are minimal, and DCTCP algorithm operates on mechanisms that are already available in most Silicon. The gain (dctcp_shift_g) is currently a fixed constant (1/16) from the paper, but we leave the option that it can be chosen carefully to a different value by the user. In case DCTCP is being used and ECN support on peer site is off, DCTCP falls back after 3WHS to operate in normal TCP Reno mode. ss {-4,-6} -t -i diag interface: ... dctcp wscale:7,7 rto:203 rtt:2.349/0.026 mss:1448 cwnd:2054 ssthresh:1102 ce_state 0 alpha 15 ab_ecn 0 ab_tot 735584 send 10129.2Mbps pacing_rate 20254.1Mbps unacked:1822 retrans:0/15 reordering:101 rcv_space:29200 ... dctcp-reno wscale:7,7 rto:201 rtt:0.711/1.327 ato:40 mss:1448 cwnd:10 ssthresh:1102 fallback_mode send 162.9Mbps pacing_rate 325.5Mbps rcv_rtt:1.5 rcv_space:29200 More information about DCTCP can be found in [1-4]. [1] http://simula.stanford.edu/~alizade/Site/DCTCP.html [2] http://simula.stanford.edu/~alizade/Site/DCTCP_files/dctcp-final.pdf [3] http://simula.stanford.edu/~alizade/Site/DCTCP_files/dctcp_analysis-full.pdf [4] http://tools.ietf.org/html/draft-bensley-tcpm-dctcp-00 Joint work with Florian Westphal and Glenn Judd. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Glenn Judd <glenn.judd@morganstanley.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-28Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2014-09-25 1) Remove useless hash_resize_mutex in xfrm_hash_resize(). This mutex is used only there, but xfrm_hash_resize() can't be called concurrently at all. From Ying Xue. 2) Extend policy hashing to prefixed policies based on prefix lenght thresholds. From Christophe Gouault. 3) Make the policy hash table thresholds configurable via netlink. From Christophe Gouault. 4) Remove the maximum authentication length for AH. This was needed to limit stack usage. We switched already to allocate space, so no need to keep the limit. From Herbert Xu. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-26Merge tag 'master-2014-09-16' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next John W. Linville says: ==================== pull request: wireless-next 2014-09-22 Please pull this batch of updates intended for the 3.18 stream... For the mac80211 bits, Johannes says: "This time, I have some rate minstrel improvements, support for a very small feature from CCX that Steinar reverse-engineered, dynamic ACK timeout support, a number of changes for TDLS, early support for radio resource measurement and many fixes. Also, I'm changing a number of places to clear key memory when it's freed and Intel claims copyright for code they developed." For the bluetooth bits, Johan says: "Here are some more patches intended for 3.18. Most of them are cleanups or fixes for SMP. The only exception is a fix for BR/EDR L2CAP fixed channels which should now work better together with the L2CAP information request procedure." For the iwlwifi bits, Emmanuel says: "I fix here dvm which was broken by my last pull request. Arik continues to work on TDLS and Luca solved a few issues in CT-Kill. Eyal keeps digging into rate scaling code, more to come soon. Besides this, nothing really special here." Beyond that, there are the usual big batches of updates to ath9k, b43, mwifiex, and wil6210 as well as a handful of other bits here and there. Also, rtlwifi gets some btcoexist attention from Larry. Please let me know if there are problems! ==================== Had to adjust the wil6210 code to comply with Joe Perches's recent change in net-next to make the netdev_*() routines return void instead of 'int'. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-26bpf: verifier (add ability to receive verification log)Alexei Starovoitov
add optional attributes for BPF_PROG_LOAD syscall: union bpf_attr { struct { ... __u32 log_level; /* verbosity level of eBPF verifier */ __u32 log_size; /* size of user buffer */ __aligned_u64 log_buf; /* user supplied 'char *buffer' */ }; }; when log_level > 0 the verifier will return its verification log in the user supplied buffer 'log_buf' which can be used by program author to analyze why verifier rejected given program. 'Understanding eBPF verifier messages' section of Documentation/networking/filter.txt provides several examples of these messages, like the program: BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0), BPF_MOV64_REG(BPF_REG_2, BPF_REG_10), BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8), BPF_LD_MAP_FD(BPF_REG_1, 0), BPF_CALL_FUNC(BPF_FUNC_map_lookup_elem), BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 1), BPF_ST_MEM(BPF_DW, BPF_REG_0, 4, 0), BPF_EXIT_INSN(), will be rejected with the following multi-line message in log_buf: 0: (7a) *(u64 *)(r10 -8) = 0 1: (bf) r2 = r10 2: (07) r2 += -8 3: (b7) r1 = 0 4: (85) call 1 5: (15) if r0 == 0x0 goto pc+1 R0=map_ptr R10=fp 6: (7a) *(u64 *)(r0 +4) = 0 misaligned access off 4 size 8 The format of the output can change at any time as verifier evolves. Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-26bpf: expand BPF syscall with program load/unloadAlexei Starovoitov
eBPF programs are similar to kernel modules. They are loaded by the user process and automatically unloaded when process exits. Each eBPF program is a safe run-to-completion set of instructions. eBPF verifier statically determines that the program terminates and is safe to execute. The following syscall wrapper can be used to load the program: int bpf_prog_load(enum bpf_prog_type prog_type, const struct bpf_insn *insns, int insn_cnt, const char *license) { union bpf_attr attr = { .prog_type = prog_type, .insns = ptr_to_u64(insns), .insn_cnt = insn_cnt, .license = ptr_to_u64(license), }; return bpf(BPF_PROG_LOAD, &attr, sizeof(attr)); } where 'insns' is an array of eBPF instructions and 'license' is a string that must be GPL compatible to call helper functions marked gpl_only Upon succesful load the syscall returns prog_fd. Use close(prog_fd) to unload the program. User space tests and examples follow in the later patches Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-26bpf: add lookup/update/delete/iterate methods to BPF mapsAlexei Starovoitov
'maps' is a generic storage of different types for sharing data between kernel and userspace. The maps are accessed from user space via BPF syscall, which has commands: - create a map with given type and attributes fd = bpf(BPF_MAP_CREATE, union bpf_attr *attr, u32 size) returns fd or negative error - lookup key in a given map referenced by fd err = bpf(BPF_MAP_LOOKUP_ELEM, union bpf_attr *attr, u32 size) using attr->map_fd, attr->key, attr->value returns zero and stores found elem into value or negative error - create or update key/value pair in a given map err = bpf(BPF_MAP_UPDATE_ELEM, union bpf_attr *attr, u32 size) using attr->map_fd, attr->key, attr->value returns zero or negative error - find and delete element by key in a given map err = bpf(BPF_MAP_DELETE_ELEM, union bpf_attr *attr, u32 size) using attr->map_fd, attr->key - iterate map elements (based on input key return next_key) err = bpf(BPF_MAP_GET_NEXT_KEY, union bpf_attr *attr, u32 size) using attr->map_fd, attr->key, attr->next_key - close(fd) deletes the map Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-26bpf: enable bpf syscall on x64 and i386Alexei Starovoitov
done as separate commit to ease conflict resolution Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-26bpf: introduce BPF syscall and mapsAlexei Starovoitov
BPF syscall is a multiplexor for a range of different operations on eBPF. This patch introduces syscall with single command to create a map. Next patch adds commands to access maps. 'maps' is a generic storage of different types for sharing data between kernel and userspace. Userspace example: /* this syscall wrapper creates a map with given type and attributes * and returns map_fd on success. * use close(map_fd) to delete the map */ int bpf_create_map(enum bpf_map_type map_type, int key_size, int value_size, int max_entries) { union bpf_attr attr = { .map_type = map_type, .key_size = key_size, .value_size = value_size, .max_entries = max_entries }; return bpf(BPF_MAP_CREATE, &attr, sizeof(attr)); } 'union bpf_attr' is backwards compatible with future extensions. More details in Documentation/networking/filter.txt and in manpage Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-26next: openrisc: Fix buildGuenter Roeck
openrisc:defconfig fails to build in next-20140926 with the following error. In file included from arch/openrisc/kernel/signal.c:31:0: ./arch/openrisc/include/asm/syscall.h: In function 'syscall_get_arch': ./arch/openrisc/include/asm/syscall.h:77:9: error: 'EM_OPENRISC' undeclared Fix by moving EM_OPENRISC to include/uapi/linux/elf-em.h. Fixes: ce5d112827e5 ("ARCH: AUDIT: implement syscall_get_arch for all arches") Cc: Eric Paris <eparis@redhat.com> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Eric Paris <eparis@redhat.com>
2014-09-26[media] v4l2-dv-timings: fix a sparse warningMauro Carvalho Chehab
This is detected with: gcc-4.8.3-7.fc20.x86_64 sparse-0.5.0-3.fc20.x86_64 drivers/media/v4l2-core/v4l2-dv-timings.c:34:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:35:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:36:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:37:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:38:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:39:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:40:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:41:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:42:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:43:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:44:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:45:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:46:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:47:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:48:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:49:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:50:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:51:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:52:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:53:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:54:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:55:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:56:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:57:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:58:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:59:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:60:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:61:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:62:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:63:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:64:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:65:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:66:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:67:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:68:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:69:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:70:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:71:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:72:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:73:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:74:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:75:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:76:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:77:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:78:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:79:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:80:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:81:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:82:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:83:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:84:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:85:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:86:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:87:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:88:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:89:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:90:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:91:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:92:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:93:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:94:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:95:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:96:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:97:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:98:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:99:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:100:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:101:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:102:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:103:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:104:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:105:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:106:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:107:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:108:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:109:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:110:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:111:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:112:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:113:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:114:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:115:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:116:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:117:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:118:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:119:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:120:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:121:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:122:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:123:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:124:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:125:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:126:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:127:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:128:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:129:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:130:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:131:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:132:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:133:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:134:9: error: unknown field name in initializer drivers/media/v4l2-core/v4l2-dv-timings.c:135:9: error: too many errors drivers/media/usb/hdpvr/hdpvr-video.c:42:9: error: unknown field name in initializer drivers/media/usb/hdpvr/hdpvr-video.c:43:9: error: unknown field name in initializer drivers/media/usb/hdpvr/hdpvr-video.c:44:9: error: unknown field name in initializer drivers/media/usb/hdpvr/hdpvr-video.c:45:9: error: unknown field name in initializer drivers/media/usb/hdpvr/hdpvr-video.c:46:9: error: unknown field name in initializer drivers/media/usb/hdpvr/hdpvr-video.c:47:9: error: unknown field name in initializer drivers/media/usb/hdpvr/hdpvr-video.c:48:9: error: unknown field name in initializer drivers/media/usb/hdpvr/hdpvr-video.c:49:9: error: unknown field name in initializer drivers/media/platform/s5p-tv/hdmi_drv.c:484:18: error: unknown field name in initializer drivers/media/platform/s5p-tv/hdmi_drv.c:485:18: error: unknown field name in initializer drivers/media/platform/s5p-tv/hdmi_drv.c:486:18: error: unknown field name in initializer drivers/media/platform/s5p-tv/hdmi_drv.c:487:18: error: unknown field name in initializer drivers/media/platform/s5p-tv/hdmi_drv.c:488:18: error: unknown field name in initializer drivers/media/platform/s5p-tv/hdmi_drv.c:489:18: error: unknown field name in initializer drivers/media/platform/s5p-tv/hdmi_drv.c:490:18: error: unknown field name in initializer drivers/media/platform/s5p-tv/hdmi_drv.c:491:18: error: unknown field name in initializer drivers/media/platform/s5p-tv/hdmi_drv.c:492:18: error: unknown field name in initializer drivers/media/platform/s5p-tv/hdmi_drv.c:493:18: error: unknown field name in initializer Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
2014-09-25PCI/AER: Rename PCI_ERR_UNC_TRAIN to PCI_ERR_UNC_UNDChen, Gong
In PCIe r1.0, sec 5.10.2, bit 0 of the Uncorrectable Error Status, Mask, and Severity Registers was for "Training Error." In PCIe r1.1, sec 7.10.2, bit 0 was redefined to be "Undefined." Rename PCI_ERR_UNC_TRAIN to PCI_ERR_UNC_UND to reflect this change. No functional change. [bhelgaas: changelog] Signed-off-by: Chen, Gong <gong.chen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-09-23Drivers: hv: util: Properly pack the data for file copy functionalityK. Y. Srinivasan
Properly pack the data for file copy functionality. Patch based on investigation done by Matej Muzila <mmuzila@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reported-by: <qge@redhat.com> Cc: <stable@vger.kernel.org> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-23GenWQE: Update author informationFrank Haverkamp
Updated email address of co-author. Signed-off-by: Frank Haverkamp <haver@linux.vnet.ibm.com> Signed-off-by: Michael Jung <mijung@gmx.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-23ARCH: AUDIT: implement syscall_get_arch for all archesEric Paris
For all arches which support audit implement syscall_get_arch() They are all pretty easy and straight forward, stolen from how the call to audit_syscall_entry() determines the arch. Based-on-patch-by: Richard Briggs <rgb@redhat.com> Signed-off-by: Eric Paris <eparis@redhat.com> Cc: linux-ia64@vger.kernel.org Cc: microblaze-uclinux@itee.uq.edu.au Cc: linux-mips@linux-mips.org Cc: linux@lists.openrisc.net Cc: linux-parisc@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: sparclinux@vger.kernel.org
2014-09-23audit: drop unused struct audit_rule definitionEric Paris
The kernel only uses struct audit_rule_data. We dropped support for struct audit_rule a long time ago. Drop the definition in the header file. Signed-off-by: Eric Paris <eparis@redhat.com>
2014-09-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: arch/mips/net/bpf_jit.c drivers/net/can/flexcan.c Both the flexcan and MIPS bpf_jit conflicts were cases of simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-21[media] v4l: Add ARGB555X and XRGB555X pixel formatsLaurent Pinchart
The existing RGB555X pixel format is ill-defined in respect to its alpha bit and its meaning is driver dependent. Create new standard ARGB555X and XRGB555X variants with clearly defined meanings and make the existing variant deprecated. The new pixel formats 4CC values have been selected to match the DRM 4CCs for the same in-memory formats. Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
2014-09-21[media] v4l: Add camera pan/tilt speed controlsVincent Palatin
The V4L2_CID_PAN_SPEED and V4L2_CID_TILT_SPEED controls allow to move the camera by setting its rotation speed around its axis. Signed-off-by: Vincent Palatin <vpalatin@chromium.org> Reviewed-by: Pawel Osciak <posciak@chromium.org> Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
2014-09-21Merge remote-tracking branch 'linus/master' into patchworkMauro Carvalho Chehab
There are some patches that depends on media-v3.16-rc6. So, merge back from upstream before applying them. * linus/master: (1123 commits) drm/nouveau: ltc/gf100-: fix cbc issues on certain boards drm/bochs: add missing drm_connector_register call drm/cirrus: add missing drm_connector_register call staging: vt6655: buffer overflow in ioctl USB: storage: Add quirks for Entrega/Xircom USB to SCSI converters USB: storage: Add quirk for Ariston Technologies iConnect USB to SCSI adapter USB: storage: Add quirk for Adaptec USBConnect 2000 USB-to-SCSI Adapter USB: EHCI: unlink QHs even after the controller has stopped [SCSI] fix for bidi use after free [SCSI] fix regression that accidentally disabled block-based tcq [SCSI] libiscsi: fix potential buffer overrun in __iscsi_conn_send_pdu drm/radeon: Fix typo 'addr' -> 'entry' in rs400_gart_set_page drm/nouveau/runpm: fix module unload drm/radeon/px: fix module unload vgaswitcheroo: add vga_switcheroo_fini_domain_pm_ops drm/radeon: don't reset dma on r6xx-evergreen init drm/radeon: don't reset sdma on CIK init drm/radeon: don't reset dma on NI/SI init drm/radeon/dpm: fix resume on mullins drm/radeon: Disable HDP flush before every CS again for < r600 ...
2014-09-19gre: Setup and TX path for gre/UDP foo-over-udp encapsulationTom Herbert
Added netlink attrs to configure FOU encapsulation for GRE, netlink handling of these flags, and properly adjust MTU for encapsulation. ip_tunnel_encap is called from ip_tunnel_xmit to actually perform FOU encapsulation. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-19net: Changes to ip_tunnel to support foo-over-udp encapsulationTom Herbert
This patch changes IP tunnel to support (secondary) encapsulation, Foo-over-UDP. Changes include: 1) Adding tun_hlen as the tunnel header length, encap_hlen as the encapsulation header length, and hlen becomes the grand total of these. 2) Added common netlink define to support FOU encapsulation. 3) Routines to perform FOU encapsulation. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-19fou: Support for foo-over-udp RX pathTom Herbert
This patch provides a receive path for foo-over-udp. This allows direct encapsulation of IP protocols over UDP. The bound destination port is used to map to an IP protocol, and the XFRM framework (udp_encap_rcv) is used to receive encapsulated packets. Upon reception, the encapsulation header is logically removed (pointer to transport header is advanced) and the packet is reinjected into the receive path with the IP protocol indicated by the mapping. Netlink is used to configure FOU ports. The configuration information includes the port number to bind to and the IP protocol corresponding to that port. This should support GRE/UDP (http://tools.ietf.org/html/draft-yong-tsvwg-gre-in-udp-encap-02), as will as the other IP tunneling protocols (IPIP, SIT). Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-20drm/exynos: use drm generic mmap interfaceInki Dae
This patch removes DRM_EXYNOS_GEM_MMAP ictrl feature specific to Exynos drm and instead uses drm generic mmap. We had used the interface specific to Exynos drm to do mmap directly, not to use demand paging which maps each page with physical memory at page fault handler. We don't need the specific mmap interface because the drm generic mmap which uses vm offset manager stuff can also do mmap directly. This patch makes a userspace region to be mapped with whole physical memory region allocated by userspace request when mmap system call is requested. Changelog v2: - do not set VM_IO, VM_DONTEXPEND and VM_DONTDUMP. These flags were already set by drm_gem_mmap - do not include <linux/anon_inodes.h>, which isn't needed anymore. Signed-off-by: Inki Dae <inki.dae@samsung.com>
2014-09-20drm/exynos: remove DRM_EXYNOS_GEM_MAP_OFFSET ioctlInki Dae
This interface and relevant codes aren't used anymore. Signed-off-by: Inki Dae <inki.dae@samsung.com>
2014-09-19netfilter: nf_tables: export rule-set generation IDPablo Neira Ayuso
This patch exposes the ruleset generation ID in three ways: 1) The new command NFT_MSG_GETGEN that exposes the 32-bits ruleset generation ID. This ID is incremented in every commit and it should be large enough to avoid wraparound problems. 2) The less significant 16-bits of the generation ID are exposed through the nfgenmsg->res_id header field. This allows us to quickly catch if the ruleset has change between two consecutive list dumps from different object lists (in this specific case I think the risk of wraparound is unlikely). 3) Userspace subscribers may receive notifications of new rule-set generation after every commit. This also provides an alternative way to monitor the generation ID. If the events are lost, the userspace process hits a overrun error, so it knows that it is working with a stale ruleset anyway. Patrick spotted that rule-set transformations in userspace may take quite some time. In that case, it annotates the 32-bits generation ID before fetching the rule-set, then: 1) it compares it to what we obtain after the transformation to make sure it is not working with a stale rule-set and no wraparound has ocurred. 2) it subscribes to ruleset notifications, so it can watch for new generation ID. This is complementary to the NLM_F_DUMP_INTR approach, which allows us to detect an interference in the middle one single list dumping. There is no way to explicitly check that an interference has occurred between two list dumps from the kernel, since it doesn't know how many lists the userspace client is actually going to dump. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-09-18Merge branch 'ipvs-next'Pablo Neira Ayuso
Simon Horman says: ==================== This pull requests makes the following changes: * Add simple weighted fail-over scheduler. - Unlike other IPVS schedulers this offers fail-over rather than load balancing. Connections are directed to the appropriate server based solely on highest weight value and server availability. - Thanks to Kenny Mathis * Support IPv6 real servers in IPv4 virtual-services and vice versa - This feature is supported in conjunction with the tunnel (IPIP) forwarding mechanism. That is, IPv4 may be forwarded in IPv6 and vice versa. - The motivation for this is to allow more flexibility in the choice of IP version offered by both virtual-servers and real-servers as they no longer need to match: An IPv4 connection from an end-user may be forwarded to a real-server using IPv6 and vice versa. - Further work need to be done to support this feature in conjunction with connection synchronisation. For now such configurations are not allowed. - This change includes update to netlink protocol, adding a new destination address family attribute. And the necessary changes to plumb this information throughout IPVS. - Thanks to Alex Gartrell and Julian Anastasov ==================== Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-09-17KVM: device: add simple registration mechanism for kvm_device_opsWill Deacon
kvm_ioctl_create_device currently has knowledge of all the device types and their associated ops. This is fairly inflexible when adding support for new in-kernel device emulations, so move what we currently have out into a table, which can support dynamic registration of ops by new drivers for virtual hardware. Cc: Alex Williamson <Alex.Williamson@redhat.com> Cc: Alex Graf <agraf@suse.de> Cc: Gleb Natapov <gleb@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-16usb: gadget: f_fs: virtual endpoint address mappingRobert Baldyga
This patch introduces virtual endpoint address mapping. It separates function logic form physical endpoint addresses making it more hardware independent. Following modifications changes user space API, so to enable them user have to switch on the FUNCTIONFS_VIRTUAL_ADDR flag in descriptors. Endpoints are now refered using virtual endpoint addresses chosen by user in endpoint descpriptors. This applies to each context when endpoint address can be used: - when accessing endpoint files in FunctionFS filesystemi (in file name), - in setup requests directed to specific endpoint (in wIndex field), - in descriptors returned by FUNCTIONFS_ENDPOINT_DESC ioctl. In endpoint file names the endpoint address number is formatted as double-digit hexadecimal value ("ep%02x") which has few advantages - it is easy to parse, allows to easly recognize endpoint direction basing on its name (IN endpoint number starts with digit 8, and OUT with 0) which can be useful for debugging purpose, and it makes easier to introduce further features allowing to use each endpoint number in both directions to have more endpoints available for function if hardware supports this (for example we could have ep01 which is endpoint 1 with OUT direction, and ep81 which is endpoint 1 with IN direction). Physical endpoint address can be still obtained using ioctl named FUNCTIONFS_ENDPOINT_REVMAP, but now it's not neccesary to handle USB transactions properly. Signed-off-by: Robert Baldyga <r.baldyga@samsung.com> Acked-by: Michal Nazarewicz <mina86@mina86.com> Signed-off-by: Felipe Balbi <balbi@ti.com>
2014-09-16Merge tag 'v3.17-rc5' into nextFelipe Balbi
Linux 3.17-rc5 Signed-off-by: Felipe Balbi <balbi@ti.com> Conflicts: Documentation/devicetree/bindings/usb/mxs-phy.txt drivers/usb/phy/phy-mxs-usb.c
2014-09-15openvswitch: Add recirc and hash action.Andy Zhou
Recirc action allows a packet to reenter openvswitch processing. currently openvswitch lookup flow for packet received and execute set of actions on that packet, with help of recirc action we can process/modify the packet and recirculate it back in openvswitch for another pass. OVS hash action calculates 5-tupple hash and set hash in flow-key hash. This can be used along with recirculation for distributing packets among different ports for bond devices. For example: OVS bonding can use following actions: Match on: bond flow; Action: hash, recirc(id) Match on: recirc-id == id and hash lower bits == a; Action: output port_bond_a Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
2014-09-16drm: backmerge tag 'v3.17-rc5' into drm-nextDave Airlie
This is requested to get the fixes for intel and radeon into the same tree for future development work. i915_display.c: fix missing dev_priv conflict.
2014-09-16ipvs: Add destination address family to netlink interfaceAlex Gartrell
This is necessary to support heterogeneous pools. For example, if you have an ipv6 addressed network, you'll want to be able to forward ipv4 traffic into it. This patch enforces that destination address family is the same as service family, as none of the forwarding mechanisms support anything else. For the old setsockopt mechanism, we simply set the dest address family to AF_INET as we do with the service. Signed-off-by: Alex Gartrell <agartrell@fb.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
2014-09-15netfilter: ipset: Add skbinfo extension support to SET target.Anton Danilov
Signed-off-by: Anton Danilov <littlesmilingcloud@gmail.com> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2014-09-15netfilter: ipset: Add skbinfo extension kernel support in the ipset core.Anton Danilov
Skbinfo extension provides mapping of metainformation with lookup in the ipset tables. This patch defines the flags, the constants, the functions and the structures for the data type independent support of the extension. Note the firewall mark stores in the kernel structures as two 32bit values, but transfered through netlink as one 64bit value. Signed-off-by: Anton Danilov <littlesmilingcloud@gmail.com> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2014-09-15Merge tag 'mac80211-next-for-john-2014-09-12' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg <johannes@sipsolutions.net> says: "This time, I have some rate minstrel improvements, support for a very small feature from CCX that Steinar reverse-engineered, dynamic ACK timeout support, a number of changes for TDLS, early support for radio resource measurement and many fixes. Also, I'm changing a number of places to clear key memory when it's freed and Intel claims copyright for code they developed." Conflicts: net/mac80211/iface.c Signed-off-by: John W. Linville <linville@tuxdriver.com>
2014-09-14Merge 3.17-rc5 into tty-nextGreg Kroah-Hartman
We want those fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-12Merge tag 'usb-3.17-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are some USB and PHY fixes for 3.17-rc5. Nothing major here, just a number of tiny fixes for reported issues, and some new device ids as well. All have been tested in linux-next" * tag 'usb-3.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (46 commits) xhci: fix oops when xhci resumes from hibernate with hw lpm capable devices usb: xhci: Fix OOPS in xhci error handling code xhci: Fix null pointer dereference if xhci initialization fails storage: Add single-LUN quirk for Jaz USB Adapter uas: Add missing le16_to_cpu calls to asm1051 / asm1053 usb-id check usb: chipidea: msm: Initialize PHY on reset event usb: chipidea: msm: Use USB PHY API to control PHY state usb: hub: take hub->hdev reference when processing from eventlist uas: Disable uas on ASM1051 devices usb: dwc2/gadget: avoid disabling ep0 usb: dwc2/gadget: delay enabling irq once hardware is configured properly usb: dwc2/gadget: do not call disconnect method in pullup usb: dwc2/gadget: break infinite loop in endpoint disable code usb: dwc2/gadget: fix phy initialization sequence usb: dwc2/gadget: fix phy disable sequence uwb: init beacon cache entry before registering uwb device USB: ftdi_sio: Add support for GE Healthcare Nemo Tracker device USB: document the 'u' flag for usb-storage quirks parameter usb: host: xhci: fix compliance mode workaround usb: dwc3: fix TRB completion when multiple TRBs are started ...
2014-09-12drm/vmwgfx: Fix drm.h includeJosh Boyer
The userspace drm.h include doesn't prefix the drm directory. This can lead to compile failures as /usr/include/drm/ isn't in the standard gcc include paths. Fix it to be <drm/drm.h>, which matches the rest of the driver drm header files that get installed into /usr/include/drm. Red Hat Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1138759 Fixes: 1d7a5cbf8f74e Reported-by: Jeffrey Bastian <jbastian@redhat.com> Signed-off-by: Josh Boyer <jwboyer@fedoraproject.org> Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-09-11Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input updates from Dmitry Torokhov: "An update to Synaptics PS/2 driver to handle "ForcePads" (currently found in HP EliteBook 1040 laptops), a change for Elan PS/2 driver to detect newer touchpads, bunch of devices get annotated as Trackpoint and/or Pointer to help userspace classify and handle them, plus assorted driver fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: serport - add compat handling for SPIOCSTYPE ioctl Input: atmel_mxt_ts - fix double free of input device Input: synaptics - add support for ForcePads Input: matrix_keypad - use request_any_context_irq() Input: atmel_mxt_ts - downgrade warning about empty interrupts Input: wm971x - fix typo in module parameter description Input: cap1106 - fix register definition Input: add missing POINTER / DIRECT properties to a bunch of drivers Input: add INPUT_PROP_POINTING_STICK property Input: elantech - fix detection of touchpad on ASUS s301l
2014-09-11netfilter: nf_tables: add NFTA_MASQ_UNSPEC to nft_masq_attributesPablo Neira Ayuso
To keep this consistent with other nft_*_attributes. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-09-11Merge branch 'for-linus' into for-nextTakashi Iwai
Merging for-linus branch for syncing the latest STAC/IDT codec changes to be affected by the upcoming hda-jack rewrites.
2014-09-11cfg80211: allow requesting SMPS mode on ap startEliad Peller
Add feature bits to indicate device support for static-smps and dynamic-smps modes. Add a new NL80211_ATTR_SMPS_MODE attribue to allow configuring the smps mode to be used by the ap (e.g. configuring to ap to dynamic smps mode will reduce power consumption while having minor effect on throughput) Signed-off-by: Eliad Peller <eliad@wizery.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>