aboutsummaryrefslogtreecommitdiff
path: root/fs/nfs
AgeCommit message (Collapse)Author
2013-04-05NFSv4/4.1: Fix bugs in nfs4[01]_walk_client_listTrond Myklebust
It is unsafe to use list_for_each_entry_safe() here, because when we drop the nn->nfs_client_lock, we pin the _current_ list entry and ensure that it stays in the list, but we don't do the same for the _next_ list entry. Use of list_for_each_entry() is therefore the correct thing to do. Also fix the refcounting in nfs41_walk_client_list(). Finally, ensure that the nfs_client has finished being initialised and, in the case of NFSv4.1, that the session is set up. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: Bryan Schumaker <bjschuma@netapp.com> Cc: stable@vger.kernel.org [>= 3.7]
2013-04-05NFSv4: Fix a memory leak in nfs4_discover_server_trunkingTrond Myklebust
When we assign a new rpc_client to clp->cl_rpcclient, we need to destroy the old one. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: stable@vger.kernel.org [>=3.7]
2013-04-05NFSv4: Don't clear the machine cred when client establish returns EACCESTrond Myklebust
The expected behaviour is that the client will decide at mount time whether or not to use a krb5i machine cred, or AUTH_NULL. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: Bryan Schumaker <bjschuma@netapp.com>
2013-04-05NFSv4: Fix issues in nfs4_discover_server_trunkingTrond Myklebust
- Ensure that we exit with ENOENT if the call to ops->get_clid_cred() fails. - Handle the case where ops->detect_trunking() exits with an unexpected error, and return EIO. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-04NFSv4: Fix the fallback to AUTH_NULL if krb5i is not availableTrond Myklebust
If the rpcsec_gss_krb5 module cannot be loaded, the attempt to create an rpc_client in nfs4_init_client will currently fail with an EINVAL. Fix is to retry with AUTH_NULL. Regression introduced by the commit "NFS: Use "krb5i" to establish NFSv4 state whenever possible" Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: Bryan Schumaker <bjschuma@netapp.com>
2013-04-04NFS: Use server-recommended security flavor by default (NFSv3)Chuck Lever
Since commit ec88f28d in 2009, checking if the user-specified flavor is in the server's flavor list has been the source of a few noticeable regressions (now fixed), but there is one that is still vexing. An NFS server can list AUTH_NULL in its flavor list, which suggests a client should try to mount the server with the flavor of the client's choice, but the server will squash all accesses. In some cases, our client fails to mount a server because of this check, when the mount could have proceeded successfully. Skip this check if the user has specified "sec=" on the mount command line. But do consult the server-provided flavor list to choose a security flavor if no sec= option is specified on the mount command. If a server lists Kerberos pseudoflavors before "sys" in its export options, our client now chooses Kerberos over AUTH_UNIX for mount points, when no security flavor is specified by the mount command. This could be surprising to some administrators or users, who would then need to have Kerberos credentials to access the export. Or, a client administrator may not have enabled rpc.gssd. In this case, auth_rpcgss.ko might still be loadable, which is enough for the new logic to choose Kerberos over AUTH_UNIX. But the mount would fail since no GSS context can be created without rpc.gssd running. To retain the use of AUTH_UNIX by default: o The server administrator can ensure that "sys" is listed before Kerberos flavors in its export security options (see exports(5)), o The client administrator can explicitly specify "sec=sys" on its mount command line (see nfs(5)), o The client administrator can use "Sec=sys" in an appropriate section of /etc/nfsmount.conf (see nfsmount.conf(5)), or o The client administrator can blacklist auth_rpcgss.ko. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-02selinux: make security_sb_clone_mnt_opts return an error on context mismatchJeff Layton
I had the following problem reported a while back. If you mount the same filesystem twice using NFSv4 with different contexts, then the second context= option is ignored. For instance: # mount server:/export /mnt/test1 # mount server:/export /mnt/test2 -o context=system_u:object_r:tmp_t:s0 # ls -dZ /mnt/test1 drwxrwxrwt. root root system_u:object_r:nfs_t:s0 /mnt/test1 # ls -dZ /mnt/test2 drwxrwxrwt. root root system_u:object_r:nfs_t:s0 /mnt/test2 When we call into SELinux to set the context of a "cloned" superblock, it will currently just bail out when it notices that we're reusing an existing superblock. Since the existing superblock is already set up and presumably in use, we can't go overwriting its context with the one from the "original" sb. Because of this, the second context= option in this case cannot take effect. This patch fixes this by turning security_sb_clone_mnt_opts into an int return operation. When it finds that the "new" superblock that it has been handed is already set up, it checks to see whether the contexts on the old superblock match it. If it does, then it will just return success, otherwise it'll return -EBUSY and emit a printk to tell the admin why the second mount failed. Note that this patch may cause casualties. The NFSv4 code relies on being able to walk down to an export from the pseudoroot. If you mount filesystems that are nested within one another with different contexts, then this patch will make those mounts fail in new and "exciting" ways. For instance, suppose that /export is a separate filesystem on the server: # mount server:/ /mnt/test1 # mount salusa:/export /mnt/test2 -o context=system_u:object_r:tmp_t:s0 mount.nfs: an incorrect mount option was specified ...with the printk in the ring buffer. Because we *might* eventually walk down to /mnt/test1/export, the mount is denied due to this patch. The second mount needs the pseudoroot superblock, but that's already present with the wrong context. OTOH, if we mount these in the reverse order, then both mounts work, because the pseudoroot superblock created when mounting /export is discarded once that mount is done. If we then however try to walk into that directory, the automount fails for the similar reasons: # cd /mnt/test1/scratch/ -bash: cd: /mnt/test1/scratch: Device or resource busy The story I've gotten from the SELinux folks that I've talked to is that this is desirable behavior. In SELinux-land, mounting the same data under different contexts is wrong -- there can be only one. Cc: Steve Dickson <steved@redhat.com> Cc: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <james.l.morris@oracle.com>
2013-03-29NFS: Use "krb5i" to establish NFSv4 state whenever possibleChuck Lever
Currently our client uses AUTH_UNIX for state management on Kerberos NFS mounts in some cases. For example, if the first mount of a server specifies "sec=sys," the SETCLIENTID operation is performed with AUTH_UNIX. Subsequent mounts using stronger security flavors can not change the flavor used for lease establishment. This might be less security than an administrator was expecting. Dave Noveck's migration issues draft recommends the use of an integrity-protecting security flavor for the SETCLIENTID operation. Let's ignore the mount's sec= setting and use krb5i as the default security flavor for SETCLIENTID. If our client can't establish a GSS context (eg. because it doesn't have a keytab or the server doesn't support Kerberos) we fall back to using AUTH_NULL. For an operation that requires a machine credential (which never represents a particular user) AUTH_NULL is as secure as AUTH_UNIX. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-29NFS: Try AUTH_UNIX when PUTROOTFH gets NFS4ERR_WRONGSECChuck Lever
Most NFSv4 servers implement AUTH_UNIX, and administrators will prefer this over AUTH_NULL. It is harmless for our client to try this flavor in addition to the flavors mandated by RFC 3530/5661. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-29NFS: Use static list of security flavors during root FH lookup recoveryChuck Lever
If the Linux NFS client receives an NFS4ERR_WRONGSEC error while trying to look up an NFS server's root file handle, it retries the lookup operation with various security flavors to see what flavor the NFS server will accept for pseudo-fs access. The list of flavors the client uses during retry consists only of flavors that are currently registered in the kernel RPC client. This list may not include any GSS pseudoflavors if auth_rpcgss.ko has not yet been loaded. Let's instead use a static list of security flavors that the NFS standard requires the server to implement (RFC 3530bis, section 3.2.1). The RPC client should now be able to load support for these dynamically; if not, they are skipped. Recovery behavior here is prescribed by RFC 3530bis, section 15.33.5: > For LOOKUPP, PUTROOTFH and PUTPUBFH, the client will be unable to > use the SECINFO operation since SECINFO requires a current > filehandle and none exist for these two [sic] operations. Therefore, > the client must iterate through the security triples available at > the client and reattempt the PUTROOTFH or PUTPUBFH operation. In > the unfortunate event none of the MANDATORY security triples are > supported by the client and server, the client SHOULD try using > others that support integrity. Failing that, the client can try > using AUTH_NONE, but because such forms lack integrity checks, > this puts the client at risk. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-29NFS: Avoid PUTROOTFH when managing leasesChuck Lever
Currently, the compound operation the Linux NFS client sends to the server to confirm a client ID looks like this: { SETCLIENTID_CONFIRM; PUTROOTFH; GETATTR(lease_time) } Once the lease is confirmed, it makes sense to know how long before the client will have to renew it. And, performing these operations in the same compound saves a round trip. Unfortunately, this arrangement assumes that the security flavor used for establishing a client ID can also be used to access the server's pseudo-fs. If the server requires a different security flavor to access its pseudo-fs than it allowed for the client's SETCLIENTID operation, the PUTROOTFH in this compound fails with NFS4ERR_WRONGSEC. Even though the SETCLIENTID_CONFIRM succeeded, our client's trunking detection logic interprets the failure of the compound as a failure by the server to confirm the client ID. As part of server trunking detection, the client then begins another SETCLIENTID pass with the same nfs4_client_id. This fails with NFS4ERR_CLID_INUSE because the first SETCLIENTID/SETCLIENTID_CONFIRM already succeeded in confirming that client ID -- it was the PUTROOTFH operation that caused the SETCLIENTID_CONFIRM compound to fail. To address this issue, separate the "establish client ID" step from the "accessing the server's pseudo-fs root" step. The first access of the server's pseudo-fs may require retrying the PUTROOTFH operation with different security flavors. This access is done in nfs4_proc_get_rootfh(). That leaves the matter of how to retrieve the server's lease time. nfs4_proc_fsinfo() already retrieves the lease time value, though none of its callers do anything with the retrieved value (nor do they mark the lease as "renewed"). Note that NFSv4.1 state recovery invokes nfs4_proc_get_lease_time() using the lease management security flavor. This may cause some heartburn if that security flavor isn't the same as the security flavor the server requires for accessing the pseudo-fs. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-29NFS: Clean up nfs4_proc_get_rootfhChuck Lever
The long lines with no vertical white space make this function difficult for humans to read. Add a proper documenting comment while we're here. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-29NFS: Handle missing rpc.gssd when looking up root FHChuck Lever
When rpc.gssd is not running, any NFS operation that needs to use a GSS security flavor of course does not work. If looking up a server's root file handle results in an NFS4ERR_WRONGSEC, nfs4_find_root_sec() is called to try a bunch of security flavors until one works or all reasonable flavors have been tried. When rpc.gssd isn't running, this loop seems to fail immediately after rpcauth_create() craps out on the first GSS flavor. When the rpcauth_create() call in nfs4_lookup_root_sec() fails because rpc.gssd is not available, nfs4_lookup_root_sec() unconditionally returns -EIO. This prevents nfs4_find_root_sec() from retrying any other flavors; it drops out of its loop and fails immediately. Having nfs4_lookup_root_sec() return -EACCES instead allows nfs4_find_root_sec() to try all flavors in its list. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-29SUNRPC: Introduce rpcauth_get_pseudoflavor()Chuck Lever
A SECINFO reply may contain flavors whose kernel module is not yet loaded by the client's kernel. A new RPC client API, called rpcauth_get_pseudoflavor(), is introduced to do proper checking for support of a security flavor. When this API is invoked, the RPC client now tries to load the module for each flavor first before performing the "is this supported?" check. This means if a module is available on the client, but has not been loaded yet, it will be loaded and registered automatically when the SECINFO reply is processed. The new API can take a full GSS tuple (OID, QoP, and service). Previously only the OID and service were considered. nfs_find_best_sec() is updated to verify all flavors requested in a SECINFO reply, including AUTH_NULL and AUTH_UNIX. Previously these two flavors were simply assumed to be supported without consulting the RPC client. Note that the replaced version of nfs_find_best_sec() can return RPC_AUTH_MAXFLAVOR if the server returns a recognized OID but an unsupported "service" value. nfs_find_best_sec() now returns RPC_AUTH_UNIX in this case. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-29SUNRPC: Define rpcsec_gss_info structureChuck Lever
The NFSv4 SECINFO procedure returns a list of security flavors. Any GSS flavor also has a GSS tuple containing an OID, a quality-of- protection value, and a service value, which specifies a particular GSS pseudoflavor. For simplicity and efficiency, I'd like to return each GSS tuple from the NFSv4 SECINFO XDR decoder and pass it straight into the RPC client. Define a data structure that is visible to both the NFS client and the RPC client. Take structure and field names from the relevant standards to avoid confusion. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-28NFSv4: Fix Oopses in the fs_locations codeTrond Myklebust
If the server sends us a pathname with more components than the client limit of NFS4_PATHNAME_MAXCOMPONENTS, more server entries than the client limit of NFS4_FS_LOCATION_MAXSERVERS, or sends a total number of fs_locations entries than the client limit of NFS4_FS_LOCATIONS_MAXENTRIES then we will currently Oops because the limit checks are done _after_ we've decoded the data into the arrays. Reported-by: fanchaoting<fanchaoting@cn.fujitsu.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-28NFSv4: Fix another reboot recovery raceTrond Myklebust
If the open_context for the file is not yet fully initialised, then open recovery cannot succeed, and since nfs4_state_find_open_context returns an ENOENT, we end up treating the file as being irrecoverable. What we really want to do, is just defer the recovery until later. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-27NFSv4: Add a mapping for NFS4ERR_FILE_OPEN in nfs4_map_errorsTrond Myklebust
With unlink is an asynchronous operation in the sillyrename case, it expects nfs4_async_handle_error() to map the error correctly. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-26Merge tag 'nfs-for-3.9-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client bugfixes from Trond Myklebust: - Fix an NFSv4 idmapper regression - Fix an Oops in the pNFS blocks client - Fix up various issues with pNFS layoutcommit - Ensure correct read ordering of variables in rpc_wake_up_task_queue_locked * tag 'nfs-for-3.9-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: SUNRPC: Add barriers to ensure read ordering in rpc_wake_up_task_queue_locked NFSv4.1: Add a helper pnfs_commit_and_return_layout NFSv4.1: Always clear the NFS_INO_LAYOUTCOMMIT in layoutreturn NFSv4.1: Fix a race in pNFS layoutcommit pnfs-block: removing DM device maybe cause oops when call dev_remove NFSv4: Fix the string length returned by the idmapper
2013-03-25NFSv4.1: Use CLAIM_DELEG_CUR_FH opens when availableTrond Myklebust
Now that we do CLAIM_FH opens, we may run into situations where we get a delegation but don't have perfect knowledge of the file path. When returning the delegation, we might therefore not be able to us CLAIM_DELEGATE_CUR opens to convert the delegation into OPEN stateids and locks. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-25NFSv4.1: Enable open-by-filehandleTrond Myklebust
Sometimes, we actually _want_ to do open-by-filehandle, for instance when recovering opens after a network partition, or when called from nfs4_file_open. Enable that functionality using a new capability NFS_CAP_ATOMIC_OPEN_V1, and which is only enabled for NFSv4.1 servers that support it. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-25NFSv4.1: Add xdr support for CLAIM_FH and CLAIM_DELEG_CUR_FH opensTrond Myklebust
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-25NFSv4: Clean up nfs4_opendata_alloc in preparation for NFSv4.1 open modesTrond Myklebust
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-25NFSv4.1: Select the "most recent locking state" for read/write/setattr stateidsTrond Myklebust
Follow the practice described in section 8.2.2 of RFC5661: When sending a read/write or setattr stateid, set the seqid field to zero in order to signal that the NFS server should apply the most recent locking state. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-25NFSv4: Prepare for minorversion-specific nfs_server capabilitiesTrond Myklebust
Clean up the setting of the nfs_server->caps, by shoving it all into nfs4_server_common_setup(). Then add an 'initial capabilities' field into struct nfs4_minor_version_ops. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-25NFSv4: Resend the READ/WRITE RPC call if a stateid change causes an errorTrond Myklebust
Adds logic to ensure that if the server returns a BAD_STATEID, or other state related error, then we check if the stateid has already changed. If it has, then rather than start state recovery, we should just resend the failed RPC call with the new stateid. Allow nfs4_select_rw_stateid to notify that the stateid is unstable by having it return -EWOULDBLOCK if an RPC is underway that might change the stateid. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-25NFSv4: The stateid must remain the same for replayed RPC callsTrond Myklebust
If we replay a READ or WRITE call, we should not be changing the stateid. Currently, we may end up doing so, because the stateid is only selected at xdr encode time. This patch ensures that we select the stateid after we get an NFSv4.1 session slot, and that we keep that same stateid across retries. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-25NFS: __nfs_find_lock_context needs to check ctx->lock_context for a match tooTrond Myklebust
Currently, we're forcing an unnecessary duplication of the initial nfs_lock_context in calls to nfs_get_lock_context, since __nfs_find_lock_context ignores the ctx->lock_context. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-25NFS: Don't accept more reads/writes if the open context recovery failedTrond Myklebust
If the state recovery failed, we want to ensure that the application doesn't try to use the same file descriptor for more reads or writes. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-25NFSv4: Fail I/O if the state recovery fails irrevocablyTrond Myklebust
If state recovery fails with an ESTALE or a ENOENT, then we shouldn't keep retrying. Instead, mark the stateid as being invalid and fail the I/O with an EIO error. For other operations such as POSIX and BSD file locking, truncate etc, fail with an EBADF to indicate that this file descriptor is no longer valid. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-21NFSv4.1: Add a helper pnfs_commit_and_return_layoutTrond Myklebust
In order to be able to safely return the layout in nfs4_proc_setattr, we need to block new uses of the layout, wait for all outstanding users of the layout to complete, commit the layout and then return it. This patch adds a helper in order to do all this safely. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Boaz Harrosh <bharrosh@panasas.com>
2013-03-21NFSv4.1: Always clear the NFS_INO_LAYOUTCOMMIT in layoutreturnTrond Myklebust
Note that clearing NFS_INO_LAYOUTCOMMIT is tricky, since it requires you to also clear the NFS_LSEG_LAYOUTCOMMIT bits from the layout segments. The only two sites that need to do this are the ones that call pnfs_return_layout() without first doing a layout commit. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Benny Halevy <bhalevy@tonian.com> Cc: stable@vger.kernel.org
2013-03-21NFSv4.1: Fix a race in pNFS layoutcommitTrond Myklebust
We need to clear the NFS_LSEG_LAYOUTCOMMIT bits atomically with the NFS_INO_LAYOUTCOMMIT bit, otherwise we may end up with situations where the two are out of sync. The first half of the problem is to ensure that pnfs_layoutcommit_inode clears the NFS_LSEG_LAYOUTCOMMIT bit through pnfs_list_write_lseg. We still need to keep the reference to those segments until the RPC call is finished, so in order to make it clear _where_ those references come from, we add a helper pnfs_list_write_lseg_done() that cleans up after pnfs_list_write_lseg. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Benny Halevy <bhalevy@tonian.com> Cc: stable@vger.kernel.org
2013-03-21pnfs-block: removing DM device maybe cause oops when call dev_removefanchaoting
when pnfs block using device mapper,if umounting later,it maybe cause oops. we apply "1 + sizeof(bl_umount_request)" memory for msg->data, the memory maybe overflow when we do "memcpy(&dataptr [sizeof(bl_msg)], &bl_umount_request, sizeof(bl_umount_request))", because the size of bl_msg is more than 1 byte. Signed-off-by: fanchaoting<fanchaoting@cn.fujitsu.com> Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-20NFSv4: Fix the string length returned by the idmapperTrond Myklebust
Functions like nfs_map_uid_to_name() and nfs_map_gid_to_group() are expected to return a string without any terminating NUL character. Regression introduced by commit 57e62324e469e092ecc6c94a7a86fe4bd6ac5172 (NFS: Store the legacy idmapper result in the keyring). Reported-by: Dave Chiluk <dave.chiluk@canonical.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Bryan Schumaker <bjschuma@netapp.com> Cc: stable@vger.kernel.org [>=3.4]
2013-03-12fs: Readd the fs module aliases.Eric W. Biederman
I had assumed that the only use of module aliases for filesystems prior to "fs: Limit sys_mount to only request filesystem modules." was in request_module. It turns out I was wrong. At least mkinitcpio in Arch linux uses these aliases. So readd the preexising aliases, to keep from breaking userspace. Userspace eventually will have to follow and use the same aliases the kernel does. So at some point we may be delete these aliases without problems. However that day is not today. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2013-03-03fs: Limit sys_mount to only request filesystem modules.Eric W. Biederman
Modify the request_module to prefix the file system type with "fs-" and add aliases to all of the filesystems that can be built as modules to match. A common practice is to build all of the kernel code and leave code that is not commonly needed as modules, with the result that many users are exposed to any bug anywhere in the kernel. Looking for filesystems with a fs- prefix limits the pool of possible modules that can be loaded by mount to just filesystems trivially making things safer with no real cost. Using aliases means user space can control the policy of which filesystem modules are auto-loaded by editing /etc/modprobe.d/*.conf with blacklist and alias directives. Allowing simple, safe, well understood work-arounds to known problematic software. This also addresses a rare but unfortunate problem where the filesystem name is not the same as it's module name and module auto-loading would not work. While writing this patch I saw a handful of such cases. The most significant being autofs that lives in the module autofs4. This is relevant to user namespaces because we can reach the request module in get_fs_type() without having any special permissions, and people get uncomfortable when a user specified string (in this case the filesystem type) goes all of the way to request_module. After having looked at this issue I don't think there is any particular reason to perform any filtering or permission checks beyond making it clear in the module request that we want a filesystem module. The common pattern in the kernel is to call request_module() without regards to the users permissions. In general all a filesystem module does once loaded is call register_filesystem() and go to sleep. Which means there is not much attack surface exposed by loading a filesytem module unless the filesystem is mounted. In a user namespace filesystems are not mounted unless .fs_flags = FS_USERNS_MOUNT, which most filesystems do not set today. Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Acked-by: Kees Cook <keescook@chromium.org> Reported-by: Kees Cook <keescook@google.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2013-03-02Merge tag 'nfs-for-3.9-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client bugfixes from Trond Myklebust: "We've just concluded another Connectathon interoperability testing week, and so here are the fixes for the bugs that were discovered: - Don't allow NFS silly-renamed files to be deleted - Don't start the retransmission timer when out of socket space - Fix a couple of pnfs-related Oopses. - Fix one more NFSv4 state recovery deadlock - Don't loop forever when LAYOUTGET returns NFS4ERR_LAYOUTTRYLATER" * tag 'nfs-for-3.9-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: SUNRPC: One line comment fix NFSv4.1: LAYOUTGET EDELAY loops timeout to the MDS SUNRPC: add call to get configured timeout PNFS: set the default DS timeout to 60 seconds NFSv4: Fix another open/open_recovery deadlock nfs: don't allow nfs_find_actor to match inodes of the wrong type NFSv4.1: Hold reference to layout hdr in layoutget pnfs: fix resend_to_mds for directio SUNRPC: Don't start the retransmission timer when out of socket space NFS: Don't allow NFS silly-renamed files to be deleted, no signal
2013-02-28Merge branch 'for-3.9' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
Pull nfsd changes from J Bruce Fields: "Miscellaneous bugfixes, plus: - An overhaul of the DRC cache by Jeff Layton. The main effect is just to make it larger. This decreases the chances of intermittent errors especially in the UDP case. But we'll need to watch for any reports of performance regressions. - Containerized nfsd: with some limitations, we now support per-container nfs-service, thanks to extensive work from Stanislav Kinsbursky over the last year." Some notes about conflicts, since there were *two* non-data semantic conflicts here: - idr_remove_all() had been added by a memory leak fix, but has since become deprecated since idr_destroy() does it for us now. - xs_local_connect() had been added by this branch to make AF_LOCAL connections be synchronous, but in the meantime Trond had changed the calling convention in order to avoid a RCU dereference. There were a couple of more obvious actual source-level conflicts due to the hlist traversal changes and one just due to code changes next to each other, but those were trivial. * 'for-3.9' of git://linux-nfs.org/~bfields/linux: (49 commits) SUNRPC: make AF_LOCAL connect synchronous nfsd: fix compiler warning about ambiguous types in nfsd_cache_csum svcrpc: fix rpc server shutdown races svcrpc: make svc_age_temp_xprts enqueue under sv_lock lockd: nlmclnt_reclaim(): avoid stack overflow nfsd: enable NFSv4 state in containers nfsd: disable usermode helper client tracker in container nfsd: use proper net while reading "exports" file nfsd: containerize NFSd filesystem nfsd: fix comments on nfsd_cache_lookup SUNRPC: move cache_detail->cache_request callback call to cache_read() SUNRPC: remove "cache_request" argument in sunrpc_cache_pipe_upcall() function SUNRPC: rework cache upcall logic SUNRPC: introduce cache_detail->cache_request callback NFS: simplify and clean cache library NFS: use SUNRPC cache creation and destruction helper for DNS cache nfsd4: free_stid can be static nfsd: keep a checksum of the first 256 bytes of request sunrpc: trim off trailing checksum before returning decrypted or integrity authenticated buffer sunrpc: fix comment in struct xdr_buf definition ...
2013-02-28NFSv4.1: LAYOUTGET EDELAY loops timeout to the MDSWeston Andros Adamson
The client will currently try LAYOUTGETs forever if a server is returning NFS4ERR_LAYOUTTRYLATER or NFS4ERR_RECALLCONFLICT - even if the client no longer needs the layout (ie process killed, unmounted). This patch uses the DS timeout value (module parameter 'dataserver_timeo' via rpc layer) to set an upper limit of how long the client tries LATOUTGETs in this situation. Once the timeout is reached, IO is redirected to the MDS. This also changes how the client checks if a layout is on the clp list to avoid a double list_add. Signed-off-by: Weston Andros Adamson <dros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-02-28PNFS: set the default DS timeout to 60 secondsWeston Andros Adamson
The client should have 60 second default timeouts for DS operations, not 6 seconds. NFS4_DEF_DS_TIMEO is used as "timeout in tenths of a second" in nfs_init_timeout_values (and is not used anywhere else). This matches up with the description of the module param dataserver_timeo. Signed-off-by: Weston Andros Adamson <dros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-02-28NFSv4: Fix another open/open_recovery deadlockTrond Myklebust
If we don't release the open seqid before we wait for state recovery, then we may end up deadlocking the state recovery thread. This patch addresses a new deadlock that was introduced by commit c21443c2c792cd9b463646d982b0fe48aa6feb0f (NFSv4: Fix a reboot recovery race when opening a file) Reported-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-02-27hlist: drop the node parameter from iteratorsSasha Levin
I'm not sure why, but the hlist for each entry iterators were conceived list_for_each_entry(pos, head, member) The hlist ones were greedy and wanted an extra parameter: hlist_for_each_entry(tpos, pos, head, member) Why did they need an extra pos parameter? I'm not quite sure. Not only they don't really need it, it also prevents the iterator from looking exactly like the list iterator, which is unfortunate. Besides the semantic patch, there was some manual work required: - Fix up the actual hlist iterators in linux/list.h - Fix up the declaration of other iterators based on the hlist ones. - A very small amount of places were using the 'node' parameter, this was modified to use 'obj->member' instead. - Coccinelle didn't handle the hlist_for_each_entry_safe iterator properly, so those had to be fixed up manually. The semantic patch which is mostly the work of Peter Senna Tschudin is here: @@ iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host; type T; expression a,c,d,e; identifier b; statement S; @@ -T b; <+... when != b ( hlist_for_each_entry(a, - b, c, d) S | hlist_for_each_entry_continue(a, - b, c) S | hlist_for_each_entry_from(a, - b, c) S | hlist_for_each_entry_rcu(a, - b, c, d) S | hlist_for_each_entry_rcu_bh(a, - b, c, d) S | hlist_for_each_entry_continue_rcu_bh(a, - b, c) S | for_each_busy_worker(a, c, - b, d) S | ax25_uid_for_each(a, - b, c) S | ax25_for_each(a, - b, c) S | inet_bind_bucket_for_each(a, - b, c) S | sctp_for_each_hentry(a, - b, c) S | sk_for_each(a, - b, c) S | sk_for_each_rcu(a, - b, c) S | sk_for_each_from -(a, b) +(a) S + sk_for_each_from(a) S | sk_for_each_safe(a, - b, c, d) S | sk_for_each_bound(a, - b, c) S | hlist_for_each_entry_safe(a, - b, c, d, e) S | hlist_for_each_entry_continue_rcu(a, - b, c) S | nr_neigh_for_each(a, - b, c) S | nr_neigh_for_each_safe(a, - b, c, d) S | nr_node_for_each(a, - b, c) S | nr_node_for_each_safe(a, - b, c, d) S | - for_each_gfn_sp(a, c, d, b) S + for_each_gfn_sp(a, c, d) S | - for_each_gfn_indirect_valid_sp(a, c, d, b) S + for_each_gfn_indirect_valid_sp(a, c, d) S | for_each_host(a, - b, c) S | for_each_host_safe(a, - b, c, d) S | for_each_mesh_entry(a, - b, c, d) S ) ...+> [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c] [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c] [akpm@linux-foundation.org: checkpatch fixes] [akpm@linux-foundation.org: fix warnings] [akpm@linux-foudnation.org: redo intrusive kvm changes] Tested-by: Peter Senna Tschudin <peter.senna@gmail.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Gleb Natapov <gleb@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27nfs4client: convert to idr_alloc()Tejun Heo
Convert to the much saner new idr interface. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27nfs: idr_destroy() no longer needs idr_remove_all()Tejun Heo
idr_destroy() can destroy idr by itself and idr_remove_all() is being deprecated. Drop reference to idr_remove_all(). Note that the code wasn't completely correct before because idr_remove() on all entries doesn't necessarily release all idr_layers which could lead to memory leak. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27nfs: don't allow nfs_find_actor to match inodes of the wrong typeJeff Layton
Benny Halevy reported the following oops when testing RHEL6: <7>nfs_update_inode: inode 892950 mode changed, 0040755 to 0100644 <1>BUG: unable to handle kernel NULL pointer dereference at (null) <1>IP: [<ffffffffa02a52c5>] nfs_closedir+0x15/0x30 [nfs] <4>PGD 81448a067 PUD 831632067 PMD 0 <4>Oops: 0000 [#1] SMP <4>last sysfs file: /sys/kernel/mm/redhat_transparent_hugepage/enabled <4>CPU 6 <4>Modules linked in: fuse bonding 8021q garp ebtable_nat ebtables be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi softdog bridge stp llc xt_physdev ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_multiport iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 dm_round_robin dm_multipath objlayoutdriver2(U) nfs(U) lockd fscache auth_rpcgss nfs_acl sunrpc vhost_net macvtap macvlan tun kvm_intel kvm be2net igb dca ptp pps_core microcode serio_raw sg iTCO_wdt iTCO_vendor_support i7core_edac edac_core shpchp ext4 mbcache jbd2 sd_mod crc_t10dif ahci dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan] <4> <4>Pid: 6332, comm: dd Not tainted 2.6.32-358.el6.x86_64 #1 HP ProLiant DL170e G6 /ProLiant DL170e G6 <4>RIP: 0010:[<ffffffffa02a52c5>] [<ffffffffa02a52c5>] nfs_closedir+0x15/0x30 [nfs] <4>RSP: 0018:ffff88081458bb98 EFLAGS: 00010292 <4>RAX: ffffffffa02a52b0 RBX: 0000000000000000 RCX: 0000000000000003 <4>RDX: ffffffffa02e45a0 RSI: ffff88081440b300 RDI: ffff88082d5f5760 <4>RBP: ffff88081458bba8 R08: 0000000000000000 R09: 0000000000000000 <4>R10: 0000000000000772 R11: 0000000000400004 R12: 0000000040000008 <4>R13: ffff88082d5f5760 R14: ffff88082d6e8800 R15: ffff88082f12d780 <4>FS: 00007f728f37e700(0000) GS:ffff8800456c0000(0000) knlGS:0000000000000000 <4>CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b <4>CR2: 0000000000000000 CR3: 0000000831279000 CR4: 00000000000007e0 <4>DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 <4>DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 <4>Process dd (pid: 6332, threadinfo ffff88081458a000, task ffff88082fa0e040) <4>Stack: <4> 0000000040000008 ffff88081440b300 ffff88081458bbf8 ffffffff81182745 <4><d> ffff88082d5f5760 ffff88082d6e8800 ffff88081458bbf8 ffffffffffffffea <4><d> ffff88082f12d780 ffff88082d6e8800 ffffffffa02a50a0 ffff88082d5f5760 <4>Call Trace: <4> [<ffffffff81182745>] __fput+0xf5/0x210 <4> [<ffffffffa02a50a0>] ? do_open+0x0/0x20 [nfs] <4> [<ffffffff81182885>] fput+0x25/0x30 <4> [<ffffffff8117e23e>] __dentry_open+0x27e/0x360 <4> [<ffffffff811c397a>] ? inotify_d_instantiate+0x2a/0x60 <4> [<ffffffff8117e4b9>] lookup_instantiate_filp+0x69/0x90 <4> [<ffffffffa02a6679>] nfs_intent_set_file+0x59/0x90 [nfs] <4> [<ffffffffa02a686b>] nfs_atomic_lookup+0x1bb/0x310 [nfs] <4> [<ffffffff8118e0c2>] __lookup_hash+0x102/0x160 <4> [<ffffffff81225052>] ? selinux_inode_permission+0x72/0xb0 <4> [<ffffffff8118e76a>] lookup_hash+0x3a/0x50 <4> [<ffffffff81192a4b>] do_filp_open+0x2eb/0xdd0 <4> [<ffffffff8104757c>] ? __do_page_fault+0x1ec/0x480 <4> [<ffffffff8119f562>] ? alloc_fd+0x92/0x160 <4> [<ffffffff8117de79>] do_sys_open+0x69/0x140 <4> [<ffffffff811811f6>] ? sys_lseek+0x66/0x80 <4> [<ffffffff8117df90>] sys_open+0x20/0x30 <4> [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b <4>Code: 65 48 8b 04 25 c8 cb 00 00 83 a8 44 e0 ff ff 01 5b 41 5c c9 c3 90 55 48 89 e5 53 48 83 ec 08 0f 1f 44 00 00 48 8b 9e a0 00 00 00 <48> 8b 3b e8 13 0c f7 ff 48 89 df e8 ab 3d ec e0 48 83 c4 08 31 <1>RIP [<ffffffffa02a52c5>] nfs_closedir+0x15/0x30 [nfs] <4> RSP <ffff88081458bb98> <4>CR2: 0000000000000000 I think this is ultimately due to a bug on the server. The client had previously found a directory dentry. It then later tried to do an atomic open on a new (regular file) dentry. The attributes it got back had the same filehandle as the previously found directory inode. It then tried to put the filp because it failed the aops tests for O_DIRECT opens, and oopsed here because the ctx was still NULL. Obviously the root cause here is a server issue, but we can take steps to mitigate this on the client. When nfs_fhget is called, we always know what type of inode it is. In the event that there's a broken or malicious server on the other end of the wire, the client can end up crashing because the wrong ops are set on it. Have nfs_find_actor check that the inode type is correct after checking the fileid. The fileid check should rarely ever match, so it should only rarely ever get to this check. In the case where we have a broken server, we may see two different inodes with the same i_ino, but the client should be able to cope with them without crashing. This should fix the oops reported here: https://bugzilla.redhat.com/show_bug.cgi?id=913660 Reported-by: Benny Halevy <bhalevy@tonian.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-02-26Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs pile (part one) from Al Viro: "Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent locking violations, etc. The most visible changes here are death of FS_REVAL_DOT (replaced with "has ->d_weak_revalidate()") and a new helper getting from struct file to inode. Some bits of preparation to xattr method interface changes. Misc patches by various people sent this cycle *and* ocfs2 fixes from several cycles ago that should've been upstream right then. PS: the next vfs pile will be xattr stuff." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits) saner proc_get_inode() calling conventions proc: avoid extra pde_put() in proc_fill_super() fs: change return values from -EACCES to -EPERM fs/exec.c: make bprm_mm_init() static ocfs2/dlm: use GFP_ATOMIC inside a spin_lock ocfs2: fix possible use-after-free with AIO ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero target: writev() on single-element vector is pointless export kernel_write(), convert open-coded instances fs: encode_fh: return FILEID_INVALID if invalid fid_type kill f_vfsmnt vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op nfsd: handle vfs_getattr errors in acl protocol switch vfs_getattr() to struct path default SET_PERSONALITY() in linux/elf.h ceph: prepopulate inodes only when request is aborted d_hash_and_lookup(): export, switch open-coded instances 9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate() 9p: split dropping the acls from v9fs_set_create_acl() ...
2013-02-26vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry opJeff Layton
The following set of operations on a NFS client and server will cause server# mkdir a client# cd a server# mv a a.bak client# sleep 30 # (or whatever the dir attrcache timeout is) client# stat . stat: cannot stat `.': Stale NFS file handle Obviously, we should not be getting an ESTALE error back there since the inode still exists on the server. The problem is that the lookup code will call d_revalidate on the dentry that "." refers to, because NFS has FS_REVAL_DOT set. nfs_lookup_revalidate will see that the parent directory has changed and will try to reverify the dentry by redoing a LOOKUP. That of course fails, so the lookup code returns ESTALE. The problem here is that d_revalidate is really a bad fit for this case. What we really want to know at this point is whether the inode is still good or not, but we don't really care what name it goes by or whether the dcache is still valid. Add a new d_op->d_weak_revalidate operation and have complete_walk call that instead of d_revalidate. The intent there is to allow for a "weaker" d_revalidate that just checks to see whether the inode is still good. This is also gives us an opportunity to kill off the FS_REVAL_DOT special casing. [AV: changed method name, added note in porting, fixed confusion re having it possibly called from RCU mode (it won't be)] Cc: NeilBrown <neilb@suse.de> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-25NFSv4.1: Hold reference to layout hdr in layoutgetWeston Andros Adamson
This fixes an oops where a LAYOUTGET is in still in the rpciod queue, but the requesting processes has been killed. Without this, killing the process does the final pnfs_put_layout_hdr() and sets NFS_I(inode)->layout to NULL while the LAYOUTGET rpc task still references it. Example oops: BUG: unable to handle kernel NULL pointer dereference at 0000000000000080 IP: [<ffffffffa01bd586>] pnfs_choose_layoutget_stateid+0x37/0xef [nfsv4] PGD 7365b067 PUD 7365d067 PMD 0 Oops: 0000 [#1] SMP DEBUG_PAGEALLOC Modules linked in: nfs_layout_nfsv41_files nfsv4 auth_rpcgss nfs lockd sunrpc ipt_MASQUERADE ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat iptable_mangle ip6table_filter ip6_tables ppdev e1000 i2c_piix4 i2c_core shpchp parport_pc parport crc32c_intel aesni_intel xts aes_x86_64 lrw gf128mul ablk_helper cryptd mptspi scsi_transport_spi mptscsih mptbase floppy autofs4 CPU 0 Pid: 27, comm: kworker/0:1 Not tainted 3.8.0-dros_cthon2013+ #4 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform RIP: 0010:[<ffffffffa01bd586>] [<ffffffffa01bd586>] pnfs_choose_layoutget_stateid+0x37/0xef [nfsv4] RSP: 0018:ffff88007b0c1c88 EFLAGS: 00010246 RAX: ffff88006ed36678 RBX: 0000000000000000 RCX: 0000000ea877e3bc RDX: ffff88007a729da8 RSI: 0000000000000000 RDI: ffff88007a72b958 RBP: ffff88007b0c1ca8 R08: 0000000000000002 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffff88007a72b958 R13: ffff88007a729da8 R14: 0000000000000000 R15: ffffffffa011077e FS: 0000000000000000(0000) GS:ffff88007f600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000080 CR3: 00000000735f8000 CR4: 00000000001407f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kworker/0:1 (pid: 27, threadinfo ffff88007b0c0000, task ffff88007c2fa0c0) Stack: ffff88006fc05388 ffff88007a72b908 ffff88007b240900 ffff88006fc05388 ffff88007b0c1cd8 ffffffffa01a2170 ffff88007b240900 ffff88007b240900 ffff88007b240970 ffffffffa011077e ffff88007b0c1ce8 ffffffffa0110791 Call Trace: [<ffffffffa01a2170>] nfs4_layoutget_prepare+0x7b/0x92 [nfsv4] [<ffffffffa011077e>] ? __rpc_atrun+0x15/0x15 [sunrpc] [<ffffffffa0110791>] rpc_prepare_task+0x13/0x15 [sunrpc] Reported-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de> Signed-off-by: Weston Andros Adamson <dros@netapp.com> Cc: stable@kernel.org Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-02-25Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull user namespace and namespace infrastructure changes from Eric W Biederman: "This set of changes starts with a few small enhnacements to the user namespace. reboot support, allowing more arbitrary mappings, and support for mounting devpts, ramfs, tmpfs, and mqueuefs as just the user namespace root. I do my best to document that if you care about limiting your unprivileged users that when you have the user namespace support enabled you will need to enable memory control groups. There is a minor bug fix to prevent overflowing the stack if someone creates way too many user namespaces. The bulk of the changes are a continuation of the kuid/kgid push down work through the filesystems. These changes make using uids and gids typesafe which ensures that these filesystems are safe to use when multiple user namespaces are in use. The filesystems converted for 3.9 are ceph, 9p, afs, ocfs2, gfs2, ncpfs, nfs, nfsd, and cifs. The changes for these filesystems were a little more involved so I split the changes into smaller hopefully obviously correct changes. XFS is the only filesystem that remains. I was hoping I could get that in this release so that user namespace support would be enabled with an allyesconfig or an allmodconfig but it looks like the xfs changes need another couple of days before it they are ready." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (93 commits) cifs: Enable building with user namespaces enabled. cifs: Convert struct cifs_ses to use a kuid_t and a kgid_t cifs: Convert struct cifs_sb_info to use kuids and kgids cifs: Modify struct smb_vol to use kuids and kgids cifs: Convert struct cifsFileInfo to use a kuid cifs: Convert struct cifs_fattr to use kuid and kgids cifs: Convert struct tcon_link to use a kuid. cifs: Modify struct cifs_unix_set_info_args to hold a kuid_t and a kgid_t cifs: Convert from a kuid before printing current_fsuid cifs: Use kuids and kgids SID to uid/gid mapping cifs: Pass GLOBAL_ROOT_UID and GLOBAL_ROOT_GID to keyring_alloc cifs: Use BUILD_BUG_ON to validate uids and gids are the same size cifs: Override unmappable incoming uids and gids nfsd: Enable building with user namespaces enabled. nfsd: Properly compare and initialize kuids and kgids nfsd: Store ex_anon_uid and ex_anon_gid as kuids and kgids nfsd: Modify nfsd4_cb_sec to use kuids and kgids nfsd: Handle kuids and kgids in the nfs4acl to posix_acl conversion nfsd: Convert nfsxdr to use kuids and kgids nfsd: Convert nfs3xdr to use kuids and kgids ...