From 7af1364ffa64db61e386628594836e13d2ef04b5 Mon Sep 17 00:00:00 2001
From: "Eric W. Biederman" <ebiederm@xmission.com>
Date: Fri, 4 Oct 2013 19:15:13 -0700
Subject: vfs: Don't allow overwriting mounts in the current mount namespace

In preparation for allowing mountpoints to be renamed and unlinked
in remote filesystems and in other mount namespaces test if on a dentry
there is a mount in the local mount namespace before allowing it to
be renamed or unlinked.

The primary motivation here are old versions of fusermount unmount
which is not safe if the a path can be renamed or unlinked while it is
verifying the mount is safe to unmount.  More recent versions are simpler
and safer by simply using UMOUNT_NOFOLLOW when unmounting a mount
in a directory owned by an arbitrary user.

Miklos Szeredi <miklos@szeredi.hu> reports this is approach is good
enough to remove concerns about new kernels mixed with old versions
of fusermount.

A secondary motivation for restrictions here is that it removing empty
directories that have non-empty mount points on them appears to
violate the rule that rmdir can not remove empty directories.  As
Linus Torvalds pointed out this is useful for programs (like git) that
test if a directory is empty with rmdir.

Therefore this patch arranges to enforce the existing mount point
semantics for local mount namespace.

v2: Rewrote the test to be a drop in replacement for d_mountpoint
v3: Use bool instead of int as the return type of is_local_mountpoint

Reviewed-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 fs/namei.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

(limited to 'fs/namei.c')

diff --git a/fs/namei.c b/fs/namei.c
index a7b05bf82d31..a3a14b033b0d 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3565,6 +3565,8 @@ int vfs_rmdir(struct inode *dir, struct dentry *dentry)
 	mutex_lock(&dentry->d_inode->i_mutex);
 
 	error = -EBUSY;
+	if (is_local_mountpoint(dentry))
+		goto out;
 	if (d_mountpoint(dentry))
 		goto out;
 
@@ -3681,7 +3683,7 @@ int vfs_unlink(struct inode *dir, struct dentry *dentry, struct inode **delegate
 		return -EPERM;
 
 	mutex_lock(&target->i_mutex);
-	if (d_mountpoint(dentry))
+	if (is_local_mountpoint(dentry) || d_mountpoint(dentry))
 		error = -EBUSY;
 	else {
 		error = security_inode_unlink(dir, dentry);
@@ -4126,6 +4128,8 @@ int vfs_rename(struct inode *old_dir, struct dentry *old_dentry,
 		mutex_lock(&target->i_mutex);
 
 	error = -EBUSY;
+	if (is_local_mountpoint(old_dentry) || is_local_mountpoint(new_dentry))
+		goto out;
 	if (d_mountpoint(old_dentry) || d_mountpoint(new_dentry))
 		goto out;
 
-- 
cgit v1.2.3


From 8ed936b5671bfb33d89bc60bdcc7cf0470ba52fe Mon Sep 17 00:00:00 2001
From: "Eric W. Biederman" <ebiederman@twitter.com>
Date: Tue, 1 Oct 2013 18:33:48 -0700
Subject: vfs: Lazily remove mounts on unlinked files and directories.

With the introduction of mount namespaces and bind mounts it became
possible to access files and directories that on some paths are mount
points but are not mount points on other paths.  It is very confusing
when rm -rf somedir returns -EBUSY simply because somedir is mounted
somewhere else.  With the addition of user namespaces allowing
unprivileged mounts this condition has gone from annoying to allowing
a DOS attack on other users in the system.

The possibility for mischief is removed by updating the vfs to support
rename, unlink and rmdir on a dentry that is a mountpoint and by
lazily unmounting mountpoints on deleted dentries.

In particular this change allows rename, unlink and rmdir system calls
on a dentry without a mountpoint in the current mount namespace to
succeed, and it allows rename, unlink, and rmdir performed on a
distributed filesystem to update the vfs cache even if when there is a
mount in some namespace on the original dentry.

There are two common patterns of maintaining mounts: Mounts on trusted
paths with the parent directory of the mount point and all ancestory
directories up to / owned by root and modifiable only by root
(i.e. /media/xxx, /dev, /dev/pts, /proc, /sys, /sys/fs/cgroup/{cpu,
cpuacct, ...}, /usr, /usr/local).  Mounts on unprivileged directories
maintained by fusermount.

In the case of mounts in trusted directories owned by root and
modifiable only by root the current parent directory permissions are
sufficient to ensure a mount point on a trusted path is not removed
or renamed by anyone other than root, even if there is a context
where the there are no mount points to prevent this.

In the case of mounts in directories owned by less privileged users
races with users modifying the path of a mount point are already a
danger.  fusermount already uses a combination of chdir,
/proc/<pid>/fd/NNN, and UMOUNT_NOFOLLOW to prevent these races.  The
removable of global rename, unlink, and rmdir protection really adds
nothing new to consider only a widening of the attack window, and
fusermount is already safe against unprivileged users modifying the
directory simultaneously.

In principle for perfect userspace programs returning -EBUSY for
unlink, rmdir, and rename of dentires that have mounts in the local
namespace is actually unnecessary.  Unfortunately not all userspace
programs are perfect so retaining -EBUSY for unlink, rmdir and rename
of dentries that have mounts in the current mount namespace plays an
important role of maintaining consistency with historical behavior and
making imperfect userspace applications hard to exploit.

v2: Remove spurious old_dentry.
v3: Optimized shrink_submounts_and_drop
    Removed unsued afs label
v4: Simplified the changes to check_submounts_and_drop
    Do not rename check_submounts_and_drop shrink_submounts_and_drop
    Document what why we need atomicity in check_submounts_and_drop
    Rely on the parent inode mutex to make d_revalidate and d_invalidate
    an atomic unit.
v5: Refcount the mountpoint to detach in case of simultaneous
    renames.

Reviewed-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 fs/namei.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

(limited to 'fs/namei.c')

diff --git a/fs/namei.c b/fs/namei.c
index a3a14b033b0d..2ba10904dba0 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3567,8 +3567,6 @@ int vfs_rmdir(struct inode *dir, struct dentry *dentry)
 	error = -EBUSY;
 	if (is_local_mountpoint(dentry))
 		goto out;
-	if (d_mountpoint(dentry))
-		goto out;
 
 	error = security_inode_rmdir(dir, dentry);
 	if (error)
@@ -3581,6 +3579,7 @@ int vfs_rmdir(struct inode *dir, struct dentry *dentry)
 
 	dentry->d_inode->i_flags |= S_DEAD;
 	dont_mount(dentry);
+	detach_mounts(dentry);
 
 out:
 	mutex_unlock(&dentry->d_inode->i_mutex);
@@ -3683,7 +3682,7 @@ int vfs_unlink(struct inode *dir, struct dentry *dentry, struct inode **delegate
 		return -EPERM;
 
 	mutex_lock(&target->i_mutex);
-	if (is_local_mountpoint(dentry) || d_mountpoint(dentry))
+	if (is_local_mountpoint(dentry))
 		error = -EBUSY;
 	else {
 		error = security_inode_unlink(dir, dentry);
@@ -3692,8 +3691,10 @@ int vfs_unlink(struct inode *dir, struct dentry *dentry, struct inode **delegate
 			if (error)
 				goto out;
 			error = dir->i_op->unlink(dir, dentry);
-			if (!error)
+			if (!error) {
 				dont_mount(dentry);
+				detach_mounts(dentry);
+			}
 		}
 	}
 out:
@@ -4130,8 +4131,6 @@ int vfs_rename(struct inode *old_dir, struct dentry *old_dentry,
 	error = -EBUSY;
 	if (is_local_mountpoint(old_dentry) || is_local_mountpoint(new_dentry))
 		goto out;
-	if (d_mountpoint(old_dentry) || d_mountpoint(new_dentry))
-		goto out;
 
 	if (max_links && new_dir != old_dir) {
 		error = -EMLINK;
@@ -4168,6 +4167,7 @@ int vfs_rename(struct inode *old_dir, struct dentry *old_dentry,
 		if (is_dir)
 			target->i_flags |= S_DEAD;
 		dont_mount(new_dentry);
+		detach_mounts(new_dentry);
 	}
 	if (!(old_dir->i_sb->s_type->fs_flags & FS_RENAME_DOES_D_MOVE)) {
 		if (!(flags & RENAME_EXCHANGE))
-- 
cgit v1.2.3


From 5542aa2fa7f6cddb03c4ac3135e390adffda98ca Mon Sep 17 00:00:00 2001
From: "Eric W. Biederman" <ebiederm@xmission.com>
Date: Thu, 13 Feb 2014 09:46:25 -0800
Subject: vfs: Make d_invalidate return void

Now that d_invalidate can no longer fail, stop returning a useless
return code.  For the few callers that checked the return code update
remove the handling of d_invalidate failure.

Reviewed-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 fs/namei.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

(limited to 'fs/namei.c')

diff --git a/fs/namei.c b/fs/namei.c
index 2ba10904dba0..d20d579a022e 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1306,7 +1306,8 @@ static struct dentry *lookup_dcache(struct qstr *name, struct dentry *dir,
 				if (error < 0) {
 					dput(dentry);
 					return ERR_PTR(error);
-				} else if (!d_invalidate(dentry)) {
+				} else {
+					d_invalidate(dentry);
 					dput(dentry);
 					dentry = NULL;
 				}
@@ -1435,10 +1436,9 @@ unlazy:
 			dput(dentry);
 			return status;
 		}
-		if (!d_invalidate(dentry)) {
-			dput(dentry);
-			goto need_lookup;
-		}
+		d_invalidate(dentry);
+		dput(dentry);
+		goto need_lookup;
 	}
 
 	path->mnt = mnt;
-- 
cgit v1.2.3


From 115cbfdc609702a131c51281864c08f5d27b459a Mon Sep 17 00:00:00 2001
From: Al Viro <viro@zeniv.linux.org.uk>
Date: Sat, 11 Oct 2014 23:05:52 -0400
Subject: let path_init() failures treated the same way as subsequent
 link_path_walk()

As it is, path_lookupat() and path_mounpoint() might end up leaking struct file
reference in some cases.

Spotted-by: Eric Biggers <ebiggers3@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 fs/namei.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

(limited to 'fs/namei.c')

diff --git a/fs/namei.c b/fs/namei.c
index d20d579a022e..0f64aa412617 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1950,7 +1950,7 @@ static int path_lookupat(int dfd, const char *name,
 	err = path_init(dfd, name, flags | LOOKUP_PARENT, nd, &base);
 
 	if (unlikely(err))
-		return err;
+		goto out;
 
 	current->total_link_count = 0;
 	err = link_path_walk(name, nd);
@@ -1982,6 +1982,7 @@ static int path_lookupat(int dfd, const char *name,
 		}
 	}
 
+out:
 	if (base)
 		fput(base);
 
@@ -2301,7 +2302,7 @@ path_mountpoint(int dfd, const char *name, struct path *path, unsigned int flags
 
 	err = path_init(dfd, name, flags | LOOKUP_PARENT, &nd, &base);
 	if (unlikely(err))
-		return err;
+		goto out;
 
 	current->total_link_count = 0;
 	err = link_path_walk(name, &nd);
-- 
cgit v1.2.3