aboutsummaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2014-06-18sysfs: make sure read buffer is zeroedTejun Heo
commit f5c16f29bf5e57ba4051fc7785ba7f035f798c71 upstream. 13c589d5b0ac ("sysfs: use seq_file when reading regular files") switched sysfs from custom read implementation to seq_file to enable later transition to kernfs. After the change, the buffer passed to ->show() is acquired through seq_get_buf(); unfortunately, this introduces a subtle behavior change. Before the commit, the buffer passed to ->show() was always zero as it was allocated using get_zeroed_page(). Because seq_file doesn't clear buffers on allocation and neither does seq_get_buf(), after the commit, depending on the behavior of ->show(), we may end up exposing uninitialized data to userland thus possibly altering userland visible behavior and leaking information. Fix it by explicitly clearing the buffer. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Ron <ron@debian.org> Fixes: 13c589d5b0ac ("sysfs: use seq_file when reading regular files") Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18metag: Reduce maximum stack size to 256MBJames Hogan
commit d71f290b4e98a39f49f2595a13be3b4d5ce8e1f1 upstream. Specify the maximum stack size for arches where the stack grows upward (parisc and metag) in asm/processor.h rather than hard coding in fs/exec.c so that metag can specify a smaller value of 256MB rather than 1GB. This fixes a BUG on metag if the RLIMIT_STACK hard limit is increased beyond a safe value by root. E.g. when starting a process after running "ulimit -H -s unlimited" it will then attempt to use a stack size of the maximum 1GB which is far too big for metag's limited user virtual address space (stack_top is usually 0x3ffff000): BUG: failure at fs/exec.c:589/shift_arg_pages()! Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Helge Deller <deller@gmx.de> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: linux-parisc@vger.kernel.org Cc: linux-metag@vger.kernel.org Cc: John David Anglin <dave.anglin@bell.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18nfsd4: remove lockowner when removing lock stateidJ. Bruce Fields
commit a1b8ff4c97b4375d21b6d6c45d75877303f61b3b upstream. The nfsv4 state code has always assumed a one-to-one correspondance between lock stateid's and lockowners even if it appears not to in some places. We may actually change that, but for now when FREE_STATEID releases a lock stateid it also needs to release the parent lockowner. Symptoms were a subsequent LOCK crashing in find_lockowner_str when it calls same_lockowner_ino on a lockowner that unexpectedly has an empty so_stateids list. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18nfsd4: warn on finding lockowner without stateid'sJ. Bruce Fields
commit 27b11428b7de097c42f205beabb1764f4365443b upstream. The current code assumes a one-to-one lockowner<->lock stateid correspondance. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18NFSD: Call ->set_acl with a NULL ACL structure if no entriesKinglong Mee
commit aa07c713ecfc0522916f3cd57ac628ea6127c0ec upstream. After setting ACL for directory, I got two problems that caused by the cached zero-length default posix acl. This patch make sure nfsd4_set_nfs4_acl calls ->set_acl with a NULL ACL structure if there are no entries. Thanks for Christoph Hellwig's advice. First problem: ............ hang ........... Second problem: [ 1610.167668] ------------[ cut here ]------------ [ 1610.168320] kernel BUG at /root/nfs/linux/fs/nfsd/nfs4acl.c:239! [ 1610.168320] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC [ 1610.168320] Modules linked in: nfsv4(OE) nfs(OE) nfsd(OE) rpcsec_gss_krb5 fscache ip6t_rpfilter ip6t_REJECT cfg80211 xt_conntrack rfkill ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw auth_rpcgss nfs_acl snd_intel8x0 ppdev lockd snd_ac97_codec ac97_bus snd_pcm snd_timer e1000 pcspkr parport_pc snd parport serio_raw joydev i2c_piix4 sunrpc(OE) microcode soundcore i2c_core ata_generic pata_acpi [last unloaded: nfsd] [ 1610.168320] CPU: 0 PID: 27397 Comm: nfsd Tainted: G OE 3.15.0-rc1+ #15 [ 1610.168320] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 1610.168320] task: ffff88005ab653d0 ti: ffff88005a944000 task.ti: ffff88005a944000 [ 1610.168320] RIP: 0010:[<ffffffffa034d5ed>] [<ffffffffa034d5ed>] _posix_to_nfsv4_one+0x3cd/0x3d0 [nfsd] [ 1610.168320] RSP: 0018:ffff88005a945b00 EFLAGS: 00010293 [ 1610.168320] RAX: 0000000000000001 RBX: ffff88006700bac0 RCX: 0000000000000000 [ 1610.168320] RDX: 0000000000000000 RSI: ffff880067c83f00 RDI: ffff880068233300 [ 1610.168320] RBP: ffff88005a945b48 R08: ffffffff81c64830 R09: 0000000000000000 [ 1610.168320] R10: ffff88004ea85be0 R11: 000000000000f475 R12: ffff880068233300 [ 1610.168320] R13: 0000000000000003 R14: 0000000000000002 R15: ffff880068233300 [ 1610.168320] FS: 0000000000000000(0000) GS:ffff880077800000(0000) knlGS:0000000000000000 [ 1610.168320] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 1610.168320] CR2: 00007f5bcbd3b0b9 CR3: 0000000001c0f000 CR4: 00000000000006f0 [ 1610.168320] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1610.168320] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1610.168320] Stack: [ 1610.168320] ffffffff00000000 0000000b67c83500 000000076700bac0 0000000000000000 [ 1610.168320] ffff88006700bac0 ffff880068233300 ffff88005a945c08 0000000000000002 [ 1610.168320] 0000000000000000 ffff88005a945b88 ffffffffa034e2d5 000000065a945b68 [ 1610.168320] Call Trace: [ 1610.168320] [<ffffffffa034e2d5>] nfsd4_get_nfs4_acl+0x95/0x150 [nfsd] [ 1610.168320] [<ffffffffa03400d6>] nfsd4_encode_fattr+0x646/0x1e70 [nfsd] [ 1610.168320] [<ffffffff816a6e6e>] ? kmemleak_alloc+0x4e/0xb0 [ 1610.168320] [<ffffffffa0327962>] ? nfsd_setuser_and_check_port+0x52/0x80 [nfsd] [ 1610.168320] [<ffffffff812cd4bb>] ? selinux_cred_prepare+0x1b/0x30 [ 1610.168320] [<ffffffffa0341caa>] nfsd4_encode_getattr+0x5a/0x60 [nfsd] [ 1610.168320] [<ffffffffa0341e07>] nfsd4_encode_operation+0x67/0x110 [nfsd] [ 1610.168320] [<ffffffffa033844d>] nfsd4_proc_compound+0x21d/0x810 [nfsd] [ 1610.168320] [<ffffffffa0324d9b>] nfsd_dispatch+0xbb/0x200 [nfsd] [ 1610.168320] [<ffffffffa00850cd>] svc_process_common+0x46d/0x6d0 [sunrpc] [ 1610.168320] [<ffffffffa0085433>] svc_process+0x103/0x170 [sunrpc] [ 1610.168320] [<ffffffffa032472f>] nfsd+0xbf/0x130 [nfsd] [ 1610.168320] [<ffffffffa0324670>] ? nfsd_destroy+0x80/0x80 [nfsd] [ 1610.168320] [<ffffffff810a5202>] kthread+0xd2/0xf0 [ 1610.168320] [<ffffffff810a5130>] ? insert_kthread_work+0x40/0x40 [ 1610.168320] [<ffffffff816c1ebc>] ret_from_fork+0x7c/0xb0 [ 1610.168320] [<ffffffff810a5130>] ? insert_kthread_work+0x40/0x40 [ 1610.168320] Code: 78 02 e9 e7 fc ff ff 31 c0 31 d2 31 c9 66 89 45 ce 41 8b 04 24 66 89 55 d0 66 89 4d d2 48 8d 04 80 49 8d 5c 84 04 e9 37 fd ff ff <0f> 0b 90 0f 1f 44 00 00 55 8b 56 08 c7 07 00 00 00 00 8b 46 0c [ 1610.168320] RIP [<ffffffffa034d5ed>] _posix_to_nfsv4_one+0x3cd/0x3d0 [nfsd] [ 1610.168320] RSP <ffff88005a945b00> [ 1610.257313] ---[ end trace 838254e3e352285b ]--- Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18NFSd: call rpc_destroy_wait_queue() from free_client()Trond Myklebust
commit 4cb57e3032d4e4bf5e97780e9907da7282b02b0c upstream. Mainly to ensure that we don't leave any hanging timers. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18NFSd: Move default initialisers from create_client() to alloc_client()Trond Myklebust
commit 5694c93e6c4954fa9424c215f75eeb919bddad64 upstream. Aside from making it clearer what is non-trivial in create_client(), it also fixes a bug whereby we can call free_client() before idr_init() has been called. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18fix races between __d_instantiate() and checks of dentry flagsAl Viro
commit 22213318af7ae265bc6cd8aef2febbc2d69a2440 upstream. in non-lazy walk we need to be careful about dentry switching from negative to positive - both ->d_flags and ->d_inode are updated, and in some places we might see only one store. The cases where dentry has been obtained by dcache lookup with ->i_mutex held on parent are safe - ->d_lock and ->i_mutex provide all the barriers we need. However, there are several places where we run into trouble: * do_last() fetches ->d_inode, then checks ->d_flags and assumes that inode won't be NULL unless d_is_negative() is true. Race with e.g. creat() - we might have fetched the old value of ->d_inode (still NULL) and new value of ->d_flags (already not DCACHE_MISS_TYPE). Lin Ming has observed and reported the resulting oops. * a bunch of places checks ->d_inode for being non-NULL, then checks ->d_flags for "is it a symlink". Race with symlink(2) in case if our CPU sees ->d_inode update first - we see non-NULL there, but ->d_flags still contains DCACHE_MISS_TYPE instead of DCACHE_SYMLINK_TYPE. Result: false negative on "should we follow link here?", with subsequent unpleasantness. Reported-and-tested-by: Lin Ming <minggr@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18autofs: fix lockref lookupIan Kent
commit 6b6751f7feba68d8f5c72b72cc69a1c5a625529c upstream. autofs needs to be able to see private data dentry flags for its dentrys that are being created but not yet hashed and for its dentrys that have been rmdir()ed but not yet freed. It needs to do this so it can block processes in these states until a status has been returned to indicate the given operation is complete. It does this by keeping two lists, active and expring, of dentrys in this state and uses ->d_release() to keep them stable while it checks the reference count to determine if they should be used. But with the recent lockref changes dentrys being freed sometimes don't transition to a reference count of 0 before being freed so autofs can occassionally use a dentry that is invalid which can lead to a panic. Signed-off-by: Ian Kent <raven@themaw.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18ceph: clear directory's completeness when creating fileYan, Zheng
commit 0a8a70f96fe1bd3e07c15bb86fd247e76102398a upstream. When creating a file, ceph_set_dentry_offset() puts the new dentry at the end of directory's d_subdirs, then set the dentry's offset based on directory's max offset. The offset does not reflect the real postion of the dentry in directory. Later readdir reply from MDS may change the dentry's position/offset. This inconsistency can cause missing/duplicate entries in readdir result if readdir is partly satisfied by dcache_readdir(). The fix is clear directory's completeness after creating/renaming file. It prevents later readdir from using dcache_readdir(). Fixes: http://tracker.ceph.com/issues/8025 Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18fs/affs/super.c: bugfix / double freeFabian Frederick
commit d353efd02357a74753cd45f367a2d3d357fd6904 upstream. Commit 842a859db26b ("affs: use ->kill_sb() to simplify ->put_super() and failure exits of ->mount()") adds .kill_sb which frees sbi but doesn't remove sbi free in case of parse_options error causing double free+random crash. Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18posix_acl: handle NULL ACL in posix_acl_equiv_modeChristoph Hellwig
commit 50c6e282bdf5e8dabf8d7cf7b162545a55645fd9 upstream. Various filesystems don't bother checking for a NULL ACL in posix_acl_equiv_mode, and thus can dereference a NULL pointer when it gets passed one. This usually happens from the NFS server, as the ACL tools never pass a NULL ACL, but instead of one representing the mode bits. Instead of adding boilerplat to all filesystems put this check into one place, which will allow us to remove the check from other filesystems as well later on. Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Ben Greear <greearb@candelatech.com> Reported-by: Marco Munderloh <munderl@tnt.uni-hannover.de>, Cc: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18aio: fix potential leak in aio_run_iocb().Leon Yu
commit 754320d6e166d3a12cb4810a452bde00afbd4e9a upstream. iovec should be reclaimed whenever caller of rw_copy_check_uvector() returns, but it doesn't hold when failure happens right after aio_setup_vectored_rw(). Fix that in a such way to avoid hairy goto. Signed-off-by: Leon Yu <chianglungyu@gmail.com> Signed-off-by: Benjamin LaHaise <bcrl@kvack.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18coredump: fix va_list corruptionEric Dumazet
commit 404ca80eb5c2727d78cd517d12108b040c522e12 upstream. A va_list needs to be copied in case it needs to be used twice. Thanks to Hugh for debugging this issue, leading to various panics. Tested: lpq84:~# echo "|/foobar12345 %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h" >/proc/sys/kernel/core_pattern 'produce_core' is simply : main() { *(int *)0 = 1;} lpq84:~# ./produce_core Segmentation fault (core dumped) lpq84:~# dmesg | tail -1 [ 614.352947] Core dump to |/foobar12345 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 (null) pipe failed Notice the last argument was replaced by a NULL (we were lucky enough to not crash, but do not try this on your production machine !) After fix : lpq83:~# echo "|/foobar12345 %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h" >/proc/sys/kernel/core_pattern lpq83:~# ./produce_core Segmentation fault lpq83:~# dmesg | tail -1 [ 740.800441] Core dump to |/foobar12345 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 pipe failed Fixes: 5fe9d8ca21cc ("coredump: cn_vprintf() has no reason to call vsnprintf() twice") Signed-off-by: Eric Dumazet <edumazet@google.com> Diagnosed-by: Hugh Dickins <hughd@google.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18kernfs: add back missing error check in kernfs_fop_mmap()Tejun Heo
commit b44b2140265ddfde03acbe809336111d31adb0d1 upstream. While updating how mmap enabled kernfs files are handled by lockdep, 9b2db6e18945 ("sysfs: bail early from kernfs_file_mmap() to avoid spurious lockdep warning") inadvertently dropped error return check from kernfs_file_mmap(). The intention was just dropping "if (ops->mmap)" check as the control won't reach the point if the mmap callback isn't implemented, but I mistakenly removed the error return check together with it. This led to Xorg crash on i810 which was reported and bisected to the commit and then to the specific change by Tobias. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-and-bisected-by: Tobias Powalowski <tobias.powalowski@googlemail.com> Tested-by: Tobias Powalowski <tobias.powalowski@googlemail.com> References: http://lkml.kernel.org/g/533D01BD.1010200@googlemail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18fs: Don't return 0 from get_anon_bdevThomas Bächler
commit a2a4dc494a7b7135f460e38e788c4a58f65e4ac3 upstream. Commit 9e30cc9595303b27b48 removed an internal mount. This has the side-effect that rootfs now has FSID 0. Many userspace utilities assume that st_dev in struct stat is never 0, so this change breaks a number of tools in early userspace. Since we don't know how many userspace programs are affected, make sure that FSID is at least 1. References: http://article.gmane.org/gmane.linux.kernel/1666905 References: http://permalink.gmane.org/gmane.linux.utilities.util-linux-ng/8557 Signed-off-by: Thomas Bächler <thomas@archlinux.org> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: H. Peter Anvin <hpa@zytor.com> Tested-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18cifs: Wait for writebacks to complete before attempting write.Sachin Prabhu
commit c11f1df5003d534fd067f0168bfad7befffb3b5c upstream. Problem reported in Red Hat bz 1040329 for strict writes where we cache only when we hold oplock and write direct to the server when we don't. When we receive an oplock break, we first change the oplock value for the inode in cifsInodeInfo->oplock to indicate that we no longer hold the oplock before we enqueue a task to flush changes to the backing device. Once we have completed flushing the changes, we return the oplock to the server. There are 2 ways here where we can have data corruption 1) While we flush changes to the backing device as part of the oplock break, we can have processes write to the file. These writes check for the oplock, find none and attempt to write directly to the server. These direct writes made while we are flushing from cache could be overwritten by data being flushed from the cache causing data corruption. 2) While a thread runs in cifs_strict_writev, the machine could receive and process an oplock break after the thread has checked the oplock and found that it allows us to cache and before we have made changes to the cache. In that case, we end up with a dirty page in cache when we shouldn't have any. This will be flushed later and will overwrite all subsequent writes to the part of the file represented by this page. Before making any writes to the server, we need to confirm that we are not in the process of flushing data to the server and if we are, we should wait until the process is complete before we attempt the write. We should also wait for existing writes to complete before we process an oplock break request which changes oplock values. We add a version specific downgrade_oplock() operation to allow for differences in the oplock values set for the different smb versions. Signed-off-by: Sachin Prabhu <sprabhu@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Pavel Shilovsky <piastry@etersoft.ru> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18don't bother with {get,put}_write_access() on non-regular filesAl Viro
commit dd20908a8a06b22c171f6c3fcdbdbd65bed07505 upstream. it's pointless and actually leads to wrong behaviour in at least one moderately convoluted case (pipe(), close one end, try to get to another via /proc/*/fd and run into ETXTBUSY). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18lockd: ensure we tear down any live sockets when socket creation fails ↵Jeff Layton
during lockd_up commit 679b033df48422191c4cac52b610d9980e019f9b upstream. We had a Fedora ABRT report with a stack trace like this: kernel BUG at net/sunrpc/svc.c:550! invalid opcode: 0000 [#1] SMP [...] CPU: 2 PID: 913 Comm: rpc.nfsd Not tainted 3.13.6-200.fc20.x86_64 #1 Hardware name: Hewlett-Packard HP ProBook 4740s/1846, BIOS 68IRR Ver. F.40 01/29/2013 task: ffff880146b00000 ti: ffff88003f9b8000 task.ti: ffff88003f9b8000 RIP: 0010:[<ffffffffa0305fa8>] [<ffffffffa0305fa8>] svc_destroy+0x128/0x130 [sunrpc] RSP: 0018:ffff88003f9b9de0 EFLAGS: 00010206 RAX: ffff88003f829628 RBX: ffff88003f829600 RCX: 00000000000041ee RDX: 0000000000000000 RSI: 0000000000000286 RDI: 0000000000000286 RBP: ffff88003f9b9de8 R08: 0000000000017360 R09: ffff88014fa97360 R10: ffffffff8114ce57 R11: ffffea00051c9c00 R12: ffff88003f829600 R13: 00000000ffffff9e R14: ffffffff81cc7cc0 R15: 0000000000000000 FS: 00007f4fde284840(0000) GS:ffff88014fa80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f4fdf5192f8 CR3: 00000000a569a000 CR4: 00000000001407e0 Stack: ffff88003f792300 ffff88003f9b9e18 ffffffffa02de02a 0000000000000000 ffffffff81cc7cc0 ffff88003f9cb000 0000000000000008 ffff88003f9b9e60 ffffffffa033bb35 ffffffff8131c86c ffff88003f9cb000 ffff8800a5715008 Call Trace: [<ffffffffa02de02a>] lockd_up+0xaa/0x330 [lockd] [<ffffffffa033bb35>] nfsd_svc+0x1b5/0x2f0 [nfsd] [<ffffffff8131c86c>] ? simple_strtoull+0x2c/0x50 [<ffffffffa033c630>] ? write_pool_threads+0x280/0x280 [nfsd] [<ffffffffa033c6bb>] write_threads+0x8b/0xf0 [nfsd] [<ffffffff8114efa4>] ? __get_free_pages+0x14/0x50 [<ffffffff8114eff6>] ? get_zeroed_page+0x16/0x20 [<ffffffff811dec51>] ? simple_transaction_get+0xb1/0xd0 [<ffffffffa033c098>] nfsctl_transaction_write+0x48/0x80 [nfsd] [<ffffffff811b8b34>] vfs_write+0xb4/0x1f0 [<ffffffff811c3f99>] ? putname+0x29/0x40 [<ffffffff811b9569>] SyS_write+0x49/0xa0 [<ffffffff810fc2a6>] ? __audit_syscall_exit+0x1f6/0x2a0 [<ffffffff816962e9>] system_call_fastpath+0x16/0x1b Code: 31 c0 e8 82 db 37 e1 e9 2a ff ff ff 48 8b 07 8b 57 14 48 c7 c7 d5 c6 31 a0 48 8b 70 20 31 c0 e8 65 db 37 e1 e9 f4 fe ff ff 0f 0b <0f> 0b 66 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 e5 41 56 41 55 RIP [<ffffffffa0305fa8>] svc_destroy+0x128/0x130 [sunrpc] RSP <ffff88003f9b9de0> Evidently, we created some lockd sockets and then failed to create others. make_socks then returned an error and we tried to tear down the svc, but svc->sv_permsocks was not empty so we ended up tripping over the BUG() in svc_destroy(). Fix this by ensuring that we tear down any live sockets we created when socket creation is going to return an error. Fixes: 786185b5f8abefa (SUNRPC: move per-net operations from...) Reported-by: Raphos <raphoszap@laposte.net> Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18aio: v4 ensure access to ctx->ring_pages is correctly serialised for migrationBenjamin LaHaise
commit fa8a53c39f3fdde98c9eace6a9b412143f0f6ed6 upstream. As reported by Tang Chen, Gu Zheng and Yasuaki Isimatsu, the following issues exist in the aio ring page migration support. As a result, for example, we have the following problem: thread 1 | thread 2 | aio_migratepage() | |-> take ctx->completion_lock | |-> migrate_page_copy(new, old) | | *NOW*, ctx->ring_pages[idx] == old | | | *NOW*, ctx->ring_pages[idx] == old | aio_read_events_ring() | |-> ring = kmap_atomic(ctx->ring_pages[0]) | |-> ring->head = head; *HERE, write to the old ring page* | |-> kunmap_atomic(ring); | |-> ctx->ring_pages[idx] = new | | *BUT NOW*, the content of | | ring_pages[idx] is old. | |-> release ctx->completion_lock | As above, the new ring page will not be updated. Fix this issue, as well as prevent races in aio_ring_setup() by holding the ring_lock mutex during kioctx setup and page migration. This avoids the overhead of taking another spinlock in aio_read_events_ring() as Tang's and Gu's original fix did, pushing the overhead into the migration code. Note that to handle the nesting of ring_lock inside of mmap_sem, the migratepage operation uses mutex_trylock(). Page migration is not a 100% critical operation in this case, so the ocassional failure can be tolerated. This issue was reported by Sasha Levin. Based on feedback from Linus, avoid the extra taking of ctx->completion_lock. Instead, make page migration fully serialised by mapping->private_lock, and have aio_free_ring() simply disconnect the kioctx from the mapping by calling put_aio_ring_file() before touching ctx->ring_pages[]. This simplifies the error handling logic in aio_migratepage(), and should improve robustness. v4: always do mutex_unlock() in cases when kioctx setup fails. Reported-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Reported-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Benjamin LaHaise <bcrl@kvack.org> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18locks: allow __break_lease to sleep even when break_time is 0Jeff Layton
commit 4991a628a789dc5954e98e79476d9808812292ec upstream. A fl->fl_break_time of 0 has a special meaning to the lease break code that basically means "never break the lease". knfsd uses this to ensure that leases don't disappear out from under it. Unfortunately, the code in __break_lease can end up passing this value to wait_event_interruptible as a timeout, which prevents it from going to sleep at all. This causes __break_lease to spin in a tight loop and causes soft lockups. Fix this by ensuring that we pass a minimum value of 1 as a timeout instead. Cc: J. Bruce Fields <bfields@fieldses.org> Reported-by: Terry Barnaby <terry1@beam.ltd.uk> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18ext4: use i_size_read in ext4_unaligned_aio()Theodore Ts'o
commit 6e6358fc3c3c862bfe9a5bc029d3f8ce43dc9765 upstream. We haven't taken i_mutex yet, so we need to use i_size_read(). Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18ext4: move ext4_update_i_disksize() into mpage_map_and_submit_extent()Theodore Ts'o
commit 622cad1325e404598fe3b148c3fa640dbaabc235 upstream. The function ext4_update_i_disksize() is used in only one place, in the function mpage_map_and_submit_extent(). Move its code to simplify the code paths, and also move the call to ext4_mark_inode_dirty() into the i_data_sem's critical region, to be consistent with all of the other places where we update i_disksize. That way, we also keep the raw_inode's i_disksize protected, to avoid the following race: CPU #1 CPU #2 down_write(&i_data_sem) Modify i_disk_size up_write(&i_data_sem) down_write(&i_data_sem) Modify i_disk_size Copy i_disk_size to on-disk inode up_write(&i_data_sem) Copy i_disk_size to on-disk inode Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18ext4: fix jbd2 warning under heavy xattr loadJan Kara
commit ec4cb1aa2b7bae18dd8164f2e9c7c51abcf61280 upstream. When heavily exercising xattr code the assertion that jbd2_journal_dirty_metadata() shouldn't return error was triggered: WARNING: at /srv/autobuild-ceph/gitbuilder.git/build/fs/jbd2/transaction.c:1237 jbd2_journal_dirty_metadata+0x1ba/0x260() CPU: 0 PID: 8877 Comm: ceph-osd Tainted: G W 3.10.0-ceph-00049-g68d04c9 #1 Hardware name: Dell Inc. PowerEdge R410/01V648, BIOS 1.6.3 02/07/2011 ffffffff81a1d3c8 ffff880214469928 ffffffff816311b0 ffff880214469968 ffffffff8103fae0 ffff880214469958 ffff880170a9dc30 ffff8802240fbe80 0000000000000000 ffff88020b366000 ffff8802256e7510 ffff880214469978 Call Trace: [<ffffffff816311b0>] dump_stack+0x19/0x1b [<ffffffff8103fae0>] warn_slowpath_common+0x70/0xa0 [<ffffffff8103fb2a>] warn_slowpath_null+0x1a/0x20 [<ffffffff81267c2a>] jbd2_journal_dirty_metadata+0x1ba/0x260 [<ffffffff81245093>] __ext4_handle_dirty_metadata+0xa3/0x140 [<ffffffff812561f3>] ext4_xattr_release_block+0x103/0x1f0 [<ffffffff81256680>] ext4_xattr_block_set+0x1e0/0x910 [<ffffffff8125795b>] ext4_xattr_set_handle+0x38b/0x4a0 [<ffffffff810a319d>] ? trace_hardirqs_on+0xd/0x10 [<ffffffff81257b32>] ext4_xattr_set+0xc2/0x140 [<ffffffff81258547>] ext4_xattr_user_set+0x47/0x50 [<ffffffff811935ce>] generic_setxattr+0x6e/0x90 [<ffffffff81193ecb>] __vfs_setxattr_noperm+0x7b/0x1c0 [<ffffffff811940d4>] vfs_setxattr+0xc4/0xd0 [<ffffffff8119421e>] setxattr+0x13e/0x1e0 [<ffffffff811719c7>] ? __sb_start_write+0xe7/0x1b0 [<ffffffff8118f2e8>] ? mnt_want_write_file+0x28/0x60 [<ffffffff8118c65c>] ? fget_light+0x3c/0x130 [<ffffffff8118f2e8>] ? mnt_want_write_file+0x28/0x60 [<ffffffff8118f1f8>] ? __mnt_want_write+0x58/0x70 [<ffffffff811946be>] SyS_fsetxattr+0xbe/0x100 [<ffffffff816407c2>] system_call_fastpath+0x16/0x1b The reason for the warning is that buffer_head passed into jbd2_journal_dirty_metadata() didn't have journal_head attached. This is caused by the following race of two ext4_xattr_release_block() calls: CPU1 CPU2 ext4_xattr_release_block() ext4_xattr_release_block() lock_buffer(bh); /* False */ if (BHDR(bh)->h_refcount == cpu_to_le32(1)) } else { le32_add_cpu(&BHDR(bh)->h_refcount, -1); unlock_buffer(bh); lock_buffer(bh); /* True */ if (BHDR(bh)->h_refcount == cpu_to_le32(1)) get_bh(bh); ext4_free_blocks() ... jbd2_journal_forget() jbd2_journal_unfile_buffer() -> JH is gone error = ext4_handle_dirty_xattr_block(handle, inode, bh); -> triggers the warning We fix the problem by moving ext4_handle_dirty_xattr_block() under the buffer lock. Sadly this cannot be done in nojournal mode as that function can call sync_dirty_buffer() which would deadlock. Luckily in nojournal mode the race is harmless (we only dirty already freed buffer) and thus for nojournal mode we leave the dirtying outside of the buffer lock. Reported-by: Sage Weil <sage@inktank.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18ext4: note the error in ext4_end_bio()Matthew Wilcox
commit 9503c67c93ed0b95ba62d12d1fd09da6245dbdd6 upstream. ext4_end_bio() currently throws away the error that it receives. Chances are this is part of a spate of errors, one of which will end up getting the error returned to userspace somehow, but we shouldn't take that risk. Also print out the errno to aid in debug. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18ext4: FIBMAP ioctl causes BUG_ON due to handle EXT_MAX_BLOCKSKazuya Mio
commit 4adb6ab3e0fa71363a5ef229544b2d17de6600d7 upstream. When we try to get 2^32-1 block of the file which has the extent (ee_block=2^32-2, ee_len=1) with FIBMAP ioctl, it causes BUG_ON in ext4_ext_put_gap_in_cache(). To avoid the problem, ext4_map_blocks() needs to check the file logical block number. ext4_ext_put_gap_in_cache() called via ext4_map_blocks() cannot handle 2^32-1 because the maximum file logical block number is 2^32-2. Note that ext4_ind_map_blocks() returns -EIO when the block number is invalid. So ext4_map_blocks() should also return the same errno. Signed-off-by: Kazuya Mio <k-mio@sx.jp.nec.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18smarter propagate_mnt()Al Viro
commit f2ebb3a921c1ca1e2ddd9242e95a1989a50c4c68 upstream. The current mainline has copies propagated to *all* nodes, then tears down the copies we made for nodes that do not contain counterparts of the desired mountpoint. That sets the right propagation graph for the copies (at teardown time we move the slaves of removed node to a surviving peer or directly to master), but we end up paying a fairly steep price in useless allocations. It's fairly easy to create a situation where N calls of mount(2) create exactly N bindings, with O(N^2) vfsmounts allocated and freed in process. Fortunately, it is possible to avoid those allocations/freeings. The trick is to create copies in the right order and find which one would've eventually become a master with the current algorithm. It turns out to be possible in O(nodes getting propagation) time and with no extra allocations at all. One part is that we need to make sure that eventual master will be created before its slaves, so we need to walk the propagation tree in a different order - by peer groups. And iterate through the peers before dealing with the next group. Another thing is finding the (earlier) copy that will be a master of one we are about to create; to do that we are (temporary) marking the masters of mountpoints we are attaching the copies to. Either we are in a peer of the last mountpoint we'd dealt with, or we have the following situation: we are attaching to mountpoint M, the last copy S_0 had been attached to M_0 and there are sequences S_0...S_n, M_0...M_n such that S_{i+1} is a master of S_{i}, S_{i} mounted on M{i} and we need to create a slave of the first S_{k} such that M is getting propagation from M_{k}. It means that the master of M_{k} will be among the sequence of masters of M. On the other hand, the nearest marked node in that sequence will either be the master of M_{k} or the master of M_{k-1} (the latter - in the case if M_{k-1} is a slave of something M gets propagation from, but in a wrong peer group). So we go through the sequence of masters of M until we find a marked one (P). Let N be the one before it. Then we go through the sequence of masters of S_0 until we find one (say, S) mounted on a node D that has P as master and check if D is a peer of N. If it is, S will be the master of new copy, if not - the master of S will be. That's it for the hard part; the rest is fairly simple. Iterator is in next_group(), handling of one prospective mountpoint is propagate_one(). It seems to survive all tests and gives a noticably better performance than the current mainline for setups that are seriously using shared subtrees. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18ocfs2: fix panic on kfree(xattr->name)Tetsuo Handa
commit f81c20158f8d5f7938d5eb86ecc42ecc09273ce6 upstream. Commit 9548906b2bb7 ('xattr: Constify ->name member of "struct xattr"') missed that ocfs2 is calling kfree(xattr->name). As a result, kernel panic occurs upon calling kfree(xattr->name) because xattr->name refers static constant names. This patch removes kfree(xattr->name) from ocfs2_mknod() and ocfs2_symlink(). Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by: Tariq Saeed <tariq.x.saeed@oracle.com> Tested-by: Tariq Saeed <tariq.x.saeed@oracle.com> Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18ocfs2: do not put bh when buffer_uptodate failedalex chen
commit f7cf4f5bfe073ad792ab49c04f247626b3e38db6 upstream. Do not put bh when buffer_uptodate failed in ocfs2_write_block and ocfs2_write_super_or_backup, because it will put bh in b_end_io. Otherwise it will hit a warning "VFS: brelse: Trying to free free buffer". Signed-off-by: Alex Chen <alex.chen@huawei.com> Reviewed-by: Joseph Qi <joseph.qi@huawei.com> Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com> Cc: Mark Fasheh <mfasheh@suse.com> Acked-by: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18ocfs2: dlm: fix recovery hungJunxiao Bi
commit ded2cf71419b9353060e633b59e446c42a6a2a09 upstream. There is a race window in dlm_do_recovery() between dlm_remaster_locks() and dlm_reset_recovery() when the recovery master nearly finish the recovery process for a dead node. After the master sends FINALIZE_RECO message in dlm_remaster_locks(), another node may become the recovery master for another dead node, and then send the BEGIN_RECO message to all the nodes included the old master, in the handler of this message dlm_begin_reco_handler() of old master, dlm->reco.dead_node and dlm->reco.new_master will be set to the second dead node and the new master, then in dlm_reset_recovery(), these two variables will be reset to default value. This will cause new recovery master can not finish the recovery process and hung, at last the whole cluster will hung for recovery. old recovery master: new recovery master: dlm_remaster_locks() become recovery master for another dead node. dlm_send_begin_reco_message() dlm_begin_reco_handler() { if (dlm->reco.state & DLM_RECO_STATE_FINALIZE) { return -EAGAIN; } dlm_set_reco_master(dlm, br->node_idx); dlm_set_reco_dead_node(dlm, br->dead_node); } dlm_reset_recovery() { dlm_set_reco_dead_node(dlm, O2NM_INVALID_NODE_NUM); dlm_set_reco_master(dlm, O2NM_INVALID_NODE_NUM); } will hang in dlm_remaster_locks() for request dlm locks info Before send FINALIZE_RECO message, recovery master should set DLM_RECO_STATE_FINALIZE for itself and clear it after the recovery done, this can break the race windows as the BEGIN_RECO messages will not be handled before DLM_RECO_STATE_FINALIZE flag is cleared. A similar race may happen between new recovery master and normal node which is in dlm_finalize_reco_handler(), also fix it. Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com> Reviewed-by: Wengang Wang <wen.gang.wang@oracle.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18ocfs2: dlm: fix lock migration crashJunxiao Bi
commit 34aa8dac482f1358d59110d5e3a12f4351f6acaa upstream. This issue was introduced by commit 800deef3f6f8 ("ocfs2: use list_for_each_entry where benefical") in 2007 where it replaced list_for_each with list_for_each_entry. The variable "lock" will point to invalid data if "tmpq" list is empty and a panic will be triggered due to this. Sunil advised reverting it back, but the old version was also not right. At the end of the outer for loop, that list_for_each_entry will also set "lock" to an invalid data, then in the next loop, if the "tmpq" list is empty, "lock" will be an stale invalid data and cause the panic. So reverting the list_for_each back and reset "lock" to NULL to fix this issue. Another concern is that this seemes can not happen because the "tmpq" list should not be empty. Let me describe how. old lock resource owner(node 1): migratation target(node 2): image there's lockres with a EX lock from node 2 in granted list, a NR lock from node x with convert_type EX in converting list. dlm_empty_lockres() { dlm_pick_migration_target() { pick node 2 as target as its lock is the first one in granted list. } dlm_migrate_lockres() { dlm_mark_lockres_migrating() { res->state |= DLM_LOCK_RES_BLOCK_DIRTY; wait_event(dlm->ast_wq, !dlm_lockres_is_dirty(dlm, res)); //after the above code, we can not dirty lockres any more, // so dlm_thread shuffle list will not run downconvert lock from EX to NR upconvert lock from NR to EX <<< migration may schedule out here, then <<< node 2 send down convert request to convert type from EX to <<< NR, then send up convert request to convert type from NR to <<< EX, at this time, lockres granted list is empty, and two locks <<< in the converting list, node x up convert lock followed by <<< node 2 up convert lock. // will set lockres RES_MIGRATING flag, the following // lock/unlock can not run dlm_lockres_release_ast(dlm, res); } dlm_send_one_lockres() dlm_process_recovery_data() for (i=0; i<mres->num_locks; i++) if (ml->node == dlm->node_num) for (j = DLM_GRANTED_LIST; j <= DLM_BLOCKED_LIST; j++) { list_for_each_entry(lock, tmpq, list) if (lock) break; <<< lock is invalid as grant list is empty. } if (lock->ml.node != ml->node) BUG() >>> crash here } I see the above locks status from a vmcore of our internal bug. Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Reviewed-by: Wengang Wang <wen.gang.wang@oracle.com> Cc: Sunil Mushran <sunil.mushran@gmail.com> Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18reiserfs: fix race in readdirJeff Mahoney
commit 01d8885785a60ae8f4c37b0ed75bdc96d0fc6a44 upstream. jdm-20004 reiserfs_delete_xattrs: Couldn't delete all xattrs (-2) The -ENOENT is due to readdir calling dir_emit on the same entry twice. If the dir_emit callback sleeps and the tree is changed underneath us, we won't be able to trust deh_offset(deh) anymore. We need to save next_pos before we might sleep so we can find the next entry. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18nfsd: set timeparms.to_maxval in setup_callback_clientJeff Layton
commit 3758cf7e14b753838fe754ede3862af10b35fdac upstream. ...otherwise the logic in the timeout handling doesn't work correctly. Spotted-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18NFSD: Traverse unconfirmed client through hash-tableKinglong Mee
commit 2b9056359889c78ea5decb5b654a512c2e8a945c upstream. When stopping nfsd, I got BUG messages, and soft lockup messages, The problem is cuased by double rb_erase() in nfs4_state_destroy_net() and destroy_client(). This patch just let nfsd traversing unconfirmed client through hash-table instead of rbtree. [ 2325.021995] BUG: unable to handle kernel NULL pointer dereference at (null) [ 2325.022809] IP: [<ffffffff8133c18c>] rb_erase+0x14c/0x390 [ 2325.022982] PGD 7a91b067 PUD 7a33d067 PMD 0 [ 2325.022982] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC [ 2325.022982] Modules linked in: nfsd(OF) cfg80211 rfkill bridge stp llc snd_intel8x0 snd_ac97_codec ac97_bus auth_rpcgss nfs_acl serio_raw e1000 i2c_piix4 ppdev snd_pcm snd_timer lockd pcspkr joydev parport_pc snd parport i2c_core soundcore microcode sunrpc ata_generic pata_acpi [last unloaded: nfsd] [ 2325.022982] CPU: 1 PID: 2123 Comm: nfsd Tainted: GF O 3.14.0-rc8+ #2 [ 2325.022982] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 2325.022982] task: ffff88007b384800 ti: ffff8800797f6000 task.ti: ffff8800797f6000 [ 2325.022982] RIP: 0010:[<ffffffff8133c18c>] [<ffffffff8133c18c>] rb_erase+0x14c/0x390 [ 2325.022982] RSP: 0018:ffff8800797f7d98 EFLAGS: 00010246 [ 2325.022982] RAX: ffff880079c1f010 RBX: ffff880079f4c828 RCX: 0000000000000000 [ 2325.022982] RDX: 0000000000000000 RSI: ffff880079bcb070 RDI: ffff880079f4c810 [ 2325.022982] RBP: ffff8800797f7d98 R08: 0000000000000000 R09: ffff88007964fc70 [ 2325.022982] R10: 0000000000000000 R11: 0000000000000400 R12: ffff880079f4c800 [ 2325.022982] R13: ffff880079bcb000 R14: ffff8800797f7da8 R15: ffff880079f4c860 [ 2325.022982] FS: 0000000000000000(0000) GS:ffff88007f900000(0000) knlGS:0000000000000000 [ 2325.022982] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 2325.022982] CR2: 0000000000000000 CR3: 000000007a3ef000 CR4: 00000000000006e0 [ 2325.022982] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 2325.022982] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 2325.022982] Stack: [ 2325.022982] ffff8800797f7de0 ffffffffa0191c6e ffff8800797f7da8 ffff8800797f7da8 [ 2325.022982] ffff880079f4c810 ffff880079bcb000 ffffffff81cc26c0 ffff880079c1f010 [ 2325.022982] ffff880079bcb070 ffff8800797f7e28 ffffffffa01977f2 ffff8800797f7df0 [ 2325.022982] Call Trace: [ 2325.022982] [<ffffffffa0191c6e>] destroy_client+0x32e/0x3b0 [nfsd] [ 2325.022982] [<ffffffffa01977f2>] nfs4_state_shutdown_net+0x1a2/0x220 [nfsd] [ 2325.022982] [<ffffffffa01700b8>] nfsd_shutdown_net+0x38/0x70 [nfsd] [ 2325.022982] [<ffffffffa017013e>] nfsd_last_thread+0x4e/0x80 [nfsd] [ 2325.022982] [<ffffffffa001f1eb>] svc_shutdown_net+0x2b/0x30 [sunrpc] [ 2325.022982] [<ffffffffa017064b>] nfsd_destroy+0x5b/0x80 [nfsd] [ 2325.022982] [<ffffffffa0170773>] nfsd+0x103/0x130 [nfsd] [ 2325.022982] [<ffffffffa0170670>] ? nfsd_destroy+0x80/0x80 [nfsd] [ 2325.022982] [<ffffffff810a8232>] kthread+0xd2/0xf0 [ 2325.022982] [<ffffffff810a8160>] ? insert_kthread_work+0x40/0x40 [ 2325.022982] [<ffffffff816c493c>] ret_from_fork+0x7c/0xb0 [ 2325.022982] [<ffffffff810a8160>] ? insert_kthread_work+0x40/0x40 [ 2325.022982] Code: 48 83 e1 fc 48 89 10 0f 84 02 01 00 00 48 3b 41 10 0f 84 08 01 00 00 48 89 51 08 48 89 fa e9 74 ff ff ff 0f 1f 40 00 48 8b 50 10 <f6> 02 01 0f 84 93 00 00 00 48 8b 7a 10 48 85 ff 74 05 f6 07 01 [ 2325.022982] RIP [<ffffffff8133c18c>] rb_erase+0x14c/0x390 [ 2325.022982] RSP <ffff8800797f7d98> [ 2325.022982] CR2: 0000000000000000 [ 2325.022982] ---[ end trace 28c27ed011655e57 ]--- [ 228.064071] BUG: soft lockup - CPU#0 stuck for 22s! [nfsd:558] [ 228.064428] Modules linked in: ip6t_rpfilter ip6t_REJECT cfg80211 xt_conntrack rfkill ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw nfsd(OF) auth_rpcgss nfs_acl lockd snd_intel8x0 snd_ac97_codec ac97_bus joydev snd_pcm snd_timer e1000 sunrpc snd ppdev parport_pc serio_raw pcspkr i2c_piix4 microcode parport soundcore i2c_core ata_generic pata_acpi [ 228.064539] CPU: 0 PID: 558 Comm: nfsd Tainted: GF O 3.14.0-rc8+ #2 [ 228.064539] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 228.064539] task: ffff880076adec00 ti: ffff880074616000 task.ti: ffff880074616000 [ 228.064539] RIP: 0010:[<ffffffff8133ba17>] [<ffffffff8133ba17>] rb_next+0x27/0x50 [ 228.064539] RSP: 0018:ffff880074617de0 EFLAGS: 00000282 [ 228.064539] RAX: ffff880074478010 RBX: ffff88007446f860 RCX: 0000000000000014 [ 228.064539] RDX: ffff880074478010 RSI: 0000000000000000 RDI: ffff880074478010 [ 228.064539] RBP: ffff880074617de0 R08: 0000000000000000 R09: 0000000000000012 [ 228.064539] R10: 0000000000000001 R11: ffffffffffffffec R12: ffffea0001d11a00 [ 228.064539] R13: ffff88007f401400 R14: ffff88007446f800 R15: ffff880074617d50 [ 228.064539] FS: 0000000000000000(0000) GS:ffff88007f800000(0000) knlGS:0000000000000000 [ 228.064539] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 228.064539] CR2: 00007fe9ac6ec000 CR3: 000000007a5d6000 CR4: 00000000000006f0 [ 228.064539] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 228.064539] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 228.064539] Stack: [ 228.064539] ffff880074617e28 ffffffffa01ab7db ffff880074617df0 ffff880074617df0 [ 228.064539] ffff880079273000 ffffffff81cc26c0 ffffffff81cc26c0 0000000000000000 [ 228.064539] 0000000000000000 ffff880074617e48 ffffffffa01840b8 ffffffff81cc26c0 [ 228.064539] Call Trace: [ 228.064539] [<ffffffffa01ab7db>] nfs4_state_shutdown_net+0x18b/0x220 [nfsd] [ 228.064539] [<ffffffffa01840b8>] nfsd_shutdown_net+0x38/0x70 [nfsd] [ 228.064539] [<ffffffffa018413e>] nfsd_last_thread+0x4e/0x80 [nfsd] [ 228.064539] [<ffffffffa00aa1eb>] svc_shutdown_net+0x2b/0x30 [sunrpc] [ 228.064539] [<ffffffffa018464b>] nfsd_destroy+0x5b/0x80 [nfsd] [ 228.064539] [<ffffffffa0184773>] nfsd+0x103/0x130 [nfsd] [ 228.064539] [<ffffffffa0184670>] ? nfsd_destroy+0x80/0x80 [nfsd] [ 228.064539] [<ffffffff810a8232>] kthread+0xd2/0xf0 [ 228.064539] [<ffffffff810a8160>] ? insert_kthread_work+0x40/0x40 [ 228.064539] [<ffffffff816c493c>] ret_from_fork+0x7c/0xb0 [ 228.064539] [<ffffffff810a8160>] ? insert_kthread_work+0x40/0x40 [ 228.064539] Code: 1f 44 00 00 55 48 8b 17 48 89 e5 48 39 d7 74 3b 48 8b 47 08 48 85 c0 75 0e eb 25 66 0f 1f 84 00 00 00 00 00 48 89 d0 48 8b 50 10 <48> 85 d2 75 f4 5d c3 66 90 48 3b 78 08 75 f6 48 8b 10 48 89 c7 Fixes: ac55fdc408039 (nfsd: move the confirmed and unconfirmed hlists...) Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18nfsd4: fix setclientid encode sizeJ. Bruce Fields
commit 480efaee085235bb848f1063f959bf144103c342 upstream. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18nfsd4: fix memory leak in nfsd4_encode_fattr()Yan, Zheng
commit 18df11d0eacf67bbcd8dda755b568bbbd7264735 upstream. fh_put() does not free the temporary file handle. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18nfsd: check passed socket's net matches NFSd superblock's oneStanislav Kinsbursky
commit 3064639423c48d6e0eb9ecc27c512a58e38c6c57 upstream. There could be a case, when NFSd file system is mounted in network, different to socket's one, like below: "ip netns exec" creates new network and mount namespace, which duplicates NFSd mount point, created in init_net context. And thus NFS server stop in nested network context leads to RPCBIND client destruction in init_net. Then, on NFSd start in nested network context, rpc.nfsd process creates socket in nested net and passes it into "write_ports", which leads to RPCBIND sockets creation in init_net context because of the same reason (NFSd monut point was created in init_net context). An attempt to register passed socket in nested net leads to panic, because no RPCBIND client present in nexted network namespace. This patch add check that passed socket's net matches NFSd superblock's one. And returns -EINVAL error to user psace otherwise. v2: Put socket on exit. Reported-by: Weng Meiling <wengmeiling.weng@huawei.com> Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18nfsd: notify_change needs elevated write countJ. Bruce Fields
commit 9f67f189939eccaa54f3d2c9cf10788abaf2d584 upstream. Looks like this bug has been here since these write counts were introduced, not sure why it was just noticed now. Thanks also to Jan Kara for pointing out the problem. Reported-by: Matthew Rahtz <mrahtz@rapitasystems.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18nfsd4: leave reply buffer space for failed setattrJ. Bruce Fields
commit 04819bf6449094e62cebaf5199d85d68d711e667 upstream. This fixes an ommission from 18032ca062e621e15683cb61c066ef3dc5414a7b "NFSD: Server implementation of MAC Labeling", which increased the size of the setattr error reply without increasing COMPOUND_ERR_SLACK_SPACE. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18nfsd4: fix test_stateid error reply encodingJ. Bruce Fields
commit a11fcce1544df08c723d950ff0edef3adac40405 upstream. If the entire operation fails then there's nothing to encode. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18nfsd4: buffer-length check for SUPPATTR_EXCLCREATJ. Bruce Fields
commit de3997a7eeb9ea286b15879fdf8a95aae065b4f7 upstream. This was an omission from 8c18f2052e756e7d5dea712fc6e7ed70c00e8a39 "nfsd41: SUPPATTR_EXCLCREAT attribute". Cc: Benny Halevy <bhalevy@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18nfsd4: session needs room for following op to error outJ. Bruce Fields
commit 4c69d5855a16f7378648c5733632628fa10431db upstream. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18nfsd: revert v2 half of "nfsd: don't return high mode bits"J. Bruce Fields
commit 082f31a2169bd639785e45bf252f3d5bce0303c6 upstream. This reverts the part of commit 6e14b46b91fee8a049b0940333ce13a820beaaa5 that changes NFSv2 behavior. Mark Lord found that it broke nfs-root for Linux clients, because it broke NFSv2. In fact, from RFC 1094: "Notice that the file type is specified both in the mode bits and in the file type. This is really a bug in the protocol and will be fixed in future versions." So NFSv2 clients really are expected to depend on the high bits of the mode. Reported-by: Mark Lord <mlord@pobox.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Cc: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18NFSv4: Fix a use-after-free problem in open()Trond Myklebust
commit e911b8158ee1def8153849b1641b736026b036e0 upstream. If we interrupt the nfs4_wait_for_completion_rpc_task() call in nfs4_run_open_task(), then we don't prevent the RPC call from completing. So freeing up the opendata->f_attr.mdsthreshold in the error path in _nfs4_do_open() leads to a use-after-free when the XDR decoder tries to decode the mdsthreshold information from the server. Fixes: 82be417aa37c0 (NFSv4.1 cache mdsthreshold values on OPEN) Tested-by: Steve Dickson <SteveD@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18jffs2: remove from wait queue after schedule()Li Zefan
commit 3ead9578443b66ddb3d50ed4f53af8a0c0298ec5 upstream. @wait is a local variable, so if we don't remove it from the wait queue list, later wake_up() may end up accessing invalid memory. This was spotted by eyes. Signed-off-by: Li Zefan <lizefan@huawei.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Brian Norris <computersforpeace@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18jffs2: avoid soft-lockup in jffs2_reserve_space_gc()Li Zefan
commit 13b546d96207c131eeae15dc7b26c6e7d0f1cad7 upstream. We triggered soft-lockup under stress test on 2.6.34 kernel. BUG: soft lockup - CPU#1 stuck for 60009ms! [lockf2.test:14488] ... [<bf09a4d4>] (jffs2_do_reserve_space+0x420/0x440 [jffs2]) [<bf09a528>] (jffs2_reserve_space_gc+0x34/0x78 [jffs2]) [<bf0a1350>] (jffs2_garbage_collect_dnode.isra.3+0x264/0x478 [jffs2]) [<bf0a2078>] (jffs2_garbage_collect_pass+0x9c0/0xe4c [jffs2]) [<bf09a670>] (jffs2_reserve_space+0x104/0x2a8 [jffs2]) [<bf09dc48>] (jffs2_write_inode_range+0x5c/0x4d4 [jffs2]) [<bf097d8c>] (jffs2_write_end+0x198/0x2c0 [jffs2]) [<c00e00a4>] (generic_file_buffered_write+0x158/0x200) [<c00e14f4>] (__generic_file_aio_write+0x3a4/0x414) [<c00e15c0>] (generic_file_aio_write+0x5c/0xbc) [<c012334c>] (do_sync_write+0x98/0xd4) [<c0123a84>] (vfs_write+0xa8/0x150) [<c0123d74>] (sys_write+0x3c/0xc0)] Fix this by adding a cond_resched() in the while loop. [akpm@linux-foundation.org: don't initialize `ret'] Signed-off-by: Li Zefan <lizefan@huawei.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Brian Norris <computersforpeace@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18jffs2: Fix crash due to truncation of csizeAjesh Kunhipurayil Vijayan
commit 41bf1a24c1001f4d0d41a78e1ac575d2f14789d7 upstream. mounting JFFS2 partition sometimes crashes with this call trace: [ 1322.240000] Kernel bug detected[#1]: [ 1322.244000] Cpu 2 [ 1322.244000] $ 0 : 0000000000000000 0000000000000018 000000003ff00070 0000000000000001 [ 1322.252000] $ 4 : 0000000000000000 c0000000f3980150 0000000000000000 0000000000010000 [ 1322.260000] $ 8 : ffffffffc09cd5f8 0000000000000001 0000000000000088 c0000000ed300de8 [ 1322.268000] $12 : e5e19d9c5f613a45 ffffffffc046d464 0000000000000000 66227ba5ea67b74e [ 1322.276000] $16 : c0000000f1769c00 c0000000ed1e0200 c0000000f3980150 0000000000000000 [ 1322.284000] $20 : c0000000f3a80000 00000000fffffffc c0000000ed2cfbd8 c0000000f39818f0 [ 1322.292000] $24 : 0000000000000004 0000000000000000 [ 1322.300000] $28 : c0000000ed2c0000 c0000000ed2cfab8 0000000000010000 ffffffffc039c0b0 [ 1322.308000] Hi : 000000000000023c [ 1322.312000] Lo : 000000000003f802 [ 1322.316000] epc : ffffffffc039a9f8 check_tn_node+0x88/0x3b0 [ 1322.320000] Not tainted [ 1322.324000] ra : ffffffffc039c0b0 jffs2_do_read_inode_internal+0x1250/0x1e48 [ 1322.332000] Status: 5400f8e3 KX SX UX KERNEL EXL IE [ 1322.336000] Cause : 00800034 [ 1322.340000] PrId : 000c1004 (Netlogic XLP) [ 1322.344000] Modules linked in: [ 1322.348000] Process jffs2_gcd_mtd7 (pid: 264, threadinfo=c0000000ed2c0000, task=c0000000f0e68dd8, tls=0000000000000000) [ 1322.356000] Stack : c0000000f1769e30 c0000000ed010780 c0000000ed010780 c0000000ed300000 c0000000f1769c00 c0000000f3980150 c0000000f3a80000 00000000fffffffc c0000000ed2cfbd8 ffffffffc039c0b0 ffffffffc09c6340 0000000000001000 0000000000000dec ffffffffc016c9d8 c0000000f39805a0 c0000000f3980180 0000008600000000 0000000000000000 0000000000000000 0000000000000000 0001000000000dec c0000000f1769d98 c0000000ed2cfb18 0000000000010000 0000000000010000 0000000000000044 c0000000f3a80000 c0000000f1769c00 c0000000f3d207a8 c0000000f1769d98 c0000000f1769de0 ffffffffc076f9c0 0000000000000009 0000000000000000 0000000000000000 ffffffffc039cf90 0000000000000017 ffffffffc013fbdc 0000000000000001 000000010003e61c ... [ 1322.424000] Call Trace: [ 1322.428000] [<ffffffffc039a9f8>] check_tn_node+0x88/0x3b0 [ 1322.432000] [<ffffffffc039c0b0>] jffs2_do_read_inode_internal+0x1250/0x1e48 [ 1322.440000] [<ffffffffc039cf90>] jffs2_do_crccheck_inode+0x70/0xd0 [ 1322.448000] [<ffffffffc03a1b80>] jffs2_garbage_collect_pass+0x160/0x870 [ 1322.452000] [<ffffffffc03a392c>] jffs2_garbage_collect_thread+0xdc/0x1f0 [ 1322.460000] [<ffffffffc01541c8>] kthread+0xb8/0xc0 [ 1322.464000] [<ffffffffc0106d18>] kernel_thread_helper+0x10/0x18 [ 1322.472000] [ 1322.472000] Code: 67bd0050 94a4002c 2c830001 <00038036> de050218 2403fffc 0080a82d 00431824 24630044 [ 1322.480000] ---[ end trace b052bb90e97dfbf5 ]--- The variable csize in structure jffs2_tmp_dnode_info is of type uint16_t, but it is used to hold the compressed data length(csize) which is declared as uint32_t. So, when the value of csize exceeds 16bits, it gets truncated when assigned to tn->csize. This is causing a kernel BUG. Changing the definition of csize in jffs2_tmp_dnode_info to uint32_t fixes the issue. Signed-off-by: Ajesh Kunhipurayil Vijayan <ajesh@broadcom.com> Signed-off-by: Kamlakant Patel <kamlakant.patel@broadcom.com> Signed-off-by: Brian Norris <computersforpeace@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18jffs2: Fix segmentation fault found in stress testKamlakant Patel
commit 3367da5610c50e6b83f86d366d72b41b350b06a2 upstream. Creating a large file on a JFFS2 partition sometimes crashes with this call trace: [ 306.476000] CPU 13 Unable to handle kernel paging request at virtual address c0000000dfff8002, epc == ffffffffc03a80a8, ra == ffffffffc03a8044 [ 306.488000] Oops[#1]: [ 306.488000] Cpu 13 [ 306.492000] $ 0 : 0000000000000000 0000000000000000 0000000000008008 0000000000008007 [ 306.500000] $ 4 : c0000000dfff8002 000000000000009f c0000000e0007cde c0000000ee95fa58 [ 306.508000] $ 8 : 0000000000000001 0000000000008008 0000000000010000 ffffffffffff8002 [ 306.516000] $12 : 0000000000007fa9 000000000000ff0e 000000000000ff0f 80e55930aebb92bb [ 306.524000] $16 : c0000000e0000000 c0000000ee95fa5c c0000000efc80000 ffffffffc09edd70 [ 306.532000] $20 : ffffffffc2b60000 c0000000ee95fa58 0000000000000000 c0000000efc80000 [ 306.540000] $24 : 0000000000000000 0000000000000004 [ 306.548000] $28 : c0000000ee950000 c0000000ee95f738 0000000000000000 ffffffffc03a8044 [ 306.556000] Hi : 00000000000574a5 [ 306.560000] Lo : 6193b7a7e903d8c9 [ 306.564000] epc : ffffffffc03a80a8 jffs2_rtime_compress+0x98/0x198 [ 306.568000] Tainted: G W [ 306.572000] ra : ffffffffc03a8044 jffs2_rtime_compress+0x34/0x198 [ 306.580000] Status: 5000f8e3 KX SX UX KERNEL EXL IE [ 306.584000] Cause : 00800008 [ 306.588000] BadVA : c0000000dfff8002 [ 306.592000] PrId : 000c1100 (Netlogic XLP) [ 306.596000] Modules linked in: [ 306.596000] Process dd (pid: 170, threadinfo=c0000000ee950000, task=c0000000ee6e0858, tls=0000000000c47490) [ 306.608000] Stack : 7c547f377ddc7ee4 7ffc7f967f5d7fae 7f617f507fc37ff4 7e7d7f817f487f5f 7d8e7fec7ee87eb3 7e977ff27eec7f9e 7d677ec67f917f67 7f3d7e457f017ed7 7fd37f517f867eb2 7fed7fd17ca57e1d 7e5f7fe87f257f77 7fd77f0d7ede7fdb 7fba7fef7e197f99 7fde7fe07ee37eb5 7f5c7f8c7fc67f65 7f457fb87f847e93 7f737f3e7d137cd9 7f8e7e9c7fc47d25 7dbb7fac7fb67e52 7ff17f627da97f64 7f6b7df77ffa7ec5 80057ef17f357fb3 7f767fa27dfc7fd5 7fe37e8e7fd07e53 7e227fcf7efb7fa1 7f547e787fa87fcc 7fcb7fc57f5a7ffb 7fc07f6c7ea97e80 7e2d7ed17e587ee0 7fb17f9d7feb7f31 7f607e797e887faa 7f757fdd7c607ff3 7e877e657ef37fbd 7ec17fd67fe67ff7 7ff67f797ff87dc4 7eef7f3a7c337fa6 7fe57fc97ed87f4b 7ebe7f097f0b8003 7fe97e2a7d997cba 7f587f987f3c7fa9 ... [ 306.676000] Call Trace: [ 306.680000] [<ffffffffc03a80a8>] jffs2_rtime_compress+0x98/0x198 [ 306.684000] [<ffffffffc0394f10>] jffs2_selected_compress+0x110/0x230 [ 306.692000] [<ffffffffc039508c>] jffs2_compress+0x5c/0x388 [ 306.696000] [<ffffffffc039dc58>] jffs2_write_inode_range+0xd8/0x388 [ 306.704000] [<ffffffffc03971bc>] jffs2_write_end+0x16c/0x2d0 [ 306.708000] [<ffffffffc01d3d90>] generic_file_buffered_write+0xf8/0x2b8 [ 306.716000] [<ffffffffc01d4e7c>] __generic_file_aio_write+0x1ac/0x350 [ 306.720000] [<ffffffffc01d50a0>] generic_file_aio_write+0x80/0x168 [ 306.728000] [<ffffffffc021f7dc>] do_sync_write+0x94/0xf8 [ 306.732000] [<ffffffffc021ff6c>] vfs_write+0xa4/0x1a0 [ 306.736000] [<ffffffffc02202e8>] SyS_write+0x50/0x90 [ 306.744000] [<ffffffffc0116cc0>] handle_sys+0x180/0x1a0 [ 306.748000] [ 306.748000] Code: 020b202d 0205282d 90a50000 <90840000> 14a40038 00000000 0060602d 0000282d 016c5823 [ 306.760000] ---[ end trace 79dd088435be02d0 ]--- Segmentation fault This crash is caused because the 'positions' is declared as an array of signed short. The value of position is in the range 0..65535, and will be converted to a negative number when the position is greater than 32767 and causes a corruption and crash. Changing the definition to 'unsigned short' fixes this issue Signed-off-by: Jayachandran C <jchandra@broadcom.com> Signed-off-by: Kamlakant Patel <kamlakant.patel@broadcom.com> Signed-off-by: Brian Norris <computersforpeace@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18fs: NULL dereference in posix_acl_to_xattr()Dan Carpenter
commit 47ba9734403770a4c5e685b01f0a72b835dd4fff upstream. This patch moves the dereference of "buffer" after the check for NULL. The only place which passes a NULL parameter is gfs2_set_acl(). Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-06-18ext4: fix premature freeing of partial clusters split across leaf blocksEric Whitney
commit ad6599ab3ac98a4474544086e048ce86ec15a4d1 upstream. Xfstests generic/311 and shared/298 fail when run on a bigalloc file system. Kernel error messages produced during the tests report that blocks to be freed are already on the to-be-freed list. When e2fsck is run at the end of the tests, it typically reports bad i_blocks and bad free blocks counts. The bug that causes these failures is located in ext4_ext_rm_leaf(). Code at the end of the function frees a partial cluster if it's not shared with an extent remaining in the leaf. However, if all the extents in the leaf have been removed, the code dereferences an invalid extent pointer (off the front of the leaf) when the check for sharing is made. This generally has the effect of unconditionally freeing the partial cluster, which leads to the observed failures when the partial cluster is shared with the last extent in the next leaf. Fix this by attempting to free the cluster only if extents remain in the leaf. Any remaining partial cluster will be freed if possible when the next leaf is processed or when leaf removal is complete. Signed-off-by: Eric Whitney <enwlinux@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>