aboutsummaryrefslogtreecommitdiff
path: root/drivers/md
AgeCommit message (Collapse)Author
2016-02-29Merge branch 'linux-linaro-lsk-v3.10' into linux-linaro-lsk-v3.10-androidAlex Shi
2016-02-26 Merge tag 'v3.10.98' into linux-linaro-lsk-v3.10Alex Shi
This is the 3.10.98 stable release
2016-02-19Revert "dm mpath: fix stalls when handling invalid ioctls"Mauricio Faria de Oliveira
commit 47796938c46b943d157ac8a6f9ed4e3b98b83cf4 upstream. This reverts commit a1989b330093578ea5470bea0a00f940c444c466. That commit introduced a regression at least for the case of the SG_IO ioctl() running without CAP_SYS_RAWIO capability (e.g., unprivileged users) when there are no active paths: the ioctl() fails with the ENOTTY errno immediately rather than blocking due to queue_if_no_path until a path becomes active, for example. That case happens to be exercised by QEMU KVM guests with 'scsi-block' devices (qemu "-device scsi-block" [1], libvirt "<disk type='block' device='lun'>" [2]) from multipath devices; which leads to SCSI/filesystem errors in such a guest. More general scenarios can hit that regression too. The following demonstration employs a SG_IO ioctl() with a standard SCSI INQUIRY command for this objective (some output & user changes omitted for brevity and comments added for clarity). Reverting that commit restores normal operation (queueing) in failing scenarios; tested on linux-next (next-20151022). 1) Test-case is based on sg_simple0 [3] (just SG_IO; remove SG_GET_VERSION_NUM) $ cat sg_simple0.c ... see [3] ... $ sed '/SG_GET_VERSION_NUM/,/}/d' sg_simple0.c > sgio_inquiry.c $ gcc sgio_inquiry.c -o sgio_inquiry 2) The ioctl() works fine with active paths present. # multipath -l 85ag56 85ag56 (...) dm-19 IBM ,2145 size=60G features='1 queue_if_no_path' hwhandler='0' wp=rw |-+- policy='service-time 0' prio=0 status=active | |- 8:0:11:0 sdz 65:144 active undef running | `- 9:0:9:0 sdbf 67:144 active undef running `-+- policy='service-time 0' prio=0 status=enabled |- 8:0:12:0 sdae 65:224 active undef running `- 9:0:12:0 sdbo 68:32 active undef running $ ./sgio_inquiry /dev/mapper/85ag56 Some of the INQUIRY command's response: IBM 2145 0000 INQUIRY duration=0 millisecs, resid=0 3) The ioctl() fails with ENOTTY errno with _no_ active paths present, for unprivileged users (rather than blocking due to queue_if_no_path). # for path in $(multipath -l 85ag56 | grep -o 'sd[a-z]\+'); \ do multipathd -k"fail path $path"; done # multipath -l 85ag56 85ag56 (...) dm-19 IBM ,2145 size=60G features='1 queue_if_no_path' hwhandler='0' wp=rw |-+- policy='service-time 0' prio=0 status=enabled | |- 8:0:11:0 sdz 65:144 failed undef running | `- 9:0:9:0 sdbf 67:144 failed undef running `-+- policy='service-time 0' prio=0 status=enabled |- 8:0:12:0 sdae 65:224 failed undef running `- 9:0:12:0 sdbo 68:32 failed undef running $ ./sgio_inquiry /dev/mapper/85ag56 sg_simple0: Inquiry SG_IO ioctl error: Inappropriate ioctl for device 4) dmesg shows that scsi_verify_blk_ioctl() failed for SG_IO (0x2285); it returns -ENOIOCTLCMD, later replaced with -ENOTTY in vfs_ioctl(). $ dmesg <...> [] device-mapper: multipath: Failing path 65:144. [] device-mapper: multipath: Failing path 67:144. [] device-mapper: multipath: Failing path 65:224. [] device-mapper: multipath: Failing path 68:32. [] sgio_inquiry: sending ioctl 2285 to a partition! 5) The ioctl() only works if the SYS_CAP_RAWIO capability is present (then queueing happens -- in this example, queue_if_no_path is set); this is due to a conditional check in scsi_verify_blk_ioctl(). # capsh --drop=cap_sys_rawio -- -c './sgio_inquiry /dev/mapper/85ag56' sg_simple0: Inquiry SG_IO ioctl error: Inappropriate ioctl for device # ./sgio_inquiry /dev/mapper/85ag56 & [1] 72830 # cat /proc/72830/stack [<c00000171c0df700>] 0xc00000171c0df700 [<c000000000015934>] __switch_to+0x204/0x350 [<c000000000152d4c>] msleep+0x5c/0x80 [<c00000000077dfb0>] dm_blk_ioctl+0x70/0x170 [<c000000000487c40>] blkdev_ioctl+0x2b0/0x9b0 [<c0000000003128e4>] block_ioctl+0x64/0xd0 [<c0000000002dd3b0>] do_vfs_ioctl+0x490/0x780 [<c0000000002dd774>] SyS_ioctl+0xd4/0xf0 [<c000000000009358>] system_call+0x38/0xd0 6) This is the function call chain exercised in this analysis: SYSCALL_DEFINE3(ioctl, <...>) @ fs/ioctl.c -> do_vfs_ioctl() -> vfs_ioctl() ... error = filp->f_op->unlocked_ioctl(filp, cmd, arg); ... -> dm_blk_ioctl() @ drivers/md/dm.c -> multipath_ioctl() @ drivers/md/dm-mpath.c ... (bdev = NULL, due to no active paths) ... if (!bdev || <...>) { int err = scsi_verify_blk_ioctl(NULL, cmd); if (err) r = err; } ... -> scsi_verify_blk_ioctl() @ block/scsi_ioctl.c ... if (bd && bd == bd->bd_contains) // not taken (bd = NULL) return 0; ... if (capable(CAP_SYS_RAWIO)) // not taken (unprivileged user) return 0; ... printk_ratelimited(KERN_WARNING "%s: sending ioctl %x to a partition!\n" <...>); return -ENOIOCTLCMD; <- ... return r ? : <...> <- ... if (error == -ENOIOCTLCMD) error = -ENOTTY; out: return error; ... Links: [1] http://git.qemu.org/?p=qemu.git;a=commit;h=336a6915bc7089fb20fea4ba99972ad9a97c5f52 [2] https://libvirt.org/formatdomain.html#elementsDisks (see 'disk' -> 'device') [3] http://tldp.org/HOWTO/SCSI-Generic-HOWTO/pexample.html (Revision 1.2, 2002-05-03) Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-02-19dm btree: fix leak of bufio-backed block in btree_split_sibling error pathMike Snitzer
commit 30ce6e1cc5a0f781d60227e9096c86e188d2c2bd upstream. The block allocated at the start of btree_split_sibling() is never released if later insert_at() fails. Fix this by releasing the previously allocated bufio block using unlock_block(). Reported-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-11-30Merge branch 'linux-linaro-lsk-v3.10' into linux-linaro-lsk-v3.10-androidKevin Hilman
2015-11-10Merge tag 'v3.10.93' of ↵Kevin Hilman
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable into linux-linaro-lsk-v3.10 This is the 3.10.93 stable release # gpg: Signature made Mon Nov 9 10:13:39 2015 PST using RSA key ID 6092693E # gpg: Good signature from "Greg Kroah-Hartman (Linux kernel stable release signing key) <greg@kroah.com>" * tag 'v3.10.93' of git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable: (24 commits) Linux 3.10.93 xen: fix backport of previous kexec patch IB/cm: Fix rb-tree duplicate free and use-after-free mvsas: Fix NULL pointer dereference in mvs_slot_task_free md/raid10: submit_bio_wait() returns 0 on success md/raid1: submit_bio_wait() returns 0 on success crypto: api - Only abort operations on fatal signal module: Fix locking in symbol_put_addr() xen-blkfront: check for null drvdata in blkback_changed (XenbusStateClosing) xhci: handle no ping response error properly dm btree: fix leak of bufio-backed block in btree_split_beneath error path dm btree remove: fix a bug when rebalancing nodes after removal Revert "ARM64: unwind: Fix PC calculation" rbd: prevent kernel stack blow up on rbd map rbd: don't leak parent_spec in rbd_dev_probe_parent() rbd: require stable pages if message data CRCs are enabled drm/nouveau/gem: return only valid domain when there's only one mm: make sendfile(2) killable ASoC: wm8904: Correct number of EQ registers powerpc/rtas: Validate rtas.entry before calling enter_rtas() ...
2015-11-09md/raid10: submit_bio_wait() returns 0 on successJes Sorensen
commit 681ab4696062f5aa939c9e04d058732306a97176 upstream. This was introduced with 9e882242c6193ae6f416f2d8d8db0d9126bd996b which changed the return value of submit_bio_wait() to return != 0 on error, but didn't update the caller accordingly. Fixes: 9e882242c6 ("block: Add submit_bio_wait(), remove from md") Reported-by: Bill Kuzeja <William.Kuzeja@stratus.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-11-09md/raid1: submit_bio_wait() returns 0 on successJes Sorensen
commit 203d27b0226a05202438ddb39ef0ef1acb14a759 upstream. This was introduced with 9e882242c6193ae6f416f2d8d8db0d9126bd996b which changed the return value of submit_bio_wait() to return != 0 on error, but didn't update the caller accordingly. Fixes: 9e882242c6 ("block: Add submit_bio_wait(), remove from md") Reported-by: Bill Kuzeja <William.Kuzeja@stratus.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-11-09dm btree: fix leak of bufio-backed block in btree_split_beneath error pathMike Snitzer
commit 4dcb8b57df3593dcb20481d9d6cf79d1dc1534be upstream. btree_split_beneath()'s error path had an outstanding FIXME that speaks directly to the potential for _not_ cleaning up a previously allocated bufio-backed block. Fix this by releasing the previously allocated bufio block using unlock_block(). Reported-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Acked-by: Joe Thornber <thornber@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-11-09dm btree remove: fix a bug when rebalancing nodes after removalJoe Thornber
commit 2871c69e025e8bc507651d5a9cf81a8a7da9d24b upstream. Commit 4c7e309340ff ("dm btree remove: fix bug in redistribute3") wasn't a complete fix for redistribute3(). The redistribute3 function takes 3 btree nodes and shares out the entries evenly between them. If the three nodes in total contained (MAX_ENTRIES * 3) - 1 entries between them then this was erroneously getting rebalanced as (MAX_ENTRIES - 1) on the left and right, and (MAX_ENTRIES + 1) in the center. Fix this issue by being more careful about calculating the target number of entries for the left and right nodes. Unit tested in userspace using this program: https://github.com/jthornber/redistribute3-test/blob/master/redistribute3_t.c Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-11-04Merge tag 'v3.10.92' of ↵Kevin Hilman
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable into linux-linaro-lsk-v3.10 This is the 3.10.92 stable release # gpg: Signature made Mon Oct 26 17:45:12 2015 PDT using RSA key ID 6092693E # gpg: Good signature from "Greg Kroah-Hartman (Linux kernel stable release signing key) <greg@kroah.com>" * tag 'v3.10.92' of git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable: (73 commits) Linux 3.10.92 rbd: fix double free on rbd_dev->header_name dm thin: fix missing pool reference count decrement in pool_ctr error path workqueue: make sure delayed work run in local cpu i2c: rcar: enable RuntimePM before registering to the core crypto: ahash - ensure statesize is non-zero crypto: sparc - initialize blkcipher.ivsize m68k/uaccess: Fix asm constraints for userspace access asix: Do full reset during ax88772_bind asix: Don't reset PHY on if_up for ASIX 88772 ethtool: Use kcalloc instead of kmalloc for ethtool_get_strings ppp: don't override sk->sk_state in pppoe_flush_dev() net: add pfmemalloc check in sk_add_backlog() skbuff: Fix skb checksum partial check. skbuff: Fix skb checksum flag on skb pull af_unix: return data from multiple SKBs on recv() with MSG_PEEK flag af_unix: Convert the unix_sk macro to an inline function for type safety l2tp: protect tunnel->del_work by ref_count Linux 3.10.91 3w-9xxx: don't unmap bounce buffered commands ...
2015-10-27dm thin: fix missing pool reference count decrement in pool_ctr error pathMike Snitzer
commit ba30670f4d5292c4e7f7980bbd5071f7c4794cdd upstream. Fixes: ac8c3f3df ("dm thin: generate event when metadata threshold passed") Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-10-22dm cache: fix NULL pointer when switching from cleaner policyJoe Thornber
commit 2bffa1503c5c06192eb1459180fac4416575a966 upstream. The cleaner policy doesn't make use of the per cache block hint space in the metadata (unlike the other policies). When switching from the cleaner policy to mq or smq a NULL pointer crash (in dm_tm_new_block) was observed. The crash was caused by bugs in dm-cache-metadata.c when trying to skip creation of the hint btree. The minimal fix is to change hint size for the cleaner policy to 4 bytes (only hint size supported). Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-10-22md: flush ->event_work before stopping array.NeilBrown
commit ee5d004fd0591536a061451eba2b187092e9127c upstream. The 'event_work' worker used by dm-raid may still be running when the array is stopped. This can result in an oops. So flush the workqueue on which it is run after detaching and before destroying the device. Reported-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: NeilBrown <neilb@suse.com> Fixes: 9d09e663d550 ("dm: raid456 basic support") Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-10-22dm raid: fix round up of default region sizeMikulas Patocka
commit 042745ee53a0a7c1f5aff191a4a24213c6dcfb52 upstream. Commit 3a0f9aaee028 ("dm raid: round region_size to power of two") intended to make sure that the default region size is a power of two. However, the logic in that commit is incorrect and sets the variable region_size to 0 or 1, depending on whether min_region_size is a power of two. Fix this logic, using roundup_pow_of_two(), so that region_size is properly rounded up to the next power of two. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Fixes: 3a0f9aaee028 ("dm raid: round region_size to power of two") Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-10-22dm btree: add ref counting ops for the leaves of top level btreesJoe Thornber
commit b0dc3c8bc157c60b1d470163882be8c13e1950af upstream. When using nested btrees, the top leaves of the top levels contain block addresses for the root of the next tree down. If we shadow a shared leaf node the leaf values (sub tree roots) should be incremented accordingly. This is only an issue if there is metadata sharing in the top levels. Which only occurs if metadata snapshots are being used (as is possible with dm-thinp). And could result in a block from the thinp metadata snap being reused early, thus corrupting the thinp metadata snap. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-10-14Merge branch 'linux-linaro-lsk-v3.10' into linux-linaro-lsk-v3.10-androidlsk-v3.10-15.11-androidlsk-v3.10-15.10-androidKevin Hilman
2015-10-13Merge branch 'linux-3.10.y' of ↵lsk-v3.10-15.10Kevin Hilman
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable into linux-linaro-lsk-v3.10 * 'linux-3.10.y' of git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable: (77 commits) Linux 3.10.90 Revert "iio: bmg160: IIO_BUFFER and IIO_TRIGGERED_BUFFER are required" vfs: Remove incorrect debugging WARN in prepend_path fib_rules: fix fib rule dumps across multiple skbs sctp: fix race on protocol/netns initialization net/ipv6: Correct PIM6 mrt_lock handling ipv6: fix exthdrs offload registration in out_rt path usbnet: Get EVENT_NO_RUNTIME_PM bit before it is cleared ip6_gre: release cached dst on tunnel removal rds: fix an integer overflow test in rds_info_getsockopt() netlink: don't hold mutex in rcu callback when releasing mmapd ring inet: frags: fix defragmented packet's IP header for af_packet bonding: fix destruction of bond with devices different from arphrd_ether ipv6: lock socket in ip6_datagram_connect() isdn/gigaset: reset tty->receive_room when attaching ser_gigaset bridge: mdb: fix double add notification net: Fix skb_set_peeked use-after-free bug net: Fix skb csum races when peeking net: Clone skb before setting peeked flag net: call rcu_read_lock early in process_backlog ...
2015-10-01md/raid10: always set reshape_safe when initializing reshape_position.NeilBrown
commit 299b0685e31c9f3dcc2d58ee3beca761a40b44b3 upstream. 'reshape_position' tracks where in the reshape we have reached. 'reshape_safe' tracks where in the reshape we have safely recorded in the metadata. These are compared to determine when to update the metadata. So it is important that reshape_safe is initialised properly. Currently it isn't. When starting a reshape from the beginning it usually has the correct value by luck. But when reducing the number of devices in a RAID10, it has the wrong value and this leads to the metadata not being updated correctly. This can lead to corruption if the reshape is not allowed to complete. This patch is suitable for any -stable kernel which supports RAID10 reshape, which is 3.5 and later. Fixes: 3ea7daa5d7fd ("md/raid10: add reshape support") Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-14Merge branch 'linux-linaro-lsk-v3.10' into linux-linaro-lsk-v3.10-androidlsk-v3.10-15.09-androidKevin Hilman
2015-09-14Merge tag 'v3.10.88' of ↵lsk-v3.10-15.09Kevin Hilman
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable into linux-linaro-lsk-v3.10 This is the 3.10.88 stable release * tag 'v3.10.88' of git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable: (48 commits) Linux 3.10.88 arm64/mm: Remove hack in mmap randomize layout crypto: caam - fix memory corruption in ahash_final_ctx libfc: Fix fc_fcp_cleanup_each_cmd() drm/radeon: add new OLAND pci id EDAC, ppc4xx: Access mci->csrows array elements properly localmodconfig: Use Kbuild files too dm thin metadata: delete btrees when releasing metadata snapshot perf: Fix fasync handling on inherited events mm/hwpoison: fix page refcount of unknown non LRU page ipc/sem.c: update/correct memory barriers ipc,sem: fix use after free on IPC_RMID after a task using same semaphore set exits Linux 3.10.87 mm, vmscan: Do not wait for page writeback for GFP_NOFS allocations md/bitmap: return an error when bitmap superblock is corrupt. kvm: x86: fix kvm_apic_has_events to check for NULL pointer signal: fix information leak in copy_siginfo_from_user32 signal: fix information leak in copy_siginfo_to_user signalfd: fix information leak in signalfd_copyinfo ARM: 7819/1: fiq: Cast the first argument of flush_icache_range() ...
2015-09-13dm thin metadata: delete btrees when releasing metadata snapshotJoe Thornber
commit 7f518ad0a212e2a6fd68630e176af1de395070a7 upstream. The device details and mapping trees were just being decremented before. Now btree_del() is called to do a deep delete. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-16md/bitmap: return an error when bitmap superblock is corrupt.NeilBrown
commit b97e92574c0bf335db1cd2ec491d8ff5cd5d0b49 upstream Use separate bitmaps for each nodes in the cluster bitmap_read_sb() validates the bitmap superblock that it reads in. If it finds an inconsistency like a bad magic number or out-of-range version number, it prints an error and returns, but it incorrectly returns zero, so the array is still assembled with the (invalid) bitmap. This means it could try to use a bitmap with a new version number which it therefore does not understand. This bug was introduced in 3.5 and fix as part of a larger patch in 4.1. So the patch is suitable for any -stable kernel in that range. Fixes: 27581e5ae01f ("md/bitmap: centralise allocation of bitmap file pages.") Signed-off-by: NeilBrown <neilb@suse.com> Reported-by: GuoQing Jiang <gqjiang@suse.com>
2015-08-16md/raid1: extend spinlock to protect raid1_end_read_request against ↵NeilBrown
inconsistencies commit 423f04d63cf421ea436bcc5be02543d549ce4b28 upstream. raid1_end_read_request() assumes that the In_sync bits are consistent with the ->degaded count. raid1_spare_active updates the In_sync bit before the ->degraded count and so exposes an inconsistency, as does error() So extend the spinlock in raid1_spare_active() and error() to hide those inconsistencies. This should probably be part of Commit: 34cab6f42003 ("md/raid1: fix test for 'was read error from last working device'.") as it addresses the same issue. It fixes the same bug and should go to -stable for same reasons. Fixes: 76073054c95b ("md/raid1: clean up read_balance.") Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-16md: use kzalloc() when bitmap is disabledBenjamin Randazzo
commit b6878d9e03043695dbf3fa1caa6dfc09db225b16 upstream. In drivers/md/md.c get_bitmap_file() uses kmalloc() for creating a mdu_bitmap_file_t called "file". 5769 file = kmalloc(sizeof(*file), GFP_NOIO); 5770 if (!file) 5771 return -ENOMEM; This structure is copied to user space at the end of the function. 5786 if (err == 0 && 5787 copy_to_user(arg, file, sizeof(*file))) 5788 err = -EFAULT But if bitmap is disabled only the first byte of "file" is initialized with zero, so it's possible to read some bytes (up to 4095) of kernel space memory from user space. This is an information leak. 5775 /* bitmap disabled, zero the first byte and copy out */ 5776 if (!mddev->bitmap_info.file) 5777 file->pathname[0] = '\0'; Signed-off-by: Benjamin Randazzo <benjamin@randazzo.fr> Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-14Merge branch 'linux-linaro-lsk-v3.10' into linux-linaro-lsk-v3.10-androidlsk-v3.10-15.08-androidKevin Hilman
Conflicts: fs/exec.c Resolution summary: Conflict between upstream/LTS commit 9eae8ac6ab40 (fs: take i_mutex during prepare_binprm for set[ug]id executables) and android commit 9d0ff694bc22 (sched: move no_new_privs into new atomic flags). Resolution: move task_no_new_privs() usage into new function created by upstream/LTS comit.
2015-08-14Merge tag 'v3.10.86' into linux-linaro-lsk-v3.10lsk-v3.10-15.08Kevin Hilman
This is the 3.10.86 stable release * tag 'v3.10.86': (132 commits) Linux 3.10.86 efi: fix 32bit kernel boot failed problem using efi iscsi-target: Fix iser explicit logout TX kthread leak iscsi-target: Fix use-after-free during TPG session shutdown vhost: actually track log eventfd file rds: rds_ib_device.refcount overflow xhci: prevent bus_suspend if SS port resuming in phase 1 xhci: report U3 when link is in resume state xhci: Calculate old endpoints correctly on device reset usb-storage: ignore ZTE MF 823 card reader in mode 0x1225 ata: pmp: add quirk for Marvell 4140 SATA PMP blkcg: fix gendisk reference leak in blkg_conf_prep() Input: usbtouchscreen - avoid unresponsive TSC-30 touch screen tile: use free_bootmem_late() for initrd md/raid1: fix test for 'was read error from last working device'. mmc: sdhci-pxav3: fix platform_data is not initialized mmc: sdhci-esdhc: Make 8BIT bus work mac80211: clear subdir_stations when removing debugfs st: null pointer dereference panic caused by use after kref_put by st_open ALSA: hda - Fix MacBook Pro 5,2 quirk ...
2015-08-10md/raid1: fix test for 'was read error from last working device'.NeilBrown
commit 34cab6f42003cb06f48f86a86652984dec338ae9 upstream. When we get a read error from the last working device, we don't try to repair it, and don't fail the device. We simple report a read error to the caller. However the current test for 'is this the last working device' is wrong. When there is only one fully working device, it assumes that a non-faulty device is that device. However a spare which is rebuilding would be non-faulty but so not the only working device. So change the test from "!Faulty" to "In_sync". If ->degraded says there is only one fully working device and this device is in_sync, this must be the one. This bug has existed since we allowed read_balance to read from a recovering spare in v3.0 Reported-and-tested-by: Alexander Lyakas <alex.bolshoy@gmail.com> Fixes: 76073054c95b ("md/raid1: clean up read_balance.") Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03md: fix a build warningFiro Yang
commit 4e023612325a9034a542bfab79f78b1fe5ebb841 upstream. Warning like this: drivers/md/md.c: In function "update_array_info": drivers/md/md.c:6394:26: warning: logical not is only applied to the left hand side of comparison [-Wlogical-not-parentheses] !mddev->persistent != info->not_persistent|| Fix it as Neil Brown said: mddev->persistent != !info->not_persistent || Signed-off-by: Firo Yang <firogm@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03dm btree: silence lockdep lock inversion in dm_btree_del()Joe Thornber
commit 1c7518794a3647eb345d59ee52844e8a40405198 upstream. Allocate memory using GFP_NOIO when deleting a btree. dm_btree_del() can be called via an ioctl and we don't want to recurse into the FS or block layer. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03dm btree remove: fix bug in redistribute3Dennis Yang
commit 4c7e309340ff85072e96f529582d159002c36734 upstream. redistribute3() shares entries out across 3 nodes. Some entries were being moved the wrong way, breaking the ordering. This manifested as a BUG() in dm-btree-remove.c:shift() when entries were removed from the btree. For additional context see: https://www.redhat.com/archives/dm-devel/2015-May/msg00113.html Signed-off-by: Dennis Yang <shinrairis@gmail.com> Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-06-24Merge branch 'linux-linaro-lsk-v3.10' into linux-linaro-lsk-v3.10-androidAlex Shi
2015-06-23Merge branch 'v3.10/topic/dm-crypt' into linux-linaro-lsk-v3.10linux-linaro-lsk-v3.10-testAlex Shi
2015-06-23dm crypt: use per-bio datav3.10/topic/dm-cryptMikulas Patocka
Change dm-crypt so that it uses auxiliary data allocated with the bio. Dm-crypt requires two allocations per request - struct dm_crypt_io and struct ablkcipher_request (with other data appended to it). It previously only used mempool allocations. Some requests may require more dm_crypt_ios and ablkcipher_requests, however most requests need just one of each of these two structures to complete. This patch changes it so that the first dm_crypt_io and ablkcipher_request are allocated with the bio (using target per_bio_data_size option). If the request needs additional values, they are allocated from the mempool. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> (cherry picked from commit 298a9fa08a1577211d42a75e8fc073baef61e0d9) Signed-off-by: Alex Shi <alex.shi@linaro.org> Conflicts: drivers/md/dm-crypt.c Solutions: reuse the old bio->sector instead of bvec ioterator
2015-06-23dm crypt: use unbound workqueue for request processingMikulas Patocka
Use unbound workqueue by default so that work is automatically balanced between available CPUs. The original behavior of encrypting using the same cpu that IO was submitted on can still be enabled by setting the optional 'same_cpu_crypt' table argument. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> (cherry picked from commit f3396c58fd8442850e759843457d78b6ec3a9589) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-06-23dm crypt: use memzero_explicit for on-stack bufferMilan Broz
Use memzero_explicit to cleanup sensitive data allocated on stack to prevent the compiler from optimizing and removing memset() calls. Signed-off-by: Milan Broz <gmazyland@gmail.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org (cherry picked from commit 1a71d6ffe18c0d0f03fc8531949cc8ed41d702ee) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-06-23dm crypt: add TCW IV mode for old CBC TCRYPT containersMilan Broz
dm-crypt can already activate TCRYPT (TrueCrypt compatible) containers in LRW or XTS block encryption mode. TCRYPT containers prior to version 4.1 use CBC mode with some additional tweaks, this patch adds support for these containers. This new mode is implemented using special IV generator named TCW (TrueCrypt IV with whitening). TCW IV only supports containers that are encrypted with one cipher (Tested with AES, Twofish, Serpent, CAST5 and TripleDES). While this mode is legacy and is known to be vulnerable to some watermarking attacks (e.g. revealing of hidden disk existence) it can still be useful to activate old containers without using 3rd party software or for independent forensic analysis of such containers. (Both the userspace and kernel code is an independent implementation based on the format documentation and it completely avoids use of original source code.) The TCW IV generator uses two additional keys: Kw (whitening seed, size is always 16 bytes - TCW_WHITENING_SIZE) and Kiv (IV seed, size is always the IV size of the selected cipher). These keys are concatenated at the end of the main encryption key provided in mapping table. While whitening is completely independent from IV, it is implemented inside IV generator for simplification. The whitening value is always 16 bytes long and is calculated per sector from provided Kw as initial seed, xored with sector number and mixed with CRC32 algorithm. Resulting value is xored with ciphertext sector content. IV is calculated from the provided Kiv as initial IV seed and xored with sector number. Detailed calculation can be found in the Truecrypt documentation for version < 4.1 and will also be described on dm-crypt site, see: http://code.google.com/p/cryptsetup/wiki/DMCrypt The experimental support for activation of these containers is already present in git devel brach of cryptsetup. Signed-off-by: Milan Broz <gmazyland@gmail.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> (cherry picked from commit ed04d98169f1c33ebc79f510c855eed83924d97f) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-06-23dm crypt: properly handle extra key string in initializationMilan Broz
Some encryption modes use extra keys (e.g. loopAES has IV seed) which are not used in block cipher initialization but are part of key string in table constructor. This patch adds an additional field which describes the length of the extra key(s) and substracts it before real key encryption setting. The key_size always includes the size, in bytes, of the key provided in mapping table. The key_parts describes how many parts (usually keys) are contained in the whole key buffer. And key_extra_size contains size in bytes of additional keys part (this number of bytes must be subtracted because it is processed by the IV generator). | K1 | K2 | .... | K64 | Kiv | |----------- key_size ----------------- | | |-key_extra_size-| | [64 keys] | [1 key] | => key_parts = 65 Example where key string contains main key K, whitening key Kw and IV seed Kiv: | K | Kiv | Kw | |--------------- key_size --------------| | |-----key_extra_size------| | [1 key] | [1 key] | [1 key] | => key_parts = 3 Because key_extra_size is calculated during IV mode setting, key initialization is moved after this step. For now, this change has no effect to supported modes (thanks to ilog2 rounding) but it is required by the following patch. Also, fix a sparse warning in crypt_iv_lmk_one(). Signed-off-by: Milan Broz <gmazyland@gmail.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> (cherry picked from commit da31a0787a2ac92dd219ce0d33322160b66d6a01) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-06-23dm: stop using WQ_NON_REENTRANTTejun Heo
dbf2576e37 ("workqueue: make all workqueues non-reentrant") made WQ_NON_REENTRANT no-op and the flag is going away. Remove its usages. This patch doesn't introduce any behavior changes. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Acked-by: Joe Thornber <ejt@redhat.com> (cherry picked from commit 670368a8ddc5df56437444c33b8089afc547c30a) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-06-05md/raid5: don't record new size if resize_stripes fails.NeilBrown
commit 6e9eac2dcee5e19f125967dd2be3e36558c42fff upstream. If any memory allocation in resize_stripes fails we will return -ENOMEM, but in some cases we update conf->pool_size anyway. This means that if we try again, the allocations will be assumed to be larger than they are, and badness results. So only update pool_size if there is no error. This bug was introduced in 2.6.17 and the patch is suitable for -stable. Fixes: ad01c9e3752f ("[PATCH] md: Allow stripes to be expanded in preparation for expanding an array") Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-04-18Merge branch 'linux-linaro-lsk' into linux-linaro-lsk-androidAlex Shi
2015-04-13dm: hold suspend_lock while suspending device during device deletionMikulas Patocka
commit ab7c7bb6f4ab95dbca96fcfc4463cd69843e3e24 upstream. __dm_destroy() must take the suspend_lock so that its presuspend and postsuspend calls do not race with an internal suspend. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-19Merge branch 'linux-linaro-lsk' into linux-linaro-lsk-androidlsk-v3.10-15.03-androidAlex Shi
2015-03-18dm snapshot: fix a possible invalid memory access on unloadMikulas Patocka
commit 22aa66a3ee5b61e0f4a0bfeabcaa567861109ec3 upstream. When the snapshot target is unloaded, snapshot_dtr() waits until pending_exceptions_count drops to zero. Then, it destroys the snapshot. Therefore, the function that decrements pending_exceptions_count should not touch the snapshot structure after the decrement. pending_complete() calls free_pending_exception(), which decrements pending_exceptions_count, and then it performs up_write(&s->lock) and it calls retry_origin_bios() which dereferences s->origin. These two memory accesses to the fields of the snapshot may touch the dm_snapshot struture after it is freed. This patch moves the call to free_pending_exception() to the end of pending_complete(), so that the snapshot will not be destroyed while pending_complete() is in progress. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-18dm: fix a race condition in dm_get_mdMikulas Patocka
commit 2bec1f4a8832e74ebbe859f176d8a9cb20dd97f4 upstream. The function dm_get_md finds a device mapper device with a given dev_t, increases the reference count and returns the pointer. dm_get_md calls dm_find_md, dm_find_md takes _minor_lock, finds the device, tests that the device doesn't have DMF_DELETING or DMF_FREEING flag, drops _minor_lock and returns pointer to the device. dm_get_md then calls dm_get. dm_get calls BUG if the device has the DMF_FREEING flag, otherwise it increments the reference count. There is a possible race condition - after dm_find_md exits and before dm_get is called, there are no locks held, so the device may disappear or DMF_FREEING flag may be set, which results in BUG. To fix this bug, we need to call dm_get while we hold _minor_lock. This patch renames dm_find_md to dm_get_md and changes it so that it calls dm_get while holding the lock. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-18dm io: reject unsupported DISCARD requests with EOPNOTSUPPDarrick J. Wong
commit 37527b869207ad4c208b1e13967d69b8bba1fbf9 upstream. I created a dm-raid1 device backed by a device that supports DISCARD and another device that does NOT support DISCARD with the following dm configuration: # echo '0 2048 mirror core 1 512 2 /dev/sda 0 /dev/sdb 0' | dmsetup create moo # lsblk -D NAME DISC-ALN DISC-GRAN DISC-MAX DISC-ZERO sda 0 4K 1G 0 `-moo (dm-0) 0 4K 1G 0 sdb 0 0B 0B 0 `-moo (dm-0) 0 4K 1G 0 Notice that the mirror device /dev/mapper/moo advertises DISCARD support even though one of the mirror halves doesn't. If I issue a DISCARD request (via fstrim, mount -o discard, or ioctl BLKDISCARD) through the mirror, kmirrord gets stuck in an infinite loop in do_region() when it tries to issue a DISCARD request to sdb. The problem is that when we call do_region() against sdb, num_sectors is set to zero because q->limits.max_discard_sectors is zero. Therefore, "remaining" never decreases and the loop never terminates. To fix this: before entering the loop, check for the combination of REQ_DISCARD and no discard and return -EOPNOTSUPP to avoid hanging up the mirror device. This bug was found by the unfortunate coincidence of pvmove and a discard operation in the RHEL 6.5 kernel; upstream is also affected. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Acked-by: "Martin K. Petersen" <martin.petersen@oracle.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-18dm mirror: do not degrade the mirror on discard errorMikulas Patocka
commit f2ed51ac64611d717d1917820a01930174c2f236 upstream. It may be possible that a device claims discard support but it rejects discards with -EOPNOTSUPP. It happens when using loopback on ext2/ext3 filesystem driven by the ext4 driver. It may also happen if the underlying devices are moved from one disk on another. If discard error happens, we reject the bio with -EOPNOTSUPP, but we do not degrade the array. This patch fixes failed test shell/lvconvert-repair-transient.sh in the lvm2 testsuite if the testsuite is extracted on an ext2 or ext3 filesystem and it is being driven by the ext4 driver. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-16Merge branch 'linaro-android-3.10-lsk' ofAlex Shi
git://android.git.linaro.org/kernel/linaro-android into linux-linaro-lsk-android
2015-03-11dm-verity: Add modes and emit uevent on corrupted blocksSami Tolvanen
Add a device specific mode to dm-verity for handling corrupted blocks: DM_VERITY_MODE_EIO is the default behavior, where reading a corrupted block results in -EIO. DM_VERITY_MODE_LOGGING only logs corrupted blocks, but does not block the read. DM_VERITY_MODE_RESTART calls kernel_restart when a corrupted block is discovered. Each mode sends a uevent to notify userspace of corruption and allow further recovery actions. Defaults to previous behavior, other modes can be enabled with an optional parameter added to the verity table. Change-Id: Ib72ae6ccb865594d28f3553bdcc5a40b1d7af390 Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
2015-03-06md/raid1: fix read balance when a drive is write-mostly.Tomáš Hodek
commit d1901ef099c38afd11add4cfb3312c02ef21ec4a upstream. When a drive is marked write-mostly it should only be the target of reads if there is no other option. This behaviour was broken by commit 9dedf60313fa4dddfd5b9b226a0ef12a512bf9dc md/raid1: read balance chooses idlest disk for SSD which causes a write-mostly device to be *preferred* is some cases. Restore correct behaviour by checking and setting best_dist_disk and best_pending_disk rather than best_disk. We only need to test one of these as they are both changed from -1 or >=0 at the same time. As we leave min_pending and best_dist unchanged, any non-write-mostly device will appear better than the write-mostly device. Reported-by: Tomáš Hodek <tomas.hodek@volny.cz> Reported-by: Dark Penguin <darkpenguin@yandex.ru> Signed-off-by: NeilBrown <neilb@suse.de> Link: http://marc.info/?l=linux-raid&m=135982797322422 Fixes: 9dedf60313fa4dddfd5b9b226a0ef12a512bf9dc Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>