aboutsummaryrefslogtreecommitdiff
path: root/drivers/md/dm-stripe.c
AgeCommit message (Collapse)Author
2012-08-20workqueue: deprecate flush[_delayed]_work_sync()Tejun Heo
flush[_delayed]_work_sync() are now spurious. Mark them deprecated and convert all users to flush[_delayed]_work(). If you're cc'd and wondering what's going on: Now all workqueues are non-reentrant and the regular flushes guarantee that the work item is not pending or running on any CPU on return, so there's no reason to use the sync flushes at all and they're going away. This patch doesn't make any functional difference. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Mattia Dongili <malattia@linux.it> Cc: Kent Yoder <key@linux.vnet.ibm.com> Cc: David Airlie <airlied@linux.ie> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Karsten Keil <isdn@linux-pingi.de> Cc: Bryan Wu <bryan.wu@canonical.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Alasdair Kergon <agk@redhat.com> Cc: Mauro Carvalho Chehab <mchehab@infradead.org> Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de> Cc: David Woodhouse <dwmw2@infradead.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: linux-wireless@vger.kernel.org Cc: Anton Vorontsov <cbou@mail.ru> Cc: Sangbeom Kim <sbkim73@samsung.com> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Eric Van Hensbergen <ericvh@gmail.com> Cc: Takashi Iwai <tiwai@suse.de> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Petr Vandrovec <petr@vandrovec.name> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Avi Kivity <avi@redhat.com>
2012-07-27dm thin: commit before gathering statusAlasdair G Kergon
Commit outstanding metadata before returning the status for a dm thin pool so that the numbers reported are as up-to-date as possible. The commit is not performed if the device is suspended or if the DM_NOFLUSH_FLAG is supplied by userspace and passed to the target through a new 'status_flags' parameter in the target's dm_status_fn. The userspace dmsetup tool will support the --noflush flag with the 'dmsetup status' and 'dmsetup wait' commands from version 1.02.76 onwards. Tested-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-07-27dm stripe: optimize chunk_size calculationsMikulas Patocka
dm-stripe is usually used with a chunk size that is a power of two. Use faster shifts and bit masks in such cases. stripe_width is already optimized in a similar way. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-07-27dm stripe: remove minimum stripe sizeMikulas Patocka
There is no technical limitation in device mapper that would prevent the dm-stripe target from using a stripe size smaller than page size. This patch removes the limit and makes stripe volumes portable across architectures with different page size. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-07-27dm stripe: support for non power of 2 chunksizeMike Snitzer
Support non-power-of-2 chunk sizes with dm striping for proper alignment of stripe IO on storage that has non-power-of-2 optimal IO sizes (e.g. RAID6 10+2). Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-07-27dm: support non power of two target max_io_lenMike Snitzer
Remove the restriction that limits a target's specified maximum incoming I/O size to be a power of 2. Rename this setting from 'split_io' to the less-ambiguous 'max_io_len'. Change it from sector_t to uint32_t, which is plenty big enough, and introduce a wrapper function dm_set_target_max_io_len() to set it. Use sector_div() to process it now that it is not necessarily a power of 2. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-07-27dm stripe: remove stripes_maskMikulas Patocka
The structure stripe_c contains a stripes_mask field. This field is useless because it can be trivially calculated by subtracting one from stripes. It is used only at one place. This patch removes it. The patch also changes ffs(stripes) - 1 to __ffs(stripes). Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-07-27dm stripe: fix size testMikulas Patocka
dm-stripe is supposed to ensure that all the space allocated to the stripes is fully used and that all stripes are the same size. This patch fixes the test. It checks that device length is divisible by the chunk size and checks that the resulting quotient is divisible by the number of stripes (which is equivalent to testing if device length is divisible by chunk_size * stripes). Previously, the code only tested that the number of sectors in the target was divisible by each of the chunk size and the number of stripes separately, which could leave entire stripes unused. (A setup that genuinely needs some stripes to be shorter than others can be created by concatenating striped targets.) Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-07-27dm: replace simple_strtoulmajianpeng
Replace obsolete simple_strtoul() with kstrtou8/kstrtouint. Signed-off-by: majianpeng <majianpeng@gmail.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-03-28dm: reject trailing characters in sccanf inputMikulas Patocka
Device mapper uses sscanf to convert arguments to numbers. The problem is that the way we use it ignores additional unmatched characters in the scanned string. For example, this `if (sscanf(string, "%d", &number) == 1)' will match a number, but also it will match number with some garbage appended, like "123abc". As a result, device mapper accepts garbage after some numbers. For example the command `dmsetup create vg1-new --table "0 16384 linear 254:1bla 34816bla"' will pass without an error. This patch fixes all sscanf uses in device mapper. It appends "%c" with a pointer to a dummy character variable to every sscanf statement. The construct `if (sscanf(string, "%d%c", &number, &dummy) == 1)' succeeds only if string is a null-terminated number (optionally preceded by some whitespace characters). If there is some character appended after the number, sscanf matches "%c", writes the character to the dummy variable and returns 2. We check the return value for 1 and consequently reject numbers with some garbage appended. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Acked-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2011-03-24dm stripe: implement merge methodMustafa Mesanovic
Implement a merge function in the striped target. When the striped target's underlying devices provide a merge_bvec_fn (like all DM devices do via dm_merge_bvec) it is important to call down to them when building a biovec that doesn't span a stripe boundary. Without the merge method, a striped DM device stacked on DM devices causes bios with a single page to be submitted which results in unnecessary overhead that hurts performance. This change really helps filesystems (e.g. XFS and now ext4) which take care to assemble larger bios. By implementing stripe_merge(), DM and the stripe target no longer undermine the filesystem's work by only allowing a single page per bio. Buffered IO sees the biggest improvement (particularly uncached reads, buffered writes to a lesser degree). This is especially so for more capable "enterprise" storage LUNs. The performance improvement has been measured to be ~12-35% -- when a reasonable chunk_size is used (e.g. 64K) in conjunction with a stripe count that is a power of 2. In contrast, the performance penalty is ~5-7% for the pathological worst case stripe configuration (small chunk_size with a stripe count that is not a power of 2). The reason for this is that stripe_map_sector() is now called once for every call to dm_merge_bvec(). stripe_map_sector() will use slower division if stripe count isn't a power of 2. Signed-off-by: Mustafa Mesanovic <mume@linux.vnet.ibm.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2011-01-13dm stripe: switch from local workqueue to system_wqTejun Heo
kstriped only serves sc->kstriped_ws which runs dm_table_event(). This doesn't need to be executed from an ordered workqueue w/ rescuer. Drop kstriped and use the system_wq instead. While at it, rename kstriped_ws to trigger_event so that it's consistent with other dm modules. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2010-09-10dm: implement REQ_FLUSH/FUA support for bio-based dmTejun Heo
This patch converts bio-based dm to support REQ_FLUSH/FUA instead of now deprecated REQ_HARDBARRIER. * -EOPNOTSUPP handling logic dropped. * Preflush is handled as before but postflush is dropped and replaced with passing down REQ_FUA to member request_queues. This replaces one array wide cache flush w/ member specific FUA writes. * __split_and_process_bio() now calls __clone_and_map_flush() directly for flushes and guarantees all FLUSH bio's going to targets are zero ` length. * It's now guaranteed that all FLUSH bio's which are passed onto dm targets are zero length. bio_empty_barrier() tests are replaced with REQ_FLUSH tests. * Empty WRITE_BARRIERs are replaced with WRITE_FLUSHes. * Dropped unlikely() around REQ_FLUSH tests. Flushes are not unlikely enough to be marked with unlikely(). * Block layer now filters out REQ_FLUSH/FUA bio's if the request_queue doesn't support cache flushing. Advertise REQ_FLUSH | REQ_FUA capability. * Request based dm isn't converted yet. dm_init_request_based_queue() resets flush support to 0 for now. To avoid disturbing request based dm code, dm->flush_error is added for bio based dm while requested based dm continues to use dm->barrier_error. Lightly tested linear, stripe, raid1, snap and crypt targets. Please proceed with caution as I'm not familiar with the code base. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: dm-devel@redhat.com Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-12dm stripe: support discardsMikulas Patocka
The DM core will submit a discard bio to the stripe target for each stripe in a striped DM device. The stripe target will determine stripe-specific portions of the supplied bio to be remapped into individual (at most 'num_discard_requests' extents). If a given stripe-specific discard bio doesn't touch a particular stripe the bio will be dropped. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2010-08-12dm stripe: optimize sector divisionMikulas Patocka
Optimize sector division: If the number of stripes is a power of two, we can do shift and mask instead of division. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2010-08-12dm stripe: move sector translation to a functionMikulas Patocka
Move sector to stripe translation into a function. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2010-08-12dm: use dm_target_offset macroAlasdair G Kergon
Use new dm_target_offset() macro to avoid most references to ti->begin in dm targets. Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2010-08-12dm: rename map_info flush_request to target_request_nrMike Snitzer
'target_request_nr' is a more generic name that reflects the fact that it will be used for both flush and discard support. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2010-08-07block: unify flags for struct bio and struct requestChristoph Hellwig
Remove the current bio flags and reuse the request flags for the bio, too. This allows to more easily trace the type of I/O from the filesystem down to the block driver. There were two flags in the bio that were missing in the requests: BIO_RW_UNPLUG and BIO_RW_AHEAD. Also I've renamed two request flags that had a superflous RW in them. Note that the flags are in bio.h despite having the REQ_ name - as blkdev.h includes bio.h that is the only way to go for now. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-03-06dm table: remove unused dm_get_device range parametersNikanth Karthikesan
Remove unused parameters(start and len) of dm_get_device() and fix the callers. Signed-off-by: Nikanth Karthikesan <knikanth@suse.de> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2010-02-16dm stripe: avoid divide by zero with invalid stripe countNikanth Karthikesan
If a table containing zero as stripe count is passed into stripe_ctr the code attempts to divide by zero. This patch changes DM_TABLE_LOAD to return -EINVAL if the stripe count is zero. We now get the following error messages: device-mapper: table: 253:0: striped: Invalid stripe count device-mapper: ioctl: error adding target to table Signed-off-by: Nikanth Karthikesan <knikanth@suse.de> Cc: stable@kernel.org Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-09-14block: Optimal I/O limit wrapperMartin K. Petersen
Implement blk_limits_io_opt() and make blk_queue_io_opt() a wrapper around it. DM needs this to avoid poking at the queue_limits directly. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-11bio: first step in sanitizing the bio->bi_rw flag testingJens Axboe
Get rid of any functions that test for these bits and make callers use bio_rw_flagged() directly. Then it is at least directly apparent what variable and flag they check. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-04dm stripe: expose correct io hintsMike Snitzer
Set sensible I/O hints for striped DM devices in the topology infrastructure added for 2.6.31 for userspace tools to obtain via sysfs. Add .io_hints to 'struct target_type' to allow the I/O hints portion (io_min and io_opt) of the 'struct queue_limits' to be set by each target and implement this for dm-stripe. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-07-23dm table: pass correct dev area size to device_area_is_validMike Snitzer
Incorrect device area lengths are being passed to device_area_is_valid(). The regression appeared in 2.6.31-rc1 through commit 754c5fc7ebb417b23601a6222a6005cc2e7f2913. With the dm-stripe target, the size of the target (ti->len) was used instead of the stripe_width (ti->len/#stripes). An example of a consequent incorrect error message is: device-mapper: table: 254:0: sdb too small for target Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-06-22dm target:s introduce iterate devices fnMike Snitzer
Add .iterate_devices to 'struct target_type' to allow a function to be called for all devices in a DM target. Implemented it for all targets except those in dm-snap.c (origin and snapshot). (The raid1 version number jumps to 1.12 because we originally reserved 1.1 to 1.11 for 'block_on_error' but ended up using 'handle_errors' instead.) Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Cc: martin.petersen@oracle.com
2009-06-22dm: stripe support flushMikulas Patocka
Flush support for the stripe target. This sets ti->num_flush_requests to the number of stripes and remaps individual flush requests to the appropriate stripe devices. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-01-06dm: consolidate target deregistration error handlingMikulas Patocka
Change dm_unregister_target to return void and use BUG() for error reporting. dm_unregister_target can only fail because of programming bug in the target driver. It can't fail because of user's behavior or disk errors. This patch changes unregister_target to return void and use BUG if someone tries to unregister non-registered target or unregister target that is in use. This patch removes code duplication (testing of error codes in all dm targets) and reports bugs in just one place, in dm_unregister_target. In some target drivers, these return codes were ignored, which could lead to a situation where bugs could be missed. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2008-11-13dm stripe: fix init failureHeinz Mauelshagen
Don't proceed if dm_stripe_init() fails to register itself as a dm target. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2008-10-21dm: remove dm header from targetsMikulas Patocka
Change #include "dm.h" to #include <linux/device-mapper.h> in all targets. Targets should not need direct access to internal DM structures. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2008-10-21dm: publish array_too_bigMikulas Patocka
Move array_too_big to include/linux/device-mapper.h because it is used by targets. Remove the test from dm-raid1 as the number of mirror legs is limited such that it can never fail. (Even for stripes it seems rather unlikely.) Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2008-10-09block: don't depend on consecutive minor spaceTejun Heo
* Implement disk_devt() and part_devt() and use them to directly access devt instead of computing it from ->major and ->first_minor. Note that all references to ->major and ->first_minor outside of block layer is used to determine devt of the disk (the part0) and as ->major and ->first_minor will continue to represent devt for the disk, converting these users aren't strictly necessary. However, convert them for consistency. * Implement disk_max_parts() to avoid directly deferencing genhd->minors. * Update bdget_disk() such that it doesn't assume consecutive minor space. * Move devt computation from register_disk() to add_disk() and make it the only one (all other usages use the initially determined value). These changes clean up the code and will help disk->part dereference fix and extended block device numbers. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-02-08dm: stripe enhanced status returnBrian Wood
This patch adds additional information to the status line. It is added at the end of the returned text so it will not interfere with existing implementations using this data. The addition of this information will allow for a common return interface to match that returned with the dm-raid1.c status line (with Jonathan Brassow's patches). Here is a sample of what is returned with a mirror "status" call: isw_eeaaabgfg_mirror: 0 488390920 mirror 2 8:16 8:32 3727/3727 1 AA 1 core Here's what's returned with this patch for a stripe "status" call: isw_dheeijjdej_stripe: 0 976783872 striped 2 8:16 8:32 1 AA Signed-off-by: Brian Wood <brian.j.wood@intel.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2008-02-08dm: stripe trigger event on failureBrian Wood
This patch adds the stripe_end_io function to process errors that might occur after an IO operation. As part of this there are a number of enhancements made to record and trigger events: - New atomic variable in struct stripe to record the number of errors each stripe volume device has experienced (could be used later with uevents to report back directly to userspace) - New workqueue/work struct setup to process the trigger_event function - New end_io function. It is here that testing for BIO error conditions take place. It determines the exact stripe that cause the error, records this in the new atomic variable, and calls the queue_work() function - New trigger_event function to process failure events. This calls dm_table_event() Signed-off-by: Brian Wood <brian.j.wood@intel.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2007-10-20dm: use is_power_of_2vignesh babu
Replacing n & (n - 1) for power of 2 check by is_power_of_2(n) Signed-off-by: vignesh babu <vignesh.babu@wipro.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2006-12-08[PATCH] dm: map and endio symbolic return codesKiyoshi Ueda
Update existing targets to use the new symbols for return values from target map and end_io functions. There is no effect on behaviour. Test results: Done build test without errors. Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Cc: dm-devel@redhat.com Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26[PATCH] dm: improve error message consistencyAlasdair G Kergon
Tidy device-mapper error messages to include context information automatically. Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27[PATCH] dm: remove unnecessary typecastKevin Corry
Signed-off-by: Kevin Corry <kevcorry@us.ibm.com> Cc: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27[PATCH] dm: remove SECTOR_FORMATAndrew Morton
We don't know what type sector_t has. Sometimes it's unsigned long, sometimes it's unsigned long long. For example on ppc64 it's unsigned long with CONFIG_LBD=n and on x86_64 it's unsigned long long with CONFIG_LBD=n. The way to handle all of this is to always use unsigned long long and to always typecast the sector_t when printing it. Acked-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-17[PATCH] dm stripe: Fix boundsKevin Corry
The dm-stripe target currently does not enforce that the size of a stripe device be a multiple of the chunk-size. Under certain conditions, this can lead to I/O requests going off the end of an underlying device. This test-case shows one example. echo "0 100 linear /dev/hdb1 0" | dmsetup create linear0 echo "0 100 linear /dev/hdb1 100" | dmsetup create linear1 echo "0 200 striped 2 32 /dev/mapper/linear0 0 /dev/mapper/linear1 0" | \ dmsetup create stripe0 dd if=/dev/zero of=/dev/mapper/stripe0 bs=1k This will produce the output: dd: writing '/dev/mapper/stripe0': Input/output error 97+0 records in 96+0 records out And in the kernel log will be: attempt to access beyond end of device dm-0: rw=0, want=104, limit=100 The patch will check that the table size is a multiple of the stripe chunk-size when the table is created, which will prevent the above striped device from being created. This should not affect tools like LVM or EVMS, since in all the cases I can think of, striped devices are always created with the sizes being a multiple of the chunk-size. The size of a stripe device must be a multiple of its chunk-size. (akpm: that typecast is quite gratuitous) Signed-off-by: Kevin Corry <kevcorry@us.ibm.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-16Linux-2.6.12-rc2Linus Torvalds
Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!