aboutsummaryrefslogtreecommitdiff
path: root/drivers/dma/dmaengine.c
AgeCommit message (Collapse)Author
2013-09-10Merge branch 'for-linus' of git://git.infradead.org/users/vkoul/slave-dmaLinus Torvalds
Pull slave-dmaengine updates from Vinod Koul: "This pull brings: - Andy's DW driver updates - Guennadi's sh driver updates - Pl08x driver fixes from Tomasz & Alban - Improvements to mmp_pdma by Daniel - TI EDMA fixes by Joel - New drivers: - Hisilicon k3dma driver - Renesas rcar dma driver - New API for publishing slave driver capablities - Various fixes across the subsystem by Andy, Jingoo, Sachin etc..." * 'for-linus' of git://git.infradead.org/users/vkoul/slave-dma: (94 commits) dma: edma: Remove limits on number of slots dma: edma: Leave linked to Null slot instead of DUMMY slot dma: edma: Find missed events and issue them ARM: edma: Add function to manually trigger an EDMA channel dma: edma: Write out and handle MAX_NR_SG at a given time dma: edma: Setup parameters to DMA MAX_NR_SG at a time dmaengine: pl330: use dma_set_max_seg_size to set the sg limit dmaengine: dma_slave_caps: remove sg entries dma: replace devm_request_and_ioremap by devm_ioremap_resource dma: ste_dma40: Fix potential null pointer dereference dma: ste_dma40: Remove duplicate const dma: imx-dma: Remove redundant NULL check dma: dmagengine: fix function names in comments dma: add driver for R-Car HPB-DMAC dma: k3dma: use devm_ioremap_resource() instead of devm_request_and_ioremap() dma: imx-sdma: Staticize sdma_driver_data structures pch_dma: Add MODULE_DEVICE_TABLE dmaengine: PL08x: Add cyclic transfer support dmaengine: PL08x: Fix reading the byte count in cctl dmaengine: PL08x: Add support for different maximum transfer size ...
2013-09-09Merge tag 'dmaengine-3.12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/djbw/dmaengine Pull dmaengine update from Dan Williams: "Collection of random updates to the core and some end-driver fixups for ioatdma and mv_xor: - NUMA aware channel allocation - Cleanup dmatest debugfs interface - ioat: make raid-support Atom only - mv_xor: big endian Aside from the top three commits these have all had some soak time in -next. The top commit fixes a recent build breakage. It has been a long while since my last pull request, hopefully it does not show. Thanks to Vinod for keeping an eye on drivers/dma/ this past year" * tag 'dmaengine-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/dmaengine: dmaengine: dma_sync_wait and dma_find_channel undefined MAINTAINERS: update email for Dan Williams dma: mv_xor: Fix incorrect error path ioatdma: silence GCC warnings dmaengine: make dma_channel_rebalance() NUMA aware dmaengine: make dma_submit_error() return an error code ioatdma: disable RAID on non-Atom platforms and reenable unaligned copies mv_xor: support big endian systems using descriptor swap feature mv_xor: use {readl, writel}_relaxed instead of __raw_{readl, writel} dmatest: print message on debug level in case of no error dmatest: remove IS_ERR_OR_NULL checks of debugfs calls dmatest: make module parameters writable
2013-09-02Merge branch 'topic/of' into for-linusVinod Koul
2013-09-02dma: dmagengine: fix function names in commentsDaniel Mack
Trivial fix for function name mismatches I stumbled over. Signed-off-by: Daniel Mack <zonque@gmail.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-08-22dmaengine: make dma_channel_rebalance() NUMA awareBrice Goglin
dma_channel_rebalance() currently distributes channels by processor ID. These IDs often change with the BIOS, and the order isn't related to the DMA channel list (related to PCI bus ids). * On my SuperMicro dual E5 machine, first socket has processor IDs [0-7] (and [16-23] for hyperthreads), second socket has [8-15]+[24-31] => channels are properly allocated to local CPUs. * On Dells R720 with same processors, first socket has even processor IDs, second socket has odd numbers => half the processors get channels on the remote socket, causing cross-NUMA traffic and lower DMA performance. Change nth_chan() to return the channel with min table_count and in the NUMA node of the given CPU, if any. If none, the (non-local) channel with min table_count is returned. nth_chan() is therefore renamed into min_chan() since we don't iterate until the nth channel anymore. In practice, the behavior is the same because first channels are taken first and are then ignored because they got an additional reference. The new code has a slightly higher complexity since we always scan the entire list of channels for finding the minimal table_count (instead of stopping after N chans), and because we check whether the CPU is in the DMA device locality mask. Overall we still have time complexity = number of chans x number of processors. This rebalance is rarely used, so this won't hurt. On the above SuperMicro machine, channels are still allocated the same. On the Dells, there are no locality issue anymore (MEMCPY channel X goes to processor X and to its hyperthread sibling). Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr> Signed-off-by: Dan Williams <djbw@fb.com>
2013-08-19dmaengine: fix - error: potential NULL dereference 'chan'Vinod Koul
commit 7bb587f4 "dmaengine: add interface of dma_get_slave_channel" introduced the above error so fix it Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Suggested-by: Zhangfei Gao <zhangfei.gao@linaro.org> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-08-13dmaengine: add interface of dma_get_slave_channelZhangfei Gao
Suggested by Arnd, add dma_get_slave_channel interface Dma host driver could get specific channel specificied by request line, rather than filter. host example: static struct dma_chan *xx_of_dma_simple_xlate(struct of_phandle_args *dma_spec, struct of_dma *ofdma) { struct xx_dma_dev *d = ofdma->of_dma_data; unsigned int request = dma_spec->args[0]; if (request > d->dma_requests) return NULL; return dma_get_slave_channel(&(d->chans[request].vc.chan)); } probe: of_dma_controller_register((&op->dev)->of_node, xx_of_dma_simple_xlate, d); Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-07-25dma: convert dma_devclass to use dev_groupsGreg Kroah-Hartman
The dev_attrs field of struct class is going away soon, dev_groups should be used instead. This converts the dma dma_devclass code to use the correct field. Cc: Dan Williams <djbw@fb.com> Acked-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-03drivers/dma: remove unused support for MEMSET operationsBartlomiej Zolnierkiewicz
There have never been any real users of MEMSET operations since they have been introduced in January 2007 by commit 7405f74badf4 ("dmaengine: refactor dmaengine around dma_async_tx_descriptor"). Therefore remove support for them for now, it can be always brought back when needed. [sebastian.hesselbarth@gmail.com: fix drivers/dma/mv_xor] Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com> Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com> Cc: Vinod Koul <vinod.koul@intel.com> Acked-by: Dan Williams <djbw@fb.com> Cc: Tomasz Figa <t.figa@samsung.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Olof Johansson <olof@lixom.net> Cc: Kevin Hilman <khilman@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-02Merge branch 'topic/of' into for-linusVinod Koul
Conflicts: include/linux/dmaengine.h Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-04-15dmaengine: call acpi_dma_request_slave_channel as wellAndy Shevchenko
The slave device could be enumerated by ACPI. In that case the dma_request_slave_channel should use the acpi_dma_request_slave_channel() helper. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-04-15DMA: of: Constant namesMarkus Pargmann
No DMA of-function alters the name, so this patch changes the name arguments to be constant. Most drivers will probably request DMA channels using a constant name. Signed-off-by: Markus Pargmann <mpa@pengutronix.de> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-04-15dma: Make the 'mask' parameter of __dma_request_channel constLars-Peter Clausen
The 'mask' parameter is not modified in __dma_request_channel and really shouldn't be. Make this explicit by making the parameter const. Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-02-27dmaengine: convert to idr_alloc()Tejun Heo
Convert to the much saner new idr interface. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Dan Williams <djbw@fb.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-01-07dmaengine: add cpu_relax() to busy-loop in dma_sync_wait()Bartlomiej Zolnierkiewicz
Removal of the busy-loop from dma_sync_wait() is not a trivial task so just add cpu_relax() to the loop for now. Cc: Vinod Koul <vinod.koul@intel.com> Cc: Tomasz Figa <t.figa@samsung.com> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com> Signed-off-by: Dan Williams <djbw@fb.com>
2013-01-06dmaengine: add helper function to request a slave DMA channelJon Hunter
Currently slave DMA channels are requested by calling dma_request_channel() and requires DMA clients to pass various filter parameters to obtain the appropriate channel. With device-tree being used by architectures such as arm and the addition of device-tree helper functions to extract the relevant DMA client information from device-tree, add a new function to request a slave DMA channel using device-tree. This function is currently a simple wrapper that calls the device-tree of_dma_request_slave_channel() function. Cc: Nicolas Ferre <nicolas.ferre@atmel.com> Cc: Benoit Cousson <b-cousson@ti.com> Cc: Stephen Warren <swarren@nvidia.com> Cc: Grant Likely <grant.likely@secretlab.ca> Cc: Russell King <linux@arm.linux.org.uk> Cc: Rob Herring <rob.herring@calxeda.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Vinod Koul <vinod.koul@intel.com> Cc: Dan Williams <djbw@fb.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jon Hunter <jon-hunter@ti.com> Reviewed-by: Stephen Warren <swarren@wwwdotorg.org> Acked-by: Rob Herring <rob.herring@calxeda.com> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
2012-10-06drivers/dma/dmaengine.c: lower the priority of 'failed to get' dma channel ↵Fabio Estevam
message Do the same as commit a03a202e95fd ("dmaengine: failure to get a specific DMA channel is not critical") to get rid of the following messages during kernel boot: dmaengine_get: failed to get dma1chan0: (-22) dmaengine_get: failed to get dma1chan1: (-22) dmaengine_get: failed to get dma1chan2: (-22) dmaengine_get: failed to get dma1chan3: (-22) .. Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com> Cc: Vinod Koul <vinod.koul@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-07-20dmaengine: Cleanup logging messagesJoe Perches
Use a more current logging style. Add pr_fmt to prefix dmaengine: to messages. Convert printk(KERN_ERR to pr_err(. Convert embedded function name use to "%s: ", __func__ Align arguments. Original-patch-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
2012-04-10Merge tag 'dmaengine-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/djbw/dmaengine Pull dmaengine fixes from Dan Williams: 1/ regression fix for Xen as it now trips over a broken assumption about the dma address size on 32-bit builds 2/ new quirk for netdma to ignore dma channels that cannot meet netdma alignment requirements 3/ fixes for two long standing issues in ioatdma (ring size overflow) and iop-adma (potential stack corruption) * tag 'dmaengine-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/dmaengine: netdma: adding alignment check for NETDMA ops ioatdma: DMA copy alignment needed to address IOAT DMA silicon errata ioat: ring size variables need to be 32bit to avoid overflow iop-adma: Corrected array overflow in RAID6 Xscale(R) test. ioat: fix size of 'completion' for Xen
2012-04-05netdma: adding alignment check for NETDMA opsDave Jiang
This is the fallout from adding memcpy alignment workaround for certain IOATDMA hardware. NetDMA will only use DMA engine that can handle byte align ops. Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2012-03-06dma: dmaengine: Distinguish between 'dmaengine: failed to get' messagesFabio Estevam
The message "dmaengine: failed to get" can come from two possible locations within dmaengine.c. In order to distinguish between them, replace "dmaengine" with __func__ string so that the source function of the error message can be easily identified. Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
2011-11-18DMAEngine: Define interleaved transfer request apiJassi Brar
Define a new api that could be used for doing fancy data transfers like interleaved to contiguous copy and vice-versa. Traditional SG_list based transfers tend to be very inefficient in such cases as where the interleave and chunk are only a few bytes, which call for a very condensed api to convey pattern of the transfer. This api supports all 4 variants of scatter-gather and contiguous transfer. Of course, neither can this api help transfers that don't lend to DMA by nature, i.e, scattered tiny read/writes with no periodic pattern. Also since now we support SLAVE channels that might not provide device_prep_slave_sg callback but device_prep_interleaved_dma, remove the BUG_ON check. Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org> Acked-by: Barry Song <Baohua.Song@csr.com> [renamed dmaxfer_template to dma_interleaved_template did fixup after the enum dma_transfer_merge] Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
2011-08-04Merge branch 'next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx: dmaengine: use DEFINE_IDR for static initialization ioat: fix xor_idx_to_desc Avoid section type conflict in dma/ioat/dma_v3.c ioat: Adding PCI IDs for IOAT devices on SandyBridge platforms
2011-08-03dmaengine: use DEFINE_IDR for static initializationAxel Lin
We could use DEFINE_IDR for statically allocated idr that allow us to save a few lines of code. And also remove unneeded mutex_init() for dma_list_mutex, as dma_list_mutex is initialized automatically by DEFINE_MUTEX(). Signed-off-by: Axel Lin <axel.lin@gmail.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-08-01Merge branch 'for-linus' of git://git.infradead.org/users/vkoul/slave-dmaLinus Torvalds
* 'for-linus' of git://git.infradead.org/users/vkoul/slave-dma: (37 commits) Improve slave/cyclic DMA engine documentation dmaengine: pl08x: handle the rest of enums in pl08x_width DMA: PL08x: cleanup selection of burst size DMA: PL08x: avoid recalculating cctl at each prepare DMA: PL08x: cleanup selection of buswidth DMA: PL08x: constify plchan->cd and plat->slave_channels DMA: PL08x: separately store source/destination cctl DMA: PL08x: separately store source/destination slave address DMA: PL08x: clean up LLI debugging DMA: PL08x: select LLI bus only once per LLI setup DMA: PL08x: remove unused constants ARM: mxs-dma: reset after disable channel dma: intel_mid_dma: remove redundant pci_set_drvdata calls dma: mxs-dma: fix unterminated platform_device_id table dmaengine: pl330: make platform data optional dmaengine: imx-sdma: return proper error if kzalloc fails pch_dma: Fix CTL register access issue dmaengine: mxs-dma: skip request_irq for NO_IRQ dmaengine/coh901318: fix slave submission semantics dmaengine/ste_dma40: allow memory buswidth/burst to be configured ... Fix trivial whitespace conflict in drivers/dma/mv_xor.c
2011-06-24dmaengine: failure to get a specific DMA channel is not criticalGuennadi Liakhovetski
There exist systems with multiple DMA controllers with different capabilities. For example, on some sh-mobile / rmobile systems there are DMA controllers, whose channels can be configured to be used with SD- and MMC-host controllers, serial ports etc. Besides there are also DMA controllers, that can only be used for one special function, e.g., for USB. In such cases the DMA client filter function can just choose to specify to the DMA driver, which channel it needs. Then the .device_alloc_chan_resources() method of the DMA driver will check, whether it can provide that dunction. If not, it will fail and the loop in __dma_request_channel() will continue to the next DMA device, until it finds a suitable one. This works fine with just one minor glitch: the kernel logs error messages like dmaengine: failed to get <channel name>: (-<error code>) after each such non-critical failure. This patch lowers priority of this message to the debug level. Reported-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de> Tested-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Tested-by: Magnus Damm <damm@opensource.se> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2011-06-21net: remove mm.h inclusion from netdevice.hAlexey Dobriyan
Remove linux/mm.h inclusion from netdevice.h -- it's unused (I've checked manually). To prevent mm.h inclusion via other channels also extract "enum dma_data_direction" definition into separate header. This tiny piece is what gluing netdevice.h with mm.h via "netdevice.h => dmaengine.h => dma-mapping.h => scatterlist.h => mm.h". Removal of mm.h from scatterlist.h was tried and was found not feasible on most archs, so the link was cutoff earlier. Hope people are OK with tiny include file. Note, that mm_types.h is still dragged in, but it is a separate story. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-07async_tx: make async_tx channel switching opt-inDan Williams
The majority of drivers in drivers/dma/ will never establish cross channel operation chains and do not need the extra overhead in struct dma_async_tx_descriptor. Make channel switching opt-in by default. Cc: Anatolij Gustschin <agust@denx.de> Cc: Ira Snyder <iws@ovro.caltech.edu> Cc: Linus Walleij <linus.walleij@stericsson.com> Cc: Saeed Bishara <saeed@marvell.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-10-07Merge branches 'dma40', 'pl08x', 'fsldma', 'imx' and 'intel-mid' into dmaengineDan Williams
2010-10-07dma: add support for scatterlist to scatterlist copyIra Snyder
This adds support for scatterlist to scatterlist DMA transfers. A similar interface is exposed by the fsldma driver (through the DMA_SLAVE API) and by the ste_dma40 driver (through an exported function). This patch paves the way for making this type of copy operation a part of the generic DMAEngine API. Futher patches will add support in individual drivers. Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-10-05dmaengine: add possibility for cyclic transfersSascha Hauer
Cyclic transfers are useful for audio where a single buffer divided in periods has to be transfered endlessly until stopped. After being prepared the transfer is started using the dma_async_descriptor->tx_submit function. dma_async_descriptor->callback is called after each period. The transfer is stopped using the DMA_TERMINATE_ALL callback. While being used for cyclic transfers the channel cannot be used for other transfer types. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Cc: Haavard Skinnemoen <haavard.skinnemoen@atmel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-05-17Merge branch 'ioat' into dmaengineDan Williams
2010-05-17async_tx: trim dma_async_tx_descriptor in 'no channel switch' caseDan Williams
Saves 24 bytes per descriptor (64-bit) when the channel-switching capabilities of async_tx are not required. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-05-17DMA ENGINE: Do not reset 'private' of channelJassi Brar
The member 'private' of 'struct dma_chan' is meant for passing data between client and the controller driver. The DMA client driver may point it to platform specific stuff after acquiring the channel. So, it is the responsiblity of the same code to reset it, if it must. The DMA engine doesn't set it and hence, shouldn't reset it either. This reseting of private by DMA Engine comes in the way of implementing default channel settings during DMAC probe. That capability is useful for not having the clients to always provide platform specific data, like Rx/Tx FIFO addresses, which usually doesn't change across channel requests. Signed-off-by: Jassi Brar <jassi.brar@samsung.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-03-30include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo
implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-26DMAENGINE: generic channel status v2Linus Walleij
Convert the device_is_tx_complete() operation on the DMA engine to a generic device_tx_status()operation which can return three states, DMA_TX_RUNNING, DMA_TX_COMPLETE, DMA_TX_PAUSED. [dan.j.williams@intel.com: update for timberdale] Signed-off-by: Linus Walleij <linus.walleij@stericsson.com> Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Cc: Maciej Sosnowski <maciej.sosnowski@intel.com> Cc: Nicolas Ferre <nicolas.ferre@atmel.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Li Yang <leoli@freescale.com> Cc: Guennadi Liakhovetski <g.liakhovetski@gmx.de> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Haavard Skinnemoen <haavard.skinnemoen@atmel.com> Cc: Magnus Damm <damm@opensource.se> Cc: Liam Girdwood <lrg@slimlogic.co.uk> Cc: Joe Perches <joe@perches.com> Cc: Roland Dreier <rdreier@cisco.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-03-26DMAENGINE: generic slave control v2Linus Walleij
Convert the device_terminate_all() operation on the DMA engine to a generic device_control() operation which can now optionally support also pausing and resuming DMA on a certain channel. Implemented for the COH 901 318 DMAC as an example. [dan.j.williams@intel.com: update for timberdale] Signed-off-by: Linus Walleij <linus.walleij@stericsson.com> Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Cc: Maciej Sosnowski <maciej.sosnowski@intel.com> Cc: Nicolas Ferre <nicolas.ferre@atmel.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Li Yang <leoli@freescale.com> Cc: Guennadi Liakhovetski <g.liakhovetski@gmx.de> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Haavard Skinnemoen <haavard.skinnemoen@atmel.com> Cc: Magnus Damm <damm@opensource.se> Cc: Liam Girdwood <lrg@slimlogic.co.uk> Cc: Joe Perches <joe@perches.com> Cc: Roland Dreier <rdreier@cisco.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-03-03Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: percpu: add __percpu sparse annotations to what's left percpu: add __percpu sparse annotations to fs percpu: add __percpu sparse annotations to core kernel subsystems local_t: Remove leftover local.h this_cpu: Remove pageset_notifier this_cpu: Page allocator conversion percpu, x86: Generic inc / dec percpu instructions local_t: Move local.h include to ringbuffer.c and ring_buffer_benchmark.c module: Use this_cpu_xx to dynamically allocate counters local_t: Remove cpu_local_xx macros percpu: refactor the code in pcpu_[de]populate_chunk() percpu: remove compile warnings caused by __verify_pcpu_ptr() percpu: make accessors check for percpu pointer in sparse percpu: add __percpu for sparse. percpu: make access macros universal percpu: remove per_cpu__ prefix.
2010-02-17percpu: add __percpu sparse annotations to what's leftTejun Heo
Add __percpu sparse annotations to places which didn't make it in one of the previous patches. All converions are trivial. These annotations are to make sparse consider percpu variables to be in a different address space and warn if accessed without going through percpu accessors. This patch doesn't affect normal builds. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Borislav Petkov <borislav.petkov@amd.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Len Brown <lenb@kernel.org> Cc: Neil Brown <neilb@suse.de>
2010-02-02dmaengine: fix memleak in dma_async_device_unregisterAnatolij Gustschin
While debugging a dma driver I noticed a memleak after unloading the driver module. Caught by kmemleak. Signed-off-by: Anatolij Gustschin <agust@denx.de> Cc: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-12-14Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (34 commits) m68k: rename global variable vmalloc_end to m68k_vmalloc_end percpu: add missing per_cpu_ptr_to_phys() definition for UP percpu: Fix kdump failure if booted with percpu_alloc=page percpu: make misc percpu symbols unique percpu: make percpu symbols in ia64 unique percpu: make percpu symbols in powerpc unique percpu: make percpu symbols in x86 unique percpu: make percpu symbols in xen unique percpu: make percpu symbols in cpufreq unique percpu: make percpu symbols in oprofile unique percpu: make percpu symbols in tracer unique percpu: make percpu symbols under kernel/ and mm/ unique percpu: remove some sparse warnings percpu: make alloc_percpu() handle array types vmalloc: fix use of non-existent percpu variable in put_cpu_var() this_cpu: Use this_cpu_xx in trace_functions_graph.c this_cpu: Use this_cpu_xx for ftrace this_cpu: Use this_cpu_xx in nmi handling this_cpu: Use this_cpu operations in RCU this_cpu: Use this_cpu ops for VM statistics ... Fix up trivial (famous last words) global per-cpu naming conflicts in arch/x86/kvm/svm.c mm/slab.c
2009-11-19async_tx: build-time toggling of async_{syndrome,xor}_val dma supportDan Williams
ioat3.2 does not support asynchronous error notifications which makes the driver experience latencies when non-zero pq validate results are expected. Provide a mechanism for turning off async_xor_val and async_syndrome_val via Kconfig. This approach is generally useful for any driver that specifies ASYNC_TX_DISABLE_CHANNEL_SWITCH and would like to force the async_tx api to fall back to the synchronous path for certain operations. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-11-19dmaengine: include xor/pq validate in device_has_all_tx_types()Dan Williams
A channel must include these capabilities to satisfy ASYNC_TX_DISABLE_CHANNEL_SWITCH. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-10-03this_cpu: Eliminate get/put_cpuChristoph Lameter
There are cases where we can use this_cpu_ptr and as the result of using this_cpu_ptr() we no longer need to determine the currently executing cpu. In those places no get/put_cpu combination is needed anymore. The local cpu variable can be eliminated. Preemption still needs to be disabled and enabled since the modifications of the per cpu variables is not atomic. There may be multiple per cpu variables modified and those must all be from the same processor. Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Tejun Heo <tj@kernel.org> cc: Eric Biederman <ebiederm@aristanetworks.com> cc: Stephen Hemminger <shemminger@vyatta.com> cc: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Tejun Heo <tj@kernel.org>
2009-09-08Merge branch 'dmaengine' into async-tx-nextDan Williams
Conflicts: crypto/async_tx/async_xor.c drivers/dma/ioat/dma_v2.h drivers/dma/ioat/pci.c drivers/md/raid5.c
2009-09-08dmaengine: kill tx_listDan Williams
The tx_list attribute of struct dma_async_tx_descriptor is common to most, but not all dma driver implementations. None of the upper level code (dmaengine/async_tx) uses it, so allow drivers to implement it locally if they need it. This saves sizeof(struct list_head) bytes for drivers that do not manage descriptors with a linked list (e.g.: ioatdma v2,3). Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-09-08dmaengine, async_tx: add a "no channel switch" allocatorDan Williams
Channel switching is problematic for some dmaengine drivers as the architecture precludes separating the ->prep from ->submit. In these cases the driver can select ASYNC_TX_DISABLE_CHANNEL_SWITCH to modify the async_tx allocator to only return channels that support all of the required asynchronous operations. For example MD_RAID456=y selects support for asynchronous xor, xor validate, pq, pq validate, and memcpy. When ASYNC_TX_DISABLE_CHANNEL_SWITCH=y any channel with all these capabilities is marked DMA_ASYNC_TX allowing async_tx_find_channel() to quickly locate compatible channels with the guarantee that dependency chains will remain on one channel. When ASYNC_TX_DISABLE_CHANNEL_SWITCH=n async_tx_find_channel() may select channels that lead to operation chains that need to cross channel boundaries using the async_tx channel switch capability. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-09-08Merge branch 'md-raid6-accel' into ioat3.2Dan Williams
Conflicts: include/linux/dmaengine.h
2009-08-29async_tx: add support for asynchronous GF multiplicationDan Williams
[ Based on an original patch by Yuri Tikhonov ] This adds support for doing asynchronous GF multiplication by adding two additional functions to the async_tx API: async_gen_syndrome() does simultaneous XOR and Galois field multiplication of sources. async_syndrome_val() validates the given source buffers against known P and Q values. When a request is made to run async_pq against more than the hardware maximum number of supported sources we need to reuse the previous generated P and Q values as sources into the next operation. Care must be taken to remove Q from P' and P from Q'. For example to perform a 5 source pq op with hardware that only supports 4 sources at a time the following approach is taken: p, q = PQ(src0, src1, src2, src3, COEF({01}, {02}, {04}, {08})) p', q' = PQ(p, q, q, src4, COEF({00}, {01}, {00}, {10})) p' = p + q + q + src4 = p + src4 q' = {00}*p + {01}*q + {00}*q + {10}*src4 = q + {10}*src4 Note: 4 is the minimum acceptable maxpq otherwise we punt to synchronous-software path. The DMA_PREP_CONTINUE flag indicates to the driver to reuse p and q as sources (in the above manner) and fill the remaining slots up to maxpq with the new sources/coefficients. Note1: Some devices have native support for P+Q continuation and can skip this extra work. Devices with this capability can advertise it with dma_set_maxpq. It is up to each driver how to handle the DMA_PREP_CONTINUE flag. Note2: The api supports disabling the generation of P when generating Q, this is ignored by the synchronous path but is implemented by some dma devices to save unnecessary writes. In this case the continuation algorithm is simplified to only reuse Q as a source. Cc: H. Peter Anvin <hpa@zytor.com> Cc: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Yuri Tikhonov <yur@emcraft.com> Signed-off-by: Ilya Yanok <yanok@emcraft.com> Reviewed-by: Andre Noll <maan@systemlinux.org> Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2009-08-29async_tx: remove walk of tx->parent chain in dma_wait_for_async_txDan Williams
We currently walk the parent chain when waiting for a given tx to complete however this walk may race with the driver cleanup routine. The routines in async_raid6_recov.c may fall back to the synchronous path at any point so we need to be prepared to call async_tx_quiesce() (which calls dma_wait_for_async_tx). To remove the ->parent walk we guarantee that every time a dependency is attached ->issue_pending() is invoked, then we can simply poll the initial descriptor until completion. This also allows for a lighter weight 'issue pending' implementation as there is no longer a requirement to iterate through all the channels' ->issue_pending() routines as long as operations have been submitted in an ordered chain. async_tx_issue_pending() is added for this case. Signed-off-by: Dan Williams <dan.j.williams@intel.com>