aboutsummaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2015-05-11zram: factor out single stream compressionSergey Senozhatsky
This is preparation patch to add multi stream support to zcomp. Introduce struct zcomp_strm_single and a set of functions to manage zcomp_strm stream access. zcomp_strm_single implements single compession stream, same way as current zcomp implementation. This moves zcomp_strm stream control and locking from zcomp, so compressing backend zcomp is not aware of required locking. Single and multi streams require different locking schemes. Minchan Kim reported that spinlock-based locking scheme (which is used in multi stream implementation) has demonstrated a severe perfomance regression for single compression stream case, comparing to mutex-based. see https://lkml.org/lkml/2014/2/18/16 The following set of functions added: - zcomp_strm_single_find()/zcomp_strm_single_release() find and release a compression stream, implement required locking - zcomp_strm_single_create()/zcomp_strm_single_destroy() create and destroy zcomp_strm_single New ->strm_find() and ->strm_release() callbacks added to zcomp, which are set to zcomp_strm_single_find() and zcomp_strm_single_release() during initialisation. Instead of direct locking and zcomp_strm access from zcomp_strm_find() and zcomp_strm_release(), zcomp now calls ->strm_find() and ->strm_release() correspondingly. Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Acked-by: Minchan Kim <minchan@kernel.org> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 9cc97529a180b369fcb7e5265771b6ba7e01f05b) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-05-11zram: use zcomp compressing backendsSergey Senozhatsky
Do not perform direct LZO compress/decompress calls, initialise and use zcomp LZO backend (single compression stream) instead. [akpm@linux-foundation.org: resolve conflicts with zram-delete-zram_init_device-fix.patch] Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Acked-by: Minchan Kim <minchan@kernel.org> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit b7ca232ee7e85ed3b18e39eb20a7f458ee1d6047) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-05-11zram: introduce compressing backend abstractionSergey Senozhatsky
ZRAM performs direct LZO compression algorithm calls, making it the one and only option. While LZO is generally performs well, LZ4 algorithm tends to have a faster decompression (see http://code.google.com/p/lz4/ for full report) Name Ratio C.speed D.speed MB/s MB/s LZ4 (r101) 2.084 422 1820 LZO 2.06 2.106 414 600 Thus, users who have mostly read (decompress) usage scenarious or mixed workflow (writes with relatively high read ops number) will benefit from using LZ4 compression backend. Introduce compressing backend abstraction zcomp in order to support multiple compression algorithms with the following set of operations: .create .destroy .compress .decompress Schematically zram write() usually contains the following steps: 0) preparation (decompression of partioal IO, etc.) 1) lock buffer_lock mutex (protects meta compress buffers) 2) compress (using meta compress buffers) 3) alloc and map zs_pool object 4) copy compressed data (from meta compress buffers) to object allocated by 3) 5) free previous pool page, assign a new one 6) unlock buffer_lock mutex As we can see, compressing buffers must remain untouched from 1) to 4), because, otherwise, concurrent write() can overwrite data. At the same time, zram_meta must be aware of a) specific compression algorithm memory requirements and b) necessary locking to protect compression buffers. To remove requirement a) new struct zcomp_strm introduced, which contains a compress/decompress `buffer' and compression algorithm `private' part. While struct zcomp implements zcomp_strm stream handling and locking and removes requirement b) from zram meta. zcomp ->create() and ->destroy(), respectively, allocate and deallocate algorithm specific zcomp_strm `private' part. Every zcomp has zcomp stream and mutex to protect its compression stream. Stream usage semantics remains the same -- only one write can hold stream lock and use its buffers. zcomp_strm_find() turns caller into exclusive user of a stream (holding stream mutex until zram release stream), and zcomp_strm_release() makes zcomp stream available (unlock the stream mutex). Hence no concurrent write (compression) operations possible at the moment. iozone -t 3 -R -r 16K -s 60M -I +Z test base patched -------------------------------------------------- Initial write 597992.91 591660.58 Rewrite 609674.34 616054.97 Read 2404771.75 2452909.12 Re-read 2459216.81 2470074.44 Reverse Read 1652769.66 1589128.66 Stride read 2202441.81 2202173.31 Random read 2236311.47 2276565.31 Mixed workload 1423760.41 1709760.06 Random write 579584.08 615933.86 Pwrite 597550.02 594933.70 Pread 1703672.53 1718126.72 Fwrite 1330497.06 1461054.00 Fread 3922851.00 3957242.62 Usage examples: comp = zcomp_create(NAME) /* NAME e.g. "lzo" */ which initialises compressing backend if requested algorithm is supported. Compress: zstrm = zcomp_strm_find(comp) zcomp_compress(comp, zstrm, src, &dst_len) [..] /* copy compressed data */ zcomp_strm_release(comp, zstrm) Decompress: zcomp_decompress(comp, src, src_len, dst); Free compessing backend and its zcomp stream: zcomp_destroy(comp) Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Acked-by: Minchan Kim <minchan@kernel.org> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit e7e1ef439d18f9a21521116ea9f2b976d7230e54) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-05-11zram: delete zram_init_device()Sergey Senozhatsky
allocate new `zram_meta' in disksize_store() only for uninitialised zram device, saving a number of allocations and deallocations in case if disksize_store() was called on currently used device. at the same time zram_meta stack variable is not necessary, because we can set ->meta directly. there is also no need in setting QUEUE_FLAG_NONROT queue on every disksize_store(), set it once during device creation. [minchan@kernel.org: handle zram->meta alloc fail case] [minchan@kernel.org: prevent lockdep spew of init_lock] Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Minchan Kim <minchan@kernel.org> Acked-by: Jerome Marchand <jmarchan@redhat.com> Cc: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit b67d1ec189ffb92cdad9b2bd29475fb1e0166983) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-05-11zram: move zram size warning to documentationSergey Senozhatsky
Move zram warning about disksize and size of memory correlation to zram documentation. Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Acked-by: Minchan Kim <minchan@kernel.org> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit e64cd51d2fa87733176246101df871a8ac5c7c20) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-05-11zram: drop not used table `count' memberSergey Senozhatsky
struct table `count' member is not used. Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Minchan Kim <minchan@kernel.org> Acked-by: Jerome Marchand <jmarchan@redhat.com> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 59fc86a4922f1a1c0f69eac758a7e2b2b138aab4) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-05-11zram: report failed read and write statsSergey Senozhatsky
zram accounted but did not report numbers of failed read and write queries. make these stats available as failed_reads and failed_writes attrs. Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Acked-by: Minchan Kim <minchan@kernel.org> Acked-by: Jerome Marchand <jmarchan@redhat.com> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 6444724939db5de7390c90f7b4a657159b3b4465) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-05-11zram: remove zram stats code duplicationSergey Senozhatsky
Introduce ZRAM_ATTR_RO macro that generates device_attribute and default ATTR show() function for existing atomic64_t zram stats. Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Acked-by: Minchan Kim <minchan@kernel.org> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit a68eb3b65e658406d386bebef02277f4007b2f45) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-05-11zram: use atomic64_t for all zram statsSergey Senozhatsky
This is a preparation patch for stats code duplication removal. 1) use atomic64_t for `pages_zero' and `pages_stored' zram stats. 2) `compr_size' and `pages_zero' struct zram_stats members did not follow the existing device attr naming scheme: zram_stats.ATTR has ATTR_show() function. rename them: -- compr_size -> compr_data_size -- pages_zero -> zero_pages Minchan Kim's note: If we really have trouble with atomic stat operation, we could change it with percpu_counter so that it could solve atomic overhead and unnecessary memory space by introducing unsigned long instead of 64bit atomic_t. Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Acked-by: Minchan Kim <minchan@kernel.org> Acked-by: Jerome Marchand <jmarchan@redhat.com> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 90a7806ea9b9f7cb4751859cc2506e2d80e36ef1) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-05-11zram: remove good and bad compress statsSergey Senozhatsky
Remove `good' and `bad' compressed sub-requests stats. RW request may cause a number of RW sub-requests. zram used to account `good' compressed sub-queries (with compressed size less than 50% of original size), `bad' compressed sub-queries (with compressed size greater that 75% of original size), leaving sub-requests with compression size between 50% and 75% of original size not accounted and not reported. zram already accounts each sub-request's compression size so we can calculate real device compression ratio. Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Acked-by: Minchan Kim <minchan@kernel.org> Acked-by: Jerome Marchand <jmarchan@redhat.com> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit b7cccf8b4009bf74df61f3c9d86b95fabd807c11) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-05-11zram: do not pass rw argument to __zram_make_request()Sergey Senozhatsky
Do not pass rw argument down the __zram_make_request() -> zram_bvec_rw() chain, decode it in zram_bvec_rw() instead. Besides, this is the place where we distinguish READ and WRITE bio data directions, so account zram RW stats here, instead of __zram_make_request(). This also allows to account a real number of zram READ/WRITE operations, not just requests (single RW request may cause a number of zram RW ops with separate locking, compression/decompression, etc). Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Acked-by: Minchan Kim <minchan@kernel.org> Acked-by: Jerome Marchand <jmarchan@redhat.com> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit be257c61306750d11c20d2ac567bf63304c696a3) Signed-off-by: Alex Shi <alex.shi@linaro.org> Conflicts: drivers/block/zram/zram_drv.c Conflicts solution: keep bio struct as old before commit 4f024f3797 'block: Abstract out bvec iterator'
2015-05-11zram: drop `init_done' struct zram memberSergey Senozhatsky
Introduce init_done() helper function which allows us to drop `init_done' struct zram member. init_done() uses the fact that ->init_done == 1 equals to ->meta != NULL. Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Acked-by: Minchan Kim <minchan@kernel.org> Acked-by: Jerome Marchand <jmarchan@redhat.com> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit be2d1d56c82d8cf20e6c77515eb499f8e86eb5be) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-05-11zram: avoid null access when fail to alloc metaMinchan Kim
zram_meta_alloc could fail so caller should check it. Otherwise, your system will hang. Signed-off-by: Minchan Kim <minchan@kernel.org> Acked-by: Jerome Marchand <jmarchan@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit db5d711e2db776f18219b033e5dc4fb7e4264dd7) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-05-11zram: remove zram->lock in read path and change it with mutexMinchan Kim
Finally, we separated zram->lock dependency from 32bit stat/ table handling so there is no reason to use rw_semaphore between read and write path so this patch removes the lock from read path totally and changes rw_semaphore with mutex. So, we could do old: read-read: OK read-write: NO write-write: NO Now: read-read: OK read-write: OK write-write: NO The below data proves mixed workload performs well 11 times and there is also enhance on write-write path because current rw-semaphore doesn't support SPIN_ON_OWNER. It's side effect but anyway good thing for us. Write-related tests perform better (from 61% to 1058%) but read path has good/bad(from -2.22% to 1.45%) but they are all marginal within stddev. CPU 12 iozone -t -T -l 12 -u 12 -r 16K -s 60M -I +Z -V 0 ==Initial write ==Initial write records: 10 records: 10 avg: 516189.16 avg: 839907.96 std: 22486.53 (4.36%) std: 47902.17 (5.70%) max: 546970.60 max: 909910.35 min: 481131.54 min: 751148.38 ==Rewrite ==Rewrite records: 10 records: 10 avg: 509527.98 avg: 1050156.37 std: 45799.94 (8.99%) std: 40695.44 (3.88%) max: 611574.27 max: 1111929.26 min: 443679.95 min: 980409.62 ==Read ==Read records: 10 records: 10 avg: 4408624.17 avg: 4472546.76 std: 281152.61 (6.38%) std: 163662.78 (3.66%) max: 4867888.66 max: 4727351.03 min: 4058347.69 min: 4126520.88 ==Re-read ==Re-read records: 10 records: 10 avg: 4462147.53 avg: 4363257.75 std: 283546.11 (6.35%) std: 247292.63 (5.67%) max: 4912894.44 max: 4677241.75 min: 4131386.50 min: 4035235.84 ==Reverse Read ==Reverse Read records: 10 records: 10 avg: 4565865.97 avg: 4485818.08 std: 313395.63 (6.86%) std: 248470.10 (5.54%) max: 5232749.16 max: 4789749.94 min: 4185809.62 min: 3963081.34 ==Stride read ==Stride read records: 10 records: 10 avg: 4515981.80 avg: 4418806.01 std: 211192.32 (4.68%) std: 212837.97 (4.82%) max: 4889287.28 max: 4686967.22 min: 4210362.00 min: 4083041.84 ==Random read ==Random read records: 10 records: 10 avg: 4410525.23 avg: 4387093.18 std: 236693.22 (5.37%) std: 235285.23 (5.36%) max: 4713698.47 max: 4669760.62 min: 4057163.62 min: 3952002.16 ==Mixed workload ==Mixed workload records: 10 records: 10 avg: 243234.25 avg: 2818677.27 std: 28505.07 (11.72%) std: 195569.70 (6.94%) max: 288905.23 max: 3126478.11 min: 212473.16 min: 2484150.69 ==Random write ==Random write records: 10 records: 10 avg: 555887.07 avg: 1053057.79 std: 70841.98 (12.74%) std: 35195.36 (3.34%) max: 683188.28 max: 1096125.73 min: 437299.57 min: 992481.93 ==Pwrite ==Pwrite records: 10 records: 10 avg: 501745.93 avg: 810363.09 std: 16373.54 (3.26%) std: 19245.01 (2.37%) max: 518724.52 max: 833359.70 min: 464208.73 min: 765501.87 ==Pread ==Pread records: 10 records: 10 avg: 4539894.60 avg: 4457680.58 std: 197094.66 (4.34%) std: 188965.60 (4.24%) max: 4877170.38 max: 4689905.53 min: 4226326.03 min: 4095739.72 Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit e46e33152eb82b8e2db7ffb3790a2a2653c34513) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-05-11zram: remove workqueue for freeing removed pending slotMinchan Kim
Commit a0c516cbfc74 ("zram: don't grab mutex in zram_slot_free_noity") introduced free request pending code to avoid scheduling by mutex under spinlock and it was a mess which made code lenghty and increased overhead. Now, we don't need zram->lock any more to free slot so this patch reverts it and then, tb_lock should protect it. Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit f614a9f48dedd2b80d1dc8bae8094842fcdb39dd) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-05-11zram: introduce zram->tb_lockMinchan Kim
Currently, the zram table is protected by zram->lock but it's rather coarse-grained lock and it makes hard for scalibility. Let's use own rwlock instead of depending on zram->lock. This patch adds new locking so obviously, it would make slow but this patch is just prepartion for removing coarse-grained rw_semaphore(ie, zram->lock) which is hurdle about zram scalability. Final patch in this patchset series will remove the lock from read-path and change rw_semaphore with mutex in write path. With bonus, we could drop pending slot free mess in next patch. Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 92967471b67163bb1654e9b7fe99449ab70a4aaa) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-05-11zram: use atomic operation for statMinchan Kim
Some of fields in zram->stats are protected by zram->lock which is rather coarse-grained so let's use atomic operation without explict locking. This patch is ready for removing dependency of zram->lock in read path which is very coarse-grained rw_semaphore. Of course, this patch adds new atomic operation so it might make slow but my 12CPU test couldn't spot any regression. All gain/lose is marginal within stddev. iozone -t -T -l 12 -u 12 -r 16K -s 60M -I +Z -V 0 ==Initial write ==Initial write records: 50 records: 50 avg: 412875.17 avg: 415638.23 std: 38543.12 (9.34%) std: 36601.11 (8.81%) max: 521262.03 max: 502976.72 min: 343263.13 min: 351389.12 ==Rewrite ==Rewrite records: 50 records: 50 avg: 416640.34 avg: 397914.33 std: 60798.92 (14.59%) std: 46150.42 (11.60%) max: 543057.07 max: 522669.17 min: 304071.67 min: 316588.77 ==Read ==Read records: 50 records: 50 avg: 4147338.63 avg: 4070736.51 std: 179333.25 (4.32%) std: 223499.89 (5.49%) max: 4459295.28 max: 4539514.44 min: 3753057.53 min: 3444686.31 ==Re-read ==Re-read records: 50 records: 50 avg: 4096706.71 avg: 4117218.57 std: 229735.04 (5.61%) std: 171676.25 (4.17%) max: 4430012.09 max: 4459263.94 min: 2987217.80 min: 3666904.28 ==Reverse Read ==Reverse Read records: 50 records: 50 avg: 4062763.83 avg: 4078508.32 std: 186208.46 (4.58%) std: 172684.34 (4.23%) max: 4401358.78 max: 4424757.22 min: 3381625.00 min: 3679359.94 ==Stride read ==Stride read records: 50 records: 50 avg: 4094933.49 avg: 4082170.22 std: 185710.52 (4.54%) std: 196346.68 (4.81%) max: 4478241.25 max: 4460060.97 min: 3732593.23 min: 3584125.78 ==Random read ==Random read records: 50 records: 50 avg: 4031070.04 avg: 4074847.49 std: 192065.51 (4.76%) std: 206911.33 (5.08%) max: 4356931.16 max: 4399442.56 min: 3481619.62 min: 3548372.44 ==Mixed workload ==Mixed workload records: 50 records: 50 avg: 149925.73 avg: 149675.54 std: 7701.26 (5.14%) std: 6902.09 (4.61%) max: 191301.56 max: 175162.05 min: 133566.28 min: 137762.87 ==Random write ==Random write records: 50 records: 50 avg: 404050.11 avg: 393021.47 std: 58887.57 (14.57%) std: 42813.70 (10.89%) max: 601798.09 max: 524533.43 min: 325176.99 min: 313255.34 ==Pwrite ==Pwrite records: 50 records: 50 avg: 411217.70 avg: 411237.96 std: 43114.99 (10.48%) std: 33136.29 (8.06%) max: 530766.79 max: 471899.76 min: 320786.84 min: 317906.94 ==Pread ==Pread records: 50 records: 50 avg: 4154908.65 avg: 4087121.92 std: 151272.08 (3.64%) std: 219505.04 (5.37%) max: 4459478.12 max: 4435857.38 min: 3730512.41 min: 3101101.67 Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit deb0bdeb2f3d6b81d37fc778316dae46b6daab56) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-05-11zram: remove unnecessary freeMinchan Kim
Commit a0c516cbfc74 ("zram: don't grab mutex in zram_slot_free_noity") introduced pending zram slot free in zram's write path in case of missing slot free by memory allocation failure in zram_slot_free_notify but it is not necessary because we have already freed the slot right before overwriting. Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Jerome Marchand <jmarchan@redhat.com> Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 874e3cddc33f0c0f9cc08ad2b73fa0cbe7dfaa63) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-05-11zram: delay pending free request in read pathMinchan Kim
Sergey reported we don't need to handle pending free request every I/O so that this patch removes it in read path while we remain it in write path. Let's consider below example. Swap subsystem ask to zram "A" block free by swap_slot_free_notify but zram had been pended it without real freeing. Swap subsystem allocates "A" block for new data but request pended for a long time just handled and zram blindly free new data on the "A" block. :( That's why we couldn't remove handle pending free request right before zram-write. Signed-off-by: Minchan Kim <minchan@kernel.org> Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 9b353db16d18f87242337e3e61a948c023505a65) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-05-11zram: fix race between reset and flushing pending workMinchan Kim
Dan and Sergey reported that there is a racy between reset and flushing of pending work so that it could make oops by freeing zram->meta in reset while zram_slot_free can access zram->meta if new request is adding during the race window. This patch moves flush after taking init_lock so it prevents new request so that it closes the race. Signed-off-by: Minchan Kim <minchan@kernel.org> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Jerome Marchand <jmarchan@redhat.com> Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit da4a04126baa3be03bc566d4a2ee0944c5e783d0) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-05-11zram: add copyrightMinchan Kim
Add my copyright to the zram source code which I maintain. Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 7bfb3de8a1b3bebc2dc68d381efe27448c0584c5) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-05-11zram: remove old private project commentMinchan Kim
Remove the old private compcache project address so upcoming patches should be sent to LKML because we Linux kernel community will take care. Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 49061236a9c2e18b31617cef10d27ba136068bac) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-05-11zram: promote zram from stagingMinchan Kim
Zram has lived in staging for a LONG LONG time and have been fixed/improved by many contributors so code is clean and stable now. Of course, there are lots of product using zram in real practice. The major TV companys have used zram as swap since two years ago and recently our production team released android smart phone with zram which is used as swap, too and recently Android Kitkat start to use zram for small memory smart phone. And there was a report Google released their ChromeOS with zram, too and cyanogenmod have been used zram long time ago. And I heard some disto have used zram block device for tmpfs. In addition, I saw many report from many other peoples. For example, Lubuntu start to use it. The benefit of zram is very clear. With my experience, one of the benefit was to remove jitter of video application with backgroud memory pressure. It would be effect of efficient memory usage by compression but more issue is whether swap is there or not in the system. Recent mobile platforms have used JAVA so there are many anonymous pages. But embedded system normally are reluctant to use eMMC or SDCard as swap because there is wear-leveling and latency issues so if we do not use swap, it means we can't reclaim anoymous pages and at last, we could encounter OOM kill. :( Although we have real storage as swap, it was a problem, too. Because it sometime ends up making system very unresponsible caused by slow swap storage performance. Quote from Luigi on Google "Since Chrome OS was mentioned: the main reason why we don't use swap to a disk (rotating or SSD) is because it doesn't degrade gracefully and leads to a bad interactive experience. Generally we prefer to manage RAM at a higher level, by transparently killing and restarting processes. But we noticed that zram is fast enough to be competitive with the latter, and it lets us make more efficient use of the available RAM. " and he announced. http://www.spinics.net/lists/linux-mm/msg57717.html Other uses case is to use zram for block device. Zram is block device so anyone can format the block device and mount on it so some guys on the internet start zram as /var/tmp. http://forums.gentoo.org/viewtopic-t-838198-start-0.html Let's promote zram and enhance/maintain it instead of removing. Signed-off-by: Minchan Kim <minchan@kernel.org> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Nitin Gupta <ngupta@vflare.org> Acked-by: Pekka Enberg <penberg@kernel.org> Cc: Bob Liu <bob.liu@oracle.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Hugh Dickins <hughd@google.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Luigi Semenzato <semenzato@google.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Seth Jennings <sjenning@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit cd67e10ac6997c6d1e1504e3c111b693bfdbc148) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-05-11zsmalloc: move it under mmMinchan Kim
This patch moves zsmalloc under mm directory. Before that, description will explain why we have needed custom allocator. Zsmalloc is a new slab-based memory allocator for storing compressed pages. It is designed for low fragmentation and high allocation success rate on large object, but <= PAGE_SIZE allocations. zsmalloc differs from the kernel slab allocator in two primary ways to achieve these design goals. zsmalloc never requires high order page allocations to back slabs, or "size classes" in zsmalloc terms. Instead it allows multiple single-order pages to be stitched together into a "zspage" which backs the slab. This allows for higher allocation success rate under memory pressure. Also, zsmalloc allows objects to span page boundaries within the zspage. This allows for lower fragmentation than could be had with the kernel slab allocator for objects between PAGE_SIZE/2 and PAGE_SIZE. With the kernel slab allocator, if a page compresses to 60% of it original size, the memory savings gained through compression is lost in fragmentation because another object of the same size can't be stored in the leftover space. This ability to span pages results in zsmalloc allocations not being directly addressable by the user. The user is given an non-dereferencable handle in response to an allocation request. That handle must be mapped, using zs_map_object(), which returns a pointer to the mapped region that can be used. The mapping is necessary since the object data may reside in two different noncontigious pages. The zsmalloc fulfills the allocation needs for zram perfectly [sjenning@linux.vnet.ibm.com: borrow Seth's quote] Signed-off-by: Minchan Kim <minchan@kernel.org> Acked-by: Nitin Gupta <ngupta@vflare.org> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Bob Liu <bob.liu@oracle.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Hugh Dickins <hughd@google.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Luigi Semenzato <semenzato@google.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Pekka Enberg <penberg@kernel.org> Cc: Rik van Riel <riel@redhat.com> Cc: Seth Jennings <sjenning@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit bcf1647d0899666f0fb90d176abf63bae22abb7c) Signed-off-by: Alex Shi <alex.shi@linaro.org> Conflicts: drivers/staging/zsmalloc/Kconfig mm/Kconfig mm/Makefile Conflicts solutions: only move zsmalloc to mm/, skip unrelated cma/zbud/zswap
2015-05-11Staging: zram: Fix memory leak by refcount mismatchRashika Kheria
As suggested by Minchan Kim and Jerome Marchand "The code in reset_store get the block device (bdget_disk()) but it does not put it (bdput()) when it's done using it. The usage count is therefore incremented but never decremented." This patch also puts bdput() for all error cases. Acked-by: Minchan Kim <minchan@kernel.org> Acked-by: Jerome Marchand <jmarchan@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 1b672224d128ec2570eb37572ff803cfe452b4f7) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-05-11Staging: zram: Fix access of NULL pointerRashika Kheria
This patch fixes the bug in reset_store caused by accessing NULL pointer. The bdev gets its value from bdget_disk() which could fail when memory pressure is severe and hence can return NULL because allocation of inode in bdget could fail. Hence, this patch introduces a check for bdev to prevent reference to a NULL pointer in the later part of the code. It also removes unnecessary check of bdev for fsync_bdev(). Cc: stable <stable@vger.kernel.org> Acked-by: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com> Acked-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 46a51c80216cb891f271ad021f59009f34677499) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-05-11Staging: zram: Fix variable dereferenced before checkRashika Kheria
This patch fixes the following Smatch warning in zram_drv.c- drivers/staging/zram/zram_drv.c:899 destroy_device() warn: variable dereferenced before check 'zram->disk' (see line 896) Acked-by: Minchan Kim <minchan@kernel.org> Acked-by: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 59d3fe540454dd8fc48d4eda44e200f9c98bef10) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-05-11Revert "staging: zram: Add auto loading of module if user opens /dev/zram."Greg Kroah-Hartman
This reverts commit c70bda992c12e593e411c02a52e4bd6985407539. It's incorrect, Kay writes: Please just remove it. "devname" is meant to be used for single-instance devices with a static dev_t, never for things like zramX. It will not do anything useful here, it does nothing really without a statically assigned dev_t, and it should not be used for devices of this kind anyway. Reported-by: Tom Gundersen <teg@jklm.no> Reported-by: Kay Sievers <kay@vrfy.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit f0f65a95de2840db3fa61c953dca267e7b773168) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-05-11zram: don't grab mutex in zram_slot_free_noityMinchan Kim
[1] introduced down_write in zram_slot_free_notify to prevent race between zram_slot_free_notify and zram_bvec_[read|write]. The race could happen if somebody who has right permission to open swap device is reading swap device while it is used by swap in parallel. However, zram_slot_free_notify is called with holding spin_lock of swap layer so we shouldn't avoid holing mutex. Otherwise, lockdep warns it. This patch adds new list to handle free slot and workqueue so zram_slot_free_notify just registers slot index to be freed and registers the request to workqueue. If workqueue is expired, it holds mutex_lock so there is no problem any more. If any I/O is issued, zram handles pending slot-free request caused by zram_slot_free_notify right before handling issued request because workqueue wouldn't be expired yet so zram I/O request handling function can miss it. Lastly, when zram is reset, flush_work could handle all of pending free request so we shouldn't have memory leak. NOTE: If zram_slot_free_notify's kmalloc with GFP_ATOMIC would be failed, the slot will be freed when next write I/O write the slot. [1] [57ab0485, zram: use zram->lock to protect zram_free_page() in swap free notify path] * from v2 * refactoring * from v1 * totally redesign Cc: Nitin Gupta <ngupta@vflare.org> Cc: Jiang Liu <jiang.liu@huawei.com> Cc: stable@vger.kernel.org Signed-off-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit a0c516cbfc7452c8cbd564525fef66d9f20b46d1) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-05-11zram: fix invalid memory accessMinchan Kim
[1] tried to fix invalid memory access on zram->disk but it didn't fix properly because get_disk failed during module exit path. Actually, we don't need to reset zram->disk's capacity to zero in module exit path so that this patch introduces new argument "reset_capacity" on zram_reset_divice and it only reset it when reset_store is called. [1] 6030ea9b, zram: avoid invalid memory access in zram_exit() Cc: Nitin Gupta <ngupta@vflare.org> Cc: Jiang Liu <jiang.liu@huawei.com> Cc: stable@vger.kernel.org Signed-off-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 2b86ab9cc29fcd435cde9378c3b9ffe8b5c76128) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-05-11Staging: zram: zram_drv.c: Fixed Error of trailing whitespaceKumar Gaurav
Fixed by removing trailing whitespace Signed-off-by: Kumar Gaurav <kumargauravgupta3@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit a539c72a195c081d950475c2945cb82d80be9b66) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-05-11zram: prevent data loss in error cases of function zram_bvec_write()Sunghan Suh
In function zram_bvec_write(), previous data at the index is already freed by function zram_free_page(). When failed to compress or zs_malloc, there is no way to restore old data. Therefore, free previous data when it's about to update. Also, no need to check whether table is not empty outside of function zram_free_page(), because the function properly checks inside. Signed-off-by: Sunghan Suh <sunghan.suh@samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit f40ac2ae1b506484dd9261a24bbf3e86b2206ff8) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-05-11staging: zram: Add auto loading of module if user opens /dev/zram.Konrad Rzeszutek Wilk
Greg spotted that said driver is not subscribing to the automagic mechanism of auto-loading if a user tries to open /dev/zram. This fixes it. CC: Minchan Kim <minchan@kernel.org> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit c70bda992c12e593e411c02a52e4bd6985407539) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-05-11staging: zram: protect zram_reset_device() callSergey Senozhatsky
Commit 9b3bb7abcdf2df0f1b2657e6cbc9d06bc2b3b36f (remove zram_sysfs file (v2)) accidentally made zram_reset_device() racy. Protect zram_reset_device() call with zram->lock. Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Acked-by: Jerome Marchand <jmarchand@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 644d478793c6594277f8ae76954da4ace7ac6f96) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-05-11zram: remove zram_sysfs file (v2)Sergey Senozhatsky
Move zram sysfs code to zram drv and remove zram_sysfs.c file. This gives ability to make static a number of previously exported zram functions, used from zram sysfs, e.g. internal zram zram_meta_alloc/free(). We also can drop zram_drv wrapper functions, used from zram sysfs: e.g. zram_reset_device()/__zram_reset_device() pair. v2: as suggested by Greg K-H, move MODULE description to the bottom of the file. Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 9b3bb7abcdf2df0f1b2657e6cbc9d06bc2b3b36f) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-05-11zram: use atomic64_xxx() to replace zram_stat64_xxx()Jiang Liu
Use atomic64_xxx() to replace open-coded zram_stat64_xxx(). Some architectures have native support of atomic64 operations, so we can get rid of the spin_lock() in zram_stat64_xxx(). On the other hand, for platforms use generic version of atomic64 implement, it may cause an extra save/restore of the interrupt flag. So it's a tradeoff. Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit da5cc7d338f97886ebf35be92995460289379b73) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-05-11zram: optimize memory operations with clear_page()/copy_page()Jiang Liu
Some architectures provides architecture-specific, optimized version of clear_page()/copy_page(), which may have better performance than memset()/memcpy(). So use clear_page()/copy_page() to optimize zram performance if possible. Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 42e99bd975fdd24d2bf1a24ebb8b0b42bab8ba65) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-05-11zram: kill unused zram_get_num_devices()Jiang Liu
Now there's no caller of zram_get_num_devices(), so kill it. And change zram_devices to static because it's only used in zram_drv.c. Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 0f0e3ba346c8d8d2cb409b157df79805931a1c2c) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-05-11zram: simplify and optimize dev_to_zram()Jiang Liu
Simplify and optimize dev_to_zram() without walking the zram_devices array. Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 80de574dca050b734d8413a98a983fba3d06240b) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-05-11Staging: Fixes string split across lines in zramMarlies Ruck
Fixes the following checkpatch warning in zram_drv.c: WARNING: quoted string split across lines Signed-off-by: Marlies Ruck <marlies.ruck@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 596b3dd4c8e172db7806372c9d0347a4e7d28bc5) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2015-05-08gator: Add config for building the module in-treeJon Medhurst
Signed-off-by: Jon Medhurst <tixy@linaro.org>
2015-05-08gator: Version 5.21.1Jon Medhurst
Signed-off-by: Drew Richardson <drew.richardson@arm.com> Signed-off-by: Jon Medhurst <tixy@linaro.org>
2015-05-06memstick: mspro_block: add missing curly bracesDan Carpenter
commit 13f6b191aaa11c7fd718d35a0c565f3c16bc1d99 upstream. Using the indenting we can see the curly braces were obviously intended. This is a static checker fix, but my guess is that we don't read enough bytes, because we don't calculate "t_len" correctly. Fixes: f1d82698029b ('memstick: use fully asynchronous request processing') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Alex Dubov <oakad@yahoo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-06wl18xx: show rx_frames_per_rates as an array as it really isNicolas Iooss
commit a3fa71c40f1853d0c27e8f5bc01a722a705d9682 upstream. In struct wl18xx_acx_rx_rate_stat, rx_frames_per_rates field is an array, not a number. This means WL18XX_DEBUGFS_FWSTATS_FILE can't be used to display this field in debugfs (it would display a pointer, not the actual data). Use WL18XX_DEBUGFS_FWSTATS_FILE_ARRAY instead. This bug has been found by adding a __printf attribute to wl1271_format_buffer. gcc complained about "format '%u' expects argument of type 'unsigned int', but argument 5 has type 'u32 *'". Fixes: c5d94169e818 ("wl18xx: use new fw stats structures") Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-06e1000: add dummy allocator to fix race condition between mtu change and netpollSabrina Dubroca
commit 08e8331654d1d7b2c58045e549005bc356aa7810 upstream. There is a race condition between e1000_change_mtu's cleanups and netpoll, when we change the MTU across jumbo size: Changing MTU frees all the rx buffers: e1000_change_mtu -> e1000_down -> e1000_clean_all_rx_rings -> e1000_clean_rx_ring Then, close to the end of e1000_change_mtu: pr_info -> ... -> netpoll_poll_dev -> e1000_clean -> e1000_clean_rx_irq -> e1000_alloc_rx_buffers -> e1000_alloc_frag And when we come back to do the rest of the MTU change: e1000_up -> e1000_configure -> e1000_configure_rx -> e1000_alloc_jumbo_rx_buffers alloc_jumbo finds the buffers already != NULL, since data (shared with page in e1000_rx_buffer->rxbuf) has been re-alloc'd, but it's garbage, or at least not what is expected when in jumbo state. This results in an unusable adapter (packets don't get through), and a NULL pointer dereference on the next call to e1000_clean_rx_ring (other mtu change, link down, shutdown): BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff81194d6e>] put_compound_page+0x7e/0x330 [...] Call Trace: [<ffffffff81195445>] put_page+0x55/0x60 [<ffffffff815d9f44>] e1000_clean_rx_ring+0x134/0x200 [<ffffffff815da055>] e1000_clean_all_rx_rings+0x45/0x60 [<ffffffff815df5e0>] e1000_down+0x1c0/0x1d0 [<ffffffff811e2260>] ? deactivate_slab+0x7f0/0x840 [<ffffffff815e21bc>] e1000_change_mtu+0xdc/0x170 [<ffffffff81647050>] dev_set_mtu+0xa0/0x140 [<ffffffff81664218>] do_setlink+0x218/0xac0 [<ffffffff814459e9>] ? nla_parse+0xb9/0x120 [<ffffffff816652d0>] rtnl_newlink+0x6d0/0x890 [<ffffffff8104f000>] ? kvm_clock_read+0x20/0x40 [<ffffffff810a2068>] ? sched_clock_cpu+0xa8/0x100 [<ffffffff81663802>] rtnetlink_rcv_msg+0x92/0x260 By setting the allocator to a dummy version, netpoll can't mess up our rx buffers. The allocator is set back to a sane value in e1000_configure_rx. Fixes: edbbb3ca1077 ("e1000: implement jumbo receive with partial descriptors") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-06drm/i915: cope with large i2c transfersDmitry Torokhov
commit 9535c4757b881e06fae72a857485ad57c422b8d2 upstream. The hardware, according to the specs, is limited to 256 byte transfers, and current driver has no protections in case users attempt to do larger transfers. The code will just stomp over status register and mayhem ensues. Let's split larger transfers into digestable chunks. Doing this allows Atmel MXT driver on Pixel 1 function properly (it hasn't since commit 9d8dc3e529a19e427fd379118acd132520935c5d "Input: atmel_mxt_ts - implement T44 message handling" which tries to consume multiple touchscreen/touchpad reports in a single transaction). Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-06drm/radeon: fix doublescan modes (v2)Alex Deucher
commit fd99a0943ffaa0320ea4f69d09ed188f950c0432 upstream. Use the correct flags for atom. v2: handle DRM_MODE_FLAG_DBLCLK Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-06i2c: core: Export bus recovery functionsMark Brown
commit c1c21f4e60ed4523292f1a89ff45a208bddd3849 upstream. Current -next fails to link an ARM allmodconfig because drivers that use the core recovery functions can be built as modules but those functions are not exported: ERROR: "i2c_generic_gpio_recovery" [drivers/i2c/busses/i2c-davinci.ko] undefined! ERROR: "i2c_generic_scl_recovery" [drivers/i2c/busses/i2c-davinci.ko] undefined! ERROR: "i2c_recover_bus" [drivers/i2c/busses/i2c-davinci.ko] undefined! Add exports to fix this. Fixes: 5f9296ba21b3c (i2c: Add bus recovery infrastructure) Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-06IB/mlx4: Fix WQE LSO segment calculationErez Shitrit
commit ca9b590caa17bcbbea119594992666e96cde9c2f upstream. The current code decreases from the mss size (which is the gso_size from the kernel skb) the size of the packet headers. It shouldn't do that because the mss that comes from the stack (e.g IPoIB) includes only the tcp payload without the headers. The result is indication to the HW that each packet that the HW sends is smaller than what it could be, and too many packets will be sent for big messages. An easy way to demonstrate one more aspect of the problem is by configuring the ipoib mtu to be less than 2*hlen (2*56) and then run app sending big TCP messages. This will tell the HW to send packets with giant (negative value which under unsigned arithmetics becomes a huge positive one) length and the QP moves to SQE state. Fixes: b832be1e4007 ('IB/mlx4: Add IPoIB LSO support') Reported-by: Matthew Finlay <matt@mellanox.com> Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-06IB/core: don't disallow registering region starting at 0x0Yann Droneaud
commit 66578b0b2f69659f00b6169e6fe7377c4b100d18 upstream. In a call to ib_umem_get(), if address is 0x0 and size is already page aligned, check added in commit 8494057ab5e4 ("IB/uverbs: Prevent integer overflow in ib_umem_get address arithmetic") will refuse to register a memory region that could otherwise be valid (provided vm.mmap_min_addr sysctl and mmap_low_allowed SELinux knobs allow userspace to map something at address 0x0). This patch allows back such registration: ib_umem_get() should probably don't care of the base address provided it can be pinned with get_user_pages(). There's two possible overflows, in (addr + size) and in PAGE_ALIGN(addr + size), this patch keep ensuring none of them happen while allowing to pin memory at address 0x0. Anyway, the case of size equal 0 is no more (partially) handled as 0-length memory region are disallowed by an earlier check. Link: http://mid.gmane.org/cover.1428929103.git.ydroneaud@opteya.com Cc: Shachar Raindel <raindel@mellanox.com> Cc: Jack Morgenstein <jackm@mellanox.com> Cc: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>