aboutsummaryrefslogtreecommitdiff
path: root/crypto
AgeCommit message (Collapse)Author
2013-05-13crypto: algif - suppress sending source address information in recvmsgMathias Krause
commit 72a763d805a48ac8c0bf48fdb510e84c12de51fe upstream. The current code does not set the msg_namelen member to 0 and therefore makes net/socket.c leak the local sockaddr_storage variable to userland -- 128 bytes of kernel stack memory. Fix that. Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-25crypto: gcm - fix assumption that assoc has one segmentJussi Kivilinna
commit d3dde52209ab571e4e2ec26c66f85ad1355f7475 upstream. rfc4543(gcm(*)) code for GMAC assumes that assoc scatterlist always contains only one segment and only makes use of this first segment. However ipsec passes assoc with three segments when using 'extended sequence number' thus in this case rfc4543(gcm(*)) fails to function correctly. Patch fixes this issue. Reported-by: Chaoxing Lin <Chaoxing.Lin@ultra-3eti.com> Tested-by: Chaoxing Lin <Chaoxing.Lin@ultra-3eti.com> Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-03-20crypto: user - fix info leaks in report APIMathias Krause
commit 9a5467bf7b6e9e02ec9c3da4e23747c05faeaac6 upstream. Three errors resulting in kernel memory disclosure: 1/ The structures used for the netlink based crypto algorithm report API are located on the stack. As snprintf() does not fill the remainder of the buffer with null bytes, those stack bytes will be disclosed to users of the API. Switch to strncpy() to fix this. 2/ crypto_report_one() does not initialize all field of struct crypto_user_alg. Fix this to fix the heap info leak. 3/ For the module name we should copy only as many bytes as module_name() returns -- not as much as the destination buffer could hold. But the current code does not and therefore copies random data from behind the end of the module name, as the module name is always shorter than CRYPTO_MAX_ALG_NAME. Also switch to use strncpy() to copy the algorithm's name and driver_name. They are strings, after all. Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-11-16crypto: cryptd - disable softirqs in cryptd_queue_worker to prevent data ↵Jussi Kivilinna
corruption commit 9efade1b3e981f5064f9db9ca971b4dc7557ae42 upstream. cryptd_queue_worker attempts to prevent simultaneous accesses to crypto workqueue by cryptd_enqueue_request using preempt_disable/preempt_enable. However cryptd_enqueue_request might be called from softirq context, so add local_bh_disable/local_bh_enable to prevent data corruption and panics. Bug report at http://marc.info/?l=linux-crypto-vger&m=134858649616319&w=2 v2: - Disable software interrupts instead of hardware interrupts Reported-by: Gurucharan Shetty <gurucharan.shetty@gmail.com> Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-05-11crypto: sha512 - Fix byte counter overflow in SHA-512Kent Yoder
commit 25c3d30c918207556ae1d6e663150ebdf902186b upstream. The current code only increments the upper 64 bits of the SHA-512 byte counter when the number of bytes hashed happens to hit 2^64 exactly. This patch increments the upper 64 bits whenever the lower 64 bits overflows. Signed-off-by: Kent Yoder <key@linux.vnet.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-02-20crypto: sha512 - use standard ror64()Alexey Dobriyan
commit f2ea0f5f04c97b48c88edccba52b0682fbe45087 upstream. Use standard ror64() instead of hand-written. There is no standard ror64, so create it. The difference is shift value being "unsigned int" instead of uint64_t (for which there is no reason). gcc starts to emit native ROR instructions which it doesn't do for some reason currently. This should make the code faster. Patch survives in-tree crypto test and ping flood with hmac(sha512) on. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-02-20crypto: sha512 - Avoid stack bloat on i386Herbert Xu
commit 3a92d687c8015860a19213e3c102cad6b722f83c upstream. Unfortunately in reducing W from 80 to 16 we ended up unrolling the loop twice. As gcc has issues dealing with 64-bit ops on i386 this means that we end up using even more stack space (>1K). This patch solves the W reduction by moving LOAD_OP/BLEND_OP into the loop itself, thus avoiding the need to duplicate it. While the stack space still isn't great (>0.5K) it is at least in the same ball park as the amount of stack used for our C sha1 implementation. Note that this patch basically reverts to the original code so the diff looks bigger than it really is. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-02-20crypto: sha512 - Use binary and instead of modulusHerbert Xu
commit 58d7d18b5268febb8b1391c6dffc8e2aaa751fcd upstream. The previous patch used the modulus operator over a power of 2 unnecessarily which may produce suboptimal binary code. This patch changes changes them to binary ands instead. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-02-03crypto: sha512 - reduce stack usage to safe numberAlexey Dobriyan
commit 51fc6dc8f948047364f7d42a4ed89b416c6cc0a3 upstream. For rounds 16--79, W[i] only depends on W[i - 2], W[i - 7], W[i - 15] and W[i - 16]. Consequently, keeping all W[80] array on stack is unnecessary, only 16 values are really needed. Using W[16] instead of W[80] greatly reduces stack usage (~750 bytes to ~340 bytes on x86_64). Line by line explanation: * BLEND_OP array is "circular" now, all indexes have to be modulo 16. Round number is positive, so remainder operation should be without surprises. * initial full message scheduling is trimmed to first 16 values which come from data block, the rest is calculated before it's needed. * original loop body is unrolled version of new SHA512_0_15 and SHA512_16_79 macros, unrolling was done to not do explicit variable renaming. Otherwise it's the very same code after preprocessing. See sha1_transform() code which does the same trick. Patch survives in-tree crypto test and original bugreport test (ping flood with hmac(sha512). See FIPS 180-2 for SHA-512 definition http://csrc.nist.gov/publications/fips/fips180-2/fips180-2withchangenotice.pdf Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-02-03crypto: sha512 - make it work, undo percpu message scheduleAlexey Dobriyan
commit 84e31fdb7c797a7303e0cc295cb9bc8b73fb872d upstream. commit f9e2bca6c22d75a289a349f869701214d63b5060 aka "crypto: sha512 - Move message schedule W[80] to static percpu area" created global message schedule area. If sha512_update will ever be entered twice, hash will be silently calculated incorrectly. Probably the easiest way to notice incorrect hashes being calculated is to run 2 ping floods over AH with hmac(sha512): #!/usr/sbin/setkey -f flush; spdflush; add IP1 IP2 ah 25 -A hmac-sha512 0x00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000025; add IP2 IP1 ah 52 -A hmac-sha512 0x00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000052; spdadd IP1 IP2 any -P out ipsec ah/transport//require; spdadd IP2 IP1 any -P in ipsec ah/transport//require; XfrmInStateProtoError will start ticking with -EBADMSG being returned from ah_input(). This never happens with, say, hmac(sha1). With patch applied (on BOTH sides), XfrmInStateProtoError does not tick with multiple bidirectional ping flood streams like it doesn't tick with SHA-1. After this patch sha512_transform() will start using ~750 bytes of stack on x86_64. This is OK for simple loads, for something more heavy, stack reduction will be done separatedly. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2011-11-11Merge git://github.com/herbertx/cryptoLinus Torvalds
* git://github.com/herbertx/crypto: crypto: algapi - Fix build problem with NET disabled crypto: user - Fix rwsem leak in crypto_user
2011-11-11crypto: algapi - Fix build problem with NET disabledHerbert Xu
The report functions use NLA_PUT so we need to ensure that NET is enabled. Reported-by: Luis Henriques <henrix@camandro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-11-06Merge branch 'modsplit-Oct31_2011' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux * 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits) Revert "tracing: Include module.h in define_trace.h" irq: don't put module.h into irq.h for tracking irqgen modules. bluetooth: macroize two small inlines to avoid module.h ip_vs.h: fix implicit use of module_get/module_put from module.h nf_conntrack.h: fix up fallout from implicit moduleparam.h presence include: replace linux/module.h with "struct module" wherever possible include: convert various register fcns to macros to avoid include chaining crypto.h: remove unused crypto_tfm_alg_modname() inline uwb.h: fix implicit use of asm/page.h for PAGE_SIZE pm_runtime.h: explicitly requires notifier.h linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h miscdevice.h: fix up implicit use of lists and types stop_machine.h: fix implicit use of smp.h for smp_processor_id of: fix implicit use of errno.h in include/linux/of.h of_platform.h: delete needless include <linux/module.h> acpi: remove module.h include from platform/aclinux.h miscdevice.h: delete unnecessary inclusion of module.h device_cgroup.h: delete needless include <linux/module.h> net: sch_generic remove redundant use of <linux/module.h> net: inet_timewait_sock doesnt need <linux/module.h> ... Fix up trivial conflicts (other header files, and removal of the ab3550 mfd driver) in - drivers/media/dvb/frontends/dibx000_common.c - drivers/media/video/{mt9m111.c,ov6650.c} - drivers/mfd/ab3550-core.c - include/linux/dmaengine.h
2011-11-02crypto: user - Fix rwsem leak in crypto_userJonathan Corbet
The list_empty case in crypto_alg_match() will return without calling up_read() on crypto_alg_sem. We could do the "goto out" routine, but the function will clearly do the right thing with that test simply removed. Signed-off-by: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-11-01Merge git://github.com/herbertx/cryptoLinus Torvalds
* git://github.com/herbertx/crypto: (48 commits) crypto: user - Depend on NET instead of selecting it crypto: user - Add dependency on NET crypto: talitos - handle descriptor not found in error path crypto: user - Initialise match in crypto_alg_match crypto: testmgr - add twofish tests crypto: testmgr - add blowfish test-vectors crypto: Make hifn_795x build depend on !ARCH_DMA_ADDR_T_64BIT crypto: twofish-x86_64-3way - fix ctr blocksize to 1 crypto: blowfish-x86_64 - fix ctr blocksize to 1 crypto: whirlpool - count rounds from 0 crypto: Add userspace report for compress type algorithms crypto: Add userspace report for cipher type algorithms crypto: Add userspace report for rng type algorithms crypto: Add userspace report for pcompress type algorithms crypto: Add userspace report for nivaead type algorithms crypto: Add userspace report for aead type algorithms crypto: Add userspace report for givcipher type algorithms crypto: Add userspace report for ablkcipher type algorithms crypto: Add userspace report for blkcipher type algorithms crypto: Add userspace report for ahash type algorithms ...
2011-11-01crypto: user - Depend on NET instead of selecting itHerbert Xu
Selecting NET causes all sorts of issues, including a dependency loop involving bluetooth. This patch makes it a dependency instead. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-10-31crypto: add module.h to those files that are explicitly using itPaul Gortmaker
Part of the include cleanups means that the implicit inclusion of module.h via device.h is going away. So fix things up in advance. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-26crypto: user - Add dependency on NETHerbert Xu
Since the configuration interface relies on netlink we need to select NET. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-10-21crypto: user - Initialise match in crypto_alg_matchHerbert Xu
We need to default match to 0 as otherwise it may lead to a false positive. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-10-21crypto: testmgr - add twofish testsJussi Kivilinna
Add tests for parallel twofish-x86_64-3way code paths. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-10-21crypto: testmgr - add blowfish test-vectorsJussi Kivilinna
Add tests for parallel blowfish-x86_64 code paths. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-10-21crypto: whirlpool - count rounds from 0Alexey Dobriyan
rc[0] is unused because rounds are counted from 1. Save an u64! Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-10-21crypto: Add userspace report for compress type algorithmsSteffen Klassert
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-10-21crypto: Add userspace report for cipher type algorithmsSteffen Klassert
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-10-21crypto: Add userspace report for rng type algorithmsSteffen Klassert
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-10-21crypto: Add userspace report for pcompress type algorithmsSteffen Klassert
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-10-21crypto: Add userspace report for nivaead type algorithmsSteffen Klassert
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-10-21crypto: Add userspace report for aead type algorithmsSteffen Klassert
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-10-21crypto: Add userspace report for givcipher type algorithmsSteffen Klassert
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-10-21crypto: Add userspace report for ablkcipher type algorithmsSteffen Klassert
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-10-21crypto: Add userspace report for blkcipher type algorithmsSteffen Klassert
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-10-21crypto: Add userspace report for ahash type algorithmsSteffen Klassert
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-10-21crypto: Add userspace report for shash type algorithmsSteffen Klassert
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-10-21crypto: Add userspace report for larval type algorithmsSteffen Klassert
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-10-21crypto: Add a report function pointer to crypto_typeSteffen Klassert
We add a report function pointer to struct crypto_type. This function pointer is used from the crypto userspace configuration API to report crypto algorithms to userspace. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-10-21crypto: Add userspace configuration APISteffen Klassert
This patch adds a basic userspace configuration API for the crypto layer. With this it is possible to instantiate, remove and to show crypto algorithms from userspace. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-10-21crypto: Export crypto_remove_finalSteffen Klassert
The upcomming crypto usrerspace configuration api needs to remove the spawns on top on an algorithm, so export crypto_remove_final. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-10-21crypto: Export crypto_remove_spawnsSteffen Klassert
The upcomming crypto usrerspace configuration api needs to remove the spawns on top on an algorithm, so export crypto_remove_spawns. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-10-21crypto: Add a flag to identify crypto instancesSteffen Klassert
The upcomming crypto user configuration api needs to identify crypto instances. This patch adds a flag that is set if the algorithm is an instance that is build from templates. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-10-21crypto: twofish - add 3-way parallel x86_64 assembler implementionJussi Kivilinna
Patch adds 3-way parallel x86_64 assembly implementation of twofish as new module. New assembler functions crypt data in three blocks chunks, improving cipher performance on out-of-order CPUs. Patch has been tested with tcrypt and automated filesystem tests. Summary of the tcrypt benchmarks: Twofish 3-way-asm vs twofish asm (128bit 8kb block ECB) encrypt: 1.3x speed decrypt: 1.3x speed Twofish 3-way-asm vs twofish asm (128bit 8kb block CBC) encrypt: 1.07x speed decrypt: 1.4x speed Twofish 3-way-asm vs twofish asm (128bit 8kb block CTR) encrypt: 1.4x speed Twofish 3-way-asm vs AES asm (128bit 8kb block ECB) encrypt: 1.0x speed decrypt: 1.0x speed Twofish 3-way-asm vs AES asm (128bit 8kb block CBC) encrypt: 0.84x speed decrypt: 1.09x speed Twofish 3-way-asm vs AES asm (128bit 8kb block CTR) encrypt: 1.15x speed Full output: http://koti.mbnet.fi/axh/kernel/crypto/tcrypt-speed-twofish-3way-asm-x86_64.txt http://koti.mbnet.fi/axh/kernel/crypto/tcrypt-speed-twofish-asm-x86_64.txt http://koti.mbnet.fi/axh/kernel/crypto/tcrypt-speed-aes-asm-x86_64.txt Tests were run on: vendor_id : AuthenticAMD cpu family : 16 model : 10 model name : AMD Phenom(tm) II X6 1055T Processor Also userspace test were run on: vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Xeon(R) CPU E7330 @ 2.40GHz stepping : 11 Userspace test results: Encryption/decryption of twofish 3-way vs x86_64-asm on AMD Phenom II: encrypt: 1.27x decrypt: 1.25x Encryption/decryption of twofish 3-way vs x86_64-asm on Intel Xeon E7330: encrypt: 1.36x decrypt: 1.36x Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-10-21crypto: tcrypt - add ctr(twofish) speed testJussi Kivilinna
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-10-21crypto: ghash - Avoid null pointer dereference if no key is setNick Bowler
The ghash_update function passes a pointer to gf128mul_4k_lle which will be NULL if ghash_setkey is not called or if the most recent call to ghash_setkey failed to allocate memory. This causes an oops. Fix this up by returning an error code in the null case. This is trivially triggered from unprivileged userspace through the AF_ALG interface by simply writing to the socket without setting a key. The ghash_final function has a similar issue, but triggering it requires a memory allocation failure in ghash_setkey _after_ at least one successful call to ghash_update. BUG: unable to handle kernel NULL pointer dereference at 00000670 IP: [<d88c92d4>] gf128mul_4k_lle+0x23/0x60 [gf128mul] *pde = 00000000 Oops: 0000 [#1] PREEMPT SMP Modules linked in: ghash_generic gf128mul algif_hash af_alg nfs lockd nfs_acl sunrpc bridge ipv6 stp llc Pid: 1502, comm: hashatron Tainted: G W 3.1.0-rc9-00085-ge9308cf #32 Bochs Bochs EIP: 0060:[<d88c92d4>] EFLAGS: 00000202 CPU: 0 EIP is at gf128mul_4k_lle+0x23/0x60 [gf128mul] EAX: d69db1f0 EBX: d6b8ddac ECX: 00000004 EDX: 00000000 ESI: 00000670 EDI: d6b8ddac EBP: d6b8ddc8 ESP: d6b8dda4 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 Process hashatron (pid: 1502, ti=d6b8c000 task=d6810000 task.ti=d6b8c000) Stack: 00000000 d69db1f0 00000163 00000000 d6b8ddc8 c101a520 d69db1f0 d52aa000 00000ff0 d6b8dde8 d88d310f d6b8a3f8 d52aa000 00001000 d88d502c d6b8ddfc 00001000 d6b8ddf4 c11676ed d69db1e8 d6b8de24 c11679ad d52aa000 00000000 Call Trace: [<c101a520>] ? kmap_atomic_prot+0x37/0xa6 [<d88d310f>] ghash_update+0x85/0xbe [ghash_generic] [<c11676ed>] crypto_shash_update+0x18/0x1b [<c11679ad>] shash_ahash_update+0x22/0x36 [<c11679cc>] shash_async_update+0xb/0xd [<d88ce0ba>] hash_sendpage+0xba/0xf2 [algif_hash] [<c121b24c>] kernel_sendpage+0x39/0x4e [<d88ce000>] ? 0xd88cdfff [<c121b298>] sock_sendpage+0x37/0x3e [<c121b261>] ? kernel_sendpage+0x4e/0x4e [<c10b4dbc>] pipe_to_sendpage+0x56/0x61 [<c10b4e1f>] splice_from_pipe_feed+0x58/0xcd [<c10b4d66>] ? splice_from_pipe_begin+0x10/0x10 [<c10b51f5>] __splice_from_pipe+0x36/0x55 [<c10b4d66>] ? splice_from_pipe_begin+0x10/0x10 [<c10b6383>] splice_from_pipe+0x51/0x64 [<c10b63c2>] ? default_file_splice_write+0x2c/0x2c [<c10b63d5>] generic_splice_sendpage+0x13/0x15 [<c10b4d66>] ? splice_from_pipe_begin+0x10/0x10 [<c10b527f>] do_splice_from+0x5d/0x67 [<c10b6865>] sys_splice+0x2bf/0x363 [<c129373b>] ? sysenter_exit+0xf/0x16 [<c104dc1e>] ? trace_hardirqs_on_caller+0x10e/0x13f [<c129370c>] sysenter_do_call+0x12/0x32 Code: 83 c4 0c 5b 5e 5f c9 c3 55 b9 04 00 00 00 89 e5 57 8d 7d e4 56 53 8d 5d e4 83 ec 18 89 45 e0 89 55 dc 0f b6 70 0f c1 e6 04 01 d6 <f3> a5 be 0f 00 00 00 4e 89 d8 e8 48 ff ff ff 8b 45 e0 89 da 0f EIP: [<d88c92d4>] gf128mul_4k_lle+0x23/0x60 [gf128mul] SS:ESP 0068:d6b8dda4 CR2: 0000000000000670 ---[ end trace 4eaa2a86a8e2da24 ]--- note: hashatron[1502] exited with preempt_count 1 BUG: scheduling while atomic: hashatron/1502/0x10000002 INFO: lockdep is turned off. [...] Signed-off-by: Nick Bowler <nbowler@elliptictech.com> Cc: stable@kernel.org [2.6.37+] Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-09-22crypto: blowfish - add x86_64 assembly implementationJussi Kivilinna
Patch adds x86_64 assembly implementation of blowfish. Two set of assembler functions are provided. First set is regular 'one-block at time' encrypt/decrypt functions. Second is 'four-block at time' functions that gain performance increase on out-of-order CPUs. Performance of 4-way functions should be equal to 1-way functions with in-order CPUs. Summary of the tcrypt benchmarks: Blowfish assembler vs blowfish C (256bit 8kb block ECB) encrypt: 2.2x speed decrypt: 2.3x speed Blowfish assembler vs blowfish C (256bit 8kb block CBC) encrypt: 1.12x speed decrypt: 2.5x speed Blowfish assembler vs blowfish C (256bit 8kb block CTR) encrypt: 2.5x speed Full output: http://koti.mbnet.fi/axh/kernel/crypto/tcrypt-speed-blowfish-asm-x86_64.txt http://koti.mbnet.fi/axh/kernel/crypto/tcrypt-speed-blowfish-c-x86_64.txt Tests were run on: vendor_id : AuthenticAMD cpu family : 16 model : 10 model name : AMD Phenom(tm) II X6 1055T Processor stepping : 0 Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-09-22crypto: tcrypt - add ctr(blowfish) speed testJussi Kivilinna
Add ctr(blowfish) speed test to receive results for blowfish x86_64 assembly patch. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-09-22crypto: blowfish - rename C-version to blowfish_genericJussi Kivilinna
Rename blowfish to blowfish_generic so that assembler versions of blowfish cipher can autoload. Module alias 'blowfish' is added. Also fix checkpatch warnings. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-09-22crypto: blowfish - split generic and common c codeJussi Kivilinna
Patch splits up the blowfish crypto routine into a common part (key setup) which will be used by blowfish crypto modules (x86_64 assembly and generic-c). Also fixes errors/warnings reported by checkpatch. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-08-20crypto: cryptd - Use subsys_initcall to prevent races with aesniHerbert Xu
As cryptd is depeneded on by other algorithms such as aesni-intel, it needs to be registered before them. When everything is built as modules, this occurs naturally. However, for this to work when they are built-in, we need to use subsys_initcall in cryptd. Tested-by: Josh Boyer <jwboyer@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-08-10crypto: sha1 - SSSE3 based SHA1 implementation for x86-64Mathias Krause
This is an assembler implementation of the SHA1 algorithm using the Supplemental SSE3 (SSSE3) instructions or, when available, the Advanced Vector Extensions (AVX). Testing with the tcrypt module shows the raw hash performance is up to 2.3 times faster than the C implementation, using 8k data blocks on a Core 2 Duo T5500. For the smalest data set (16 byte) it is still 25% faster. Since this implementation uses SSE/YMM registers it cannot safely be used in every situation, e.g. while an IRQ interrupts a kernel thread. The implementation falls back to the generic SHA1 variant, if using the SSE/YMM registers is not possible. With this algorithm I was able to increase the throughput of a single IPsec link from 344 Mbit/s to 464 Mbit/s on a Core 2 Quad CPU using the SSSE3 variant -- a speedup of +34.8%. Saving and restoring SSE/YMM state might make the actual throughput fluctuate when there are FPU intensive userland applications running. For example, meassuring the performance using iperf2 directly on the machine under test gives wobbling numbers because iperf2 uses the FPU for each packet to check if the reporting interval has expired (in the above test I got min/max/avg: 402/484/464 MBit/s). Using this algorithm on a IPsec gateway gives much more reasonable and stable numbers, albeit not as high as in the directly connected case. Here is the result from an RFC 2544 test run with a EXFO Packet Blazer FTB-8510: frame size sha1-generic sha1-ssse3 delta 64 byte 37.5 MBit/s 37.5 MBit/s 0.0% 128 byte 56.3 MBit/s 62.5 MBit/s +11.0% 256 byte 87.5 MBit/s 100.0 MBit/s +14.3% 512 byte 131.3 MBit/s 150.0 MBit/s +14.2% 1024 byte 162.5 MBit/s 193.8 MBit/s +19.3% 1280 byte 175.0 MBit/s 212.5 MBit/s +21.4% 1420 byte 175.0 MBit/s 218.7 MBit/s +25.0% 1518 byte 150.0 MBit/s 181.2 MBit/s +20.8% The throughput for the largest frame size is lower than for the previous size because the IP packets need to be fragmented in this case to make there way through the IPsec tunnel. Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: Maxim Locktyukhin <maxim.locktyukhin@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-08-10crypto: sha1 - export sha1_update for reuseMathias Krause
Export the update function as crypto_sha1_update() to not have the need to reimplement the same algorithm for each SHA-1 implementation. This way the generic SHA-1 implementation can be used as fallback for other implementations that fail to run under certain circumstances, like the need for an FPU context while executing in IRQ context. Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-08-06crypto: Move md5_transform to lib/md5.cDavid S. Miller
We are going to use this for TCP/IP sequence number and fragment ID generation. Signed-off-by: David S. Miller <davem@davemloft.net>