From dfb09f9b7ab03fd367740e541a5caf830ed56726 Mon Sep 17 00:00:00 2001 From: Borislav Petkov Date: Fri, 5 Aug 2011 15:15:08 +0200 Subject: x86, amd: Avoid cache aliasing penalties on AMD family 15h This patch provides performance tuning for the "Bulldozer" CPU. With its shared instruction cache there is a chance of generating an excessive number of cache cross-invalidates when running specific workloads on the cores of a compute module. This excessive amount of cross-invalidations can be observed if cache lines backed by shared physical memory alias in bits [14:12] of their virtual addresses, as those bits are used for the index generation. This patch addresses the issue by clearing all the bits in the [14:12] slice of the file mapping's virtual address at generation time, thus forcing those bits the same for all mappings of a single shared library across processes and, in doing so, avoids instruction cache aliases. It also adds the command line option "align_va_addr=(32|64|on|off)" with which virtual address alignment can be enabled for 32-bit or 64-bit x86 individually, or both, or be completely disabled. This change leaves virtual region address allocation on other families and/or vendors unaffected. Signed-off-by: Borislav Petkov Link: http://lkml.kernel.org/r/1312550110-24160-2-git-send-email-bp@amd64.org Signed-off-by: H. Peter Anvin --- Documentation/kernel-parameters.txt | 13 +++++++++++++ 1 file changed, 13 insertions(+) (limited to 'Documentation') diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index aa47be71df4c..af73c036b7e6 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -299,6 +299,19 @@ bytes respectively. Such letter suffixes can also be entirely omitted. behaviour to be specified. Bit 0 enables warnings, bit 1 enables fixups, and bit 2 sends a segfault. + align_va_addr= [X86-64] + Align virtual addresses by clearing slice [14:12] when + allocating a VMA at process creation time. This option + gives you up to 3% performance improvement on AMD F15h + machines (where it is enabled by default) for a + CPU-intensive style benchmark, and it can vary highly in + a microbenchmark depending on workload and compiler. + + 1: only for 32-bit processes + 2: only for 64-bit processes + on: enable for both 32- and 64-bit processes + off: disable for both 32- and 64-bit processes + amd_iommu= [HW,X86-84] Pass parameters to the AMD IOMMU driver in the system. Possible values are: -- cgit v1.2.3