summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2013-03-08Merge branch 'tracking-ste-tb-ethernet' into merge-linux-linaro-core-trackingAndrey Konovalov
2013-03-08Merge branch 'tracking-hugetlb' into merge-linux-linaro-core-trackingAndrey Konovalov
2013-03-08Merge branch 'tracking-kvm-arm-v17' into merge-linux-linaro-core-trackingAndrey Konovalov
2013-03-08Merge branch 'tracking-big-LITTLE-MP-master-v15' into ↵Andrey Konovalov
merge-linux-linaro-core-tracking
2013-03-08Merge branch 'tracking-linaro-android-3.8' into merge-linux-linaro-core-trackingAndrey Konovalov
2013-02-21LAVA: two Arndale defconfigs added.tracking-hugetlb-llct-20130417.0tracking-hugetlb-llct-20130415.0tracking-hugetlb-llct-20130411.0tracking-hugetlb-llct-20130408.0tracking-hugetlb-llct-20130404.0tracking-hugetlb-llct-20130402.0tracking-hugetlb-llct-20130324.0tracking-hugetlb-llct-20130321.0tracking-hugetlb-llct-20130318.0tracking-hugetlb-llct-20130315.0tracking-hugetlb-llct-20130314.1tracking-hugetlb-llct-20130314.0tracking-hugetlb-llct-20130311.0tracking-hugetlb-llct-20130308.0newhugewtests-20130221Steve Capper
The defconfigs: arndale_hugenuma_test_lpae_defconfig arndale_hugenuma_test_nonlpae_defconfig Should be used to test the Huge pages and NUMA support for LPAE and non-LPAE kernels.
2013-02-21ARM: mm: Transparent huge page support for non-LPAE systems.Steve Capper
Much of the required code for THP has been implemented in the earlier non-LPAE HugeTLB patch. One more domain bit is used (to store whether or not the THP is splitting). Some THP helper functions are defined; and we have to re-define pmd_page such that it distinguishes between page tables and sections. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Steve Capper <steve.capper@arm.com>
2013-02-21ARM: mm: Transparent huge page support for LPAE systems.Catalin Marinas
The patch adds support for THP (transparent huge pages) to LPAE systems. When this feature is enabled, the kernel tries to map anonymous pages as 2MB sections where possible. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> [steve.capper@arm.com: symbolic constants used, value of PMD_SECT_SPLITTING adjusted, tlbflush.h included in pgtable.h] Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Steve Capper <steve.capper@arm.com>
2013-02-21ARM: mm: HugeTLB support for non-LPAE systems.Steve Capper
Based on Bill Carson's HugeTLB patch, with the big difference being in the way PTEs are passed back to the memory manager. Rather than store a "Linux Huge PTE" separately; we make one up on the fly in huge_ptep_get. Also rather than consider 16M supersections, we focus solely on 2x1M sections. To construct a huge PTE on the fly we need additional information (such as the accessed flag and dirty bit) which we choose to store in the domain bits of the short section descriptor. In order to use these domain bits for storage, we need to make ourselves a client for all 16 domains and this is done in head.S. Storing extra information in the domain bits also makes it a lot easier to implement Transparent Huge Pages, and some of the code in pgtable-2level.h is arranged to facilitate THP support in a later patch. Non-LPAE HugeTLB pages are incompatible with the huge page migration code (enabled when CONFIG_MEMORY_FAILURE is selected) as that code dereferences PTEs directly, rather than calling huge_ptep_get and set_huge_pte_at. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Steve Capper <steve.capper@arm.com>
2013-02-21ARM: mm: HugeTLB support for LPAE systems.Catalin Marinas
This patch adds support for hugetlbfs based on the x86 implementation. It allows mapping of 2MB sections (see Documentation/vm/hugetlbpage.txt for usage). The 64K pages configuration is not supported (section size is 512MB in this case). Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> [steve.capper@arm.com: symbolic constants replace numbers in places. Split up into multiple files, to simplify future non-LPAE support, removed huge_pmd_share code, as this is very rarely executed]. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Steve Capper <steve.capper@arm.com>
2013-02-21ARM: mm: Add support for flushing HugeTLB pages.Steve Capper
On ARM we use the __flush_dcache_page function to flush the dcache of pages when needed; usually when the PG_dcache_clean bit is unset and we are setting a PTE. A HugeTLB page is represented as a compound page consisting of an array of pages. Thus to flush the dcache of a HugeTLB page, one must flush more than a single page. This patch modifies __flush_dcache_page such that all constituent pages of a HugeTLB page are flushed. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Steve Capper <steve.capper@arm.com>
2013-02-21ARM: mm: correct pte_same behaviour for LPAE.Steve Capper
For 3 levels of paging the PTE_EXT_NG bit will be set for user address ptes that are written to a page table but not for ptes created with mk_pte. This can cause some comparison tests made by pte_same to fail spuriously and lead to other problems. To correct this behaviour, we mask off PTE_EXT_NG for any pte that is present before running the comparison. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Steve Capper <steve.capper@arm.com>
2013-02-15Merge tag 'stable/for-linus-3.8-rc7-tag-two' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen Pull xen fixes from Konrad Rzeszutek Wilk: "Two fixes: - A simple bug-fix for redundant NULL check. - CVE-2013-0228/XSA-42: x86/xen: don't assume %ds is usable in xen_iret for 32-bit PVOPS and two reverts: - Revert the PVonHVM kexec. The patch introduces a regression with older hypervisor stacks, such as Xen 4.1." * tag 'stable/for-linus-3.8-rc7-tag-two' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: Revert "xen PVonHVM: use E820_Reserved area for shared_info" Revert "xen/PVonHVM: fix compile warning in init_hvm_pv_info" xen: remove redundant NULL check before unregister_and_remove_pcpu(). x86/xen: don't assume %ds is usable in xen_iret for 32-bit PVOPS.
2013-02-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcLinus Torvalds
Pull sparc fixes from David Miller: "A couple small fixes for sparc including some THP brown-paper-bag material: 1) During the merging of all the THP support for various architectures, sparc missed adding a HAVE_ARCH_TRANSPARENT_HUGEPAGE to it's Kconfig, oops. 2) Sparc needs to be mindful of hugepages in get_user_pages_fast(). 3) Fix memory leak in SBUS probe, from Cong Ding. 4) The sunvdc virtual disk client driver has a test of the bitmask of vdisk server supported operations which was off by one bit" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sunvdc: Fix off-by-one in generic_request(). sparc64: Fix get_user_pages_fast() wrt. THP. sparc64: Add missing HAVE_ARCH_TRANSPARENT_HUGEPAGE. sparc: kernel/sbus.c: fix memory leakage
2013-02-15Merge branches 'arm-multi_pmu_v2', 'cpufreq-fixes-v3', ↵tracking-big-LITTLE-MP-master-v15-llct-20130308.0tracking-big-LITTLE-MP-master-v15-llct-20130220.0tracking-big-LITTLE-MP-master-v15-llct-20130219.0tracking-big-LITTLE-MP-master-v15-llct-20130217.0big-LITTLE-MP-master-v15Viresh Kumar
'hw-bkp-v7.1-debug-v2', 'task-placement-v2', 'task-placement-v2-sysfs', 'misc-patches' and 'config-fragments' into big-LITTLE-MP-master-v15 Updates: ------- - Rebased over 3.8-rc7 - Stats: Total distinct patches: 37 - Dropped Patches: Two patches from cpufreq-fixes-v2 (3): causing some failures. - New Patches: One new patch into task-placement-v2 branch from Sudeep Commands used for merge: ----------------------- $ git checkout -b big-LITTLE-MP-master-v15 v3.8-rc7 $ git merge arm-multi_pmu_v2 cpufreq-fixes-v3 hw-bkp-v7.1-debug-v2 task-placement-v2 task-placement-v2-sysfs misc-patches config-fragments
2013-02-15sched: fix arch_get_fast_and_slow_cpus to get logical cpumask correctlySudeep KarkadaNagesha
The patch "sched: Use device-tree to provide fast/slow CPU list for HMP" depends on the ordering of CPU's in the device tree. It breaks to determine the logical mask correctly if the logical mask of the CPUs differ from physical ordering in the device tree. This patch fix the logic by depending on the mpidr in the device tree and mapping that mpidr to the logical cpu. Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com> Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com>
2013-02-14Revert "xen PVonHVM: use E820_Reserved area for shared_info"Konrad Rzeszutek Wilk
This reverts commit 9d02b43dee0d7fb18dfb13a00915550b1a3daa9f. We are doing this b/c on 32-bit PVonHVM with older hypervisors (Xen 4.1) it ends up bothing up the start_info. This is bad b/c we use it for the time keeping, and the timekeeping code loops forever - as the version field never changes. Olaf says to revert it, so lets do that. Acked-by: Olaf Hering <olaf@aepfle.de> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-02-14Revert "xen/PVonHVM: fix compile warning in init_hvm_pv_info"Konrad Rzeszutek Wilk
This reverts commit a7be94ac8d69c037d08f0fd94b45a593f1d45176. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-02-13efi: Clear EFI_RUNTIME_SERVICES rather than EFI_BOOT by "noefi" boot parameterSatoru Takeuchi
There was a serious problem in samsung-laptop that its platform driver is designed to run under BIOS and running under EFI can cause the machine to become bricked or can cause Machine Check Exceptions. Discussion about this problem: https://bugs.launchpad.net/ubuntu-cdimage/+bug/1040557 https://bugzilla.kernel.org/show_bug.cgi?id=47121 The patches to fix this problem: efi: Make 'efi_enabled' a function to query EFI facilities 83e68189745ad931c2afd45d8ee3303929233e7f samsung-laptop: Disable on EFI hardware e0094244e41c4d0c7ad69920681972fc45d8ce34 Unfortunately this problem comes back again if users specify "noefi" option. This parameter clears EFI_BOOT and that driver continues to run even if running under EFI. Refer to the document, this parameter should clear EFI_RUNTIME_SERVICES instead. Documentation/kernel-parameters.txt: =============================================================================== ... noefi [X86] Disable EFI runtime services support. ... =============================================================================== Documentation/x86/x86_64/uefi.txt: =============================================================================== ... - If some or all EFI runtime services don't work, you can try following kernel command line parameters to turn off some or all EFI runtime services. noefi turn off all EFI runtime services ... =============================================================================== Signed-off-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com> Link: http://lkml.kernel.org/r/511C2C04.2070108@jp.fujitsu.com Cc: Matt Fleming <matt.fleming@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-02-13x86/xen: don't assume %ds is usable in xen_iret for 32-bit PVOPS.Jan Beulich
This fixes CVE-2013-0228 / XSA-42 Drew Jones while working on CVE-2013-0190 found that that unprivileged guest user in 32bit PV guest can use to crash the > guest with the panic like this: ------------- general protection fault: 0000 [#1] SMP last sysfs file: /sys/devices/vbd-51712/block/xvda/dev Modules linked in: sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 xen_netfront ext4 mbcache jbd2 xen_blkfront dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan] Pid: 1250, comm: r Not tainted 2.6.32-356.el6.i686 #1 EIP: 0061:[<c0407462>] EFLAGS: 00010086 CPU: 0 EIP is at xen_iret+0x12/0x2b EAX: eb8d0000 EBX: 00000001 ECX: 08049860 EDX: 00000010 ESI: 00000000 EDI: 003d0f00 EBP: b77f8388 ESP: eb8d1fe0 DS: 0000 ES: 007b FS: 0000 GS: 00e0 SS: 0069 Process r (pid: 1250, ti=eb8d0000 task=c2953550 task.ti=eb8d0000) Stack: 00000000 0027f416 00000073 00000206 b77f8364 0000007b 00000000 00000000 Call Trace: Code: c3 8b 44 24 18 81 4c 24 38 00 02 00 00 8d 64 24 30 e9 03 00 00 00 8d 76 00 f7 44 24 08 00 00 02 80 75 33 50 b8 00 e0 ff ff 21 e0 <8b> 40 10 8b 04 85 a0 f6 ab c0 8b 80 0c b0 b3 c0 f6 44 24 0d 02 EIP: [<c0407462>] xen_iret+0x12/0x2b SS:ESP 0069:eb8d1fe0 general protection fault: 0000 [#2] ---[ end trace ab0d29a492dcd330 ]--- Kernel panic - not syncing: Fatal exception Pid: 1250, comm: r Tainted: G D --------------- 2.6.32-356.el6.i686 #1 Call Trace: [<c08476df>] ? panic+0x6e/0x122 [<c084b63c>] ? oops_end+0xbc/0xd0 [<c084b260>] ? do_general_protection+0x0/0x210 [<c084a9b7>] ? error_code+0x73/ ------------- Petr says: " I've analysed the bug and I think that xen_iret() cannot cope with mangled DS, in this case zeroed out (null selector/descriptor) by either xen_failsafe_callback() or RESTORE_REGS because the corresponding LDT entry was invalidated by the reproducer. " Jan took a look at the preliminary patch and came up a fix that solves this problem: "This code gets called after all registers other than those handled by IRET got already restored, hence a null selector in %ds or a non-null one that got loaded from a code or read-only data descriptor would cause a kernel mode fault (with the potential of crashing the kernel as a whole, if panic_on_oops is set)." The way to fix this is to realize that the we can only relay on the registers that IRET restores. The two that are guaranteed are the %cs and %ss as they are always fixed GDT selectors. Also they are inaccessible from user mode - so they cannot be altered. This is the approach taken in this patch. Another alternative option suggested by Jan would be to relay on the subtle realization that using the %ebp or %esp relative references uses the %ss segment. In which case we could switch from using %eax to %ebp and would not need the %ss over-rides. That would also require one extra instruction to compensate for the one place where the register is used as scaled index. However Andrew pointed out that is too subtle and if further work was to be done in this code-path it could escape folks attention and lead to accidents. Reviewed-by: Petr Matousek <pmatouse@redhat.com> Reported-by: Petr Matousek <pmatouse@redhat.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-02-13sparc64: Fix get_user_pages_fast() wrt. THP.David S. Miller
Mostly mirrors the s390 logic, as unlike x86 we don't need the SetPageReferenced() bits. On sparc64 we also lack a user/privileged bit in the huge PMDs. In order to make this work for THP and non-THP builds, some header file adjustments were necessary. Namely, provide the PMD_HUGE_* bit defines and the pmd_large() inline unconditionally rather than protected by TRANSPARENT_HUGEPAGE. Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-13sparc64: Add missing HAVE_ARCH_TRANSPARENT_HUGEPAGE.David S. Miller
This got missed in the cleanups done for the S390 THP support. CC: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-13Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Peter Anvin: "One (hopefully) last batch of x86 fixes. You asked for the patch by patch justifications, so here they are: x86, MCE: Retract most UAPI exports This one unexports from userspace a bunch of definitions which should never have been exported. We really don't want to create an accidental legacy here. x86, doc: Add a bootloader ID for OVMF This is a documentation-only patch, just recording the official assignment of a boot loader ID. x86: Do not leak kernel page mapping locations Security: avoid making it needlessly easy for user space to probe the kernel memory layout. x86/mm: Check if PUD is large when validating a kernel address Prevent failures using /proc/kcore when using 1G pages. x86/apic: Work around boot failure on HP ProLiant DL980 G7 Server systems Works around a BIOS problem causing boot failures on affected hardware." * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm: Check if PUD is large when validating a kernel address x86/apic: Work around boot failure on HP ProLiant DL980 G7 Server systems x86, doc: Add a bootloader ID for OVMF x86: Do not leak kernel page mapping locations x86, MCE: Retract most UAPI exports
2013-02-13x86/mm: Check if PUD is large when validating a kernel addressMel Gorman
A user reported the following oops when a backup process reads /proc/kcore: BUG: unable to handle kernel paging request at ffffbb00ff33b000 IP: [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110 [...] Call Trace: [<ffffffff811b8aaa>] read_kcore+0x17a/0x370 [<ffffffff811ad847>] proc_reg_read+0x77/0xc0 [<ffffffff81151687>] vfs_read+0xc7/0x130 [<ffffffff811517f3>] sys_read+0x53/0xa0 [<ffffffff81449692>] system_call_fastpath+0x16/0x1b Investigation determined that the bug triggered when reading system RAM at the 4G mark. On this system, that was the first address using 1G pages for the virt->phys direct mapping so the PUD is pointing to a physical address, not a PMD page. The problem is that the page table walker in kern_addr_valid() is not checking pud_large() and treats the physical address as if it was a PMD. If it happens to look like pmd_none then it'll silently fail, probably returning zeros instead of real data. If the data happens to look like a present PMD though, it will be walked resulting in the oops above. This patch adds the necessary pud_large() check. Unfortunately the problem was not readily reproducible and now they are running the backup program without accessing /proc/kcore so the patch has not been validated but I think it makes sense. Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.coM> Reviewed-by: Michal Hocko <mhocko@suse.cz> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: stable@vger.kernel.org Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20130211145236.GX21389@suse.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-02-12Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux into akpm Pull s390 regression fix from Martin Schwidefsky: "The recent fix for the s390 sched_clock() function uncovered yet another bug in s390_next_ktime which causes an endless loop in KVM. This regression should be fixed before v3.8. I keep the fingers crossed that this is the last one for v3.8." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/timer: avoid overflow when programming clock comparator
2013-02-12Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu into akpm Pull m68knommu fix from Greg Ungerer: "This contains a single critical fix for the non-MMU m68k platforms. The change of the kernel exec code path has revealed a problem in the start thread code that causes crashing on boot. This is the fix for it." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu: m68knommu: fix trap on execing /bin/init
2013-02-12Merge branch 'stable' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile into akpm Pull tile bugfixes from Chris Metcalf: "This includes a variety of minor bug fixes, mostly to do with testing "make allyesconfig", "make allmodconfig", "make allnoconfig", inspired to Tejun Heo's observation about Kconfig.freezer not being included. The largest changes are just syntax changes removing the tile-specific use of a macro named INT_MASK, which is way too commonly redefined throughout driver code" * 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile: tile: tag some code with #ifdef CONFIG_COMPAT tile: fix memcpy_*io functions for allnoconfig tile: export a handful of symbols appropriately drm: fix compile failure by including <linux/swiotlb.h> tile: avoid defining INT_MASK macro in <arch/interrupts.h> tile: provide "screen_info" when enabling VT drivers/input/joystick/analog.c: enable precise timer tile: include kernel/Kconfig.freezer in tile Kconfig tile: remove an unused variable in copy_thread()
2013-02-11x86/apic: Work around boot failure on HP ProLiant DL980 G7 Server systemsStoney Wang
When a HP ProLiant DL980 G7 Server boots a regular kernel, there will be intermittent lost interrupts which could result in a hang or (in extreme cases) data loss. The reason is that this system only supports x2apic physical mode, while the kernel boots with a logical-cluster default setting. This bug can be worked around by specifying the "x2apic_phys" or "nox2apic" boot option, but we want to handle this system without requiring manual workarounds. The BIOS sets ACPI_FADT_APIC_PHYSICAL in FADT table. As all apicids are smaller than 255, BIOS need to pass the control to the OS with xapic mode, according to x2apic-spec, chapter 2.9. Current code handle x2apic when BIOS pass with xapic mode enabled: When user specifies x2apic_phys, or FADT indicates PHYSICAL: 1. During madt oem check, apic driver is set with xapic logical or xapic phys driver at first. 2. enable_IR_x2apic() will enable x2apic_mode. 3. if user specifies x2apic_phys on the boot line, x2apic_phys_probe() will install the correct x2apic phys driver and use x2apic phys mode. Otherwise it will skip the driver will let x2apic_cluster_probe to take over to install x2apic cluster driver (wrong one) even though FADT indicates PHYSICAL, because x2apic_phys_probe does not check FADT PHYSICAL. Add checking x2apic_fadt_phys in x2apic_phys_probe() to fix the problem. Signed-off-by: Stoney Wang <song-bo.wang@hp.com> [ updated the changelog and simplified the code ] Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: stable@kernel.org Link: http://lkml.kernel.org/r/1360263182-16226-1-git-send-email-yinghai@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-02-11ARM: Experimental Frequency-Invariant Load Scaling PatchChris Redpath
Evaluation Patch to investigate using load as a representation of the amount of POTENTIAL cpu compute capacity used rather than a representation of the CURRENT cpu compute capacity. If CPUFreq is enabled, scales load in accordance with frequency. Powersave/performance CPUFreq governors are detected and scaling is disabled while these governors are in use. This is because when a single-frequency governor is in use, potential CPU capacity is static. So long as the governors and CPUFreq subsystem correctly report the frequencies available, the scaling should self tune. Adds an additional file to sysfs to allow this feature to be disabled for experimentation. /sys/kernel/hmp/frequency_invariant_load_scale write 0 to disable, 1 to enable. Signed-off-by: Chris Redpath <chris.redpath@arm.com>
2013-02-11ARM: Change load tracking scale using sysfsOlivier Cozette
These functions allow to change the load average period used in the task load average computation through /sys/kernel/hmp/load_avg_period_ms. This period is the time in ms to go from 0 to 0.5 load average while running or the time from 1 to 0.5 while sleeping. The default one used is 32 and gives the same load_avg_ratio computation than without this patch. These functions also allow to change the up and down threshold of HMP using /sys/kernel/hmp/{up,down}_threshold. Both must be between 0 and 1024. The thresholds are divided by 1024 before being compared to the load_avg_ratio. If /sys/kernel/hmp/load_avg_period_ms is 128 and /sys/kernel/hmp/up_threshold is 512, a task will be migrated to a bigger cluster after running for 128ms. Because after load_avg_period_ms the load average is 0.5 and real up_threshold us 512 / 1024 = 0.5. Signed-off-by: Olivier Cozette <olivier.cozette@arm.com> Signed-off-by: Chris Redpath <chris.redpath@arm.com>
2013-02-11ARM: sched: Avoid empty 'slow' HMP domainJon Medhurst
On homogeneous (non-heterogeneous) systems all CPUs will be declared 'fast' and the slow cpu list will be empty. In this situation we need to avoid adding an empty slow HMP domain otherwise the scheduler code will blow up when it attempts to move a task to the slow domain. Signed-off-by: Jon Medhurst <tixy@linaro.org>
2013-02-11sched: Enable HMP priority filter by defaultMorten Rasmussen
This updates the ARM Kconfig to enable the HMP priority filter by default. Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
2013-02-11ARM: sched: Setup SCHED_HMP domainsMorten Rasmussen
SCHED_HMP requires the different cpu types to be represented by an ordered list of hmp_domains. Each hmp_domain represents all cpus of a particular type using a cpumask. The list is platform specific and therefore must be generated by platform code by implementing arch_get_hmp_domains(). Signed-off-by: Morten Rasmussen <Morten.Rasmussen@arm.com>
2013-02-11ARM: sched: Use device-tree to provide fast/slow CPU list for HMPMorten Rasmussen
We can't rely on Kconfig options to set the fast and slow CPU lists for HMP scheduling if we want a single kernel binary to support multiple devices with different CPU topology. E.g. TC2 (ARM's Test-Chip-2 big.LITTLE system), Fast Models, or even non big.LITTLE devices. This patch adds the function arch_get_fast_and_slow_cpus() to generate the lists at run-time by parsing the CPU nodes in device-tree; it assumes slow cores are A7s and everything else is fast. The function still supports the old Kconfig options as this is useful for testing the HMP scheduler on devices without big.LITTLE. This patch is reuse of a patch by Jon Medhurst <tixy@linaro.org> with a few bits left out. Signed-off-by: Morten Rasmussen <Morten.Rasmussen@arm.com>
2013-02-11ARM: Add HMP scheduling support for ARM architectureMorten Rasmussen
Adds Kconfig entries to enable HMP scheduling on ARM platforms. Currently, it disables CPU level sched_domain load-balacing in order to simplify things. This needs fixing in a later revision. HMP scheduling will do the load-balancing at this level instead. Signed-off-by: Morten Rasmussen <Morten.Rasmussen@arm.com>
2013-02-11sched: Introduce priority-based task migration filterMorten Rasmussen
Introduces a priority threshold which prevents low priority task from migrating to faster hmp_domains (cpus). This is useful for user-space software which assigns lower task priority to background task. Signed-off-by: Morten Rasmussen <Morten.Rasmussen@arm.com>
2013-02-11sched: Task placement for heterogeneous systems based on task load-trackingMorten Rasmussen
This patch introduces the basic SCHED_HMP infrastructure. Each class of cpus is represented by a hmp_domain and tasks will only be moved between these domains when their load profiles suggest it is beneficial. SCHED_HMP relies heavily on the task load-tracking introduced in Paul Turners fair group scheduling patch set: <https://lkml.org/lkml/2012/8/23/267> SCHED_HMP requires that the platform implements arch_get_hmp_domains() which should set up the platform specific list of hmp_domains. It is also assumed that the platform disables SD_LOAD_BALANCE for the appropriate sched_domains. Tasks placement takes place every time a task is to be inserted into a runqueue based on its load history. The task placement decision is based on load thresholds. There are no restrictions on the number of hmp_domains, however, multiple (>2) has not been tested and the up/down migration policy is rather simple. Signed-off-by: Morten Rasmussen <Morten.Rasmussen@arm.com>
2013-02-11ARM: hw_breakpoint: Debug powerdown support for self-hosted debugDietmar Eggemann
This patch introduces debug powerdown support for self-hosted debug for v7 and v7.1 debug architecture for a SinglePower system, i.e. a system without a separate core and debug power domain. On a SinglePower system the OS Lock is lost over a powerdown. If CONFIG_CPU_PM is set the new function pm_init() registers hw_breakpoint with CPU PM for a system supporting OS Save and Restore. Receiving a CPU PM EXIT notifier indicates that a single CPU has exited a low power state. A call to reset_ctrl_regs() is hooked into the CPU PM EXIT notifier chain. This function makes sure that the sticky power-down is clear (only v7 debug), the OS Double Lock is clear (only v7.1 debug) and it clears the OS Lock for v7 debug (for a system supporting OS Save and Restore) and v7.1 debug. Furthermore, it clears any vector-catch events and all breakpoint/watchpoint control/value registers for v7 and v7.1 debug. Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2013-02-11ARM: hw_breakpoint: Check function for OS Save and Restore mechanismDietmar Eggemann
v7 debug introduced OS Save and Restore mechanism. On a v7 debug SinglePower system, i.e a system without a separate core and debug power domain, which does not support external debug over powerdown, it is implementation defined whether OS Save and Restore is implemented. v7.1 debug requires OS Save and Restore mechanism. v6 debug and v6.1 debug do not implement it. A new global variable bool has_ossr is introduced and is determined in arch_hw_breakpoint_init() like debug_arch or the number of BRPs/WRPs. The logic how to check if OS Save and Restore is supported has changed with this patch. In reset_ctrl_regs() a mask consisting of OSLM[1] (OSLSR.3) and OSLM[0] (OSLSR.0) was used to check if the system supports OS Save and Restore. In the new function core_has_os_save_restore() only OSLM[0] is used. It is not necessary to check OSLM[1] too since it is v7.1 debug specific and v7.1 debug requires OS Save and Restore and thus OS Lock. Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2013-02-11ARM: perf: save/restore pmu registers in pm notifierSudeep KarkadaNagesha
This adds core support for saving and restoring CPU PMU registers for suspend/resume support i.e. deeper C-states in cpuidle terms. This patch adds support only to ARMv7 PMU registers save/restore. It needs to be extended to xscale and ARMv6 if needed. Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
2013-02-11ARM: perf: remove spaces in CPU PMU namesSudeep KarkadaNagesha
The userspace perf tool provides options to specify PMU names from command line for the event. An example of pmu event syntax would be (<pmu_name>/<config>/<modifier>) However the parser in the perf tool breaks the tokens at spacesand fails to identify the PMU name with spaces correctly. This patch removes spaces in the ARMv7 CPU PMU names. Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
2013-02-11ARM: perf: set cpu affinity for the irqs correctlySudeep KarkadaNagesha
This patch sets the cpu affinity for the perf IRQs in the logical order within the cluster. However interupts are assumed to be specified in the same logical order within the cluster. Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
2013-02-11ARM: perf: set cpu affinity to support multiple PMUsSudeep KarkadaNagesha
In a system with multiple heterogeneous CPU PMUs and each PMUs can handle events on a subset of CPUs, probably belonging a the same cluster. This patch introduces a cpumask to track which CPUs each PMU supports. It also updates armpmu_event_init to reject cpu-specific events being initialised for unsupported CPUs. Since process-specific events can be initialised for all the CPU PMUs,armpmu_start/stop/add are modified to prevent from being added on unsupported CPUs. Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
2013-02-11ARM: perf: register CPU PMUs with idr typesSudeep KarkadaNagesha
In order to support multiple, heterogeneous CPU PMUs and distinguish them, they cannot be registered as PERF_TYPE_RAW type. Instead we can get perf core to allocate a new idr type id for each PMU. Userspace applications can refer sysfs entried to find a PMU's type, which can then be used in tracking events on individual PMUs. Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
2013-02-11ARM: perf: replace global CPU PMU pointer with per-cpu pointersSudeep KarkadaNagesha
A single global CPU PMU pointer is not useful in a system with multiple, heterogeneous CPU PMUs as we need to access the relevant PMU depending on the current CPU. This patch replaces the single global CPU PMU pointer with per-cpu pointers and changes the OProfile accessors to refer to the PMU affine to CPU0. Signed-off-by: Sudeep KarkadaNagesha <Sudeep.KarkadaNagesha@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2013-02-11ARM: kernel: provide cluster to logical cpu mask mapping APILorenzo Pieralisi
Some device drivers like PMU require to retrieve the logical cpu mask that corresponds to a given cluster id. This patch provides a hook in the topology code that, given an existing cluster id as input, initializes the corresponding cpumask passed as a pointer, reusing all existing topology information required by sched domains in the kernel. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
2013-02-09Merge tag 'highbank-fixes-for-3.8' of git://sources.calxeda.com/kernel/linux ↵Olof Johansson
into fixes From Rob Herring: highbank fixes for 3.8 -Compile fix for !SMP -More cpu cluster id related fixes * tag 'highbank-fixes-for-3.8' of git://sources.calxeda.com/kernel/linux: ARM: highbank: mask cluster id from cpu_logical_map ARM: scu: mask cluster id from cpu_logical_map ARM: scu: add empty scu_enable for !CONFIG_SMP
2013-02-09Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-armLinus Torvalds
Pull ARM fixes from Russell King: "I was going to hold these off until v3.8 was out, and send them with a stable tag, but as everyone else is pushing much bigger fixes which Linus is accepting, let's save people from the hastle of having to patch v3.8 back into working or use a stable kernel. Looking at the diffstat, this really is high value for its size; this is miniscule compared to how the -rc6 to tip diffstat currently looks. So, four patches in this set: - Punit Agrawal reports that the kernel no longer boots on MPCore due to a new assumption made in the GIC code which isn't true of earlier GIC designs. This is the biggest change in this set. - Punit's boot log also revealed a bunch of WARN_ON() dumps caused by the DT-ification of the GIC support without fixing up non-DT Realview - which now sees a greater number of interrupts than it did before. - A fix for the DMA coherent code from Marek which uses the wrong check for atomic allocations; this can result in spinlock lockups or other nasty effects. - A fix from Will, which will affect all Android based platforms if not applied (which use the 2G:2G VM split) - this causes particularly 'make' to misbehave unless this bug is fixed." * 'fixes' of git://git.linaro.org/people/rmk/linux-arm: ARM: 7641/1: memory: fix broken mmap by ensuring TASK_UNMAPPED_BASE is aligned ARM: DMA mapping: fix bad atomic test ARM: realview: ensure that we have sufficient IRQs available ARM: GIC: fix GIC cpumask initialization
2013-02-08tile: tag some code with #ifdef CONFIG_COMPATChris Metcalf
This allows us to disable COMPAT mode without a link error. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2013-02-08tile: fix memcpy_*io functions for allnoconfigChris Metcalf
On tilepro without CONFIG_PCI, we can't provide inlines of these functions, as we don't have readl/writel. In addition, fix memset_io() signature to take a volatile void *. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>