aboutsummaryrefslogtreecommitdiff
path: root/arch/powerpc/kernel/head_64.S
AgeCommit message (Collapse)Author
2013-04-26powerpc: Add isync to copy_and_flushMichael Neuling
In __after_prom_start we copy the kernel down to zero in two calls to copy_and_flush. After the first call (copy from 0 to copy_to_here:) we jump to the newly copied code soon after. Unfortunately there's no isync between the copy of this code and the jump to it. Hence it's possible that stale instructions could still be in the icache or pipeline before we branch to it. We've seen this on real machines and it's results in no console output after: calling quiesce... returning from prom_init The below adds an isync to ensure that the copy and flushing has completed before any branching to the new instructions occurs. Signed-off-by: Michael Neuling <mikey@neuling.org> CC: <stable@vger.kernel.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-01-10powerpc/kexec: Add kexec "hold" support for Book3e processorsJimi Xenidis
Motivation: IBM Blue Gene/Q comes with some very strange firmware that I'm trying to get out of using in the kernel. So instead I spin all the threads in the boot wrapper (using the firmware) and have them enter the kexec stub, pre-translated at the virtual "linear" address, never touching firmware again. This works strategy works wonderfully, but I need the following patch in the kexec stub. I believe it should not effect Book3S and Book3E does not appear to be here yet so I'd love to get any criticisms up front. This patch adds two items: 1) Book3e requires that GPR4 survive the "hold" process, so we make sure that happens. 2) Book3e has no real mode, and the hold code exploits this. Since these processors ares always translated, we arrange for the kexeced threads to enter the hold code using the normal kernel linear mapping. Signed-off-by: Jimi Xenidis <jimix@pobox.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-01-10powerpc: Build kernel with -mcmodel=mediumAnton Blanchard
Finally remove the two level TOC and build with -mcmodel=medium. Unfortunately we can't build modules with -mcmodel=medium due to the tricks the kernel module loader plays with percpu data: # -mcmodel=medium breaks modules because it uses 32bit offsets from # the TOC pointer to create pointers where possible. Pointers into the # percpu data area are created by this method. # # The kernel module loader relocates the percpu data section from the # original location (starting with 0xd...) to somewhere in the base # kernel percpu data space (starting with 0xc...). We need a full # 64bit relocation for this to work, hence -mcmodel=large. On older kernels we fall back to the two level TOC (-mminimal-toc) Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-11-15powerpc: Add relocation on exception vector handlersMichael Neuling
POWER8/v2.07 allows exceptions to be taken with the MMU still on. A new set of exception vectors is added at 0xc000_0000_0000_4xxx. When the HW takes us here, MSR IR/DR will be set already and we no longer need a costly RFID to turn the MMU back on again. The original 0x0 based exception vectors remain for when the HW can't leave the MMU on. Examples of this are when we can't trust the current MMU mappings, like when we are changing from guest to hypervisor (HV 0 -> 1) or when the MMU was off already. In these cases the HW will take us to the original 0x0 based exception vectors with the MMU off as before. This uses the new macros added previously too implement these new execption vectors at 0xc000_0000_0000_4xxx. We exit these exception vectors using mflr/blr (rather than mtspr SSR0/RFID), since we don't need the costly MMU switch anymore. This moves the __end_interrupts marker down past these new 0x4000 vectors since they will need to be copied down to 0x0 when the kernel is not at 0x0. Signed-off-by: Matt Evans <matt@ozlabs.org> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-11-15powerpc: Fix CONFIG_RELOCATABLE=y CONFIG_CRASH_DUMP=n buildAnton Blanchard
If we build a kernel with CONFIG_RELOCATABLE=y CONFIG_CRASH_DUMP=n, the kernel fails when we run at a non zero offset. It turns out we were incorrectly wrapping some of the relocatable kernel code with CONFIG_CRASH_DUMP. Signed-off-by: Anton Blanchard <anton@samba.org> Cc: <stable@kernel.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-11-15powerpc/powernv: Fix OPAL debug entryBenjamin Herrenschmidt
OPAL provides the firmware base/entry in registers at boot time for debugging purposes. We had a bug in the code trying to stash these into the appropriate kernel globals (a line of code was probably dropped by accident back when this was merged) Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-03-09powerpc: Rework lazy-interrupt handlingBenjamin Herrenschmidt
The current implementation of lazy interrupts handling has some issues that this tries to address. We don't do the various workarounds we need to do when re-enabling interrupts in some cases such as when returning from an interrupt and thus we may still lose or get delayed decrementer or doorbell interrupts. The current scheme also makes it much harder to handle the external "edge" interrupts provided by some BookE processors when using the EPR facility (External Proxy) and the Freescale Hypervisor. Additionally, we tend to keep interrupts hard disabled in a number of cases, such as decrementer interrupts, external interrupts, or when a masked decrementer interrupt is pending. This is sub-optimal. This is an attempt at fixing it all in one go by reworking the way we do the lazy interrupt disabling from the ground up. The base idea is to replace the "hard_enabled" field with a "irq_happened" field in which we store a bit mask of what interrupt occurred while soft-disabled. When re-enabling, either via arch_local_irq_restore() or when returning from an interrupt, we can now decide what to do by testing bits in that field. We then implement replaying of the missed interrupts either by re-using the existing exception frame (in exception exit case) or via the creation of a new one from an assembly trampoline (in the arch_local_irq_enable case). This removes the need to play with the decrementer to try to create fake interrupts, among others. In addition, this adds a few refinements: - We no longer hard disable decrementer interrupts that occur while soft-disabled. We now simply bump the decrementer back to max (on BookS) or leave it stopped (on BookE) and continue with hard interrupts enabled, which means that we'll potentially get better sample quality from performance monitor interrupts. - Timer, decrementer and doorbell interrupts now hard-enable shortly after removing the source of the interrupt, which means they no longer run entirely hard disabled. Again, this will improve perf sample quality. - On Book3E 64-bit, we now make the performance monitor interrupt act as an NMI like Book3S (the necessary C code for that to work appear to already be present in the FSL perf code, notably calling nmi_enter instead of irq_enter). (This also fixes a bug where BookE perfmon interrupts could clobber r14 ... oops) - We could make "masked" decrementer interrupts act as NMIs when doing timer-based perf sampling to improve the sample quality. Signed-off-by-yet: Benjamin Herrenschmidt <benh@kernel.crashing.org> --- v2: - Add hard-enable to decrementer, timer and doorbells - Fix CR clobber in masked irq handling on BookE - Make embedded perf interrupt act as an NMI - Add a PACA_HAPPENED_EE_EDGE for use by FSL if they want to retrigger an interrupt without preventing hard-enable v3: - Fix or vs. ori bug on Book3E - Fix enabling of interrupts for some exceptions on Book3E v4: - Fix resend of doorbells on return from interrupt on Book3E v5: - Rebased on top of my latest series, which involves some significant rework of some aspects of the patch. v6: - 32-bit compile fix - more compile fixes with various .config combos - factor out the asm code to soft-disable interrupts - remove the C wrapper around preempt_schedule_irq v7: - Fix a bug with hard irq state tracking on native power7
2012-03-09powerpc: Remove legacy iSeries bits from assembly filesBenjamin Herrenschmidt
This removes the various bits of assembly in the kernel entry, exception handling and SLB management code that were specific to running under the legacy iSeries hypervisor which is no longer supported. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-20powerpc/powernv: Support for OPAL consoleBenjamin Herrenschmidt
This adds a udbg and an hvc console backend for supporting a console using the OPAL console interfaces. On OPAL v1 we have hvc0 mapped to whatever console the system was configured for (network or hvsi serial port) via the service processor. On OPAL v2 we have hvcN mapped to the Nth console provided by OPAL which generally corresponds to: hvc0 : network console (raw protocol) hvc1 : serial port S1 (hvsi) hvc2 : serial port S2 (hvsi) Note: At this point, early debug console only works with OPAL v1 and shouldn't be enabled in a normal kernel. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-20powerpc/powernv: Add OPAL takeover from PowerVMBenjamin Herrenschmidt
On machines supporting the OPAL firmware version 1, the system is initially booted under pHyp. We then use a special hypercall to verify if OPAL is available and if it is, we then trigger a "takeover" which disables pHyp and loads the OPAL runtime firmware, giving control to the kernel in hypervisor mode. This patch add the necessary code to detect that the OPAL takeover capability is present when running under PowerVM (aka pHyp) and perform said takeover to get hypervisor control of the processor. To perform the takeover, we must first use RTAS (within Open Firmware runtime environment) to start all processors & threads, in order to give control to OPAL on all of them. We then call the takeover hypercall on everybody, OPAL will re-enter the kernel main entry point passing it a flat device-tree. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-20powerpc/powernv: Don't clobber r9 in relative_toc()Benjamin Herrenschmidt
With OPAL, r8 and r9 will be used to pass the OPAL base and entry for debugging purposes (those informations are also in the device-tree). We don't want to clobber those registers that early. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-06-17powerpc: Fix early boot accounting of CPUsMatt Evans
smp_release_cpus() waits for all cpus (including the bootcpu) due to an off-by-one count on boot_cpu_count (which is all CPUs). This patch replaces that with spinning_secondaries (which is all secondary CPUs). Signed-off-by: Matt Evans <matt@ozlabs.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19powerpc: Don't search for paca in freed memoryMilton Miller
Starting with 1426d5a3bd07589534286375998c0c8c6fdc5260 (powerpc: Dynamically allocate pacas) we free the memory for pacas beyond cpu_possible, but we failed to update the loop the secondary cpus use to find their paca. If the system has running cpu threads for which the kernel did not allocate a paca for they will search the memory that was freed. For instance this could happen when the device tree for a kdump kernel was not updated after a cpu hotplug, or the kernel is running with more cpus than the kernel was configured. Since c1854e00727f50f7ac99e98d26ece04c087ef785 (powerpc: Set nr_cpu_ids early and use it to free PACAs) we set nr_cpu_ids before telling the cpus to advance, so use that to limit the search. We can't reference nr_cpu_ids without CONFIG_SMP because it is defined as 1 instead of a memory location, but any extra threads should be sent to kexec_wait in that case anyways, so make that explicit and remove the search loop for UP. Note to stable: The fix also requires c1854e00727f50f7ac99e98d26ece04c087ef785 (powerpc: Set nr_cpu_ids early and use it to free PACAs) to function. Also 9d07bc841c9779b4d7902e417f4e509996ce805d (Properly handshake CPUs going out of boot spin loop) affects the second chunk, specifically the branch target was 3b before and is 4b after that patch, and there was a blank line before the #ifdef CONFIG_SMP that was removed Cc: <stable@kernel.org> # .34.x: c1854e0072 powerpc: Set nr_cpu_ids early Cc: <stable@kernel.org> # .34.x Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27powerpc: Use MSR_64BIT in placesMichael Ellerman
Use the new MSR_64BIT in a few places. Some of these are already ifdef'ed for BOOKE vs BOOKS, but it's still clearer, MSR_SF does not immediately parse as "MSR bit for 64bit". Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20powerpc: Perform an isync to synchronize CPUs coming out of secondary_holdBenjamin Herrenschmidt
We need to do that to guarantee they see any code change done by dynamic patching during boot. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20powerpc: Properly handshake CPUs going out of boot spin loopBenjamin Herrenschmidt
We need to wait a bit for them to have done their CPU setup or we might end up with translation and EE on with different LPCR values between threads Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20powerpc: Call CPU ->restore callback earlier on secondary CPUsBenjamin Herrenschmidt
We do it before we loop on the PACA start flag. This way, we get a chance to set critical SPRs on all CPUs before Linux tries to start them up, which avoids problems when changing some bits such as LPCR bits that need to be identical on all threads of a core or similar things like that. Ideally, some of that should also be done before the MMU is enabled, but that's a separate issue which would require moving some of the SMP startup code earlier, let's not get there for now, it works with that change alone. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20powerpc: In HV mode, use HSPRG0 for PACABenjamin Herrenschmidt
When running in Hypervisor mode (arch 2.06 or later), we store the PACA in HSPRG0 instead of SPRG1. The architecture specifies that SPRGs may be lost during a "nap" power management operation (though they aren't currently on POWER7) and this enables use of SPRG1 by KVM guests. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-07Merge branch 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6Linus Torvalds
* 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6: Fix common misspellings
2011-04-01powerpc/pmac/smp: Properly NAP offlined CPU on G5Benjamin Herrenschmidt
The current code soft-disables, and then goes to NAP mode which turns interrupts on. That means that if an interrupt occurs, we will hit the masked interrupt code path which isn't what we want, as it will return with EE off, which will either get us out of NAP mode, or fail to enter it (according to spec). Instead, let's just rely on the fact that it is safe to take decrementer interrupts on an offline CPU and leave interrupts enabled. We can also get rid of the special case in asm for power4_cpu_offline_powersave() and just use power4_idle(). Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-31Fix common misspellingsLucas De Marchi
Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2010-12-09Powerpc: separate CONFIG_RELOCATABLE from CONFIG_CRASHDUMP in boot codeSonny Rao
Fix head_64.S so that we can build a relocatable kernel that isn't necessarily a crash-dump kernel Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Sonny Rao <sonnyrao@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-11-29powerpc: Remove second definition of STACK_FRAME_OVERHEADStephen Rothwell
Since STACK_FRAME_OVERHEAD is defined in asm/ptrace.h and that is ASSEMBER safe, we can just include that instead of going via asm-offsets.h. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-10-24KVM: PPC: Move KVM trampolines before __end_interruptsAlexander Graf
When using a relocatable kernel we need to make sure that the trampline code and the interrupt handlers are both copied to low memory. The only way to do this reliably is to put them in the copied section. This patch should make relocated kernels work with KVM. KVM-Stable-Tag Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-08-31powerpc: Don't use kernel stack with translation offMichael Neuling
In f761622e59433130bc33ad086ce219feee9eb961 we changed early_setup_secondary so it's called using the proper kernel stack rather than the emergency one. Unfortunately, this stack pointer can't be used when translation is off on PHYP as this stack pointer might be outside the RMO. This results in the following on all non zero cpus: cpu 0x1: Vector: 300 (Data Access) at [c00000001639fd10] pc: 000000000001c50c lr: 000000000000821c sp: c00000001639ff90 msr: 8000000000001000 dar: c00000001639ffa0 dsisr: 42000000 current = 0xc000000016393540 paca = 0xc000000006e00200 pid = 0, comm = swapper The original patch was only tested on bare metal system, so it never caught this problem. This changes __secondary_start so that we calculate the new stack pointer but only start using it after we've called early_setup_secondary. With this patch, the above problem goes away. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-24powerpc: Initialise paca->kstack before early_setup_secondaryMatt Evans
As early setup calls down to slb_initialize(), we must have kstack initialised before checking "should we add a bolted SLB entry for our kstack?" Failing to do so means stack access requires an SLB miss exception to refill an entry dynamically, if the stack isn't accessible via SLB(0) (kernel text & static data). It's not always allowable to take such a miss, and intermittent crashes will result. Primary CPUs don't have this issue; an SLB entry is not bolted for their stack anyway (as that lives within SLB(0)). This patch therefore only affects the init of secondaries. Signed-off-by: Matt Evans <matt@ozlabs.org> Cc: stable <stable@kernel.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-17KVM: PPC: Name generic 64-bit code genericAlexander Graf
We have quite some code that can be used by Book3S_32 and Book3S_64 alike, so let's call it "Book3S" instead of "Book3S_64", so we can later on use it from the 32 bit port too. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-09powerpc: Reset kernel stack on cpu online from cede stateVaidyanathan Srinivasan
Cpu hotplug (offline) without dlpar operation will place cpu in cede state and the extended_cede_processor() function will return when resumed. Kernel stack pointer needs to be reset before start_secondary() is called to continue the online operation. Added new function start_secondary_resume() to do the above steps. Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Cc: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-03-09powerpc: Dynamically allocate pacasMichael Ellerman
On 64-bit kernels we currently have a 512 byte struct paca_struct for each cpu (usually just called "the paca"). Currently they are statically allocated, which means a kernel built for a large number of cpus will waste a lot of space if it's booted on a machine with few cpus. We can avoid that by only allocating the number of pacas we need at boot. However this is complicated by the fact that we need to access the paca before we know how many cpus there are in the system. The solution is to dynamically allocate enough space for NR_CPUS pacas, but then later in boot when we know how many cpus we have, we free any unused pacas. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-11-05Make head_64.S aware of KVM real mode codeAlexander Graf
We need to run some KVM trampoline code in real mode. Unfortunately, real mode only covers 8MB on Cell so we need to squeeze ourselves as low as possible. Also, we need to trap interrupts to get us back from guest state to host state without telling Linux about it. This patch adds interrupt traps and includes the KVM code that requires real mode in the real mode parts of Linux. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-20powerpc: Remaining 64-bit Book3E supportBenjamin Herrenschmidt
This contains all the bits that didn't fit in previous patches :-) This includes the actual exception handlers assembly, the changes to the kernel entry, other misc bits and wiring it all up in Kconfig. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-20powerpc: Use names rather than numbers for SPRGs (v2)Benjamin Herrenschmidt
The kernel uses SPRG registers for various purposes, typically in low level assembly code as scratch registers or to hold per-cpu global infos such as the PACA or the current thread_info pointer. We want to be able to easily shuffle the usage of those registers as some implementations have specific constraints realted to some of them, for example, some have userspace readable aliases, etc.. and the current choice isn't always the best. This patch should not change any code generation, and replaces the usage of SPRN_SPRGn everywhere in the kernel with a named replacement and adds documentation next to the definition of the names as to what those are used for on each processor family. The only parts that still use the original numbers are bits of KVM or suspend/resume code that just blindly needs to save/restore all the SPRGs. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-20powerpc: Rename exception.h to exception-64s.hBenjamin Herrenschmidt
The file include/asm/exception.h contains definitions that are specific to exception handling on 64-bit server type processors. This renames the file to exception-64s.h to reflect that fact and avoid confusion. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-06-09powerpc: Split exception handling out of head_64.SBenjamin Herrenschmidt
To prepare for future support of Book3E 64-bit PowerPC processors, which use a completely different exception handling, we move that code to a new exceptions-64s.S file. This file is #included from head_64.S due to some of the absolute address requirements which can currently only be fulfilled from within that file. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-06-09powerpc: Move VMX and VSX asm code to vector.SBenjamin Herrenschmidt
Currently, load_up_altivec and give_up_altivec are duplicated in 32-bit and 64-bit. This creates a common implementation that is moved away from head_32.S, head_64.S and misc_64.S and into vector.S, using the same macros we already use for our common implementation of load_up_fpu. I also moved the VSX code over to vector.S though in that case I didn't make it build on 32-bit (yet). Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-11powerpc/kconfig: Kill PPC_MULTIPLATFORMBenjamin Herrenschmidt
CONFIG_PPC_MULTIPLATFORM is a remain of the pre-powerpc days and isn't really meaningful anymore. It was basically equivalent to PPC64 || 6xx. This removes it along with the following changes: - 32-bit platforms that relied on PPC32 && PPC_MULTIPLATFORM now rely on 6xx which is what they want anyway. - A new symbol, PPC_BOOK3S, is defined that represent compliance with the "Server" variant of the architecture. This is set when either 6xx or PPC64 is set and open the door for future BOOK3E 64-bit. - 64-bit platforms that relied on PPC64 && PPC_MULTIPLATFORM now use PPC64 && PPC_BOOK3S - A separate and selectable CONFIG_PPC_OF_BOOT_TRAMPOLINE option is now used to control the use of prom_init.c Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-01-13powerpc/powermac: Fix occasional SMP boot failureBenjamin Herrenschmidt
The PowerMac kernel occasionally fails to bring up the secondary CPUs on SMP, the trigger factor seem to be fairly random and related to location of code and data. This appears to be due to the initial loading of the TOC value by the secondary processor which now happens before we clear HID4:RM_CI (Real Mode Cache Invalidate). This bit should really be cleared before we do any load or store other than fetching code. This fix works based on the assumption that all SMP 64-bit PowerMacs use variants of the 970, which fortunately is true, by explicitely clearing that bit, adding an slbia for good measure as RM_CI mode is known to create bogus ERAT entries. I also removed some spurrious debug output that was left enabled by mistake while at it. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2008-10-31powerpc/ppc64/kdump: Better flag for running relocatableMilton Miller
The __kdump_flag ABI is overly constraining for future development. As of 2.6.27, the kernel entry point has 4 constraints: Offset 0 is the starting point for the master (boot) cpu (entered with r3 pointing to the device tree structure), offset 0x60 is code for the slave cpus (entered with r3 set to their device tree physical id), offset 0x20 is used by the iseries hypervisor, and secondary cpus must be well behaved when the first 256 bytes are copied to address 0. Placing the __kdump_flag at 0x18 is bad because: - It was taking the last 8 bytes before the iseries hypervisor data. - It was 8 bytes for a boolean flag - It had no way of identifying that the flag was present - It does leave any room for the master to add any additional code before branching, which hurts debug. - It will be unnecessarily hard for 32 bit code to be common (8 bytes) Now that we have eliminated the use of __kdump_flag in favor of the standard is_kdump_kernel(), this flag only controls run without relocating the kernel to PHYSICAL_START (0), so rename it __run_at_load. Move the flag to 0x5c, 1 word before the secondary cpu entry point at 0x60. Initialize it with "run0" to say it will run at 0 unless it is set to 1. It only exists if we are relocatable. Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-10-22powerpc: Support for relocatable kdump kernelMohan Kumar M
This adds relocatable kernel support for kdump. With this one can use the same regular kernel to capture the kdump. A signature (0xfeed1234) is passed in r6 from panic code to the next kernel through kexec_sequence and purgatory code. The signature is used to differentiate between kdump kernel and non-kdump kernels. The purgatory code compares the signature and sets the __kdump_flag in head_64.S. During the boot up, kernel code checks __kdump_flag and if it is set, the kernel will behave as relocatable kdump kernel. This kernel will boot at the address where it was loaded by kexec-tools ie. at the address reserved through crashkernel boot parameter. CONFIG_CRASH_DUMP depends on CONFIG_RELOCATABLE option to build kdump kernel as relocatable. So the same kernel can be used as production and kdump kernel. This patch incorporates the changes suggested by Paul Mackerras to avoid GOT use and to avoid two copies of the code. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Mohan Kumar M <mohan@in.ibm.com> Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2008-09-15powerpc: Make the 64-bit kernel as a position-independent executablePaul Mackerras
This implements CONFIG_RELOCATABLE for 64-bit by making the kernel as a position-independent executable (PIE) when it is set. This involves processing the dynamic relocations in the image in the early stages of booting, even if the kernel is being run at the address it is linked at, since the linker does not necessarily fill in words in the image for which there are dynamic relocations. (In fact the linker does fill in such words for 64-bit executables, though not for 32-bit executables, so in principle we could avoid calling relocate() entirely when we're running a 64-bit kernel at the linked address.) The dynamic relocations are processed by a new function relocate(addr), where the addr parameter is the virtual address where the image will be run. In fact we call it twice; once before calling prom_init, and again when starting the main kernel. This means that reloc_offset() returns 0 in prom_init (since it has been relocated to the address it is running at), which necessitated a few adjustments. This also changes __va and __pa to use an equivalent definition that is simpler. With the relocatable kernel, PAGE_OFFSET and MEMORY_START are constants (for 64-bit) whereas PHYSICAL_START is a variable (and KERNELBASE ideally should be too, but isn't yet). With this, relocatable kernels still copy themselves down to physical address 0 and run there. Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-09-15powerpc: Use LOAD_REG_IMMEDIATE only for constants on 64-bitPaul Mackerras
Using LOAD_REG_IMMEDIATE to get the address of kernel symbols generates 5 instructions where LOAD_REG_ADDR can do it in one, and will generate R_PPC64_ADDR16_* relocations in the output when we get to making the kernel as a position-independent executable, which we'd rather not have to handle. This changes various bits of assembly code to use LOAD_REG_ADDR when we need to get the address of a symbol, or to use suitable position-independent code for cases where we can't access the TOC for various reasons, or if we're not running at the address we were linked at. It also cleans up a few minor things; there's no reason to save and restore SRR0/1 around RTAS calls, __mmu_off can get the return address from LR more conveniently than the caller can supply it in R4 (and we already assume elsewhere that EA == RA if the MMU is on in early boot), and enable_64b_mode was using 5 instructions where 2 would do. Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-09-15powerpc: Make it possible to move the interrupt handlers away from the kernelPaul Mackerras
This changes the way that the exception prologs transfer control to the handlers in 64-bit kernels with the aim of making it possible to have the prologs separate from the main body of the kernel. Now, instead of computing the address of the handler by taking the top 32 bits of the paca address (to get the 0xc0000000........ part) and ORing in something in the bottom 16 bits, we get the base address of the kernel by doing a load from the paca and add an offset. This also replaces an mfmsr and an ori to compute the MSR value for the handler with a load from the paca. That makes it unnecessary to have a separate version of EXCEPTION_PROLOG_PSERIES that forces 64-bit mode. We can no longer use a direct branches in the exception prolog code, which means that the SLB miss handlers can't branch directly to .slb_miss_realmode any more. Instead we have to compute the address and do an indirect branch. This is conditional on CONFIG_RELOCATABLE; for non-relocatable kernels we use a direct branch as before. (A later change will allow CONFIG_RELOCATABLE to be set on 64-bit powerpc.) Since the secondary CPUs on pSeries start execution in the first 0x100 bytes of real memory and then have to get to wherever the kernel is, we can't use a direct branch to get there. Instead this changes __secondary_hold_spinloop from a flag to a function pointer. When it is set to a non-NULL value, the secondary CPUs jump to the function pointed to by that value. Finally this eliminates one code difference between 32-bit and 64-bit by making __secondary_hold be the text address of the secondary CPU spinloop rather than a function descriptor for it. Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-09-15powerpc: Rearrange head_64.S to move interrupt handler code to the beginningPaul Mackerras
This rearranges head_64.S so that we have all the first-level exception prologs together starting at 0x100, followed by all the second-level handlers that are invoked from the first-level prologs, followed by other code. This doesn't make any functional change but will make following changes for relocatable kernel support easier. Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-07-15powerpc: Don't spin on sync instruction at boot timeSonny Rao
Push the sync below the secondary smp init hold loop and comment its purpose. This should speed up boot by reducing global traffic during the single-threaded portion of boot. Signed-off-by: Sonny Rao <sonnyrao@us.ibm.com> Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2008-07-01powerpc: Add VSX context save/restore, ptrace and signal supportMichael Neuling
This patch extends the floating point save and restore code to use the VSX load/stores when VSX is available. This will make FP context save/restore marginally slower on FP only code, when VSX is available, as it has to load/store 128bits rather than just 64bits. Mixing FP, VMX and VSX code will get constant architected state. The signals interface is extended to enable access to VSR 0-31 doubleword 1 after discussions with tool chain maintainers. Backward compatibility is maintained. The ptrace interface is also extended to allow access to VSR 0-31 full registers. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-07-01powerpc: Make load_up_fpu and load_up_altivec callableMichael Neuling
Make load_up_fpu and load_up_altivec callable so they can be reused by the VSX code. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-07-01powerpc: Move altivec_unavailableMichael Neuling
Move the altivec_unavailable code, to make room at 0xf40 where the vsx_unavailable exception will be. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-05-09[POWERPC] Fix bogus paca->_current initializationBenjamin Herrenschmidt
When doing lockdep, I had two patches to initialize paca->_current early, one bogus, and one correct. Unfortunately both got merged as the bad one ended up being part of the main lockdep patch by mistake. This causes memory corruption at boot. This removes the offending code. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-04-29[POWERPC] Add fast little-endian switch system callPaul Mackerras
This adds a system call on 64-bit platforms for switching between little-endian and big-endian modes that is much faster than doing a prctl call. This system call is handled as a special case right at the start of the system call entry code, and because it is a special case, it uses a system call number which is out of the range of normal system calls, namely 0x1ebe. Measurements with lmbench on a 4.2GHz POWER6 showed no measurable change in the speed of normal system calls with this patch. Switching endianness with this new system call takes around 60ns on a 4.2GHz POWER6, compared with around 300ns to switch endian mode with a prctl. This can provide a significant performance advantage for emulators for little-endian architectures that want to switch between big-endian and little-endian mode frequently, e.g. because they are generating instructions sequences on the fly and they want to run those sequences in little-endian mode. The other thing about this system call is that it doesn't clobber as many registers as a normal system call. It only clobbers r12. Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-04-18[POWERPC] irqtrace support for 64-bit powerpcBenjamin Herrenschmidt
This adds the low level irq tracing hooks to the powerpc architecture needed to enable full lockdep functionality. This is partly based on Johannes Berg's initial version. I removed the asm trampoline that isn't needed (thus improving performance) and modified all sorts of bits and pieces, reworking most of the assembly, etc... Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>