diff options
Diffstat (limited to 'docs')
-rw-r--r-- | docs/change-log.md | 76 | ||||
-rw-r--r-- | docs/porting-guide.md | 939 | ||||
-rw-r--r-- | docs/user-guide.md | 961 |
3 files changed, 1976 insertions, 0 deletions
diff --git a/docs/change-log.md b/docs/change-log.md new file mode 100644 index 0000000..3a9e5cd --- /dev/null +++ b/docs/change-log.md @@ -0,0 +1,76 @@ +ARM Trusted Firmware - version 0.2 +================================== + +New features +------------ + +* First source release. + +* Code for the PSCI suspend feature is supplied, although this is not enabled + by default since there are known issues (see below). + + +Issues resolved since last release +---------------------------------- + +* The "psci" nodes in the FDTs provided in this release now fully comply + with the recommendations made in the PSCI specification. + + +Known issues +------------ + +The following is a list of issues which are expected to be fixed in the future +releases of the ARM Trusted Firmware. + +* The TrustZone Address Space Controller (TZC-400) is not being programmed + yet. Use of model parameter `-C bp.secure_memory=1` is not supported. + +* No support yet for secure world interrupt handling or for switching context + between secure and normal worlds in EL3. + +* GICv3 support is experimental. The Linux kernel patches to support this are + not widely available. There are known issues with GICv3 initialization in + the ARM Trusted Firmware. + +* Dynamic image loading is not available yet. The current image loader + implementation (used to load BL2 and all subsequent images) has some + limitations. Changing BL2 or BL3-1 load addresses in certain ways can lead + to loading errors, even if the images should theoretically fit in memory. + +* Although support for PSCI `CPU_SUSPEND` is present, it is not yet stable + and ready for use. + +* PSCI api calls `AFFINITY_INFO` & `PSCI_VERSION` are implemented but have not + been tested. + +* The ARM Trusted Firmware make files result in all build artifacts being + placed in the root of the project. These should be placed in appropriate + sub-directories. + +* The compilation of ARM Trusted Firmware is not free from compilation + warnings. Some of these warnings have not been investigated yet so they + could mask real bugs. + +* The ARM Trusted Firmware currently uses toolchain/system include files like + stdio.h. It should provide versions of these within the project to maintain + compatibility between toolchains/systems. + +* The PSCI code takes some locks in an incorrect sequence. This may cause + problems with suspend and hotplug in certain conditions. + +* The Linux kernel used in this release is based on version 3.12-rc4. Using + this kernel with the ARM Trusted Firmware fails to start the file-system as + a RAM-disk. It fails to execute user-space `init` from the RAM-disk. As an + alternative, the VirtioBlock mechanism can be used to provide a file-system + to the kernel. + + +Detailed changes since last release +----------------------------------- + +First source release – not applicable. + +- - - - - - - - - - - - - - - - - - - - - - - - - - + +_Copyright (c) 2013 ARM Ltd. All rights reserved._ diff --git a/docs/porting-guide.md b/docs/porting-guide.md new file mode 100644 index 0000000..ae77c55 --- /dev/null +++ b/docs/porting-guide.md @@ -0,0 +1,939 @@ +ARM Trusted Firmware Porting Guide +================================== + +Contents +-------- + +1. Introduction +2. Common Modifications + * Common mandatory modifications + * Common optional modifications +3. Boot Loader stage specific modifications + * Boot Loader stage 1 (BL1) + * Boot Loader stage 2 (BL2) + * Boot Loader stage 3-1 (BL3-1) + * PSCI implementation (in BL3-1) + +- - - - - - - - - - - - - - - - - - + +1. Introduction +---------------- + +Porting the ARM Trusted Firmware to a new platform involves making some +mandatory and optional modifications for both the cold and warm boot paths. +Modifications consist of: + +* Implementing a platform-specific function or variable, +* Setting up the execution context in a certain way, or +* Defining certain constants (for example #defines). + +The firmware provides a default implementation of variables and functions to +fulfill the optional requirements. These implementations are all weakly defined; +they are provided to ease the porting effort. Each platform port can override +them with its own implementation if the default implementation is inadequate. + +Some modifications are common to all Boot Loader (BL) stages. Section 2 +discusses these in detail. The subsequent sections discuss the remaining +modifications for each BL stage in detail. + +This document should be read in conjunction with the ARM Trusted Firmware +[User Guide]. + + +2. Common modifications +------------------------ + +This section covers the modifications that should be made by the platform for +each BL stage to correctly port the firmware stack. They are categorized as +either mandatory or optional. + + +2.1 Common mandatory modifications +---------------------------------- +A platform port must enable the Memory Management Unit (MMU) with identity +mapped page tables, and enable both the instruction and data caches for each BL +stage. In the ARM FVP port, each BL stage configures the MMU in its platform- +specific architecture setup function, for example `blX_plat_arch_setup()`. + +Each platform must allocate a block of identity mapped secure memory with +Device-nGnRE attributes aligned to page boundary (4K) for each BL stage. This +memory is identified by the section name `tzfw_coherent_mem` so that its +possible for the firmware to place variables in it using the following C code +directive: + + __attribute__ ((section("tzfw_coherent_mem"))) + +Or alternatively the following assembler code directive: + + .section tzfw_coherent_mem + +The `tzfw_coherent_mem` section is used to allocate any data structures that are +accessed both when a CPU is executing with its MMU and caches enabled, and when +it's running with its MMU and caches disabled. Examples are given below. + +The following variables, functions and constants must be defined by the platform +for the firmware to work correctly. + + +### File : platform.h [mandatory] + +Each platform must export a header file of this name with the following +constants defined. In the ARM FVP port, this file is found in +[../plat/fvp/platform.h]. + +* ** #define : PLATFORM_LINKER_FORMAT ** + + Defines the linker format used by the platform, for example + `elf64-littleaarch64` used by the FVP. + +* ** #define : PLATFORM_LINKER_ARCH ** + + Defines the processor architecture for the linker by the platform, for + example `aarch64` used by the FVP. + +* ** #define : PLATFORM_STACK_SIZE ** + + Defines the normal stack memory available to each CPU. This constant is used + by `platform_set_stack()`. + +* ** #define : FIRMWARE_WELCOME_STR ** + + Defines the character string printed by BL1 upon entry into the `bl1_main()` + function. + +* ** #define : BL2_IMAGE_NAME ** + + Name of the BL2 binary image on the host file-system. This name is used by + BL1 to load BL2 into secure memory using semi-hosting. + +* ** #define : PLATFORM_CACHE_LINE_SIZE ** + + Defines the size (in bytes) of the largest cache line across all the cache + levels in the platform. + +* ** #define : PLATFORM_CLUSTER_COUNT ** + + Defines the total number of clusters implemented by the platform in the + system. + +* ** #define : PLATFORM_CORE_COUNT ** + + Defines the total number of CPUs implemented by the platform across all + clusters in the system. + +* ** #define : PLATFORM_MAX_CPUS_PER_CLUSTER ** + + Defines the maximum number of CPUs that can be implemented within a cluster + on the platform. + +* ** #define : PRIMARY_CPU ** + + Defines the `MPIDR` of the primary CPU on the platform. This value is used + after a cold boot to distinguish between primary and secondary CPUs. + +* ** #define : TZROM_BASE ** + + Defines the base address of secure ROM on the platform, where the BL1 binary + is loaded. This constant is used by the linker scripts to ensure that the + BL1 image fits into the available memory. + +* ** #define : TZROM_SIZE ** + + Defines the size of secure ROM on the platform. This constant is used by the + linker scripts to ensure that the BL1 image fits into the available memory. + +* ** #define : TZRAM_BASE ** + + Defines the base address of the secure RAM on platform, where the data + section of the BL1 binary is loaded. The BL2 and BL3-1 images are also + loaded in this secure RAM region. This constant is used by the linker + scripts to ensure that the BL1 data section and BL2/BL3-1 binary images fit + into the available memory. + +* ** #define : TZRAM_SIZE ** + + Defines the size of the secure RAM on the platform. This constant is used by + the linker scripts to ensure that the BL1 data section and BL2/BL3-1 binary + images fit into the available memory. + +* ** #define : SYS_CNTCTL_BASE ** + + Defines the base address of the `CNTCTLBase` frame of the memory mapped + counter and timer in the system level implementation of the generic timer. + +* ** #define : BL2_BASE ** + + Defines the base address in secure RAM where BL1 loads the BL2 binary image. + +* ** #define : BL31_BASE ** + + Defines the base address in secure RAM where BL2 loads the BL3-1 binary + image. + + +### Other mandatory modifications + +The following following mandatory modifications may be implemented in any file +the implementer chooses. In the ARM FVP port, they are implemented in +[../plat/fvp/aarch64/fvp_common.c]. + +* ** Variable : unsigned char platform_normal_stacks[X][Y] ** + + where X = PLATFORM_STACK_SIZE + and Y = PLATFORM_CORE_COUNT + + Each platform must allocate a block of memory with Normal Cacheable, Write + back, Write allocate and Inner Shareable attributes aligned to the size (in + bytes) of the largest cache line amongst all caches implemented in the + system. A pointer to this memory should be exported with the name + `platform_normal_stacks`. This pointer is used by the common platform helper + function `platform_set_stack()` to allocate a stack to each CPU in the + platform (see [../plat/common/aarch64/platform_helpers.S]). + + +2.2 Common optional modifications +--------------------------------- + +The following are helper functions implemented by the firmware that perform +common platform-specific tasks. A platform may choose to override these +definitions. + + +### Function : platform_get_core_pos() + + Argument : unsigned long + Return : int + +A platform may need to convert the `MPIDR` of a CPU to an absolute number, which +can be used as a CPU-specific linear index into blocks of memory (for example +while allocating per-CPU stacks). This routine contains a simple mechanism +to perform this conversion, using the assumption that each cluster contains a +maximum of 4 CPUs: + + linear index = cpu_id + (cluster_id * 4) + + cpu_id = 8-bit value in MPIDR at affinity level 0 + cluster_id = 8-bit value in MPIDR at affinity level 1 + + +### Function : platform_set_coherent_stack() + + Argument : unsigned long + Return : void + +A platform may need stack memory that is coherent with main memory to perform +certain operations like: + +* Turning the MMU on, or +* Flushing caches prior to powering down a CPU or cluster. + +Each BL stage allocates this coherent stack memory for each CPU in the +`tzfw_coherent_mem` section. A pointer to this memory (`pcpu_dv_mem_stack`) is +used by this function to allocate a coherent stack for each CPU. A CPU is +identified by its `MPIDR`, which is passed as an argument to this function. + +The size of the stack allocated to each CPU is specified by the constant +`PCPU_DV_MEM_STACK_SIZE`. + + +### Function : platform_is_primary_cpu() + + Argument : unsigned long + Return : unsigned int + +This function identifies a CPU by its `MPIDR`, which is passed as the argument, +to determine whether this CPU is the primary CPU or a secondary CPU. A return +value of zero indicates that the CPU is not the primary CPU, while a non-zero +return value indicates that the CPU is the primary CPU. + + +### Function : platform_set_stack() + + Argument : unsigned long + Return : void + +This function uses the `platform_normal_stacks` pointer variable to allocate +stacks to each CPU. Further details are given in the description of the +`platform_normal_stacks` variable below. A CPU is identified by its `MPIDR`, +which is passed as the argument. + +The size of the stack allocated to each CPU is specified by the platform defined +constant `PLATFORM_STACK_SIZE`. + + +### Function : plat_report_exception() + + Argument : unsigned int + Return : void + +A platform may need to report various information about its status when an +exception is taken, for example the current exception level, the CPU security +state (secure/non-secure), the exception type, and so on. This function is +called in the following circumstances: + +* In BL1, whenever an exception is taken. +* In BL2, whenever an exception is taken. +* In BL3-1, whenever an asynchronous exception or a synchronous exception + other than an SMC32/SMC64 exception is taken. + +The default implementation doesn't do anything, to avoid making assumptions +about the way the platform displays its status information. + +This function receives the exception type as its argument. Possible values for +exceptions types are listed in the [../include/runtime_svc.h] header file. Note +that these constants are not related to any architectural exception code; they +are just an ARM Trusted Firmware convention. + + +3. Modifications specific to a Boot Loader stage +------------------------------------------------- + +3.1 Boot Loader Stage 1 (BL1) +----------------------------- + +BL1 implements the reset vector where execution starts from after a cold or +warm boot. For each CPU, BL1 is responsible for the following tasks: + +1. Distinguishing between a cold boot and a warm boot. + +2. In the case of a cold boot and the CPU being the primary CPU, ensuring that + only this CPU executes the remaining BL1 code, including loading and passing + control to the BL2 stage. + +3. In the case of a cold boot and the CPU being a secondary CPU, ensuring that + the CPU is placed in a platform-specific state until the primary CPU + performs the necessary steps to remove it from this state. + +4. In the case of a warm boot, ensuring that the CPU jumps to a platform- + specific address in the BL3-1 image in the same processor mode as it was + when released from reset. + +5. Loading the BL2 image in secure memory using semi-hosting at the + address specified by the platform defined constant `BL2_BASE`. + +6. Populating a `meminfo` structure with the following information in memory, + accessible by BL2 immediately upon entry. + + meminfo.total_base = Base address of secure RAM visible to BL2 + meminfo.total_size = Size of secure RAM visible to BL2 + meminfo.free_base = Base address of secure RAM available for + allocation to BL2 + meminfo.free_size = Size of secure RAM available for allocation to BL2 + + BL1 places this `meminfo` structure at the beginning of the free memory + available for its use. Since BL1 cannot allocate memory dynamically at the + moment, its free memory will be available for BL2's use as-is. However, this + means that BL2 must read the `meminfo` structure before it starts using its + free memory (this is discussed in Section 3.2). + + In future releases of the ARM Trusted Firmware it will be possible for + the platform to decide where it wants to place the `meminfo` structure for + BL2. + + BL1 implements the `init_bl2_mem_layout()` function to populate the + BL2 `meminfo` structure. The platform may override this implementation, for + example if the platform wants to restrict the amount of memory visible to + BL2. Details of how to do this are given below. + +The following functions need to be implemented by the platform port to enable +BL1 to perform the above tasks. + + +### Function : platform_get_entrypoint() [mandatory] + + Argument : unsigned long + Return : unsigned int + +This function is called with the `SCTLR.M` and `SCTLR.C` bits disabled. The CPU +is identified by its `MPIDR`, which is passed as the argument. The function is +responsible for distinguishing between a warm and cold reset using platform- +specific means. If it's a warm reset then it returns the entrypoint into the +BL3-1 image that the CPU must jump to. If it's a cold reset then this function +must return zero. + +This function is also responsible for implementing a platform-specific mechanism +to handle the condition where the CPU has been warm reset but there is no +entrypoint to jump to. + +This function does not follow the Procedure Call Standard used by the +Application Binary Interface for the ARM 64-bit architecture. The caller should +not assume that callee saved registers are preserved across a call to this +function. + +This function fulfills requirement 1 listed above. + + +### Function : plat_secondary_cold_boot_setup() [mandatory] + + Argument : void + Return : void + +This function is called with the MMU and data caches disabled. It is responsible +for placing the executing secondary CPU in a platform-specific state until the +primary CPU performs the necessary actions to bring it out of that state and +allow entry into the OS. + +In the ARM FVP port, each secondary CPU powers itself off. The primary CPU is +responsible for powering up the secondary CPU when normal world software +requires them. + +This function fulfills requirement 3 above. + + +### Function : platform_cold_boot_init() [mandatory] + + Argument : unsigned long + Return : unsigned int + +This function executes with the MMU and data caches disabled. It is only called +by the primary CPU. The argument to this function is the address of the +`bl1_main()` routine where the generic BL1-specific actions are performed. +This function performs any platform-specific and architectural setup that the +platform requires to make execution of `bl1_main()` possible. + +The platform must enable the MMU with identity mapped page tables and enable +caches by setting the `SCTLR.I` and `SCTLR.C` bits. + +Platform-specific setup might include configuration of memory controllers, +configuration of the interconnect to allow the cluster to service cache snoop +requests from another cluster, zeroing of the ZI section, and so on. + +In the ARM FVP port, this function enables CCI snoops into the cluster that the +primary CPU is part of. It also enables the MMU and initializes the ZI section +in the BL1 image through the use of linker defined symbols. + +This function helps fulfill requirement 2 above. + + +### Function : bl1_platform_setup() [mandatory] + + Argument : void + Return : void + +This function executes with the MMU and data caches enabled. It is responsible +for performing any remaining platform-specific setup that can occur after the +MMU and data cache have been enabled. + +In the ARM FVP port, it zeros out the ZI section, enables the system level +implementation of the generic timer counter and initializes the console. + +This function helps fulfill requirement 5 above. + + +### Function : bl1_get_sec_mem_layout() [mandatory] + + Argument : void + Return : meminfo + +This function executes with the MMU and data caches enabled. The `meminfo` +structure returned by this function must contain the extents and availability of +secure RAM for the BL1 stage. + + meminfo.total_base = Base address of secure RAM visible to BL1 + meminfo.total_size = Size of secure RAM visible to BL1 + meminfo.free_base = Base address of secure RAM available for allocation + to BL1 + meminfo.free_size = Size of secure RAM available for allocation to BL1 + +This information is used by BL1 to load the BL2 image in secure RAM. BL1 also +populates a similar structure to tell BL2 the extents of memory available for +its own use. + +This function helps fulfill requirement 5 above. + + +### Function : init_bl2_mem_layout() [optional] + + Argument : meminfo *, meminfo *, unsigned int, unsigned long + Return : void + +Each BL stage needs to tell the next stage the amount of secure RAM available +for it to use. For example, as part of handing control to BL2, BL1 informs BL2 +of the extents of secure RAM available for BL2 to use. BL2 must do the same when +passing control to BL3-1. This information is populated in a `meminfo` +structure. + +Depending upon where BL2 has been loaded in secure RAM (determined by +`BL2_BASE`), BL1 calculates the amount of free memory available for BL2 to use. +BL1 also ensures that its data sections resident in secure RAM are not visible +to BL2. An illustration of how this is done in the ARM FVP port is given in the +[User Guide], in the Section "Memory layout on Base FVP". + + +3.2 Boot Loader Stage 2 (BL2) +----------------------------- + +The BL2 stage is executed only by the primary CPU, which is determined in BL1 +using the `platform_is_primary_cpu()` function. BL1 passed control to BL2 at +`BL2_BASE`. BL2 executes in Secure EL1 and is responsible for: + +1. Loading the BL3-1 binary image in secure RAM using semi-hosting. To load the + BL3-1 image, BL2 makes use of the `meminfo` structure passed to it by BL1. + This structure allows BL2 to calculate how much secure RAM is available for + its use. The platform also defines the address in secure RAM where BL3-1 is + loaded through the constant `BL31_BASE`. BL2 uses this information to + determine if there is enough memory to load the BL3-1 image. + +2. Arranging to pass control to a normal world BL image that has been + pre-loaded at a platform-specific address. This address is determined using + the `plat_get_ns_image_entrypoint()` function described below. + + BL2 populates an `el_change_info` structure in memory provided by the + platform with information about how BL3-1 should pass control to the normal + world BL image. + +3. Populating a `meminfo` structure with the following information in + memory that is accessible by BL3-1 immediately upon entry. + + meminfo.total_base = Base address of secure RAM visible to BL3-1 + meminfo.total_size = Size of secure RAM visible to BL3-1 + meminfo.free_base = Base address of secure RAM available for allocation + to BL3-1 + meminfo.free_size = Size of secure RAM available for allocation to + BL3-1 + + BL2 places this `meminfo` structure in memory provided by the + platform (`bl2_el_change_mem_ptr`). BL2 implements the + `init_bl31_mem_layout()` function to populate the BL3-1 meminfo structure + described above. The platform may override this implementation, for example + if the platform wants to restrict the amount of memory visible to BL3-1. + Details of this function are given below. + +The following functions must be implemented by the platform port to enable BL2 +to perform the above tasks. + + +### Function : bl2_early_platform_setup() [mandatory] + + Argument : meminfo *, void * + Return : void + +This function executes with the MMU and data caches disabled. It is only called +by the primary CPU. The arguments to this function are: + +* The address of the `meminfo` structure populated by BL1 +* An opaque pointer that the platform may use as needed. + +The platform must copy the contents of the `meminfo` structure into a private +variable as the original memory may be subsequently overwritten by BL2. The +copied structure is made available to all BL2 code through the +`bl2_get_sec_mem_layout()` function. + + +### Function : bl2_plat_arch_setup() [mandatory] + + Argument : void + Return : void + +This function executes with the MMU and data caches disabled. It is only called +by the primary CPU. + +The purpose of this function is to perform any architectural initialization +that varies across platforms, for example enabling the MMU (since the memory +map differs across platforms). + + +### Function : bl2_platform_setup() [mandatory] + + Argument : void + Return : void + +This function may execute with the MMU and data caches enabled if the platform +port does the necessary initialization in `bl2_plat_arch_setup()`. It is only +called by the primary CPU. + +The purpose of this function is to perform any platform initialization specific +to BL2. This function must initialize a pointer to memory +(`bl2_el_change_mem_ptr`), which can then be used to populate an +`el_change_info` structure. The underlying requirement is that the platform must +initialize this pointer before the `get_el_change_mem_ptr()` function +accesses it in `bl2_main()`. + +The ARM FVP port initializes this pointer to the base address of Secure DRAM +(`0x06000000`). + + +### Variable : unsigned char bl2_el_change_mem_ptr[EL_CHANGE_MEM_SIZE] [mandatory] + +As mentioned in the description of `bl2_platform_setup()`, this pointer is +initialized by the platform to point to memory where an `el_change_info` +structure can be populated. + + +### Function : bl2_get_sec_mem_layout() [mandatory] + + Argument : void + Return : meminfo + +This function may execute with the MMU and data caches enabled if the platform +port does the necessary initialization in `bl2_plat_arch_setup()`. It is only +called by the primary CPU. + +The purpose of this function is to return a `meminfo` structure populated with +the extents of secure RAM available for BL2 to use. See +`bl2_early_platform_setup()` above. + + +### Function : init_bl31_mem_layout() [optional] + + Argument : meminfo *, meminfo *, unsigned int + Return : void + +Each BL stage needs to tell the next stage the amount of secure RAM that is +available for it to use. For example, as part of handing control to BL2, BL1 +must inform BL2 about the extents of secure RAM that is available for BL2 to +use. BL2 must do the same when passing control to BL3-1. This information is +populated in a `meminfo` structure. + +Depending upon where BL3-1 has been loaded in secure RAM (determined by +`BL31_BASE`), BL2 calculates the amount of free memory available for BL3-1 to +use. BL2 also ensures that BL3-1 is able reclaim memory occupied by BL2. This +is done because BL2 never executes again after passing control to BL3-1. +An illustration of how this is done in the ARM FVP port is given in the +[User Guide], in the section "Memory layout on Base FVP". + + +### Function : plat_get_ns_image_entrypoint() [mandatory] + + Argument : void + Return : unsigned long + +As previously described, BL2 is responsible for arranging for control to be +passed to a normal world BL image through BL3-1. This function returns the +entrypoint of that image, which BL3-1 uses to jump to it. + +The ARM FVP port assumes that flash memory has been pre-loaded with the UEFI +image, and so returns the base address of flash memory. + + +3.2 Boot Loader Stage 3-1 (BL3-1) +--------------------------------- + +During cold boot, the BL3-1 stage is executed only by the primary CPU. This is +determined in BL1 using the `platform_is_primary_cpu()` function. BL1 passes +control to BL3-1 at `BL31_BASE`. During warm boot, BL3-1 is executed by all +CPUs. BL3-1 executes at EL3 and is responsible for: + +1. Re-initializing all architectural and platform state. Although BL1 performs + some of this initialization, BL3-1 remains resident in EL3 and must ensure + that EL3 architectural and platform state is completely initialized. It + should make no assumptions about the system state when it receives control. + +2. Passing control to a normal world BL image, pre-loaded at a platform- + specific address by BL2. BL3-1 uses the `el_change_info` structure that BL2 + populated in memory to do this. + +3. Providing runtime firmware services. Currently, BL3-1 only implements a + subset of the Power State Coordination Interface (PSCI) API as a runtime + service. See Section 3.3 below for details of porting the PSCI + implementation. + +The following functions must be implemented by the platform port to enable BL3-1 +to perform the above tasks. + + +### Function : bl31_early_platform_setup() [mandatory] + + Argument : meminfo *, void *, unsigned long + Return : void + +This function executes with the MMU and data caches disabled. It is only called +by the primary CPU. The arguments to this function are: + +* The address of the `meminfo` structure populated by BL2. +* An opaque pointer that the platform may use as needed. +* The `MPIDR` of the primary CPU. + +The platform must copy the contents of the `meminfo` structure into a private +variable as the original memory may be subsequently overwritten by BL3-1. The +copied structure is made available to all BL3-1 code through the +`bl31_get_sec_mem_layout()` function. + + +### Function : bl31_plat_arch_setup() [mandatory] + + Argument : void + Return : void + +This function executes with the MMU and data caches disabled. It is only called +by the primary CPU. + +The purpose of this function is to perform any architectural initialization +that varies across platforms, for example enabling the MMU (since the memory +map differs across platforms). + + +### Function : bl31_platform_setup() [mandatory] + + Argument : void + Return : void + +This function may execute with the MMU and data caches enabled if the platform +port does the necessary initialization in `bl31_plat_arch_setup()`. It is only +called by the primary CPU. + +The purpose of this function is to complete platform initialization so that both +BL3-1 runtime services and normal world software can function correctly. + +The ARM FVP port does the following: +* Initializes the generic interrupt controller. +* Configures the CLCD controller. +* Grants access to the system counter timer module +* Initializes the FVP power controller device +* Detects the system topology. + + +### Function : bl31_get_next_image_info() [mandatory] + + Argument : unsigned long + Return : el_change_info * + +This function may execute with the MMU and data caches enabled if the platform +port does the necessary initializations in `bl31_plat_arch_setup()`. + +This function is called by `bl31_main()` to retrieve information provided by +BL2, so that BL3-1 can pass control to the normal world software image. This +function must return a pointer to the `el_change_info` structure (that was +copied during `bl31_early_platform_setup()`). + + +### Function : bl31_get_sec_mem_layout() [mandatory] + + Argument : void + Return : meminfo + +This function may execute with the MMU and data caches enabled if the platform +port does the necessary initializations in `bl31_plat_arch_setup()`. It is only +called by the primary CPU. + +The purpose of this function is to return a `meminfo` structure populated with +the extents of secure RAM available for BL3-1 to use. See +`bl31_early_platform_setup()` above. + + +3.3 Power State Coordination Interface (in BL3-1) +------------------------------------------------ + +The ARM Trusted Firmware's implementation of the PSCI API is based around the +concept of an _affinity instance_. Each _affinity instance_ can be uniquely +identified in a system by a CPU ID (the processor `MPIDR` is used in the PSCI +interface) and an _affinity level_. A processing element (for example, a +CPU) is at level 0. If the CPUs in the system are described in a tree where the +node above a CPU is a logical grouping of CPUs that share some state, then +affinity level 1 is that group of CPUs (for example, a cluster), and affinity +level 2 is a group of clusters (for example, the system). The implementation +assumes that the affinity level 1 ID can be computed from the affinity level 0 +ID (for example, a unique cluster ID can be computed from the CPU ID). The +current implementation computes this on the basis of the recommended use of +`MPIDR` affinity fields in the ARM Architecture Reference Manual. + +BL3-1's platform initialization code exports a pointer to the platform-specific +power management operations required for the PSCI implementation to function +correctly. This information is populated in the `plat_pm_ops` structure. The +PSCI implementation calls members of the `plat_pm_ops` structure for performing +power management operations for each affinity instance. For example, the target +CPU is specified by its `MPIDR` in a PSCI `CPU_ON` call. The `affinst_on()` +handler (if present) is called for each affinity instance as the PSCI +implementation powers up each affinity level implemented in the `MPIDR` (for +example, CPU, cluster and system). + +The following functions must be implemented to initialize PSCI functionality in +the ARM Trusted Firmware. + + +### Function : plat_get_aff_count() [mandatory] + + Argument : unsigned int, unsigned long + Return : unsigned int + +This function may execute with the MMU and data caches enabled if the platform +port does the necessary initializations in `bl31_plat_arch_setup()`. It is only +called by the primary CPU. + +This function is called by the PSCI initialization code to detect the system +topology. Its purpose is to return the number of affinity instances implemented +at a given `affinity level` (specified by the first argument) and a given +`MPIDR` (specified by the second argument). For example, on a dual-cluster +system where first cluster implements 2 CPUs and the second cluster implements 4 +CPUs, a call to this function with an `MPIDR` corresponding to the first cluster +(`0x0`) and affinity level 0, would return 2. A call to this function with an +`MPIDR` corresponding to the second cluster (`0x100`) and affinity level 0, +would return 4. + + +### Function : plat_get_aff_state() [mandatory] + + Argument : unsigned int, unsigned long + Return : unsigned int + +This function may execute with the MMU and data caches enabled if the platform +port does the necessary initializations in `bl31_plat_arch_setup()`. It is only +called by the primary CPU. + +This function is called by the PSCI initialization code. Its purpose is to +return the state of an affinity instance. The affinity instance is determined by +the affinity ID at a given `affinity level` (specified by the first argument) +and an `MPIDR` (specified by the second argument). The state can be one of +`PSCI_AFF_PRESENT` or `PSCI_AFF_ABSENT`. The latter state is used to cater for +system topologies where certain affinity instances are unimplemented. For +example, consider a platform that implements a single cluster with 4 CPUs and +another CPU implemented directly on the interconnect with the cluster. The +`MPIDR`s of the cluster would range from `0x0-0x3`. The `MPIDR` of the single +CPU would be 0x100 to indicate that it does not belong to cluster 0. Cluster 1 +is missing but needs to be accounted for to reach this single CPU in the +topology tree. Hence it is marked as `PSCI_AFF_ABSENT`. + + +### Function : plat_get_max_afflvl() [mandatory] + + Argument : void + Return : int + +This function may execute with the MMU and data caches enabled if the platform +port does the necessary initializations in `bl31_plat_arch_setup()`. It is only +called by the primary CPU. + +This function is called by the PSCI implementation both during cold and warm +boot, to determine the maximum affinity level that the power management +operations should apply to. ARMv8 has support for 4 affinity levels. It is +likely that hardware will implement fewer affinity levels. This function allows +the PSCI implementation to consider only those affinity levels in the system +that the platform implements. For example, the Base AEM FVP implements two +clusters with a configurable number of CPUs. It reports the maximum affinity +level as 1, resulting in PSCI power control up to the cluster level. + + +### Function : platform_setup_pm() [mandatory] + + Argument : plat_pm_ops ** + Return : int + +This function may execute with the MMU and data caches enabled if the platform +port does the necessary initializations in `bl31_plat_arch_setup()`. It is only +called by the primary CPU. + +This function is called by PSCI initialization code. Its purpose is to export +handler routines for platform-specific power management actions by populating +the passed pointer with a pointer to BL3-1's private `plat_pm_ops` structure. + +A description of each member of this structure is given below. Please refer to +the ARM FVP specific implementation of these handlers in [../plat/fvp/fvp_pm.c] +as an example. A platform port may choose not implement some of the power +management operations. For example, the ARM FVP port does not implement the +`affinst_standby()` function. + +#### plat_pm_ops.affinst_standby() + +Perform the platform-specific setup to enter the standby state indicated by the +passed argument. + +#### plat_pm_ops.affinst_on() + +Perform the platform specific setup to power on an affinity instance, specified +by the `MPIDR` (first argument) and `affinity level` (fourth argument). The +`state` (fifth argument) contains the current state of that affinity instance +(ON or OFF). This is useful to determine whether any action must be taken. For +example, while powering on a CPU, the cluster that contains this CPU might +already be in the ON state. The platform decides what actions must be taken to +transition from the current state to the target state (indicated by the power +management operation). + +#### plat_pm_ops.affinst_off() + +Perform the platform specific setup to power off an affinity instance in the +`MPIDR` of the calling CPU. It is called by the PSCI `CPU_OFF` API +implementation. + +The `MPIDR` (first argument), `affinity level` (second argument) and `state` +(third argument) have a similar meaning as described in the `affinst_on()` +operation. They are used to identify the affinity instance on which the call +is made and its current state. This gives the platform port an indication of the +state transition it must make to perform the requested action. For example, if +the calling CPU is the last powered on CPU in the cluster, after powering down +affinity level 0 (CPU), the platform port should power down affinity level 1 +(the cluster) as well. + +This function is called with coherent stacks. This allows the PSCI +implementation to flush caches at a given affinity level without running into +stale stack state after turning off the caches. On ARMv8 cache hits do not occur +after the cache has been turned off. + +#### plat_pm_ops.affinst_suspend() + +Perform the platform specific setup to power off an affinity instance in the +`MPIDR` of the calling CPU. It is called by the PSCI `CPU_SUSPEND` API +implementation. + +The `MPIDR` (first argument), `affinity level` (third argument) and `state` +(fifth argument) have a similar meaning as described in the `affinst_on()` +operation. They are used to identify the affinity instance on which the call +is made and its current state. This gives the platform port an indication of the +state transition it must make to perform the requested action. For example, if +the calling CPU is the last powered on CPU in the cluster, after powering down +affinity level 0 (CPU), the platform port should power down affinity level 1 +(the cluster) as well. + +The difference between turning an affinity instance off versus suspending it +is that in the former case, the affinity instance is expected to re-initialize +its state when its next powered on (see `affinst_on_finish()`). In the latter +case, the affinity instance is expected to save enough state so that it can +resume execution by restoring this state when its powered on (see +`affinst_suspend_finish()`). + +This function is called with coherent stacks. This allows the PSCI +implementation to flush caches at a given affinity level without running into +stale stack state after turning off the caches. On ARMv8 cache hits do not occur +after the cache has been turned off. + +#### plat_pm_ops.affinst_on_finish() + +This function is called by the PSCI implementation after the calling CPU is +powered on and released from reset in response to an earlier PSCI `CPU_ON` call. +It performs the platform-specific setup required to initialize enough state for +this CPU to enter the normal world and also provide secure runtime firmware +services. + +The `MPIDR` (first argument), `affinity level` (second argument) and `state` +(third argument) have a similar meaning as described in the previous operations. + +This function is called with coherent stacks. This allows the PSCI +implementation to flush caches at a given affinity level without running into +stale stack state after turning off the caches. On ARMv8 cache hits do not occur +after the cache has been turned off. + +#### plat_pm_ops.affinst_on_suspend() + +This function is called by the PSCI implementation after the calling CPU is +powered on and released from reset in response to an asynchronous wakeup +event, for example a timer interrupt that was programmed by the CPU during the +`CPU_SUSPEND` call. It performs the platform-specific setup required to +restore the saved state for this CPU to resume execution in the normal world +and also provide secure runtime firmware services. + +The `MPIDR` (first argument), `affinity level` (second argument) and `state` +(third argument) have a similar meaning as described in the previous operations. + +This function is called with coherent stacks. This allows the PSCI +implementation to flush caches at a given affinity level without running into +stale stack state after turning off the caches. On ARMv8 cache hits do not occur +after the cache has been turned off. + +BL3-1 platform initialization code must also detect the system topology and +the state of each affinity instance in the topology. This information is +critical for the PSCI runtime service to function correctly. More details are +provided in the description of the `plat_get_aff_count()` and +`plat_get_aff_state()` functions above. + + +- - - - - - - - - - - - - - - - - - - - - - - - - - + +_Copyright (c) 2013 ARM Ltd. All rights reserved._ + + +[User Guide]: user-guide.md + +[../plat/common/aarch64/platform_helpers.S]: ../plat/common/aarch64/platform_helpers.S +[../plat/fvp/platform.h]: ../plat/fvp/platform.h +[../plat/fvp/aarch64/fvp_common.c]: ../plat/fvp/aarch64/fvp_common.c +[../plat/fvp/fvp_pm.c]: ../plat/fvp/fvp_pm.c +[../include/runtime_svc.h]: ../include/runtime_svc.h diff --git a/docs/user-guide.md b/docs/user-guide.md new file mode 100644 index 0000000..20483e4 --- /dev/null +++ b/docs/user-guide.md @@ -0,0 +1,961 @@ +ARM Trusted Firmware User Guide +=============================== + +Contents : + +1. Introduction +2. Using the Software +3. Firmware Design +4. References + + +1. Introduction +---------------- + +The ARM Trusted Firmware implements a subset of the Trusted Board Boot +Requirements (TBBR) Platform Design Document (PDD) [1] for ARM reference +platforms. The TBB sequence starts when the platform is powered on and runs up +to the stage where it hands-off control to firmware running in the normal +world in DRAM. This is the cold boot path. + +The ARM Trusted Firmware also implements the Power State Coordination Interface +([PSCI]) PDD [2] as a runtime service. PSCI is the interface from normal world +software to firmware implementing power management use-cases (for example, +secondary CPU boot, hotplug and idle). Normal world software can access ARM +Trusted Firmware runtime services via the ARM SMC (Secure Monitor Call) +instruction. The SMC instruction must be used as mandated by the [SMC Calling +Convention PDD][SMCCC] [3]. + + +2. Using the Software +---------------------- + +### Host machine requirements + +The minimum recommended machine specification is an Intel Core2Duo clocking at +2.6GHz or above, and 12GB RAM. For best performance, use a machine with Intel +Core i7 (SandyBridge) and 16GB of RAM. + + +### Tools + +The following tools are required to use the ARM Trusted Firmware: + +* Ubuntu desktop OS. The software has been tested on Ubuntu 12.04.02 (64-bit). + The following packages are also needed: + +* `ia32-libs` package. + +* `make` and `uuid-dev` packages for building UEFI. + +* `bc` and `ncurses-dev` packages for building Linux. + +* Baremetal GNU GCC tools. Verified packages can be downloaded from [Linaro] + [Linaro Toolchain]. The rest of this document assumes that the + `gcc-linaro-aarch64-none-elf-4.8-2013.09-01_linux.tar.xz` tools are used. + + wget http://releases.linaro.org/13.09/components/toolchain/binaries/gcc-linaro-aarch64-none-elf-4.8-2013.09-01_linux.tar.xz + tar -xf gcc-linaro-aarch64-none-elf-4.8-2013.09-01_linux.tar.xz + +* The Device Tree Compiler (DTC) included with Linux kernel 3.12-rc4 is used + to build the Flattened Device Tree (FDT) source files (`.dts` files) + provided with this release. + +* (Optional) For debugging, ARM [Development Studio 5 (DS-5)][DS-5] v5.16. + + +### Building the Trusted Firmware + +To build the software for the Base FVPs, follow these steps: + +1. Clone the ARM Trusted Firmware repository from Github: + + git clone https://github.com/ARM-software/arm-trusted-firmware.git + +2. Change to the trusted firmware directory: + + cd arm-trusted-firmware + +3. Set the compiler path and build: + + CROSS_COMPILE=<path/to>/aarch64-none-elf- make + + By default this produces a release version of the build. To produce a debug + version instead, refer to the "Debugging options" section below. + + The build creates ELF and raw binary files in the current directory. It + generates the following boot loader binary files from the ELF files: + + * `bl1.bin` + * `bl2.bin` + * `bl31.bin` + +4. Copy the above 3 boot loader binary files to the directory where the FVPs + are launched from. Symbolic links of the same names may be created instead. + +5. (Optional) To clean the build directory use + + make distclean + + +#### Debugging options + +To compile a debug version and make the build more verbose use + + CROSS_COMPILE=<path/to>/aarch64-none-elf- make DEBUG=1 V=1 + +AArch64 GCC uses DWARF version 4 debugging symbols by default. Some tools (for +example DS-5) might not support this and may need an older version of DWARF +symbols to be emitted by GCC. This can be achieved by using the +`-gdwarf-<version>` flag, with the version being set to 2 or 3. Setting the +version to 2 is recommended for DS-5 versions older than 5.16. + +When debugging logic problems it might also be useful to disable all compiler +optimizations by using `-O0`. + +NOTE: Using `-O0` could cause output images to be larger and base addresses +might need to be recalculated (see the later memory layout section). + +Extra debug options can be passed to the build system by setting `CFLAGS`: + + CFLAGS='-O0 -gdwarf-2' CROSS_COMPILE=<path/to>/aarch64-none-elf- make DEBUG=1 V=1 + + +### Obtaining the normal world software + +#### Obtaining UEFI + +Download an archive of the [EDK2 (EFI Development Kit 2) source code][EDK2] +supporting the Base FVPs. EDK2 is an open source implementation of the UEFI +specification: + + wget http://sourceforge.net/projects/edk2/files/ARM/aarch64-uefi-rev14582.tgz/download -O aarch64-uefi-rev14582.tgz + tar -xf aarch64-uefi-rev14582.tgz + +To build the software for the Base FVPs, follow these steps: + +1. Change into the unpacked EDK2 source directory + + cd uefi + +2. Copy build config templates to local workspace + + export EDK_TOOLS_PATH=$(pwd)/BaseTools + . edksetup.sh $(pwd)/BaseTools/ + +3. Rebuild EDK2 host tools + + make -C "$EDK_TOOLS_PATH" clean + make -C "$EDK_TOOLS_PATH" + +4. Build the software + + AARCH64GCC_TOOLS_PATH=<full-path-to-aarch64-gcc>/bin/ \ + build -v -d3 -a AARCH64 -t ARMGCC \ + -p ArmPlatformPkg/ArmVExpressPkg/ArmVExpress-FVP-AArch64.dsc + + The EDK2 binary for use with the ARM Trusted Firmware can then be found + here: + + Build/ArmVExpress-FVP-AArch64/DEBUG_ARMGCC/FV/FVP_AARCH64_EFI.fd + +This will build EDK2 for the default settings as used by the FVPs. + +To boot Linux using a VirtioBlock file-system, the command line passed from EDK2 +to the Linux kernel must be modified as described in the "Obtaining a +File-system" section below. + +If legacy GICv2 locations are used, the EDK2 platform description must be +updated. This is required as EDK2 does not support probing for the GIC location. +To do this, open the `ArmPlatformPkg/ArmVExpressPkg/ArmVExpress-FVP-AArch64.dsc` +file for editing and make the modifications as below. Rebuild EDK2 after doing a +`clean`. + + gArmTokenSpaceGuid.PcdGicDistributorBase|0x2C001000 + gArmTokenSpaceGuid.PcdGicInterruptInterfaceBase|0x2C002000 + +The EDK2 binary `FVP_AARCH64_EFI.fd` should be loaded into FVP FLASH0 via model +parameters as described in the "Running the Software" section below. + +#### Obtaining a Linux kernel + +The software has been verified using Linux kernel version 3.12-rc4. Patches +have been applied to the kernel in order to enable CPU hotplug. + +Preparing a Linux kernel for use on the FVPs with hotplug support can +be done as follows (GICv2 support only): + +1. Clone Linux: + + git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git + + The CPU hotplug features are not yet included in the mainline kernel. To use + these, add the patches from Mark Rutland's kernel, based on Linux 3.12-rc4: + + cd linux + git remote add -f --tags markr git://linux-arm.org/linux-mr.git + git checkout -b hotplug arm64-cpu-hotplug-20131023 + +2. Build with the Linaro GCC tools. + + # in linux/ + make mrproper + make ARCH=arm64 defconfig + + # Enable Hotplug + make ARCH=arm64 menuconfig + # Kernel Features ---> [*] Support for hot-pluggable CPUs + + CROSS_COMPILE=/path/to/aarch64-none-elf- make -j6 ARCH=arm64 + +3. Copy the Linux image `arch/arm64/boot/Image` to the working directory from + where the FVP is launched. A symbolic link may also be created instead. + +#### Obtaining the Flattened Device Trees + +Depending on the FVP configuration and Linux configuration used, different +FDT files are required. FDTs for the Base FVP can be found in the Trusted +Firmware source directory under `fdts`. + +* `fvp-base-gicv2-psci.dtb` + + (Default) For use with both AEMv8 and Cortex-A57-A53 Base FVPs with + default memory map configuration. + +* `fvp-base-gicv2legacy-psci.dtb` + + For use with both AEMv8 and Cortex-A57-A53 Base FVPs with legacy GICv2 + memory map configuration. + +* `fvp-base-gicv3-psci.dtb` + + For use with AEMv8 Base FVP with default memory map configuration and + Linux GICv3 support. + +Copy the chosen FDT blob as `fdt.dtb` to the directory from which the FVP +is launched. A symbolic link may also be created instead. + +#### Obtaining a File-system + +To prepare a Linaro LAMP based Open Embedded file-system, the following +instructions can be used as a guide. The file-system can be provided to Linux +via VirtioBlock or as a RAM-disk. Both methods are described below. + +##### Prepare VirtioBlock + +To prepare a VirtioBlock file-system, do the following: + +1. Download and unpack the disk image. + + NOTE: The unpacked disk image grows to 2 GiB in size. + + wget http://releases.linaro.org/13.09/openembedded/aarch64/vexpress64-openembedded_lamp-armv8_20130927-7.img.gz + gunzip vexpress64-openembedded_lamp-armv8_20130927-7.img.gz + +2. Make sure the Linux kernel has Virtio support enabled using + `make ARCH=arm64 menuconfig`. + + Device Drivers ---> Virtio drivers ---> <*> Platform bus driver for memory mapped virtio devices + Device Drivers ---> [*] Block devices ---> <*> Virtio block driver + File systems ---> <*> The Extended 4 (ext4) filesystem + + If some of these configurations are missing, enable them, save the kernel + configuration, then rebuild the kernel image using the instructions provided + in the section "Obtaining a Linux kernel". + +3. Change the Kernel command line to include `root=/dev/vda2`. This can either + be done in the EDK2 boot menu or in the platform file. Editing the platform + file and rebuilding EDK2 will make the change persist. To do this: + + 1. In EDK, edit the following file: + + ArmPlatformPkg/ArmVExpressPkg/ArmVExpress-FVP-AArch64.dsc + + 2. Add `root=/dev/vda2` to: + + gArmPlatformTokenSpaceGuid.PcdDefaultBootArgument|"<Other default options>" + + 3. Remove the entry: + + gArmPlatformTokenSpaceGuid.PcdDefaultBootInitrdPath|"" + + 4. Rebuild EDK2 (see "Obtaining UEFI" section above). + +4. The file-system image file should be provided to the model environment by + passing it the correct command line option. In the Base FVP the following + option should be provided in addition to the ones described in the + "Running the software" section below. + + NOTE: A symbolic link to this file cannot be used with the FVP; the path + to the real file must be provided. + + -C bp.virtioblockdevice.image_path="<path/to/>vexpress64-openembedded_lamp-armv8_20130927-7.img" + +5. Ensure that the FVP doesn't output any error messages. If the following + error message is displayed: + + ERROR: BlockDevice: Failed to open "vexpress64-openembedded_lamp-armv8_20130927-7.img"! + + then make sure the path to the file-system image in the model parameter is + correct and that read permission is correctly set on the file-system image + file. + +##### Prepare RAM-disk + +NOTE: The RAM-disk option does not currently work with the Linux kernel version +described above; use the VirtioBlock method instead. For further information +please see the "Known issues" section in the [Change Log]. + +To Prepare a RAM-disk file-system, do the following: + +1. Download the file-system image: + + wget http://releases.linaro.org/13.09/openembedded/aarch64/linaro-image-lamp-genericarmv8-20130912-487.rootfs.tar.gz + +2. Modify the Linaro image: + + # Prepare for use as RAM-disk. Normally use MMC, NFS or VirtioBlock. + # Be careful, otherwise you could damage your host file-system. + mkdir tmp; cd tmp + sudo sh -c "zcat ../linaro-image-lamp-genericarmv8-20130912-487.rootfs.tar.gz | cpio -id" + sudo ln -s sbin/init . + sudo ln -s S35mountall.sh etc/rcS.d/S03mountall.sh + sudo sh -c "echo 'devtmpfs /dev devtmpfs mode=0755,nosuid 0 0' >> etc/fstab" + sudo sh -c "find . | cpio --quiet -H newc -o | gzip -3 -n > ../filesystem.cpio.gz" + cd .. + +3. Copy the resultant `filesystem.cpio.gz` to the directory where the FVP is + launched from. A symbolic link may also be created instead. + + +### Running the software + +This release of the ARM Trusted Firmware has been tested on the following ARM +FVPs (64-bit versions only). + +* `FVP_Base_AEMv8A-AEMv8A` (Version 5.1 build 8) +* `FVP_Base_Cortex-A57x4-A53x4` (Version 5.1 build 8) + +Please refer to the FVP documentation for a detailed description of the model +parameter options. A brief description of the important ones that affect the +ARM Trusted Firmware and normal world software behavior is provided below. + +#### Running on the AEMv8 Base FVP + +The following `FVP_Base_AEMv8A-AEMv8A` parameters should be used to boot Linux +with 8 CPUs using the ARM Trusted Firmware. + +NOTE: Using `cache_state_modelled=1` makes booting very slow. The software will +still work (and run much faster) without this option but this will hide any +cache maintenance defects in the software. + +NOTE: Using the `-C bp.virtioblockdevice.image_path` parameter is not necessary +if a Linux RAM-disk file-system is used (see the "Obtaining a File-system" +section above). + + FVP_Base_AEMv8A-AEMv8A \ + -C pctl.startup=0.0.0.0 \ + -C bp.secure_memory=0 \ + -C cluster0.NUM_CORES=4 \ + -C cluster1.NUM_CORES=4 \ + -C cache_state_modelled=1 \ + -C bp.pl011_uart0.untimed_fifos=1 \ + -C bp.secureflashloader.fname=<path to bl1.bin> \ + -C bp.flashloader0.fname=<path to UEFI binary> \ + -C bp.virtioblockdevice.image_path="<path/to/>vexpress64-openembedded_lamp-armv8_20130927-7.img" + +#### Running on the Cortex-A57-A53 Base FVP + +The following `FVP_Base_Cortex-A57x4-A53x4` model parameters should be used to +boot Linux with 8 CPUs using the ARM Trusted Firmware. + +NOTE: Using `cache_state_modelled=1` makes booting very slow. The software will +still work (and run much faster) without this option but this will hide any +cache maintenance defects in the software. + +NOTE: Using the `-C bp.virtioblockdevice.image_path` parameter is not necessary +if a Linux RAM-disk file-system is used (see the "Obtaining a File-system" +section above). + + FVP_Base_Cortex-A57x4-A53x4 \ + -C pctl.startup=0.0.0.0 \ + -C bp.secure_memory=0 \ + -C cache_state_modelled=1 \ + -C bp.pl011_uart0.untimed_fifos=1 \ + -C bp.secureflashloader.fname=<path to bl1.bin> \ + -C bp.flashloader0.fname=<path to UEFI binary> \ + -C bp.virtioblockdevice.image_path="<path/to/>vexpress64-openembedded_lamp-armv8_20130927-7.img" + +### Configuring the GICv2 memory map + +The Base FVP models support GICv2 with the default model parameters at the +following addresses. + + GICv2 Distributor Interface 0x2f000000 + GICv2 CPU Interface 0x2c000000 + GICv2 Virtual CPU Interface 0x2c010000 + GICv2 Hypervisor Interface 0x2c02f000 + +The models can be configured to support GICv2 at addresses corresponding to the +legacy (Versatile Express) memory map as follows. + + GICv2 Distributor Interface 0x2c001000 + GICv2 CPU Interface 0x2c002000 + GICv2 Virtual CPU Interface 0x2c004000 + GICv2 Hypervisor Interface 0x2c006000 + +The choice of memory map is reflected in the build field (bits[15:12]) in the +`SYS_ID` register (Offset `0x0`) in the Versatile Express System registers +memory map (`0x1c010000`). + +* `SYS_ID.Build[15:12]` + + `0x1` corresponds to the presence of the default GICv2 memory map. This is + the default value. + +* `SYS_ID.Build[15:12]` + + `0x0` corresponds to the presence of the Legacy VE GICv2 memory map. This + value can be configured as described in the next section. + +NOTE: If the legacy VE GICv2 memory map is used, then the corresponding FDT and +UEFI images should be used. + +#### Configuring AEMv8 Base FVP for legacy VE memory map + +The following parameters configure the GICv2 memory map in legacy VE mode: + +NOTE: Using the `-C bp.virtioblockdevice.image_path` parameter is not necessary +if a Linux RAM-disk file-system is used (see the "Obtaining a File-system" +section above). + + FVP_Base_AEMv8A-AEMv8A \ + -C cluster0.gic.GICD-offset=0x1000 \ + -C cluster0.gic.GICC-offset=0x2000 \ + -C cluster0.gic.GICH-offset=0x4000 \ + -C cluster0.gic.GICH-other-CPU-offset=0x5000 \ + -C cluster0.gic.GICV-offset=0x6000 \ + -C cluster0.gic.PERIPH-size=0x8000 \ + -C cluster1.gic.GICD-offset=0x1000 \ + -C cluster1.gic.GICC-offset=0x2000 \ + -C cluster1.gic.GICH-offset=0x4000 \ + -C cluster1.gic.GICH-other-CPU-offset=0x5000 \ + -C cluster1.gic.GICV-offset=0x6000 \ + -C cluster1.gic.PERIPH-size=0x8000 \ + -C gic_distributor.GICD-alias=0x2c001000 \ + -C bp.variant=0x0 \ + -C bp.virtioblockdevice.image_path="<path/to/>vexpress64-openembedded_lamp-armv8_20130927-7.img" + +The last parameter sets the build variant field of the `SYS_ID` register to +`0x0`. This allows the ARM Trusted Firmware to detect the legacy VE memory map +while configuring the GIC. + +#### Configuring Cortex-A57-A53 Base FVP for legacy VE memory map + +Configuration of the GICv2 as per the legacy VE memory map is controlled by +the following parameter. In this case, separate configuration of the `SYS_ID` +register is not required. + +NOTE: Using the `-C bp.virtioblockdevice.image_path` parameter is not necessary +if a Linux RAM-disk file-system is used (see the "Obtaining a File-system" +section above). + + FVP_Base_Cortex-A57x4-A53x4 \ + -C legacy_gicv2_map=1 \ + -C bp.virtioblockdevice.image_path="<path/to/>vexpress64-openembedded_lamp-armv8_20130927-7.img" + +3. Firmware Design +------------------- + +The cold boot path starts when the platform is physically turned on. One of +the CPUs released from reset is chosen as the primary CPU, and the remaining +CPUs are considered secondary CPUs. The primary CPU is chosen through +platform-specific means. The cold boot path is mainly executed by the primary +CPU, other than essential CPU initialization executed by all CPUs. The +secondary CPUs are kept in a safe platform-specific state until the primary +CPU has performed enough initialization to boot them. + +The cold boot path in this implementation of the ARM Trusted Firmware is divided +into three stages (in order of execution): + +* Boot Loader stage 1 (BL1) +* Boot Loader stage 2 (BL2) +* Boot Loader stage 3 (BL3-1). The '1' distinguishes this from other 3rd level + boot loader stages. + +The ARM Fixed Virtual Platforms (FVPs) provide trusted ROM, trusted SRAM and +trusted DRAM regions. Each boot loader stage uses one or more of these +memories for its code and data. + + +### BL1 + +This stage begins execution from the platform's reset vector in trusted ROM at +EL3. BL1 code starts at `0x00000000` (trusted ROM) in the FVP memory map. The +BL1 data section is placed at the start of trusted SRAM, `0x04000000`. The +functionality implemented by this stage is as follows. + +#### Determination of boot path + +Whenever a CPU is released from reset, BL1 needs to distinguish between a warm +boot and a cold boot. This is done using a platform-specific mechanism. The +ARM FVPs implement a simple power controller at `0x1c100000`. The `PSYS` +register (`0x10`) is used to distinguish between a cold and warm boot. This +information is contained in the `PSYS.WK[25:24]` field. Additionally, a +per-CPU mailbox is maintained in trusted DRAM (`0x00600000`), to which BL1 +writes an entrypoint. Each CPU jumps to this entrypoint upon warm boot. During +cold boot, BL1 places the secondary CPUs in a safe platform-specific state while +the primary CPU executes the remaining cold boot path as described in the +following sections. + +#### Architectural initialization + +BL1 performs minimal architectural initialization as follows. + +* Exception vectors + + BL1 sets up simple exception vectors for both synchronous and asynchronous + exceptions. The default behavior upon receiving an exception is to set a + status code. In the case of the FVP this code is written to the Versatile + Express System LED register in the following format: + + SYS_LED[0] - Security state (Secure=0/Non-Secure=1) + SYS_LED[2:1] - Exception Level (EL3=0x3, EL2=0x2, EL1=0x1, EL0=0x0) + SYS_LED[7:3] - Exception Class (Sync/Async & origin). The values for + each exception class are: + + 0x0 : Synchronous exception from Current EL with SP_EL0 + 0x1 : IRQ exception from Current EL with SP_EL0 + 0x2 : FIQ exception from Current EL with SP_EL0 + 0x3 : System Error exception from Current EL with SP_EL0 + 0x4 : Synchronous exception from Current EL with SP_ELx + 0x5 : IRQ exception from Current EL with SP_ELx + 0x6 : FIQ exception from Current EL with SP_ELx + 0x7 : System Error exception from Current EL with SP_ELx + 0x8 : Synchronous exception from Lower EL using aarch64 + 0x9 : IRQ exception from Lower EL using aarch64 + 0xa : FIQ exception from Lower EL using aarch64 + 0xb : System Error exception from Lower EL using aarch64 + 0xc : Synchronous exception from Lower EL using aarch32 + 0xd : IRQ exception from Lower EL using aarch32 + 0xe : FIQ exception from Lower EL using aarch32 + 0xf : System Error exception from Lower EL using aarch32 + + A write to the LED register reflects in the System LEDs (S6LED0..7) in the + CLCD window of the FVP. This behavior is because this boot loader stage + does not expect to receive any exceptions other than the SMC exception. + For the latter, BL1 installs a simple stub. The stub expects to receive + only a single type of SMC (determined by its function ID in the general + purpose register `X0`). This SMC is raised by BL2 to make BL1 pass control + to BL3-1 (loaded by BL2) at EL3. Any other SMC leads to an assertion + failure. + +* MMU setup + + BL1 sets up EL3 memory translation by creating page tables to cover the + first 4GB of physical address space. This covers all the memories and + peripherals needed by BL1. + +* Control register setup + - `SCTLR_EL3`. Instruction cache is enabled by setting the `SCTLR_EL3.I` + bit. Alignment and stack alignment checking is enabled by setting the + `SCTLR_EL3.A` and `SCTLR_EL3.SA` bits. Exception endianness is set to + little-endian by clearing the `SCTLR_EL3.EE` bit. + + - `CPUECTLR`. When the FVP includes a model of a specific ARM processor + implementation (for example A57 or A53), then intra-cluster coherency is + enabled by setting the `CPUECTLR.SMPEN` bit. The AEMv8 Base FVP is + inherently coherent so does not implement `CPUECTLR`. + + - `SCR`. Use of the HVC instruction from EL1 is enabled by setting the + `SCR.HCE` bit. FIQ exceptions are configured to be taken in EL3 by + setting the `SCR.FIQ` bit. The register width of the next lower + exception level is set to AArch64 by setting the `SCR.RW` bit. + + - `CPTR_EL3`. Accesses to the `CPACR` from EL1 or EL2, or the `CPTR_EL2` + from EL2 are configured to not trap to EL3 by clearing the + `CPTR_EL3.TCPAC` bit. Instructions that access the registers associated + with Floating Point and Advanced SIMD execution are configured to not + trap to EL3 by clearing the `CPTR_EL3.TFP` bit. + + - `CNTFRQ_EL0`. The `CNTFRQ_EL0` register is programmed with the base + frequency of the system counter, which is retrieved from the first entry + in the frequency modes table. + + - Generic Timer. The system level implementation of the generic timer is + enabled through the memory mapped interface. + +#### Platform initialization + +BL1 enables issuing of snoop and DVM (Distributed Virtual Memory) requests from +the CCI-400 slave interface corresponding to the cluster that includes the +primary CPU. BL1 also initializes UART0 (PL011 console), which enables access to +the `printf` family of functions. + +#### BL2 image load and execution + +BL1 execution continues as follows: + +1. BL1 determines the amount of free trusted SRAM memory available by + calculating the extent of its own data section, which also resides in + trusted SRAM. BL1 loads a BL2 raw binary image through semi-hosting, at a + platform-specific base address. The filename of the BL2 raw binary image on + the host file system must be `bl2.bin`. If the BL2 image file is not present + or if there is not enough free trusted SRAM the following error message + is printed: + + "Failed to load boot loader stage 2 (BL2) firmware." + + If the load is successful, BL1 updates the limits of the remaining free + trusted SRAM. It also populates information about the amount of trusted + SRAM used by the BL2 image. The exact load location of the image is + provided as a base address in the platform header. Further description of + the memory layout can be found later in this document. + +2. BL1 prints the following string from the primary CPU to indicate successful + execution of the BL1 stage: + + "Booting trusted firmware boot loader stage 1" + +3. BL1 passes control to the BL2 image at Secure EL1, starting from its load + address. + +4. BL1 also passes information about the amount of trusted SRAM used and + available for use. This information is populated at a platform-specific + memory address. + + +### BL2 + +BL1 loads and passes control to BL2 at Secure EL1. BL2 is linked against and +loaded at a platform-specific base address (more information can found later +in this document). The functionality implemented by BL2 is as follows. + +#### Architectural initialization + +BL2 performs minimal architectural initialization required for subsequent +stages of the ARM Trusted Firmware and normal world software. It sets up +Secure EL1 memory translation by creating page tables to address the first 4GB +of the physical address space in a similar way to BL1. EL1 and EL0 are given +access to Floating Point & Advanced SIMD registers by clearing the `CPACR.FPEN` +bits. + +#### Platform initialization + +BL2 does not perform any platform initialization that affects subsequent +stages of the ARM Trusted Firmware or normal world software. It copies the +information regarding the trusted SRAM populated by BL1 using a +platform-specific mechanism. It also calculates the limits of DRAM (main memory) +to determine whether there is enough space to load the normal world software +images. A platform defined base address is used to specify the load address for +the BL3-1 image. + +#### Normal world image load + +BL2 loads a rich boot firmware image (UEFI). The image executes in the normal +world. BL2 relies on BL3-1 to pass control to the normal world software image it +loads. Hence, BL2 populates a platform-specific area of memory with the +entrypoint and Current Program Status Register (`CPSR`) of the normal world +software image. The entrypoint is the load address of the normal world software +image. The `CPSR` is determined as specified in Section 5.13 of the [PSCI PDD] +[PSCI]. This information is passed to BL3-1. + +##### UEFI firmware load + +By default, BL2 assumes the UEFI image is present at the base of NOR flash0 +(`0x08000000`), and arranges for BL3-1 to pass control to that location. As +mentioned earlier, BL2 populates platform-specific memory with the entrypoint +and `CPSR` of the UEFI image. + +#### BL3-1 image load and execution + +BL2 execution continues as follows: + +1. BL2 loads the BL3-1 image into a platform-specific address in trusted SRAM. + This is done using semi-hosting. The image is identified by the file + `bl31.bin` on the host file-system. If there is not enough memory to load + the image or the image is missing it leads to an assertion failure. If the + BL3-1 image loads successfully, BL1 updates the amount of trusted SRAM used + and available for use by BL3-1. This information is populated at a + platform-specific memory address. + +2. BL2 passes control back to BL1 by raising an SMC, providing BL1 with the + BL3-1 entrypoint. The exception is handled by the SMC exception handler + installed by BL1. + +3. BL1 turns off the MMU and flushes the caches. It clears the + `SCTLR_EL3.M/I/C` bits, flushes the data cache to the point of coherency + and invalidates the TLBs. + +4. BL1 passes control to BL3-1 at the specified entrypoint at EL3. + + +### BL3-1 + +The image for this stage is loaded by BL2 and BL1 passes control to BL3-1 at +EL3. BL3-1 executes solely in trusted SRAM. BL3-1 is linked against and +loaded at a platform-specific base address (more information can found later +in this document). The functionality implemented by BL3-1 is as follows. + +#### Architectural initialization + +Currently, BL3-1 performs a similar architectural initialization to BL1 as +far as system register settings are concerned. Since BL1 code resides in ROM, +architectural initialization in BL3-1 allows override of any previous +initialization done by BL1. BL3-1 creates page tables to address the first +4GB of physical address space and initializes the MMU accordingly. It replaces +the exception vectors populated by BL1 with its own. BL3-1 exception vectors +signal error conditions in the same way as BL1 does if an unexpected +exception is raised. They implement more elaborate support for handling SMCs +since this is the only mechanism to access the runtime services implemented by +BL3-1 (PSCI for example). BL3-1 checks each SMC for validity as specified by +the [SMC calling convention PDD][SMCCC] before passing control to the required +SMC handler routine. + +#### Platform initialization + +BL3-1 performs detailed platform initialization, which enables normal world +software to function correctly. It also retrieves entrypoint information for +the normal world software image loaded by BL2 from the platform defined +memory address populated by BL2. + +* GICv2 initialization: + + - Enable group0 interrupts in the GIC CPU interface. + - Configure group0 interrupts to be asserted as FIQs. + - Disable the legacy interrupt bypass mechanism. + - Configure the priority mask register to allow interrupts of all + priorities to be signaled to the CPU interface. + - Mark SGIs 8-15, the secure physical timer interrupt (#29) and the + trusted watchdog interrupt (#56) as group0 (secure). + - Target the trusted watchdog interrupt to CPU0. + - Enable these group0 interrupts in the GIC distributor. + - Configure all other interrupts as group1 (non-secure). + - Enable signaling of group0 interrupts in the GIC distributor. + +* GICv3 initialization: + + If a GICv3 implementation is available in the platform, BL3-1 initializes + the GICv3 in GICv2 emulation mode with settings as described for GICv2 + above. + +* Power management initialization: + + BL3-1 implements a state machine to track CPU and cluster state. The state + can be one of `OFF`, `ON_PENDING`, `SUSPEND` or `ON`. All secondary CPUs are + initially in the `OFF` state. The cluster that the primary CPU belongs to is + `ON`; any other cluster is `OFF`. BL3-1 initializes the data structures that + implement the state machine, including the locks that protect them. BL3-1 + accesses the state of a CPU or cluster immediately after reset and before + the MMU is enabled in the warm boot path. It is not currently possible to + use 'exclusive' based spinlocks, therefore BL3-1 uses locks based on + Lamport's Bakery algorithm instead. BL3-1 allocates these locks in device + memory. They are accessible irrespective of MMU state. + +* Runtime services initialization: + + The only runtime service implemented by BL3-1 is PSCI. The complete PSCI API + is not yet implemented. The following functions are currently implemented: + + - `PSCI_VERSION` + - `CPU_OFF` + - `CPU_ON` + - `AFFINITY_INFO` + + The `CPU_ON` and `CPU_OFF` functions implement the warm boot path in ARM + Trusted Firmware. These are the only functions which have been tested. + `AFFINITY_INFO` & `PSCI_VERSION` are present but completely untested in + this release. + + Unsupported PSCI functions that can return, return the `NOT_SUPPORTED` + (`-1`) error code. Other unsupported PSCI functions that don't return, + signal an assertion failure. + + BL3-1 returns the error code `-1` if an SMC is raised for any other runtime + service. This behavior is mandated by the [SMC calling convention PDD] + [SMCCC]. + + +### Normal world software execution + +BL3-1 uses the entrypoint information provided by BL2 to jump to the normal +world software image at the highest available Exception Level (EL2 if +available, otherwise EL1). + + +### Memory layout on Base FVP ### + +The current implementation of the image loader has some limitations. It is +designed to load images dynamically, at a load address chosen to minimize memory +fragmentation. The chosen image location can be either at the top or the bottom +of free memory. However, until this feature is fully functional, the code also +contains support for loading images at a link-time fixed address. The code that +dynamically calculates the load address is bypassed and the load address is +specified statically by the platform. + +BL1 is always loaded at address `0x0`. BL2 and BL3-1 are loaded at specified +locations in Trusted SRAM. The lack of dynamic image loader support means these +load addresses must currently be adjusted as the code grows. The individual +images must be linked against their ultimate runtime locations. + +BL2 is loaded near the top of the Trusted SRAM. BL3-1 is loaded between BL1 +and BL2. As a general rule, the following constraints must always be enforced: + +1. `BL2_MAX_ADDR <= (<Top of Trusted SRAM>)` +2. `BL31_BASE >= BL1_MAX_ADDR` +3. `BL2_BASE >= BL31_MAX_ADDR` + +Constraint 1 is enforced by BL2's linker script. If it is violated then the +linker will report an error while building BL2 to indicate that it doesn't +fit. For example: + + aarch64-none-elf-ld: address 0x40400c8 of bl2.elf section `.bss' is not + within region `RAM' + +This error means that the BL2 base address needs to be moved down. Be sure that +the new BL2 load address still obeys constraint 3. + +Constraints 2 & 3 must currently be checked by hand. To ensure they are +enforced, first determine the maximum addresses used by BL1 and BL3-1. This can +be deduced from the link map files of the different images. + +The BL1 link map file (`bl1.map`) gives these 2 values: + +* `FIRMWARE_RAM_COHERENT_START` +* `FIRMWARE_RAM_COHERENT_SIZE` + +The maximum address used by BL1 can then be easily determined: + + BL1_MAX_ADDR = FIRMWARE_RAM_COHERENT_START + FIRMWARE_RAM_COHERENT_SIZE + +The BL3-1 link map file (`bl31.map`) gives the following value: + +* `BL31_DATA_STOP`. This is the the maximum address used by BL3-1. + +The current implementation can result in wasted space because a simplified +`meminfo` structure represents the extents of free memory. For example, to load +BL2 at address `0x04020000`, the resulting memory layout should be as follows: + + ------------ 0x04040000 + | | <- Free space (1) + |----------| + | BL2 | + |----------| BL2_BASE (0x0402D000) + | | <- Free space (2) + |----------| + | BL1 | + ------------ 0x04000000 + +In the current implementation, we need to specify whether BL2 is loaded at the +top or bottom of the free memory. BL2 is top-loaded so in the example above, +the free space (1) above BL2 is hidden, resulting in the following view of +memory: + + ------------ 0x04040000 + | | + | | + | BL2 | + |----------| BL2_BASE (0x0402D000) + | | <- Free space (2) + |----------| + | BL1 | + ------------ 0x04000000 + +BL3-1 is bottom-loaded above BL1. For example, if BL3-1 is bottom-loaded at +`0x0400E000`, the memory layout should look like this: + + ------------ 0x04040000 + | | + | | + | BL2 | + |----------| BL2_BASE (0x0402D000) + | | <- Free space (2) + | | + |----------| + | | + | BL31 | + |----------| BL31_BASE (0x0400E000) + | | <- Free space (3) + |----------| + | BL1 | + ------------ 0x04000000 + +But the free space (3) between BL1 and BL3-1 is wasted, resulting in the +following view: + + ------------ 0x04040000 + | | + | | + | BL2 | + |----------| BL2_BASE (0x0402D000) + | | <- Free space (2) + | | + |----------| + | | + | | + | BL31 | BL31_BASE (0x0400E000) + | | + |----------| + | BL1 | + ------------ 0x04000000 + + +### Code Structure ### + +Trusted Firmware code is logically divided between the three boot loader +stages mentioned in the previous sections. The code is also divided into the +following categories (present as directories in the source code): + +* **Architecture specific.** This could be AArch32 or AArch64. +* **Platform specific.** Choice of architecture specific code depends upon + the platform. +* **Common code.** This is platform and architecture agnostic code. +* **Library code.** This code comprises of functionality commonly used by all + other code. +* **Stage specific.** Code specific to a boot stage. +* **Drivers.** + +Each boot loader stage uses code from one or more of the above mentioned +categories. Based upon the above, the code layout looks like this: + + Directory Used by BL1? Used by BL2? Used by BL3? + bl1 Yes No No + bl2 No Yes No + bl31 No No Yes + arch Yes Yes Yes + plat Yes Yes Yes + drivers Yes No Yes + common Yes Yes Yes + lib Yes Yes Yes + +All assembler files have the `.S` extension. The linker files for each boot +stage has the `.ld.S` extension. These are processed by GCC to create the +resultant `.ld` files used for linking. + +FDTs provide a description of the hardware platform and is used by the Linux +kernel at boot time. These can be found in the `fdts` directory. + + +4. References +-------------- + +1. Trusted Board Boot Requirements CLIENT PDD (ARM DEN 0006B-5). Available + under NDA through your ARM account representative. + +2. [Power State Coordination Interface PDD (ARM DEN 0022B.b)][PSCI]. + +3. [SMC Calling Convention PDD (ARM DEN 0028A)][SMCCC]. + + +- - - - - - - - - - - - - - - - - - - - - - - - - - + +_Copyright (c) 2013 ARM Ltd. All rights reserved._ + + +[Change Log]: change-log.md + +[Linaro Toolchain]: http://releases.linaro.org/13.09/components/toolchain/binaries/ +[EDK2]: http://sourceforge.net/projects/edk2/files/ARM/aarch64-uefi-rev14582.tgz/download +[DS-5]: http://www.arm.com/products/tools/software-tools/ds-5/index.php +[PSCI]: http://infocenter.arm.com/help/topic/com.arm.doc.den0022b/index.html "Power State Coordination Interface PDD (ARM DEN 0022B.b)" +[SMCCC]: http://infocenter.arm.com/help/topic/com.arm.doc.den0028a/index.html "SMC Calling Convention PDD (ARM DEN 0028A)" |