8f9fd239eb
The section overview list at the beginning of the document is missing a few sections which were added after the list was first created. So add them. Signed-off-by: Daniel Leung <daniel.leung@intel.com>
1035 lines
42 KiB
ReStructuredText
1035 lines
42 KiB
ReStructuredText
.. _architecture_porting_guide:
|
|
|
|
Architecture Porting Guide
|
|
##########################
|
|
|
|
An architecture port is needed to enable Zephyr to run on an :abbr:`ISA
|
|
(instruction set architecture)` or an :abbr:`ABI (Application Binary
|
|
Interface)` that is not currently supported.
|
|
|
|
The following are examples of ISAs and ABIs that Zephyr supports:
|
|
|
|
* x86_32 ISA with System V ABI
|
|
* ARMv7-M ISA with Thumb2 instruction set and ARM Embedded ABI (aeabi)
|
|
* ARCv2 ISA
|
|
|
|
For information on Kconfig configuration, see
|
|
:ref:`setting_configuration_values`. Architectures use a Kconfig configuration
|
|
scheme similar to boards.
|
|
|
|
An architecture port can be divided in several parts; most are required and
|
|
some are optional:
|
|
|
|
* **The early boot sequence**: each architecture has different steps it must
|
|
take when the CPU comes out of reset (required).
|
|
|
|
* **Interrupt and exception handling**: each architecture handles asynchronous
|
|
and unrequested events in a specific manner (required).
|
|
|
|
* **Thread context switching**: the Zephyr context switch is dependent on the
|
|
ABI and each ISA has a different set of registers to save (required).
|
|
|
|
* **Thread creation and termination**: A thread's initial stack frame is ABI
|
|
and architecture-dependent, and thread abortion possibly as well (required).
|
|
|
|
* **Device drivers**: most often, the system clock timer and the interrupt
|
|
controller are tied to the architecture (some required, some optional).
|
|
|
|
* **Utility libraries**: some common kernel APIs rely on a
|
|
architecture-specific implementation for performance reasons (required).
|
|
|
|
* **CPU idling/power management**: most architectures implement instructions
|
|
for putting the CPU to sleep (partly optional, most likely very desired).
|
|
|
|
* **Fault management**: for implementing architecture-specific debug help and
|
|
handling of fatal error in threads (partly optional).
|
|
|
|
* **Linker scripts and toolchains**: architecture-specific details will most
|
|
likely be needed in the build system and when linking the image (required).
|
|
|
|
* **Memory Management and Memory Mapping**: for architecture-specific details
|
|
on supporting memory management and memory mapping.
|
|
|
|
* **Stack Objects**: for architecture-specific details on memory protection
|
|
hardware regarding stack objects.
|
|
|
|
* **User Mode Threads**: for supporting threads in user mode.
|
|
|
|
* **GDB Stub**: for supporting GDB stub to enable remote debugging.
|
|
|
|
Early Boot Sequence
|
|
*******************
|
|
|
|
The goal of the early boot sequence is to take the system from the state it is
|
|
after reset to a state where is can run C code and thus the common kernel
|
|
initialization sequence. Most of the time, very few steps are needed, while
|
|
some architectures require a bit more work to be performed.
|
|
|
|
Common steps for all architectures:
|
|
|
|
* Setup an initial stack.
|
|
* If running an :abbr:`XIP (eXecute-In-Place)` kernel, copy initialized data
|
|
from ROM to RAM.
|
|
* If not using an ELF loader, zero the BSS section.
|
|
* Jump to :code:`z_cstart()`, the early kernel initialization
|
|
|
|
* :code:`z_cstart()` is responsible for context switching out of the fake
|
|
context running at startup into the main thread.
|
|
|
|
Some examples of architecture-specific steps that have to be taken:
|
|
|
|
* If given control in real mode on x86_32, switch to 32-bit protected mode.
|
|
* Setup the segment registers on x86_32 to handle boot loaders that leave them
|
|
in an unknown or broken state.
|
|
* Initialize a board-specific watchdog on Cortex-M3/4.
|
|
* Switch stacks from MSP to PSP on Cortex-M.
|
|
* Use a different approach than calling into _Swap() on Cortex-M to prevent
|
|
race conditions.
|
|
* Setup FIRQ and regular IRQ handling on ARCv2.
|
|
|
|
Interrupt and Exception Handling
|
|
********************************
|
|
|
|
Each architecture defines interrupt and exception handling differently.
|
|
|
|
When a device wants to signal the processor that there is some work to be done
|
|
on its behalf, it raises an interrupt. When a thread does an operation that is
|
|
not handled by the serial flow of the software itself, it raises an exception.
|
|
Both, interrupts and exceptions, pass control to a handler. The handler is
|
|
known as an :abbr:`ISR (Interrupt Service Routine)` in the case of
|
|
interrupts. The handler performs the work required by the exception or the
|
|
interrupt. For interrupts, that work is device-specific. For exceptions, it
|
|
depends on the exception, but most often the core kernel itself is responsible
|
|
for providing the handler.
|
|
|
|
The kernel has to perform some work in addition to the work the handler itself
|
|
performs. For example:
|
|
|
|
* Prior to handing control to the handler:
|
|
|
|
* Save the currently executing context.
|
|
* Possibly getting out of power saving mode, which includes waking up
|
|
devices.
|
|
* Updating the kernel uptime if getting out of tickless idle mode.
|
|
|
|
* After getting control back from the handler:
|
|
|
|
* Decide whether to perform a context switch.
|
|
* When performing a context switch, restore the context being context
|
|
switched in.
|
|
|
|
This work is conceptually the same across architectures, but the details are
|
|
completely different:
|
|
|
|
* The registers to save and restore.
|
|
* The processor instructions to perform the work.
|
|
* The numbering of the exceptions.
|
|
* etc.
|
|
|
|
It thus needs an architecture-specific implementation, called the
|
|
interrupt/exception stub.
|
|
|
|
Another issue is that the kernel defines the signature of ISRs as:
|
|
|
|
.. code-block:: C
|
|
|
|
void (*isr)(void *parameter)
|
|
|
|
Architectures do not have a consistent or native way of handling parameters to
|
|
an ISR. As such there are two commonly used methods for handling the
|
|
parameter.
|
|
|
|
* Using some architecture defined mechanism, the parameter value is forced in
|
|
the stub. This is commonly found in X86-based architectures.
|
|
|
|
* The parameters to the ISR are inserted and tracked via a separate table
|
|
requiring the architecture to discover at runtime which interrupt is
|
|
executing. A common interrupt handler demuxer is installed for all entries of
|
|
the real interrupt vector table, which then fetches the device's ISR and
|
|
parameter from the separate table. This approach is commonly used in the ARC
|
|
and ARM architectures via the :kconfig:option:`CONFIG_GEN_ISR_TABLES` implementation.
|
|
You can find examples of the stubs by looking at :code:`_interrupt_enter()` in
|
|
x86, :code:`_IntExit()` in ARM, :code:`_isr_wrapper()` in ARM, or the full
|
|
implementation description for ARC in :zephyr_file:`arch/arc/core/isr_wrapper.S`.
|
|
|
|
Each architecture also has to implement primitives for interrupt control:
|
|
|
|
* locking interrupts: :c:macro:`irq_lock()`, :c:macro:`irq_unlock()`.
|
|
* registering interrupts: :c:macro:`IRQ_CONNECT()`.
|
|
* programming the priority if possible :c:func:`irq_priority_set`.
|
|
* enabling/disabling interrupts: :c:macro:`irq_enable()`, :c:macro:`irq_disable()`.
|
|
|
|
.. note::
|
|
|
|
:c:macro:`IRQ_CONNECT` is a macro that uses assembler and/or linker script
|
|
tricks to connect interrupts at build time, saving boot time and text size.
|
|
|
|
The vector table should contain a handler for each interrupt and exception that
|
|
can possibly occur. The handler can be as simple as a spinning loop. However,
|
|
we strongly suggest that handlers at least print some debug information. The
|
|
information helps figuring out what went wrong when hitting an exception that
|
|
is a fault, like divide-by-zero or invalid memory access, or an interrupt that
|
|
is not expected (:dfn:`spurious interrupt`). See the ARM implementation in
|
|
:zephyr_file:`arch/arm/core/cortex_m/fault.c` for an example.
|
|
|
|
Thread Context Switching
|
|
************************
|
|
|
|
Multi-threading is the basic purpose to have a kernel at all. Zephyr supports
|
|
two types of threads: preemptible and cooperative.
|
|
|
|
Two crucial concepts when writing an architecture port are the following:
|
|
|
|
* Cooperative threads run at a higher priority than preemptible ones, and
|
|
always preempt them.
|
|
|
|
* After handling an interrupt, if a cooperative thread was interrupted, the
|
|
kernel always goes back to running that thread, since it is not preemptible.
|
|
|
|
A context switch can happen in several circumstances:
|
|
|
|
* When a thread executes a blocking operation, such as taking a semaphore that
|
|
is currently unavailable.
|
|
|
|
* When a preemptible thread unblocks a thread of higher priority by releasing
|
|
the object on which it was blocked.
|
|
|
|
* When an interrupt unblocks a thread of higher priority than the one currently
|
|
executing, if the currently executing thread is preemptible.
|
|
|
|
* When a thread runs to completion.
|
|
|
|
* When a thread causes a fatal exception and is removed from the running
|
|
threads. For example, referencing invalid memory,
|
|
|
|
Therefore, the context switching must thus be able to handle all these cases.
|
|
|
|
The kernel keeps the next thread to run in a "cache", and thus the context
|
|
switching code only has to fetch from that cache to select which thread to run.
|
|
|
|
There are two types of context switches: :dfn:`cooperative` and :dfn:`preemptive`.
|
|
|
|
* A *cooperative* context switch happens when a thread willfully gives the
|
|
control to another thread. There are two cases where this happens
|
|
|
|
* When a thread explicitly yields.
|
|
* When a thread tries to take an object that is currently unavailable and is
|
|
willing to wait until the object becomes available.
|
|
|
|
* A *preemptive* context switch happens either because an ISR or a
|
|
thread causes an operation that schedules a thread of higher priority than the
|
|
one currently running, if the currently running thread is preemptible.
|
|
An example of such an operation is releasing an object on which the thread
|
|
of higher priority was waiting.
|
|
|
|
.. note::
|
|
|
|
Control is never taken from cooperative thread when one of them is the
|
|
running thread.
|
|
|
|
A cooperative context switch is always done by having a thread call the
|
|
:code:`_Swap()` kernel internal symbol. When :code:`_Swap` is called, the
|
|
kernel logic knows that a context switch has to happen: :code:`_Swap` does not
|
|
check to see if a context switch must happen. Rather, :code:`_Swap` decides
|
|
what thread to context switch in. :code:`_Swap` is called by the kernel logic
|
|
when an object being operated on is unavailable, and some thread
|
|
yielding/sleeping primitives.
|
|
|
|
.. note::
|
|
|
|
On x86 and Nios2, :code:`_Swap` is generic enough and the architecture
|
|
flexible enough that :code:`_Swap` can be called when exiting an interrupt
|
|
to provoke the context switch. This should not be taken as a rule, since
|
|
neither the ARM Cortex-M or ARCv2 port do this.
|
|
|
|
Since :code:`_Swap` is cooperative, the caller-saved registers from the ABI are
|
|
already on the stack. There is no need to save them in the k_thread structure.
|
|
|
|
A context switch can also be performed preemptively. This happens upon exiting
|
|
an ISR, in the kernel interrupt exit stub:
|
|
|
|
* :code:`_interrupt_enter` on x86 after the handler is called.
|
|
* :code:`_IntExit` on ARM.
|
|
* :code:`_firq_exit` and :code:`_rirq_exit` on ARCv2.
|
|
|
|
In this case, the context switch must only be invoked when the interrupted
|
|
thread was preemptible, not when it was a cooperative one, and only when the
|
|
current interrupt is not nested.
|
|
|
|
The kernel also has the concept of "locking the scheduler". This is a concept
|
|
similar to locking the interrupts, but lighter-weight since interrupts can
|
|
still occur. If a thread has locked the scheduler, is it temporarily
|
|
non-preemptible.
|
|
|
|
So, the decision logic to invoke the context switch when exiting an interrupt
|
|
is simple:
|
|
|
|
* If the interrupted thread is not preemptible, do not invoke it.
|
|
* Else, fetch the cached thread from the ready queue, and:
|
|
|
|
* If the cached thread is not the current thread, invoke the context switch.
|
|
* Else, do not invoke it.
|
|
|
|
This is simple, but crucial: if this is not implemented correctly, the kernel
|
|
will not function as intended and will experience bizarre crashes, mostly due
|
|
to stack corruption.
|
|
|
|
.. note::
|
|
|
|
If running a coop-only system, i.e. if :kconfig:option:`CONFIG_NUM_PREEMPT_PRIORITIES`
|
|
is 0, no preemptive context switch ever happens. The interrupt code can be
|
|
optimized to not take any scheduling decision when this is the case.
|
|
|
|
Thread Creation and Termination
|
|
*******************************
|
|
|
|
To start a new thread, a stack frame must be constructed so that the context
|
|
switch can pop it the same way it would pop one from a thread that had been
|
|
context switched out. This is to be implemented in an architecture-specific
|
|
:code:`_new_thread` internal routine.
|
|
|
|
The thread entry point is also not to be called directly, i.e. it should not be
|
|
set as the :abbr:`PC (program counter)` for the new thread. Rather it must be
|
|
wrapped in :code:`_thread_entry`. This means that the PC in the stack
|
|
frame shall be set to :code:`_thread_entry`, and the thread entry point shall
|
|
be passed as the first parameter to :code:`_thread_entry`. The specifics of
|
|
this depend on the ABI.
|
|
|
|
The need for an architecture-specific thread termination implementation depends
|
|
on the architecture. There is a generic implementation, but it might not work
|
|
for a given architecture.
|
|
|
|
One reason that has been encountered for having an architecture-specific
|
|
implementation of thread termination is that aborting a thread might be
|
|
different if aborting because of a graceful exit or because of an exception.
|
|
This is the case for ARM Cortex-M, where the CPU has to be taken out of handler
|
|
mode if the thread triggered a fatal exception, but not if the thread
|
|
gracefully exits its entry point function.
|
|
|
|
This means implementing an architecture-specific version of
|
|
:c:func:`k_thread_abort`, and setting the Kconfig option
|
|
:kconfig:option:`CONFIG_ARCH_HAS_THREAD_ABORT` as needed for the architecture (e.g. see
|
|
:zephyr_file:`arch/arm/core/cortex_m/Kconfig`).
|
|
|
|
Thread Local Storage
|
|
********************
|
|
|
|
To enable thread local storage on a new architecture:
|
|
|
|
#. Implement :c:func:`arch_tls_stack_setup` to setup the TLS storage area in
|
|
stack. Refer to the toolchain documentation on how the storage area needs
|
|
to be structured. Some helper functions can be used:
|
|
|
|
* Function :c:func:`z_tls_data_size` returns the size
|
|
needed for thread local variables (excluding any extra data required by
|
|
toolchain and architecture).
|
|
* Function :c:func:`z_tls_copy` prepares the TLS storage area for
|
|
thread local variables. This only copies the variable themselves and
|
|
does not do architecture and/or toolchain specific data.
|
|
|
|
#. In the context switching, grab the ``tls`` field inside the new thread's
|
|
``struct k_thread`` and put it into an appropriate register (or some
|
|
other variable) for access to the TLS storage area. Refer to toolchain
|
|
and architecture documentation on which registers to use.
|
|
#. In kconfig, add ``select CONFIG_ARCH_HAS_THREAD_LOCAL_STORAGE`` to
|
|
kconfig related to the new architecture.
|
|
#. Run the ``tests/kernel/threads/tls`` to make sure the new code works.
|
|
|
|
Device Drivers
|
|
**************
|
|
|
|
The kernel requires very few hardware devices to function. In theory, the only
|
|
required device is the interrupt controller, since the kernel can run without a
|
|
system clock. In practice, to get access to most, if not all, of the sanity
|
|
check test suite, a system clock is needed as well. Since these two are usually
|
|
tied to the architecture, they are part of the architecture port.
|
|
|
|
Interrupt Controllers
|
|
=====================
|
|
|
|
There can be significant differences between the interrupt controllers and the
|
|
interrupt concepts across architectures.
|
|
|
|
For example, x86 has the concept of an :abbr:`IDT (Interrupt Descriptor Table)`
|
|
and different interrupt controllers. The position of an interrupt in the IDT
|
|
determines its priority.
|
|
|
|
On the other hand, the ARM Cortex-M has the :abbr:`NVIC (Nested Vectored
|
|
Interrupt Controller)` as part of the architecture definition. There is no need
|
|
for an IDT-like table that is separate from the NVIC vector table. The position
|
|
in the table has nothing to do with priority of an IRQ: priorities are
|
|
programmable per-entry.
|
|
|
|
The ARCv2 has its interrupt unit as part of the architecture definition, which
|
|
is somewhat similar to the NVIC. However, where ARC defines interrupts as
|
|
having a one-to-one mapping between exception and interrupt numbers (i.e.
|
|
exception 1 is IRQ1, and device IRQs start at 16), ARM has IRQ0 being
|
|
equivalent to exception 16 (and weirdly enough, exception 1 can be seen as
|
|
IRQ-15).
|
|
|
|
All these differences mean that very little, if anything, can be shared between
|
|
architectures with regards to interrupt controllers.
|
|
|
|
System Clock
|
|
============
|
|
|
|
x86 has APIC timers and the HPET as part of its architecture definition. ARM
|
|
Cortex-M has the SYSTICK exception. Finally, ARCv2 has the timer0/1 device.
|
|
|
|
Kernel timeouts are handled in the context of the system clock timer driver's
|
|
interrupt handler.
|
|
|
|
|
|
Console Over Serial Line
|
|
========================
|
|
|
|
There is one other device that is almost a requirement for an architecture
|
|
port, since it is so useful for debugging. It is a simple polling, output-only,
|
|
serial port driver on which to send the console (:code:`printk`,
|
|
:code:`printf`) output.
|
|
|
|
It is not required, and a RAM console (:kconfig:option:`CONFIG_RAM_CONSOLE`)
|
|
can be used to send all output to a circular buffer that can be read
|
|
by a debugger instead.
|
|
|
|
Utility Libraries
|
|
*****************
|
|
|
|
The kernel depends on a few functions that can be implemented with very few
|
|
instructions or in a lock-less manner in modern processors. Those are thus
|
|
expected to be implemented as part of an architecture port.
|
|
|
|
* Atomic operators.
|
|
|
|
* If instructions do exist for a given architecture, the implementation is
|
|
configured using the :kconfig:option:`CONFIG_ATOMIC_OPERATIONS_ARCH` Kconfig
|
|
option.
|
|
|
|
* If instructions do not exist for a given architecture,
|
|
a generic version that wraps :c:func:`irq_lock` or :c:func:`irq_unlock`
|
|
around non-atomic operations exists. It is configured using the
|
|
:kconfig:option:`CONFIG_ATOMIC_OPERATIONS_C` Kconfig option.
|
|
|
|
* Find-least-significant-bit-set and find-most-significant-bit-set.
|
|
|
|
* If instructions do not exist for a given architecture, it is always
|
|
possible to implement these functions as generic C functions.
|
|
|
|
It is possible to use compiler built-ins to implement these, but be careful
|
|
they use the required compiler barriers.
|
|
|
|
CPU Idling/Power Management
|
|
***************************
|
|
|
|
The kernel provides support for CPU power management with two functions:
|
|
:c:func:`arch_cpu_idle` and :c:func:`arch_cpu_atomic_idle`.
|
|
|
|
:c:func:`arch_cpu_idle` can be as simple as calling the power saving
|
|
instruction for the architecture with interrupts unlocked, for example
|
|
:code:`hlt` on x86, :code:`wfi` or :code:`wfe` on ARM, :code:`sleep` on ARC.
|
|
This function can be called in a loop within a context that does not care if it
|
|
get interrupted or not by an interrupt before going to sleep. There are
|
|
basically two scenarios when it is correct to use this function:
|
|
|
|
* In a single-threaded system, in the only thread when the thread is not used
|
|
for doing real work after initialization, i.e. it is sitting in a loop doing
|
|
nothing for the duration of the application.
|
|
|
|
* In the idle thread.
|
|
|
|
:c:func:`arch_cpu_atomic_idle`, on the other hand, must be able to atomically
|
|
re-enable interrupts and invoke the power saving instruction. It can thus be
|
|
used in real application code, again in single-threaded systems.
|
|
|
|
Normally, idling the CPU should be left to the idle thread, but in some very
|
|
special scenarios, these APIs can be used by applications.
|
|
|
|
Both functions must exist for a given architecture. However, the implementation
|
|
can be simply the following steps, if desired:
|
|
|
|
#. unlock interrupts
|
|
#. NOP
|
|
|
|
However, a real implementation is strongly recommended.
|
|
|
|
Fault Management
|
|
****************
|
|
|
|
In the event of an unhandled CPU exception, the architecture
|
|
code must call into :c:func:`z_fatal_error`. This function dumps
|
|
out architecture-agnostic information and makes a policy
|
|
decision on what to do next by invoking :c:func:`k_sys_fatal_error`.
|
|
This function can be overridden to implement application-specific
|
|
policies that could include locking interrupts and spinning forever
|
|
(the default implementation) or even powering off the
|
|
system (if supported).
|
|
|
|
Toolchain and Linking
|
|
*********************
|
|
|
|
Toolchain support has to be added to the build system.
|
|
|
|
Some architecture-specific definitions are needed in :zephyr_file:`include/zephyr/toolchain/gcc.h`.
|
|
See what exists in that file for currently supported architectures.
|
|
|
|
Each architecture also needs its own linker script, even if most sections can
|
|
be derived from the linker scripts of other architectures. Some sections might
|
|
be specific to the new architecture, for example the SCB section on ARM and the
|
|
IDT section on x86.
|
|
|
|
Memory Management and Memory Mapping
|
|
************************************
|
|
|
|
If the target platform enables paging and requires drivers to memory-map
|
|
their I/O regions, :kconfig:option:`CONFIG_MMU` needs to be enabled and the
|
|
following API implemented:
|
|
|
|
- :c:func:`arch_mem_map`
|
|
- :c:func:`arch_mem_unmap`
|
|
- :c:func:`arch_page_phys_get`
|
|
|
|
Stack Objects
|
|
*************
|
|
|
|
The presence of memory protection hardware affects how stack objects are
|
|
created. All architecture ports must specify the required alignment of the
|
|
stack pointer, which is some combination of CPU and ABI requirements. This
|
|
is defined in architecture headers with :c:macro:`ARCH_STACK_PTR_ALIGN` and
|
|
is typically something small like 4, 8, or 16 bytes.
|
|
|
|
Two types of thread stacks exist:
|
|
|
|
- "kernel" stacks defined with :c:macro:`K_KERNEL_STACK_DEFINE()` and related
|
|
APIs, which can host kernel threads running in supervisor mode or
|
|
used as the stack for interrupt/exception handling. These have significantly
|
|
relaxed alignment requirements and use less reserved data. No memory is
|
|
reserved for privilege elevation stacks.
|
|
|
|
- "thread" stacks which typically use more memory, but are capable of hosting
|
|
thread running in user mode, as well as any use-cases for kernel stacks.
|
|
|
|
If :kconfig:option:`CONFIG_USERSPACE` is not enabled, "thread" and "kernel" stacks are
|
|
equivalent.
|
|
|
|
Additional macros may be defined in the architecture layer to specify
|
|
the alignment of the base of stack objects, any reserved data inside the
|
|
stack object not used for the thread's stack buffer, and how to round up
|
|
stack sizes to support user mode threads. In the absence of definitions
|
|
some defaults are assumed:
|
|
|
|
- :c:macro:`ARCH_KERNEL_STACK_RESERVED`: default no reserved space
|
|
- :c:macro:`ARCH_THREAD_STACK_RESERVED`: default no reserved space
|
|
- :c:macro:`ARCH_KERNEL_STACK_OBJ_ALIGN`: default align to
|
|
:c:macro:`ARCH_STACK_PTR_ALIGN`
|
|
- :c:macro:`ARCH_THREAD_STACK_OBJ_ALIGN`: default align to
|
|
:c:macro:`ARCH_STACK_PTR_ALIGN`
|
|
- :c:macro:`ARCH_THREAD_STACK_SIZE_ALIGN`: default round up to
|
|
:c:macro:`ARCH_STACK_PTR_ALIGN`
|
|
|
|
All stack creation macros are defined in terms of these.
|
|
|
|
Stack objects all have the following layout, with some regions potentially
|
|
zero-sized depending on configuration. There are always two main parts:
|
|
reserved memory at the beginning, and then the stack buffer itself. The
|
|
bounds of some areas can only be determined at runtime in the context of
|
|
its associated thread object. Other areas are entirely computable at build
|
|
time.
|
|
|
|
Some architectures may need to carve-out reserved memory at runtime from the
|
|
stack buffer, instead of unconditionally reserving it at build time, or to
|
|
supplement an existing reserved area (as is the case with the ARM FPU).
|
|
Such carve-outs will always be tracked in ``thread.stack_info.start``.
|
|
The region specified by ``thread.stack_info.start`` and
|
|
``thread.stack_info.size`` is always fully accessible by a user mode thread.
|
|
``thread.stack_info.delta`` denotes an offset which can be used to compute
|
|
the initial stack pointer from the very end of the stack object, taking into
|
|
account storage for TLS and ASLR random offsets.
|
|
|
|
.. code-block:: none
|
|
|
|
+---------------------+ <- thread.stack_obj
|
|
| Reserved Memory | } K_(THREAD|KERNEL)_STACK_RESERVED
|
|
+---------------------+
|
|
| Carved-out memory |
|
|
|.....................| <- thread.stack_info.start
|
|
| Unused stack buffer |
|
|
| |
|
|
|.....................| <- thread's current stack pointer
|
|
| Used stack buffer |
|
|
| |
|
|
|.....................| <- Initial stack pointer. Computable
|
|
| ASLR Random offset | with thread.stack_info.delta
|
|
+---------------------| <- thread.userspace_local_data
|
|
| Thread-local data |
|
|
+---------------------+ <- thread.stack_info.start + thread.stack_info.size
|
|
|
|
|
|
At present, Zephyr does not support stacks that grow upward.
|
|
|
|
No Memory Protection
|
|
====================
|
|
|
|
If no memory protection is in use, then the defaults are sufficient.
|
|
|
|
HW-based stack overflow detection
|
|
=================================
|
|
|
|
This option uses hardware features to generate a fatal error if a thread
|
|
in supervisor mode overflows its stack. This is useful for debugging, although
|
|
for a couple reasons, you can't reliably make any assertions about the state
|
|
of the system after this happens:
|
|
|
|
* The kernel could have been inside a critical section when the overflow
|
|
occurs, leaving important global data structures in a corrupted state.
|
|
|
|
* For systems that implement stack protection using a guard memory region,
|
|
it's possible to overshoot the guard and corrupt adjacent data structures
|
|
before the hardware detects this situation.
|
|
|
|
To enable the :kconfig:option:`CONFIG_HW_STACK_PROTECTION` feature, the system must
|
|
provide some kind of hardware-based stack overflow protection, and enable the
|
|
:kconfig:option:`CONFIG_ARCH_HAS_STACK_PROTECTION` option.
|
|
|
|
Two forms of HW-based stack overflow detection are supported: dedicated
|
|
CPU features for this purpose, or special read-only guard regions immediately
|
|
preceding stack buffers.
|
|
|
|
:kconfig:option:`CONFIG_HW_STACK_PROTECTION` only catches stack overflows for
|
|
supervisor threads. This is not required to catch stack overflow from user
|
|
threads; :kconfig:option:`CONFIG_USERSPACE` is orthogonal.
|
|
|
|
This feature only detects supervisor mode stack overflows, including stack
|
|
overflows when handling system calls. It doesn't guarantee that the kernel has
|
|
not been corrupted. Any stack overflow in supervisor mode should be treated as
|
|
a fatal error, with no assertions about the integrity of the overall system
|
|
possible.
|
|
|
|
Stack overflows in user mode are recoverable (from the kernel's perspective)
|
|
and require no special configuration; :kconfig:option:`CONFIG_HW_STACK_PROTECTION`
|
|
only applies to catching overflows when the CPU is in supervisor mode.
|
|
|
|
CPU-based stack overflow detection
|
|
----------------------------------
|
|
|
|
If we are detecting stack overflows in supervisor mode via special CPU
|
|
registers (like ARM's SPLIM), then the defaults are sufficient.
|
|
|
|
|
|
|
|
Guard-based stack overflow detection
|
|
------------------------------------
|
|
|
|
We are detecting supervisor mode stack overflows via special memory protection
|
|
region located immediately before the stack buffer that generates an exception
|
|
on write. Reserved memory will be used for the guard region.
|
|
|
|
:c:macro:`ARCH_KERNEL_STACK_RESERVED` should be defined to the minimum size
|
|
of a memory protection region. On most ARM CPUs this is 32 bytes.
|
|
:c:macro:`ARCH_KERNEL_STACK_OBJ_ALIGN` should also be set to the required
|
|
alignment for this region.
|
|
|
|
MMU-based systems should not reserve RAM for the guard region and instead
|
|
simply leave an non-present virtual page below every stack when it is mapped
|
|
into the address space. The stack object will still need to be properly aligned
|
|
and sized to page granularity.
|
|
|
|
.. code-block:: none
|
|
|
|
+-----------------------------+ <- thread.stack_obj
|
|
| Guard reserved memory | } K_KERNEL_STACK_RESERVED
|
|
+-----------------------------+
|
|
| Guard carve-out |
|
|
|.............................| <- thread.stack_info.start
|
|
| Stack buffer |
|
|
. .
|
|
|
|
Guard carve-outs for kernel stacks are uncommon and should be avoided if
|
|
possible. They tend to be needed for two situations:
|
|
|
|
* The same stack may be re-purposed to host a user thread, in which case
|
|
the guard is unnecessary and shouldn't be unconditionally reserved.
|
|
This is the case when privilege elevation stacks are not inside the stack
|
|
object.
|
|
|
|
* The required guard size is variable and depends on context. For example, some
|
|
ARM CPUs have lazy floating point stacking during exceptions and may
|
|
decrement the stack pointer by a large value without writing anything,
|
|
completely overshooting a minimally-sized guard and corrupting adjacent
|
|
memory. Rather than unconditionally reserving a larger guard, the extra
|
|
memory is carved out if the thread uses floating point.
|
|
|
|
User mode enabled
|
|
=================
|
|
|
|
Enabling user mode activates two new requirements:
|
|
|
|
* A separate fixed-sized privilege mode stack, specified by
|
|
:kconfig:option:`CONFIG_PRIVILEGED_STACK_SIZE`, must be allocated that the user
|
|
thread cannot access. It is used as the stack by the kernel when handling
|
|
system calls. If stack guards are implemented, a stack guard region must
|
|
be able to be placed before it, with support for carve-outs if necessary.
|
|
|
|
* The memory protection hardware must be able to program a region that exactly
|
|
covers the thread's stack buffer, tracked in ``thread.stack_info``. This
|
|
implies that :c:macro:`ARCH_THREAD_STACK_SIZE_ADJUST()` will need to round
|
|
up the requested stack size so that a region may cover it, and that
|
|
:c:macro:`ARCH_THREAD_STACK_OBJ_ALIGN()` is also specified per the
|
|
granularity of the memory protection hardware.
|
|
|
|
This becomes more complicated if the memory protection hardware requires that
|
|
all memory regions be sized to a power of two, and aligned to their own size.
|
|
This is common on older MPUs and is known with
|
|
:kconfig:option:`CONFIG_MPU_REQUIRES_POWER_OF_TWO_ALIGNMENT`.
|
|
|
|
``thread.stack_info`` always tracks the user-accessible part of the stack
|
|
object, it must always be correct to program a memory protection region with
|
|
user access using the range stored within.
|
|
|
|
Non power-of-two memory region requirements
|
|
-------------------------------------------
|
|
|
|
On systems without power-of-two region requirements, the reserved memory area
|
|
for threads stacks defined by :c:macro:`K_THREAD_STACK_RESERVED` may be used to
|
|
contain the privilege mode stack. The layout could be something like:
|
|
|
|
.. code-block:: none
|
|
|
|
+------------------------------+ <- thread.stack_obj
|
|
| Other platform data |
|
|
+------------------------------+
|
|
| Guard region (if enabled) |
|
|
+------------------------------+
|
|
| Guard carve-out (if needed) |
|
|
|..............................|
|
|
| Privilege elevation stack |
|
|
+------------------------------| <- thread.stack_obj +
|
|
| Stack buffer | K_THREAD_STACK_RESERVED =
|
|
. . thread.stack_info.start
|
|
|
|
The guard region, and any carve-out (if needed) would be configured as a
|
|
read-only region when the thread is created.
|
|
|
|
* If the thread is a supervisor thread, the privilege elevation region is just
|
|
extra stack memory. An overflow will eventually crash into the guard region.
|
|
|
|
* If the thread is running in user mode, a memory protection region will be
|
|
configured to allow user threads access to the stack buffer, but nothing
|
|
before or after it. An overflow in user mode will crash into the privilege
|
|
elevation stack, which the user thread has no access to. An overflow when
|
|
handling a system call will crash into the guard region.
|
|
|
|
On an MMU system there should be no physical guards; the privilege mode stack
|
|
will be mapped into kernel memory, and the stack buffer in the user part of
|
|
memory, each with non-present virtual guard pages below them to catch runtime
|
|
stack overflows.
|
|
|
|
Other platform data may be stored before the guard region, but this is highly
|
|
discouraged if such data could be stored in ``thread.arch`` somewhere.
|
|
|
|
:c:macro:`ARCH_THREAD_STACK_RESERVED` will need to be defined to capture
|
|
the size of the reserved region containing platform data, privilege elevation
|
|
stacks, and guards. It must be appropriately sized such that an MPU region
|
|
to grant user mode access to the stack buffer can be placed immediately
|
|
after it.
|
|
|
|
Power-of-two memory region requirements
|
|
---------------------------------------
|
|
|
|
Thread stack objects must be sized and aligned to the same power of two,
|
|
without any reserved memory to allow efficient packing in memory. Thus,
|
|
any guards in the thread stack must be completely carved out, and the
|
|
privilege elevation stack must be allocated elsewhere.
|
|
|
|
:c:macro:`ARCH_THREAD_STACK_SIZE_ADJUST()` and
|
|
:c:macro:`ARCH_THREAD_STACK_OBJ_ALIGN()` should both be defined to
|
|
:c:macro:`Z_POW2_CEIL()`. :c:macro:`K_THREAD_STACK_RESERVED` must be 0.
|
|
|
|
For the privilege stacks, the :kconfig:option:`CONFIG_GEN_PRIV_STACKS` must be,
|
|
enabled. For every thread stack found in the system, a corresponding fixed-
|
|
size kernel stack used for handling system calls is generated. The address
|
|
of the privilege stacks can be looked up quickly at runtime based on the
|
|
thread stack address using :c:func:`z_priv_stack_find()`. These stacks are
|
|
laid out the same way as other kernel-only stacks.
|
|
|
|
.. code-block:: none
|
|
|
|
+-----------------------------+ <- z_priv_stack_find(thread.stack_obj)
|
|
| Reserved memory | } K_KERNEL_STACK_RESERVED
|
|
+-----------------------------+
|
|
| Guard carve-out (if needed) |
|
|
|.............................|
|
|
| Privilege elevation stack |
|
|
| |
|
|
+-----------------------------+ <- z_priv_stack_find(thread.stack_obj) +
|
|
K_KERNEL_STACK_RESERVED +
|
|
CONFIG_PRIVILEGED_STACK_SIZE
|
|
|
|
+-----------------------------+ <- thread.stack_obj
|
|
| MPU guard carve-out |
|
|
| (supervisor mode only) |
|
|
|.............................| <- thread.stack_info.start
|
|
| Stack buffer |
|
|
. .
|
|
|
|
The guard carve-out in the thread stack object is only used if the thread is
|
|
running in supervisor mode. If the thread drops to user mode, there is no guard
|
|
and the entire object is used as the stack buffer, with full access to the
|
|
associated user mode thread and ``thread.stack_info`` updated appropriately.
|
|
|
|
User Mode Threads
|
|
*****************
|
|
|
|
To support user mode threads, several kernel-to-arch APIs need to be
|
|
implemented, and the system must enable the :kconfig:option:`CONFIG_ARCH_HAS_USERSPACE`
|
|
option. Please see the documentation for each of these functions for more
|
|
details:
|
|
|
|
* :c:func:`arch_buffer_validate` to test whether the current thread has
|
|
access permissions to a particular memory region
|
|
|
|
* :c:func:`arch_user_mode_enter` which will irreversibly drop a supervisor
|
|
thread to user mode privileges. The stack must be wiped.
|
|
|
|
* :c:func:`arch_syscall_oops` which generates a kernel oops when system
|
|
call parameters can't be validated, in such a way that the oops appears to be
|
|
generated from where the system call was invoked in the user thread
|
|
|
|
* :c:func:`arch_syscall_invoke0` through
|
|
:c:func:`arch_syscall_invoke6` invoke a system call with the
|
|
appropriate number of arguments which must all be passed in during the
|
|
privilege elevation via registers.
|
|
|
|
* :c:func:`arch_is_user_context` return nonzero if the CPU is currently
|
|
running in user mode
|
|
|
|
* :c:func:`arch_mem_domain_max_partitions_get` which indicates the max
|
|
number of regions for a memory domain. MMU systems have an unlimited amount,
|
|
MPU systems have constraints on this.
|
|
|
|
Some architectures may need to update software memory management structures
|
|
or modify hardware registers on another CPU when memory domain APIs are invoked.
|
|
If so, :kconfig:option:`CONFIG_ARCH_MEM_DOMAIN_SYNCHRONOUS_API` must be selected by the
|
|
architecture and some additional APIs must be implemented. This is common
|
|
on MMU systems and uncommon on MPU systems:
|
|
|
|
* :c:func:`arch_mem_domain_thread_add`
|
|
|
|
* :c:func:`arch_mem_domain_thread_remove`
|
|
|
|
* :c:func:`arch_mem_domain_partition_add`
|
|
|
|
* :c:func:`arch_mem_domain_partition_remove`
|
|
|
|
Please see the doxygen documentation of these APIs for details.
|
|
|
|
In addition to implementing these APIs, there are some other tasks as well:
|
|
|
|
* :c:func:`_new_thread` needs to spawn threads with :c:macro:`K_USER` in
|
|
user mode
|
|
|
|
* On context switch, the outgoing thread's stack memory should be marked
|
|
inaccessible to user mode by making the appropriate configuration changes in
|
|
the memory management hardware.. The incoming thread's stack memory should
|
|
likewise be marked as accessible. This ensures that threads can't mess with
|
|
other thread stacks.
|
|
|
|
* On context switch, the system needs to switch between memory domains for
|
|
the incoming and outgoing threads.
|
|
|
|
* Thread stack areas must include a kernel stack region. This should be
|
|
inaccessible to user threads at all times. This stack will be used when
|
|
system calls are made. This should be fixed size for all threads, and must
|
|
be large enough to handle any system call.
|
|
|
|
* A software interrupt or some kind of privilege elevation mechanism needs to
|
|
be established. This is closely tied to how the _arch_syscall_invoke macros
|
|
are implemented. On system call, the appropriate handler function needs to
|
|
be looked up in _k_syscall_table. Bad system call IDs should jump to the
|
|
:c:enum:`K_SYSCALL_BAD` handler. Upon completion of the system call, care
|
|
must be taken not to leak any register state back to user mode.
|
|
|
|
GDB Stub
|
|
********
|
|
|
|
To enable GDB stub for remote debugging on a new architecture:
|
|
|
|
#. Create a new ``gdbstub.h`` header file under appropriate architecture
|
|
include directory (``include/arch/<arch>/gdbstub.h``).
|
|
|
|
* Create a new struct ``struct gdb_ctx`` as the GDB context.
|
|
|
|
* Must define a member named ``exception`` of type ``unsigned int`` to
|
|
store the GDB exception reason. This value needs to be set before
|
|
entering :c:func:`z_gdb_main_loop`.
|
|
|
|
* Architecture can define as many members as needed for GDB stub to
|
|
function.
|
|
|
|
* Pointer to this struct needs to be passed to :c:func:`z_gdb_main_loop`,
|
|
where this pointer will be passed to other GDB stub functions.
|
|
|
|
#. Functions for entering and exiting GDB stub main loop.
|
|
|
|
* If the architecture relies on interrupts to service breakpoints,
|
|
interrupt service routines (ISR) need to be implemented, which
|
|
will serve as the entry point to GDB stub main loop.
|
|
|
|
* These functions need to save and restore context so code execution
|
|
can continue as if no breakpoints have been encountered.
|
|
|
|
* These functions need to call :c:func:`z_gdb_main_loop` after saving
|
|
execution context to go into the GDB stub main loop to receive commands
|
|
from GDB.
|
|
|
|
* Before calling :c:func:`z_gdb_main_loop`, :c:member:`gdb_ctx.exception`
|
|
must be set to specify the exception reason.
|
|
|
|
#. Implement necessary functions to support GDB stub functionality:
|
|
|
|
* :c:func:`arch_gdb_init`
|
|
|
|
* This needs to initialize necessary bits to support GDB stub functionality,
|
|
for example, setting up the GDB context and connecting debug interrupts.
|
|
|
|
* This must stop code execution via architecture specific method (e.g.
|
|
raising debug interrupts). This allows GDB to connect during boot.
|
|
|
|
* :c:func:`arch_gdb_continue`
|
|
|
|
* This function is called when GDB sends a ``c`` or ``continue`` command
|
|
to continue code execution.
|
|
|
|
* :c:func:`arch_gdb_step`
|
|
|
|
* This function is called when GDB sends a ``si`` or ``stepi`` command
|
|
to execute one machine instruction, before returning to GDB prompt.
|
|
|
|
* Hardware register read/write functions:
|
|
|
|
* Since the GDB stub is running on the target, manipulation of hardware
|
|
registers need to cached to avoid affecting the execution of GDB stub.
|
|
Think of it as context switching, where the execution context is
|
|
changed to the GDB stub. So that the register values of the running
|
|
thread before context switch need to be stored. Manipulation of
|
|
register values must only be done to this cached copy. The updated
|
|
values will then be written to hardware registers before switching
|
|
back to the previous running thread.
|
|
|
|
* :c:func:`arch_gdb_reg_readall`
|
|
|
|
* This collects all hardware register values that would appear in
|
|
a ``g``/``G`` packets which will be sent back to GDB. The format of
|
|
the G-packet is architecture specific. Consult GDB on what is
|
|
expected.
|
|
|
|
* Note that, for most architectures, a valid G-packet must be returned
|
|
and sent to GDB. If a packet without incorrect length is sent to
|
|
GDB, GDB will abort the debugging session.
|
|
|
|
* :c:func:`arch_gdb_reg_writeall`
|
|
|
|
* This takes a G-packet sent by GDB and populates the hardware
|
|
registers with values from the G-packet.
|
|
|
|
* :c:func:`arch_gdb_reg_readone`
|
|
|
|
* This reads the value of one hardware register and sends
|
|
the result to GDB.
|
|
|
|
* :c:func:`arch_gdb_reg_writeone`
|
|
|
|
* This writes the value of one hardware register received from GDB.
|
|
|
|
* Breakpoints:
|
|
|
|
* :c:func:`arch_gdb_add_breakpoint` and
|
|
:c:func:`arch_gdb_remove_breakpoint`
|
|
|
|
* GDB may decide to use software breakpoints which modifies
|
|
the memory at the breakpoint locations to replace the instruction
|
|
with software breakpoint or trap instructions. GDB will then
|
|
restore the memory content once execution reaches the breakpoints.
|
|
GDB supports this by default and there is usually no need to
|
|
handle software breakpoints in the architecture code (where
|
|
breakpoint type is ``0``).
|
|
|
|
* Hardware breakpoints (type ``1``) are required if the code is
|
|
in ROM or flash that cannot be modified at runtime. Consult
|
|
the architecture datasheet on how to enable hardware breakpoints.
|
|
|
|
* If hardware breakpoints are not supported by the architecture,
|
|
there is no need to implement these in architecture code.
|
|
GDB will then rely on software breakpoints.
|
|
|
|
#. For architecture where certain memory regions are not accessible,
|
|
an array named :c:var:`gdb_mem_region_array` of type
|
|
:c:struct:`gdb_mem_region` needs to be defined to specify regions
|
|
that are accessible. For each array item:
|
|
|
|
* :c:member:`gdb_mem_region.start` specifies the start of a memory
|
|
region.
|
|
|
|
* :c:member:`gdb_mem_region.end` specifies the end of a memory
|
|
region.
|
|
|
|
* :c:member:`gdb_mem_region.attributes` specifies the permission
|
|
of a memory region.
|
|
|
|
* :c:macro:`GDB_MEM_REGION_RO`: region is read-only.
|
|
|
|
* :c:macro:`GDB_MEM_REGION_RW`: region is read-write.
|
|
|
|
* :c:member:`gdb_mem_region.alignment` specifies read/write alignment
|
|
of a memory region. Use ``0`` if there is no alignment requirement
|
|
and read/write can be done byte-by-byte.
|
|
|
|
API Reference
|
|
*************
|
|
|
|
Timing
|
|
======
|
|
|
|
.. doxygengroup:: arch-timing
|
|
|
|
Threads
|
|
=======
|
|
|
|
.. doxygengroup:: arch-threads
|
|
|
|
.. doxygengroup:: arch-tls
|
|
|
|
Power Management
|
|
================
|
|
|
|
.. doxygengroup:: arch-pm
|
|
|
|
Symmetric Multi-Processing
|
|
==========================
|
|
|
|
.. doxygengroup:: arch-smp
|
|
|
|
Interrupts
|
|
==========
|
|
|
|
.. doxygengroup:: arch-irq
|
|
|
|
Userspace
|
|
=========
|
|
|
|
.. doxygengroup:: arch-userspace
|
|
|
|
Memory Management
|
|
=================
|
|
|
|
.. doxygengroup:: arch-mmu
|
|
|
|
Miscellaneous Architecture APIs
|
|
===============================
|
|
|
|
.. doxygengroup:: arch-misc
|
|
|
|
GDB Stub APIs
|
|
=============
|
|
|
|
.. doxygengroup:: arch-gdbstub
|