doc: provide error handling documentation
We don't really have docs on how fatal errors are induced or handled. Provide some documentation that covers: - Assertions (runtime and build) - Kernel panic and oops conditions - Stack overflows - Other exceptions - Exception handling policy Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This commit is contained in:
parent
c311aa4675
commit
4ce988ab43
|
@ -115,3 +115,4 @@ These pages cover other kernel services.
|
||||||
other/ring_buffers.rst
|
other/ring_buffers.rst
|
||||||
other/cxx_support.rst
|
other/cxx_support.rst
|
||||||
other/version.rst
|
other/version.rst
|
||||||
|
other/fatal.rst
|
||||||
|
|
263
doc/reference/kernel/other/fatal.rst
Normal file
263
doc/reference/kernel/other/fatal.rst
Normal file
|
@ -0,0 +1,263 @@
|
||||||
|
.. _fatal:
|
||||||
|
|
||||||
|
Fatal Errors
|
||||||
|
############
|
||||||
|
|
||||||
|
Software Errors Triggered in Source Code
|
||||||
|
****************************************
|
||||||
|
|
||||||
|
Zephyr provides several methods for inducing fatal error conditions through
|
||||||
|
either build-time checks, conditionally compiled assertions, or deliberately
|
||||||
|
invoked panic or oops conditions.
|
||||||
|
|
||||||
|
Runtime Assertions
|
||||||
|
==================
|
||||||
|
|
||||||
|
Zephyr provides some macros to perform runtime assertions which may be
|
||||||
|
conditionally compiled. Their definitions may be found in
|
||||||
|
:zephyr_file:`include/sys/__assert.h`.
|
||||||
|
|
||||||
|
Assertions are enabled by setting the ``__ASSERT_ON`` preprocessor symbol to a
|
||||||
|
non-zero value. There are two ways to do this:
|
||||||
|
|
||||||
|
- Use the :option:`CONFIG_ASSERT` and :option:`CONFIG_ASSERT_LEVEL` kconfig
|
||||||
|
options.
|
||||||
|
- Add ``-D__ASSERT_ON=<level>`` to the project's CFLAGS, either on the
|
||||||
|
build command line or in a CMakeLists.txt.
|
||||||
|
|
||||||
|
The ``__ASSERT_ON`` method takes precedence over the kconfig option if both are
|
||||||
|
used.
|
||||||
|
|
||||||
|
Specifying an assertion level of 1 causes the compiler to issue warnings that
|
||||||
|
the kernel contains debug-type ``__ASSERT()`` statements; this reminder is
|
||||||
|
issued since assertion code is not normally present in a final product.
|
||||||
|
Specifying assertion level 2 suppresses these warnings.
|
||||||
|
|
||||||
|
Assertions are enabled by default when running Zephyr test cases, as
|
||||||
|
configured by the :option:`CONFIG_TEST` option.
|
||||||
|
|
||||||
|
The policy for what to do when encountering a failed assertion is controlled
|
||||||
|
by the implementation of :c:func:`assert_post_action`. Zephyr provides
|
||||||
|
a default implementation with weak linkage which invokes a kernel oops if
|
||||||
|
the thread that failed the assertion was running in user mode, and a kernel
|
||||||
|
panic otherwise.
|
||||||
|
|
||||||
|
__ASSERT()
|
||||||
|
----------
|
||||||
|
|
||||||
|
The ``__ASSERT()`` macro can be used inside kernel and application code to
|
||||||
|
perform optional runtime checks which will induce a fatal error if the
|
||||||
|
check does not pass. The macro takes a string message which will be printed
|
||||||
|
to provide context to the assertion. In addition, the kernel will print
|
||||||
|
a text representation of the expression code that was evaluated, and the
|
||||||
|
file and line number where the assertion can be found.
|
||||||
|
|
||||||
|
For example:
|
||||||
|
|
||||||
|
.. code-block:: c
|
||||||
|
|
||||||
|
__ASSERT(foo == 0xF0CACC1A, "Invalid value of foo, got 0x%x", foo);
|
||||||
|
|
||||||
|
If at runtime ``foo`` had some unexpected value, the error produced may
|
||||||
|
look like the following:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
ASSERTION FAIL [foo == 0xF0CACC1A] @ ZEPHYR_BASE/tests/kernel/fatal/src/main.c:367
|
||||||
|
Invalid value of foo, got 0xdeadbeef
|
||||||
|
[00:00:00.000,000] <err> os: r0/a1: 0x00000004 r1/a2: 0x0000016f r2/a3: 0x00000000
|
||||||
|
[00:00:00.000,000] <err> os: r3/a4: 0x00000000 r12/ip: 0x00000000 r14/lr: 0x00000a6d
|
||||||
|
[00:00:00.000,000] <err> os: xpsr: 0x61000000
|
||||||
|
[00:00:00.000,000] <err> os: Faulting instruction address (r15/pc): 0x00009fe4
|
||||||
|
[00:00:00.000,000] <err> os: >>> ZEPHYR FATAL ERROR 4: Kernel panic
|
||||||
|
[00:00:00.000,000] <err> os: Current thread: 0x20000414 (main)
|
||||||
|
[00:00:00.000,000] <err> os: Halting system
|
||||||
|
|
||||||
|
__ASSERT_EVAL()
|
||||||
|
---------------
|
||||||
|
|
||||||
|
The ``__ASSERT_EVAL()`` macro can also be used inside kernel and application
|
||||||
|
code, with special semantics for the evaluation of its arguments.
|
||||||
|
|
||||||
|
It makes use of the ``__ASSERT()`` macro, but has some extra flexibility. It
|
||||||
|
allows the developer to specify different actions depending whether the
|
||||||
|
``__ASSERT()`` macro is enabled or not. This can be particularly useful to
|
||||||
|
prevent the compiler from generating comments (errors, warnings or remarks)
|
||||||
|
about variables that are only used with ``__ASSERT()`` being assigned a value,
|
||||||
|
but otherwise unused when the ``__ASSERT()`` macro is disabled.
|
||||||
|
|
||||||
|
Consider the following example:
|
||||||
|
|
||||||
|
.. code-block:: c
|
||||||
|
|
||||||
|
int x;
|
||||||
|
x = foo();
|
||||||
|
__ASSERT(x != 0, "foo() returned zero!");
|
||||||
|
|
||||||
|
If ``__ASSERT()`` is disabled, then 'x' is assigned a value, but never used.
|
||||||
|
This type of situation can be resolved using the __ASSERT_EVAL() macro.
|
||||||
|
|
||||||
|
.. code-block:: c
|
||||||
|
|
||||||
|
__ASSERT_EVAL ((void) foo(),
|
||||||
|
int x = foo(),
|
||||||
|
x != 0,
|
||||||
|
"foo() returned zero!");
|
||||||
|
|
||||||
|
The first parameter tells ``__ASSERT_EVAL()`` what to do if ``__ASSERT()`` is
|
||||||
|
disabled. The second parameter tells ``__ASSERT_EVAL()`` what to do if
|
||||||
|
``__ASSERT()`` is enabled. The third and fourth parameters are the parameters
|
||||||
|
it passes to ``__ASSERT()``.
|
||||||
|
|
||||||
|
__ASSERT_NO_MSG()
|
||||||
|
-----------------
|
||||||
|
|
||||||
|
The ``__ASSERT_NO_MSG()`` macro can be used to perform an assertion that
|
||||||
|
reports the failed test and its location, but lacks additional debugging
|
||||||
|
information provided to assist the user in diagnosing the problem; its use is
|
||||||
|
discouraged.
|
||||||
|
|
||||||
|
Build Assertions
|
||||||
|
================
|
||||||
|
|
||||||
|
Zephyr provides two macros for performing build-time assertion checks.
|
||||||
|
These are evaluated completely at compile-time, and are always checked.
|
||||||
|
|
||||||
|
BUILD_ASSERT_MSG()
|
||||||
|
------------------
|
||||||
|
|
||||||
|
This has the same semantics as C's ``_Static_assert`` or C++'s
|
||||||
|
``static_assert``. If the evaluation fails, a build error will be generated by
|
||||||
|
the compiler. If the compiler supports it, the provided message will be printed
|
||||||
|
to provide further context.
|
||||||
|
|
||||||
|
Unlike ``__ASSERT()``, the message must be a static string, without
|
||||||
|
:c:func:`printf()`-like format codes or extra arguments.
|
||||||
|
|
||||||
|
For example, suppose this check fails:
|
||||||
|
|
||||||
|
.. code-block:: c
|
||||||
|
|
||||||
|
BUILD_ASSERT_MSG(FOO == 2000,
|
||||||
|
"Invalid value of FOO");
|
||||||
|
|
||||||
|
With GCC, the output resembles:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
tests/kernel/fatal/src/main.c: In function 'test_main':
|
||||||
|
include/toolchain/gcc.h:28:37: error: static assertion failed: "Invalid value of FOO"
|
||||||
|
#define BUILD_ASSERT_MSG(EXPR, MSG) _Static_assert(EXPR, MSG)
|
||||||
|
^~~~~~~~~~~~~~
|
||||||
|
tests/kernel/fatal/src/main.c:370:2: note: in expansion of macro 'BUILD_ASSERT_MSG'
|
||||||
|
BUILD_ASSERT_MSG(FOO == 2000,
|
||||||
|
^~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
BUILD_ASSERT()
|
||||||
|
--------------
|
||||||
|
|
||||||
|
This works just like ``BUILD_ASSERT_MSG()`` except there is no supplemental
|
||||||
|
message provided, and like ``__ASSERT_NO_MSG()`` its use is discouraged.
|
||||||
|
|
||||||
|
Kernel Oops
|
||||||
|
===========
|
||||||
|
|
||||||
|
A kernel oops is a software triggered fatal error invoked by
|
||||||
|
:c:func:`k_oops()`. This should be used to indicate an unrecoverable condition
|
||||||
|
in application logic.
|
||||||
|
|
||||||
|
The fatal error reason code generated will be ``K_ERR_KERNEL_OOPS``.
|
||||||
|
|
||||||
|
Kernel Panic
|
||||||
|
============
|
||||||
|
|
||||||
|
A kernel error is a software triggered fatal error invoked by
|
||||||
|
:c:func:`k_panic()`. This should be used to indicate that the Zephyr kernel is
|
||||||
|
in an unrecoverable state. Implementations of
|
||||||
|
:c:func:`k_sys_fatal_error_handler()` should not return if the kernel
|
||||||
|
encounters a panic condition, as the entire system needs to be reset.
|
||||||
|
|
||||||
|
Threads running in user mode are not permitted to invoke :c:func:`k_panic()`,
|
||||||
|
and doing so will generate a kernel oops instead. Otherwise, the fatal error
|
||||||
|
reason code generated will be ``K_ERR_KERNEL_PANIC``.
|
||||||
|
|
||||||
|
Exceptions
|
||||||
|
**********
|
||||||
|
|
||||||
|
Spurious Interrupts
|
||||||
|
===================
|
||||||
|
|
||||||
|
If the CPU receives a hardware interrupt on an interrupt line that has not had
|
||||||
|
a handler installed with ``IRQ_CONNECT()`` or :c:func:`irq_connect_dynamic()`,
|
||||||
|
then the kernel will generate a fatal error with the reason code
|
||||||
|
``K_ERR_SPURIOUS_IRQ()``.
|
||||||
|
|
||||||
|
Stack Overflows
|
||||||
|
===============
|
||||||
|
|
||||||
|
In the event that a thread pushes more data onto its execution stack than its
|
||||||
|
stack buffer provides, the kernel may be able to detect this situation and
|
||||||
|
generate a fatal error with a reason code of ``K_ERR_STACK_CHK_FAIL``.
|
||||||
|
|
||||||
|
If a thread is running in user mode, then stack overflows are always caught,
|
||||||
|
as the thread will simply not have permission to write to adjacent memory
|
||||||
|
addresses outside of the stack buffer. Because this is enforced by the
|
||||||
|
memory protection hardware, there is no risk of data corruption to memory
|
||||||
|
that the thread would not otherwise be able to write to.
|
||||||
|
|
||||||
|
If a thread is running in supervisor mode, or if :option:`CONFIG_USERSPACE` is
|
||||||
|
not enabled, depending on configuration stack overflows may or may not be
|
||||||
|
caught. :option:`CONFIG_HW_STACK_PROTECTION` is supported on some
|
||||||
|
architectures and will catch stack overflows in supervisor mode, including
|
||||||
|
when handling a system call on behalf of a user thread. Typically this is
|
||||||
|
implemented via dedicated CPU features, or read-only MMU/MPU guard regions
|
||||||
|
placed immediately adjacent to the stack buffer. Stack overflows caught in this
|
||||||
|
way can detect the overflow, but cannot guarantee against data corruption and
|
||||||
|
should be treated as a very serious condition impacting the health of the
|
||||||
|
entire system.
|
||||||
|
|
||||||
|
If a platform lacks memory management hardware support,
|
||||||
|
:option:`CONFIG_STACK_SENTINEL` is a software-only stack overflow detection
|
||||||
|
feature which periodically checks if a sentinel value at the end of the stack
|
||||||
|
buffer has been corrupted. It does not require hardware support, but provides
|
||||||
|
no protection against data corruption. Since the checks are typically done at
|
||||||
|
interrupt exit, the overflow may be detected a nontrivial amount of time after
|
||||||
|
the stack actually overflowed.
|
||||||
|
|
||||||
|
Finally, Zephyr supports GCC compiler stack canaries via
|
||||||
|
:option:`CONFIG_STACK_CANARIES`. If enabled, the compiler will insert a canary
|
||||||
|
value randomly generated at boot into function stack frames, checking that the
|
||||||
|
canary has not been overwritten at function exit. If the check fails, the
|
||||||
|
compiler invokes :c:func:`__stack_chk_fail()`, whose Zephyr implementation
|
||||||
|
invokes a fatal stack overflow error. An error in this case does not indicate
|
||||||
|
that the entire stack buffer has overflowed, but instead that the current
|
||||||
|
function stack frame has been corrupted. See the compiler documentation for
|
||||||
|
more details.
|
||||||
|
|
||||||
|
Other Exceptions
|
||||||
|
================
|
||||||
|
|
||||||
|
Any other type of unhandled CPU exception will generate an error code of
|
||||||
|
``K_ERR_CPU_EXCEPTION``.
|
||||||
|
|
||||||
|
Fatal Error Handling
|
||||||
|
********************
|
||||||
|
|
||||||
|
The policy for what to do when encountering a fatal error is determined by the
|
||||||
|
implementation of the :c:func:`k_sys_fatal_error_handler()` function. This
|
||||||
|
function has a default implementation with weak linkage that calls
|
||||||
|
``LOG_PANIC()`` to dump all pending logging messages and then unconditionally
|
||||||
|
halts the system with :c:func:`k_fatal_halt()`.
|
||||||
|
|
||||||
|
Applications are free to implement their own error handling policy by
|
||||||
|
overriding the implementation of :c:func:`k_sys_fatal_error_handler()`.
|
||||||
|
If the implementation returns, the faulting thread will be aborted and
|
||||||
|
the system will otherwise continue to function. See the documentation for
|
||||||
|
this function for additional details and constraints.
|
||||||
|
|
||||||
|
API Reference
|
||||||
|
*************
|
||||||
|
|
||||||
|
.. doxygengroup:: fatal_apis
|
||||||
|
:project: Zephyr
|
||||||
|
|
|
@ -4,12 +4,22 @@
|
||||||
* SPDX-License-Identifier: Apache-2.0
|
* SPDX-License-Identifier: Apache-2.0
|
||||||
*/
|
*/
|
||||||
|
|
||||||
|
/** @file
|
||||||
|
* @brief Fatal error functions
|
||||||
|
*/
|
||||||
|
|
||||||
#ifndef ZEPHYR_INCLUDE_FATAL_H
|
#ifndef ZEPHYR_INCLUDE_FATAL_H
|
||||||
#define ZEPHYR_INCLUDE_FATAL_H
|
#define ZEPHYR_INCLUDE_FATAL_H
|
||||||
|
|
||||||
#include <arch/cpu.h>
|
#include <arch/cpu.h>
|
||||||
#include <toolchain.h>
|
#include <toolchain.h>
|
||||||
|
|
||||||
|
/**
|
||||||
|
* @defgroup fatal_apis Fatal error APIs
|
||||||
|
* @ingroup kernel_apis
|
||||||
|
* @{
|
||||||
|
*/
|
||||||
|
|
||||||
enum k_fatal_error_reason {
|
enum k_fatal_error_reason {
|
||||||
/** Generic CPU exception, not covered by other codes */
|
/** Generic CPU exception, not covered by other codes */
|
||||||
K_ERR_CPU_EXCEPTION,
|
K_ERR_CPU_EXCEPTION,
|
||||||
|
@ -88,4 +98,6 @@ void k_sys_fatal_error_handler(unsigned int reason, const z_arch_esf_t *esf);
|
||||||
*/
|
*/
|
||||||
void z_fatal_error(unsigned int reason, const z_arch_esf_t *esf);
|
void z_fatal_error(unsigned int reason, const z_arch_esf_t *esf);
|
||||||
|
|
||||||
|
/** @} */
|
||||||
|
|
||||||
#endif /* ZEPHYR_INCLUDE_FATAL_H */
|
#endif /* ZEPHYR_INCLUDE_FATAL_H */
|
||||||
|
|
|
@ -4,60 +4,6 @@
|
||||||
* SPDX-License-Identifier: Apache-2.0
|
* SPDX-License-Identifier: Apache-2.0
|
||||||
*/
|
*/
|
||||||
|
|
||||||
/**
|
|
||||||
* @file
|
|
||||||
* @brief Debug aid
|
|
||||||
*
|
|
||||||
*
|
|
||||||
* The __ASSERT() macro can be used inside kernel code.
|
|
||||||
*
|
|
||||||
* Assertions are enabled by setting the __ASSERT_ON symbol to a non-zero value.
|
|
||||||
* There are two ways to do this:
|
|
||||||
* a) Use the ASSERT and ASSERT_LEVEL kconfig options
|
|
||||||
* b) Add "CFLAGS += -D__ASSERT_ON=<level>" at the end of a project's Makefile
|
|
||||||
* The Makefile method takes precedence over the kconfig option if both are
|
|
||||||
* used.
|
|
||||||
*
|
|
||||||
* Specifying an assertion level of 1 causes the compiler to issue warnings that
|
|
||||||
* the kernel contains debug-type __ASSERT() statements; this reminder is issued
|
|
||||||
* since assertion code is not normally present in a final product. Specifying
|
|
||||||
* assertion level 2 suppresses these warnings.
|
|
||||||
*
|
|
||||||
* The __ASSERT_EVAL() macro can also be used inside kernel code.
|
|
||||||
*
|
|
||||||
* It makes use of the __ASSERT() macro, but has some extra flexibility. It
|
|
||||||
* allows the developer to specify different actions depending whether the
|
|
||||||
* __ASSERT() macro is enabled or not. This can be particularly useful to
|
|
||||||
* prevent the compiler from generating comments (errors, warnings or remarks)
|
|
||||||
* about variables that are only used with __ASSERT() being assigned a value,
|
|
||||||
* but otherwise unused when the __ASSERT() macro is disabled.
|
|
||||||
*
|
|
||||||
* Consider the following example:
|
|
||||||
*
|
|
||||||
* int x;
|
|
||||||
*
|
|
||||||
* x = foo ();
|
|
||||||
* __ASSERT (x != 0, "foo() returned zero!");
|
|
||||||
*
|
|
||||||
* If __ASSERT() is disabled, then 'x' is assigned a value, but never used.
|
|
||||||
* This type of situation can be resolved using the __ASSERT_EVAL() macro.
|
|
||||||
*
|
|
||||||
* __ASSERT_EVAL ((void) foo(),
|
|
||||||
* int x = foo(),
|
|
||||||
* x != 0,
|
|
||||||
* "foo() returned zero!");
|
|
||||||
*
|
|
||||||
* The first parameter tells __ASSERT_EVAL() what to do if __ASSERT() is
|
|
||||||
* disabled. The second parameter tells __ASSERT_EVAL() what to do if
|
|
||||||
* __ASSERT() is enabled. The third and fourth parameters are the parameters
|
|
||||||
* it passes to __ASSERT().
|
|
||||||
*
|
|
||||||
* The __ASSERT_NO_MSG() macro can be used to perform an assertion that reports
|
|
||||||
* the failed test and its location, but lacks additional debugging information
|
|
||||||
* provided to assist the user in diagnosing the problem; its use is
|
|
||||||
* discouraged.
|
|
||||||
*/
|
|
||||||
|
|
||||||
#ifndef ZEPHYR_INCLUDE_SYS___ASSERT_H_
|
#ifndef ZEPHYR_INCLUDE_SYS___ASSERT_H_
|
||||||
#define ZEPHYR_INCLUDE_SYS___ASSERT_H_
|
#define ZEPHYR_INCLUDE_SYS___ASSERT_H_
|
||||||
|
|
||||||
|
|
|
@ -117,9 +117,12 @@ config ASSERT
|
||||||
default y if TEST
|
default y if TEST
|
||||||
help
|
help
|
||||||
This enables the __ASSERT() macro in the kernel code. If an assertion
|
This enables the __ASSERT() macro in the kernel code. If an assertion
|
||||||
fails, the calling thread is put on an infinite tight loop. Since
|
fails, the policy for what to do is controlled by the implementation
|
||||||
enabling this adds a significant footprint, it should only be enabled
|
of the assert_post_action() function, which by default will trigger
|
||||||
in a non-production system.
|
a fatal error.
|
||||||
|
|
||||||
|
Disabling this option will cause assertions to compile to nothing,
|
||||||
|
improving performance and system footprint.
|
||||||
|
|
||||||
config ASSERT_LEVEL
|
config ASSERT_LEVEL
|
||||||
int "__ASSERT() level"
|
int "__ASSERT() level"
|
||||||
|
|
Loading…
Reference in a new issue