If CONFIG_XTENSA_RPO_CACHE is not enabled, it can be assumed
that memory is not double mapped in hardware for cached and
uncached access. So we can specify those regions to have
cache via TLB.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Simply using __data_start and __data_end is not enough as
it leaves out kobject regions which is supposed to be
near .data section. So use _image_ram_start and
_image_ram_end instead to enclose data, bss and various
kobject regions (among others).
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
During MMU initialization, we clear TLB way 6 to remove all
identity mapping. Depending on CPU configuration, there are
certain number of entries per way. So use the number from
core-isa.h to clear enough entries instead of hard-coded
number 8. Specifying an entry number outside of permitted
range may result in CPU reacting in weird way so better to
avoid that.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
This removes the identity map of the first 512MB in TLB way 6.
Or else it would interfere with mapped entries resulting in
double mapped TLB exception.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
MMU needs to be initialized before going in to C, so
z_xtensa_mmu_init() is called in crt1.S before call
to z_cstart(). Note that this is the default case
and crt1.S can be disabled if board and SoC desire
to do so.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Turns out not all MMU enabled Xtensa cores have vaddrstatus,
vaddr0 and vaddr1. And there does not seem to be a way to
determine whether they are available. So remove them from
the exception printout for now.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Refactor the ESP32 target SOCs together with
all related boards. Most braking changes includes:
- changing the CONFIG_SOC_ESP32* to refer to
the actual soc line (esp32,esp32s2,esp32s3,esp32c3)
- replacing CONFIG_SOC with the CONFIG_SOC_SERIES
- creating CONFIG_SOC_FAMILY_ESP32 to embrace all
the ESP32 across all used architectures
- introducing CONFIG_SOC_PART_NUMBER_* to
provide a SOC model config
- introducing the 'common' folder to hide all
commonly used configs and files.
- updating west.yml to reflect previous changes in hal
Signed-off-by: Marek Matej <marek.matej@espressif.com>
xt-clang likes to remove any consecutive NOPs more than 8. So
we need to force the function to have no optimization to avoid
this behavior and to retain all those NOPs.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
This adds a Kconfig to introduce the Xtensa specific
arch_spin_relax() which can do more NOPs. Some Xtensa SoCs
may need more NOPs after failure to lock a spinlock,
especially under SMP. This gives the bus extra time to
propagate the RCW transactions among CPUs.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Allow builds which has CONFIG_MULTITHREADING disabled.
This is reduce code footprint which is handy for
constrained targets as bootloaders.
Signed-off-by: Marek Matej <marek.matej@espressif.com>
This make MCUboot build as Zephyr application.
Providing optinal 2nd stage bootloader to the
IDF bootloader, which is used by default.
This provides more flexibility when building
and loading multiple images and aims to
brings better DX to users by using the sysbuild.
MCUboot and applications has now separate
linker scripts.
Signed-off-by: Marek Matej <marek.matej@espressif.com>
This adds code to always map data TLB for VECBASE so that
we would be dealing with fewer data TLB misses during
exception handling. With VECBASE always mapped, there is
no need to pre-load anymore.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
This moves the TLB miss handling to the C exception handler.
This also allows us to handle page faults (for example,
unmapped pages) during this time as any more exceptions
handled in the C handler will not trigger the double
exception handler but the same C handler.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Instead of being able to arbitrarily set the PTEVADDR for page
table, this provides choices (currently just one). This is in
preparation to enable handling memory management exception in
C code. For that to work, we will need to pre-load the page
table address (PTEVADDR) for the memory page containing
exception code and data (containing jump addresses), and
various stacks. This is to prempt any TLB misses during handling
the level 1 interrupt code. If a TLB miss is encountered during
handling of level 1 interrupt, we will be thrown into double
exception handling code where we will get stuck in infinite
loop. However, in order to pre-load the page table entries,
PTEVADDR needs to be calculated. This requires the use of
PTEVADDR base which cannot be loaded via l32r, as we may cause
a data TLB miss. So we must be able to grab the PTEVADDR base
address strictly within code, and must be without any data
load. So changing CONFIG_XTENSA_MMU_PTEVADDR to be based on
choice so we can have pre-defined bit shift value for shift
operation. This shift value will be used in exception handling
code.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Add a build option to tell if memory should be mapped in cached
and uncachedr regions.
If the memory is neither in cached nor uncached region it is not double
mapped.
Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
Initial support for Xtensa MMU version 3. It is using a two level page
table based on fact that the page table is in the virtual space. Only
the top level (page directory) is wired mapped in the TLB to avoid
second level page miss.
The mapped memory is completely fragmented in multiple sections, maybe
we find a better way in future.
The exception handler is where we effectively map the memory, the way it
works is:
1) SW try to access some memory address
2) The address is not mapped, so the MMU will try the auto-refill,
looking the page table
3) The page table contents is not mapped (remember, just the top-level page
is mapped)
4) An exception will be triggered, in the exception we try to read the
portion of the page table that maps the original address
5) The address is not mapped, so the MMU will try again the auto-refill.
This time though, the address is mapped by the top level page that is
properly mapped. (The top-level page maps the page table itself).
Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
In z_xtensa_backtrace_print the parameter depth is checked for <= 0.
There is no need to check it again later, also, since the variable is
not used after the while loop we can use directly the parameter without
an additional variable.
Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
The "cross stack call" mechanism has intermediate states where the
stack frames are not valid for our own interrupt entry code, which
causes corruption if an interrupt races at exactly the right time.
Leave interrupts masked until just before the call.
The fix is midly complicated by the fact that we RELY on nested window
exception frames to spill registers from the interruptee, so have to
do the masking with PS.INTLEVEL, which requires a register to save its
contents, which we don't have since everything needs to happen in one
4-register window. But thankfully our Zephyr-reserved EPS register is
guaranteed to be available through this process.
Fixes#57009
Signed-off-by: Andy Ross <andyross@google.com>
Use the common exit() provided by libc so we get standard behavior
across all architectures. So only implement a special exit when
XT_SIMULATOR is defined.
Signed-off-by: Kumar Gala <kumar.gala@intel.com>
The backtrace requires a valid stack pointer to start
printing backtraces. So if there is no stack pointer
being passed in, skip printing backtraces.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Commit 408472673e added inline
assembly to lock interrupt. However, XCC doesn't like the syntax
using STRINGIFY, and also an empty clobber section. So parameterize
the second argument to rsil, and remove the last colon.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
This adds some structs for interrupt stack frames to make it
easier to access individual elements, and ultimately getting
rid of magic array element numbers in the code. Hopefully,
this would aid in debugging where you can view the whole
struct in debugger.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
In case of recoverable fatal errors the execution should
switch to another thread. This will ensure the current_cpu nested
count is reset when there is a context switch.
Signed-off-by: Aastha Grover <aastha.grover@intel.com>
If running under Xtensa simulator, it is good to tell simulator
to stop execution once we reach double exception, as the current
double exception handler is simply an endless loop. If we turn
on tracing in the simulator, the output file would contain
an infinite iteration of this endless loop, and the simulator
needs to be stopped manually before the file size goes out of
control. So we need to tell the simulator to stop once
we reach this point instead of doing an endless loop.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Save FP user register and FP register file during context switch.
This change enables shared FP registers mode using CONFIG_FPU_SHARING.
Since there is no lazy stacking, the FPU registers will be saved regardless
of whether floating point calculations are performed in the threads when
CONFIG_FPU_SHARING is enabled. This require 72 additional bytes in the
stack memory.
Signed-off-by: Lucas Tamborrino <lucas.tamborrino@espressif.com>
1. this header is no use for asm
2. if use xclib, this header include xclib stdbool, and expand to typedef
Signed-off-by: honglin leng <a909204013@gmail.com>
Change automated searching for files using "IRQ_CONNECT()" API not
including <zephyr/irq.h>.
Signed-off-by: Gerard Marull-Paretas <gerard.marull@nordicsemi.no>
Align backtrace output with the style used in rest of the codespace.
This makes it more convenient to compare the backtrace to e.g. objdump
output.
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
The Xtensa arch has historically had state/user register accessor
macros with bare three-byte symbol names. I think this might have
been in the original Cadence-contributed arch integration, but I'm not
sure. In any case they also exist in the same names in vendor
HAL/toolchain code and are causing collisions. We never should have
had these symbols exposed in our header.
Put them under an XTENSA_ prefix to decollide.
Signed-off-by: Andy Ross <andyross@google.com>
As of today <zephyr/zephyr.h> is 100% equivalent to <zephyr/kernel.h>.
This patch proposes to then include <zephyr/kernel.h> instead of
<zephyr/zephyr.h> since it is more clear that you are including the
Kernel APIs and (probably) nothing else. <zephyr/zephyr.h> sounds like a
catch-all header that may be confusing. Most applications need to
include a bunch of other things to compile, e.g. driver headers or
subsystem headers like BT, logging, etc.
The idea of a catch-all header in Zephyr is probably not feasible
anyway. Reason is that Zephyr is not a library, like it could be for
example `libpython`. Zephyr provides many utilities nowadays: a kernel,
drivers, subsystems, etc and things will likely grow. A catch-all header
would be massive, difficult to keep up-to-date. It is also likely that
an application will only build a small subset. Note that subsystem-level
headers may use a catch-all approach to make things easier, though.
NOTE: This patch is **NOT** removing the header, just removing its usage
in-tree. I'd advocate for its deprecation (add a #warning on it), but I
understand many people will have concerns.
Signed-off-by: Gerard Marull-Paretas <gerard.marull@nordicsemi.no>
Two issues:
- A unnecessary parentheses pair caused rounding errors (by truncating
a small value before multiplying it).
- arch_timing_cycles_to_ns_avg() wasn't actually converting the result
to nanoseconds.
Signed-off-by: Ederson de Souza <ederson.desouza@intel.com>
The simulator seems to drop garbage addresses (somewhere in the ROM it
looks like) into this SR at arbitrary times. I don't know if this is
a hardware exception handler that we can't turn off, or a simulator
bug, or what. But our code that assumes it will be cleared to zero or
valid is breaking. Set it every time in every context switch for now
pending someone figuring out what's going wrong.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
Move those defines and values back to headers. Kconfig is not a good
place for this, later this should move to DTS.
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
Any project with Kconfig option CONFIG_LEGACY_INCLUDE_PATH set to n
couldn't be built because some files were missing zephyr/ prefix in
includes
Re-run the migrate_includes.py script to fix all legacy include paths
Signed-off-by: Tomislav Milkovic <milkovic@byte-lab.com>
Adds compatibility with Intel ADSP GDB from Zephyr SDK and
from Cadence toolchain to coredump_gdbserver.py.
Adds CAVS 15-25 (APL) register definitions. Implements
handle_register_single_read_packet to serve ADSP GDB
p packets.
Prevents BSA from changing between stack dump printout
and coredump by taking lock. Observed to be necessary for
accurate results on slower simulated platforms.
Signed-off-by: Lauren Murphy <lauren.murphy@intel.com>
Triggers CPU exception with illegal instruction when z_except_reason
is called (e.g. in k_panic, k_oops). Creates exception stack frame
for use by coredump. Adds unique cause code for ARCH_EXCEPT. Disables
test case failure for qemu_xtensa.
Without an ARCH_EXCEPT implementation, z_except_reason calls
z_fatal_error directly with a null ESF and bypasses
xtensa_excint1_c's error logging. An ESF is required to coredump.
Signed-off-by: Lauren Murphy <lauren.murphy@intel.com>
This commit corrects all `extern K_KERNEL_STACK_ARRAY_DEFINE` macro
usages to use the `K_KERNEL_STACK_ARRAY_DECLARE` macro instead.
Signed-off-by: Stephanos Ioannidis <root@stephanos.io>
Expose the Xtenesa CCOUNT timing register (the lowest level CPU cycle
counter) using the arch_timing_*() API.
This is the simplest possible way to get this working. Future work
might focus on moving the rate configuration into devicetree in a
standard way, integrating with the platform clock driver on intel_adsp
such that the reported cycle rate tracks runtime changes (though IIRC
this is not a SOF requirement), and adding better test coverage to the
timing layer, which right now isn't exercised anywhere but in
benchmarks.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Adds few missing zephyr/ prefixes to leftover #include statements that
either got added recently or were using double quote format.
Signed-off-by: Fabio Baltieri <fabiobaltieri@google.com>
The xtensa interrupt return path was forgetting to check the nested
interrupt state and calling into the scheduler to select the context
to which to return, which of course is completely wrong. We MUST
return to the ISR we interrupted.
In fact in practice this was only visible in the case of a nested
interrupt that causes a context switch, otherwise the "interrupted"
argument just gets returned and things work. In particular, it can
happen when the nested context is a fatal exception that aborts the
current thread, which is how this was discovered. The timing required
to see this on live interrupts on real applications is likely to have
been extremely difficult to detect.
Fixes#45779
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Assembler files were not migrated with the new <zephyr/...> prefix.
Note that the conversion has been scripted, refer to #45388 for more
details.
Signed-off-by: Gerard Marull-Paretas <gerard.marull@nordicsemi.no>
In order to bring consistency in-tree, migrate all arch code to the new
prefix <zephyr/...>. Note that the conversion has been scripted, refer
to zephyrproject-rtos#45388 for more details.
Signed-off-by: Gerard Marull-Paretas <gerard.marull@nordicsemi.no>
When building with CONFIG_SCHED_CPU_MASK_PIN_ONLY we can assume that a
thread will always be executed in a same CPU and consequently skip the
cache invalidation.
Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
Making context switch cache-coherent in SMP is hard. The
KERNEL_COHERENCE handling was conservatively invalidating the stack
region of a thread that was being switched in. This was because it
might have (1) run on this CPU in the past, but (2) run most recently
on a different CPU. In that case we might have stale data still in
our local dcache!
But this has performance impact in the (very common!) case of a thread
being switched out briefly and then back in (e.g. k_sleep() for a
small duration). It will come back having lost all of its cached
stack context, and will have to fetch all that information back from
shared SRAM!
Treat this by tracking a "last_cpu" for each thread in the arch part
of the thread struct. If we're coming back to the same CPU we left,
we know we can skip the invalidate.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Platform specific functions necessary to enable this feature were
implemented (z_xtensa_ptr_executable() and
z_xtensa_stack_ptr_is_sane() for Intel ADSP platforms.
Current implementation just ensures stack pointer and program counter
are within relevant areas defined in the linker scripts, without going
too fine grained.
Also, `.iram1` section, used by the backtrace code, also added to
Intel ADSP linker script.
Finally, update west manifest to use up-to-date SOF, which contains a
patch to fix build issues related to the linker changes.
Signed-off-by: Ederson de Souza <ederson.desouza@intel.com>
According to Kconfig guidelines, boolean prompts must not start with
"Enable...". The following command has been used to automate the changes
in this patch:
sed -i "s/bool \"[Ee]nables\? \(\w\)/bool \"\U\1/g" **/Kconfig*
Signed-off-by: Gerard Marull-Paretas <gerard.marull@nordicsemi.no>
The Xtensa implementation of arch_irq_offload() required that the user
select the correct interrupt manually, and would race with itself if
invoked from separate CPUs (it was saved here by the main
irq_offload() function which has a semaphore to serialize access).
Use the new gen_zsr.py script to automatically detect the highest
available software interrupt, and keep a per-CPU set of
callback/parameter pointers.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Some XCC toolchains do not provide atexit() which results
in undefined reference error. So add a weak dummy atexit()
for this siutation.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Turns out that xt-xcc will bail when faced with a real core-isa.h (it
wants you to rely on the builtins in the compiler). Undefine __XCC__
to force it to actually parse and emit declarations for its own
header.
(Also adds a newline to the generated one-line C file to silence a warning)
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
We had a similar sequence for interrupt return, where we were
selecting (actually only for the benefit of qemu) the highest priority
EPCn/EPSn registers for our RFI instruction. That works much better
in python the preprocessor.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
The kernel coherence cache flush code was using a scratch register to
mark the top of the stack. Likewise a good candidate for ZSR use.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
This is actually Cadence-authored code, but its use of EXCSAVE1 as a
sideband input to the exception handler is very much in the same
family of tricks. Use ZSR assignments here too.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Zephyr likes to use the various Xtensa scratch registers for its own
purposes in several places. Unfortunately, owing to the
configurability of the architecture, we have to use different
registers for different platforms. This has been done so far with a
collection of different tricks, some... less elegant than others.
Put it all in one place. This is a python script that emites a
"zsr.h" header with register assignments for all the existing users.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
For functions returning nothing, there is no need to document
with @return, as Doxgen complains about "documented empty
return type of ...".
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
This is trick (mapping RAM twice so you can use alternate Region
Protection Option addresses to control cacheability) is something any
Xtensa hardware designer might productively choose to do. And as it
works really well, we should encourage that by making this a generic
architecture feature for Zephyr.
Now everything works by setting two kconfig values at the soc level
defining the cached and uncached regions. As long as these are
correct, you can then use the new arch_xtensa_un/cached_ptr() APIs to
convert between them and a ARCH_XTENSA_SET_RPO_TLB() macro that
provides much smaller initialization code (in C!) than the HAL
assembly macros. The conversion routines have been generalized to
support conversion between any two regions.
Note that full KERNEL_COHERENCE still requires support from the
platform linker script, that can't be made generic given the way
Zephyr does linkage.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Startup on these devices was sort of a mess, with multiple variants of
Xtensa and platform initialization code from multiple ancestries being
invoked at different places for different purposes. Just use one code
path for everyone.
Bootloader entry starts with a minimal assembly stub that simply sets
WINDOW{START,BASE}, PS and a stack pointer and then jumps to C code.
That then uses the cpu_early_init() implementation from cAVS 2.5's
secondary cores to finish Xtensa initialization, and then flows
directly into the pre-existing bootloader C code to initialize cache
and memory and copy the HP-SRAM image, then it invokes Zephyr via a
simple C function call to z_cstart().
Likewise, remove the "reset vector" from Zephyr. This was never a
reset vector, reset on these devices goes to a fixed address in a ROM.
CPU initialization is handled explicitly and completely in the
bootloader now, in a way that can be unified between the main and
secondary cores. Entry from the bootloader now goes directly into
z_cstart() via a C call (via a single jump instruction placed at the
entry point address -- that's going away soon too once we're using a
unified link).
Now that vector table initialization happens in a uniform way, there's
no need to copy the VECBASE value during arch_start_cpu().
Finally note that this also reverts the
CONFIG_RESET_VECTOR_IN_BOOTLOADER kconfig variable added for these
platforms, because it's no longer a tunable and true always.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Adds Xtensa as supported architecture for coredump. Fixes
a few typos in documentation, Kconfig and a C file. Dumps
minimal set of registers shown by 'info registers' in GDB
for the sample_controller and ESP32 SOCs. Updates tests.
Signed-off-by: Lauren Murphy <lauren.murphy@intel.com>
This adds basic support for GDB stub on Xtensa. Note that
this only provides the common bits on the architecture side.
SoC support is also required to fully enable GDB stub on
each Xtensa SoC.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Call into z_thread_usage_stop() before ISR entry to avoid including
interrupt handling totals in thread usage stats.
Note that this hook is after the register save and stack swap has
happened, so it still incldues some overhead. But calling out from
the interrupted stack on Xtensa gets really, really hairy due to the
weird intermediate states we leverage (once we've saved enough context
to make a C call safely, we've lost the ability to use register
windows per the C ABI!).
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
This reverts commit 67d290540e.
The script is actually used to generate the _soc_inthandlers.h
file when introducing a new Xtensa SoC. So restore it.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Some Xtensa SoCs may not have that many levels of interrupts.
So limit the call to DEF_INT_C_HANDLER() to only supported
levels to avoid calling non-existent functions.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
For some platforms, like NXP's IMX8 or Mediatek's MT8195,
the size of an interrupt vector table entry is 0x1C bytes,
less than usual (0x30 for Intel's platforms).
So, the interrupt handlers don't fit in the vector table
entries.
I've added a small indirection to bypass this size
constraint and moved the default handlers to the end
of vector table, renaming them to
_Level\LVL\()VectorHelper.
For this, I've added a generic configuration -
XTENSA_SMALL_VECTOR_TABLE_ENTRY.
Signed-off-by: Iuliana Prodan <iuliana.prodan@nxp.com>
A simple WAITI isn't sufficient in all cases. The cAVS 2.5 hardware
uses WAITI as the entry state for per-core power gating, which is very
difficult to debug. Provide a fallback that simply spins in the idle
loop waiting for interrupts to provide a stable system while this
feature stabilizes.
Also, the SOF code for those platforms references a known bug with the
Xtensa LX6 core IP (or at least some versions), and will prefix the
WAIT instruction with 128 NOP.N's followed by an ISYNC and EXTW. This
bug hasn't been seen under Zephyr yet, and details are sketchy. But
the code is simply enough to import and works correctly.
Place both workaround under new kconfig variables and select them both
(even though they're actually mutually exclusive -- if you select both
CPU_IDLE_SPIN overrides) for cavs_v25.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
On CPU startup, When we reach the cache flush code in arch_switch(),
the outgoing thread is a dummy. The behavior of the existing code was
to leave the existing value in the SR unchanged (probably NULL at
startup). Then the context switch would walk from that address up to
the top of the outgoing stack, flushing everything in between. That's
wrong, because the outgoing stack is a real pointer (generally the
interrupt stack of the current CPU), and we're flushing everything in
memory underneath it.
This also reverts commit 29abc8adc0 ("xtensa: fix booting secondary
cores on the dummy thread"), which appears to have been an early
attempt to address this issue. It worked (modulo all the extra and
potentially incorrect flushing) on cavs v1.5/1.8 because of the way
the entry code worked there. But on 2.5 we now hit the first context
switch in a case where those extra lines are in address space already
marked unwritable by the CPU, so the flush explodes.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
For IMX, for timer interrupt, the interrupt handler
was not the correct one executed and that’s because
the handlers were not at the expected address.
For IMX the size constraint of the interrupt vector
table entry is 0x1C bytes of code, less than usual.
I've added a small indirection to bypass this size
constraint and moved the default handlers to the end
of vector table, renaming them to
_Level\LVL\()VectorHelper.
Signed-off-by: Iuliana Prodan <iuliana.prodan@nxp.com>
When secondary cores are booted, they use the dummy thread and
the IRQ stack until they switch over to a real thread. Therefore
dummy threads shouldn't be skipped when cohering outgoing thread
stack, only threads with zero stack size should be skipped.
Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
Both operands of an operator in which the usual arithmetic
conversions are performed shall have the same essential
type category.
Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
When we reach this code in interrupt context, our upper GPRs contain a
cross-stack call that may still include some registers from the
interrupted thread. Those need to go out to memory before we can do
our cache coherence dance here.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Both new thread creation and context switch had the same mistake in
cache management: the bottom of the stack (the "unused" region between
the lower memory bound and the live stack pointer) needs to be
invalidated before we switch, because otherwise any dirty lines we
might have left over can get flushed out on top of the same thread on
another CPU that is putting live data there.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
The Xtensa L1 cache layer has straightforward semantics accessible via
single-instructions that operate on cache lines via physical
addresses. These are very amenable to inlining.
Unfortunately the Xtensa HAL layer requires function calls to do this,
leading to significant code waste at the calling site, an extra frame
on the stack and needless runtime instructions for situations where
the call is over a constant region that could elide the loop. This is
made even worse because the HAL library is not built with
-ffunction-sections, so pulling in even one of these tiny cache
functions has the effect of importing a 1500-byte object file into the
link!
Add our own tiny cache layer to include/arch/xtensa/cache.h and use
that instead.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Back when I started work on this stuff, I had a set of notes on
register windows that slowly evolved into something that looks like
formal documentation. There really isn't any overview-style
documentation of this stuff on the public internet, so it couldn't
hurt to commit it here for posterity.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
Instead of passing the crt1 _start function as the entry code for
auxiliary CPUs, use a tiny assembly stub instead which can avoid the
runtime testing needed to skip the work in _start. All the crt1 code
was doing was clearing BSS (which must not happen on a second CPU) and
setting the stack pointer (which is wrong on the second CPU).
This allows us to clean out the SMP code in crt1.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
The kernel passes the CPU's interrupt stack expected that it will
start on that, so do it. Pass the initial stack pointer from the SOC
layer in the variable "z_mp_stack_top" and set it in the assembly
startup before calling z_mp_entry().
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
The xtensa atomics layer was written with hand-coded assembly that had
to be called as functions. That's needlessly slow, given that the low
level primitives are a two-instruction sequence. Ideally the compiler
should see this as an inline to permit it to better optimize around
the needed barriers.
There was also a bug with the atomic_cas function, which had a loop
internally instead of returning the old value synchronously on a
failed swap. That's benign right now because our existing spin lock
does nothing but retry it in a tight loop anyway, but it's incorrect
per spec and would have caused a contention hang with more elaborate
algorithms (for example a spinlock with backoff semantics).
Remove the old implementation and replace with a much smaller inline C
one based on just two assembly primitives.
This patch also contains a little bit of refactoring to address the
scheme has been split out into a separate header for each, and the
ATOMIC_OPERATIONS_CUSTOM kconfig has been renamed to
ATOMIC_OPERATIONS_ARCH to better capture what it means.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
There was a bunch of dead historical cruft floating around in the
arch/xtensa tree, left over from older code versions. It's time to do
a cleanup pass. This is entirely refactoring and size optimization,
no behavior changes on any in-tree devices should be present.
Among the more notable changes:
+ xtensa_context.h offered an elaborate API to deal with a stack frame
and context layout that we no longer use.
+ xtensa_rtos.h was entirely dead code
+ xtensa_timer.h was a parallel abstraction layer implementing in the
architecture layer what we're already doing in our timer driver.
+ The architecture thread structs (_callee_saved and _thread_arch)
aren't used by current code, and had dead fields that were removed.
Unfortunately for standards compliance and C++ compatibility it's
not possible to leave an empty struct here, so they have a single
byte field.
+ xtensa_api.h was really just some interrupt management inlines used
by irq.h, so fold that code into the outer header.
+ Remove the stale assembly offsets. This architecture doesn't use
that facility.
All told, more than a thousand lines have been removed. Not bad.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
a0 is used as scratch register. Restore value of a0 (return address)
from stack frame before spilling registers on stack
Signed-off-by: Shubham Kulkarni <shubham.kulkarni@espressif.com>
Only the CAVS 1.5 linker script has full support for the coherence
features, don't advertise it on the other SoC's yet.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
While fixing license headers, identified this script as orphan and not
being used anywhere, so remove.
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
XCC doesn't like the "rsr.<reg name>" style assembly
so fix that to the other style.
Also, XCC doesn't like _CONCAT() with the EPC/EPS
registers so need to spell out all of them.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
There is a hard-coded value of PS_INTLEVEL(15) to set the PS
register. The correct way is actually to use XCHAL_EXCM_LEVEL
with PS_INTLEVEL() to setup the register. So fix it.
Fixes#31858
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
This change uses stack frame to print backtrace once exception occurs
Printing backtrace helps to identify the cause of exception
Signed-off-by: Shubham Kulkarni <shubham.kulkarni@espressif.com>
Currently Zephyr links reset-vector.S twice in xtensa builds:
into the bootloader and the main image. It is run at the end
of the boot loader execution and immediately after that again
in the beginning of the main code. This patch adds a
configuration option to select whether to link the file to the
bootloader or to the application. The default is to the
application, as needed e.g. for QEMU, SOF links it to the
bootloader like in native builds.
Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
There may be Xtensa SoCs which don't have high enough interrupt
levels for EPC6/EPS6 to exist in _restore_context. So changes
these to those which should be available according to the ISA
config file.
Fixes#30126
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Most of kernel files where declaring os module without providing
log level. Because of that default log level was used instead of
CONFIG_KERNEL_LOG_LEVEL.
Signed-off-by: Krzysztof Chruscinski <krzysztof.chruscinski@nordicsemi.no>
Since the tracing of thread being switched in/out has the same
instrumentation points, we can roll the tracing function calls
into the one for thread stats gathering functions.
This avoids duplicating code to call another function.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Adds the necessary bits to initialize TLS in the stack
area and sets up CPU registers during context switch.
Note that this does not enable TLS for all Xtensa SoC.
This is because Xtensa SoCs are highly configurable
so that each SoC can be considered a whole architecture.
So TLS needs to be enabled on the SoC level, instead of
at the arch level.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Implement the kernel "coherence" API on top of the linker
cached/uncached mapping work.
Add Xtensa handling for the stack coherence API.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
It's legal to have CONFIG_MP_NUM_CPUS > 1 and !CONFIG_SMP. The
tests/kernel/mp test does this as a unit test of the multiprocessor
facilities. Test the right tunable when deciding whether to blow away
static data or not.
Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
This code had one purpose only, feed timing information into a test and
was not used by anything else. The custom trace points unfortunatly were
not accurate and this test was delivering informatin that conflicted
with other tests we have due to placement of such trace points in the
architecture and kernel code.
For such measurements we are planning to use the tracing functionality
in a special mode that would be used for metrics without polluting the
architecture and kernel code with additional tracing and timing code.
Furthermore, much of the assembly code used had issues.
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Move tracing switched_in and switched_out to the architecture code and
remove duplications. This changes swap tracing for x86, xtensa.
Signed-off-by: Anas Nashif <anas.nashif@intel.com>
The core kernel computes the initial stack pointer
for a thread, properly aligning it and subtracting out
any random offsets or thread-local storage areas.
arch_new_thread() no longer needs to make any calculations,
an initial stack frame may be placed at the bounds of
the new 'stack_ptr' parameter passed in. This parameter
replaces 'stack_size'.
thread->stack_info is now set before arch_new_thread()
is invoked, z_new_thread_init() has been removed.
The values populated may need to be adjusted on arches
which carve-out MPU guard space from the actual stack
buffer.
thread->stack_info now has a new member 'delta' which
indicates any offset applied for TLS or random offset.
It's used so the calculations don't need to be repeated
if the thread later drops to user mode.
CONFIG_INIT_STACKS logic is now performed inside
z_setup_new_thread(), before arch_new_thread() is called.
thread->stack_info is now defined as the canonical
user-accessible area within the stack object, including
random offsets and TLS. It will never include any
carved-out memory for MPU guards and must be updated at
runtime if guards are removed.
Available stack space is now optimized. Some arches may
need to significantly round up the buffer size to account
for page-level granularity or MPU power-of-two requirements.
This space is now accounted for and used by virtue of
the Z_THREAD_STACK_SIZE_ADJUST() call in z_setup_new_thread.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
MISRA-C wants the parameter names in a function implementaion
to match the names used by the header prototype.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
arch_new_thread() passes along the thread priority and option
flags, but these are already initialized in thread->base and
can be accessed there if needed.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This operation is formally defined as rounding down a potential
stack pointer value to meet CPU and ABI requirments.
This was previously defined ad-hoc as STACK_ROUND_DOWN().
A new architecture constant ARCH_STACK_PTR_ALIGN is added.
Z_STACK_PTR_ALIGN() is defined in terms of it. This used to
be inconsistently specified as STACK_ALIGN or STACK_PTR_ALIGN;
in the latter case, STACK_ALIGN meant something else, typically
a required alignment for the base of a stack buffer.
STACK_ROUND_UP() only used in practice by Risc-V, delete
elsewhere.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
The core kernel z_setup_new_thread() calls into arch_new_thread(),
which calls back into the core kernel via z_new_thread_init().
Move everything that doesn't have to be in z_new_thread_init() to
z_setup_new_thread() and convert to an inline function.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Under multi-processing, only the first CPU#0 needs to go through
setting up the kernel structs and clearing out BSS (among others).
There is no need for other CPUs to do those tasks. Since each
Xtensa core starts using the same boot vector, CPUs other than #0
need to skip all the startup tasks by not calling to z_cstart().
So provide another entry point for those CPUs. Note that Xtensa
arch is highly configurable. So the implementation of the entry
point is up to each individual SoC config.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Under SMP, the main BSS section only needs to be zero-ed on CPU #0.
Other CPUs should not zero out BSS, or else it may cause CPU #0 to
crash on invalid data.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
The set of interrupt stacks is now expressed as an array. We
also define the idle threads and their associated stacks this
way. This allows for iteration in cases where we have multiple
CPUs.
There is now a centralized declaration in kernel_internal.h.
On uniprocessor systems, z_interrupt_stacks has one element
and can be used in the same way as _interrupt_stack.
The IRQ stack for CPU 0 is now set in init.c instead of in
arch code.
The extern definition of the main thread stack is now removed,
this doesn't need to be in a header.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Xtensa uses two instructions to perform atomic compare-and-set
instruction: first the comparison register, then the actual
instruction to do compare-and-set. There is a potential that
context switching is performed before these two instructions.
A restored context may have the wrong value in the comparison
register. So we need to save and restore the comparison
register during context switching.
Fixes#21800
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
This reverts commit 9987c2e2f9
which spills SoC configs into architecture files and is not
exactly desirable. So revert it.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Use BOOTLOADER definition to separate bootloader code. This allows to
use the same file reset-vector.S when building bootloader and when
CONFIG_XTENSA_RESET_VECTOR is enabled.
Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com>
The atomic_cas function was using incorrect register when determining
whether value was swapped. The swapping instruction s32c1i in
atomic_cas stores the value at memory location in register a4
regardless of whether swapping is done. In this case, the register a4
should be used to determine whether a swap is done. However, register
a3 (containing the oldValue as function argument) is used instead.
Since register a5 contains the old value at address loaded before
the swapping instruction, a3 and a5 contain the same value.
Since a3 == a5 is always true in this case, the function will always
return 1 even though values are not swapped. So fix it by using
the correct register.
Also, in case the value is not swapped, it jumps to where it returns
zero instead of loading from memory and comparing again.
The function was simply looping until swapping was done, which did not
align with the API where it would return 0 when swapping is not done
(regardless whether the memory location contains the old value or not).
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
This adds the necessary bits to build the Xtensa HAL as
a module, and removes the bits to use the HAL built with
the Zephyr SDK.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Promote the private z_arch_* namespace, which specifies
the interface between the core kernel and the
architecture code, to a new top-level namespace named
arch_*.
This allows our documentation generation to create
online documentation for this set of interfaces,
and this set of interfaces is worth treating in a
more formal way anyway.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
When compiling the components under the arch directory, the compiler
include paths for arch and kernel private headers need to be specified.
This was previously done by adding 'zephyr_library_include_directories'
to CMakeLists.txt file for every component under the arch directory,
and this resulted in a significant amount of duplicate code.
This commit uses the CMake 'include_directories' command in the root
CMakeLists.txt to simplify specification of the private header include
paths for all the arch components.
Signed-off-by: Stephanos Ioannidis <root@stephanos.io>
This commit refactors kernel and arch headers to establish a boundary
between private and public interface headers.
The refactoring strategy used in this commit is detailed in the issue
This commit introduces the following major changes:
1. Establish a clear boundary between private and public headers by
removing "kernel/include" and "arch/*/include" from the global
include paths. Ideally, only kernel/ and arch/*/ source files should
reference the headers in these directories. If these headers must be
used by a component, these include paths shall be manually added to
the CMakeLists.txt file of the component. This is intended to
discourage applications from including private kernel and arch
headers either knowingly and unknowingly.
- kernel/include/ (PRIVATE)
This directory contains the private headers that provide private
kernel definitions which should not be visible outside the kernel
and arch source code. All public kernel definitions must be added
to an appropriate header located under include/.
- arch/*/include/ (PRIVATE)
This directory contains the private headers that provide private
architecture-specific definitions which should not be visible
outside the arch and kernel source code. All public architecture-
specific definitions must be added to an appropriate header located
under include/arch/*/.
- include/ AND include/sys/ (PUBLIC)
This directory contains the public headers that provide public
kernel definitions which can be referenced by both kernel and
application code.
- include/arch/*/ (PUBLIC)
This directory contains the public headers that provide public
architecture-specific definitions which can be referenced by both
kernel and application code.
2. Split arch_interface.h into "kernel-to-arch interface" and "public
arch interface" divisions.
- kernel/include/kernel_arch_interface.h
* provides private "kernel-to-arch interface" definition.
* includes arch/*/include/kernel_arch_func.h to ensure that the
interface function implementations are always available.
* includes sys/arch_interface.h so that public arch interface
definitions are automatically included when including this file.
- arch/*/include/kernel_arch_func.h
* provides architecture-specific "kernel-to-arch interface"
implementation.
* only the functions that will be used in kernel and arch source
files are defined here.
- include/sys/arch_interface.h
* provides "public arch interface" definition.
* includes include/arch/arch_inlines.h to ensure that the
architecture-specific public inline interface function
implementations are always available.
- include/arch/arch_inlines.h
* includes architecture-specific arch_inlines.h in
include/arch/*/arch_inline.h.
- include/arch/*/arch_inline.h
* provides architecture-specific "public arch interface" inline
function implementation.
* supersedes include/sys/arch_inline.h.
3. Refactor kernel and the existing architecture implementations.
- Remove circular dependency of kernel and arch headers. The
following general rules should be observed:
* Never include any private headers from public headers
* Never include kernel_internal.h in kernel_arch_data.h
* Always include kernel_arch_data.h from kernel_arch_func.h
* Never include kernel.h from kernel_struct.h either directly or
indirectly. Only add the kernel structures that must be referenced
from public arch headers in this file.
- Relocate syscall_handler.h to include/ so it can be used in the
public code. This is necessary because many user-mode public codes
reference the functions defined in this header.
- Relocate kernel_arch_thread.h to include/arch/*/thread.h. This is
necessary to provide architecture-specific thread definition for
'struct k_thread' in kernel.h.
- Remove any private header dependencies from public headers using
the following methods:
* If dependency is not required, simply omit
* If dependency is required,
- Relocate a portion of the required dependencies from the
private header to an appropriate public header OR
- Relocate the required private header to make it public.
This commit supersedes #20047, addresses #19666, and fixes#3056.
Signed-off-by: Stephanos Ioannidis <root@stephanos.io>
Use this short header style in all Kconfig files:
# <description>
# <copyright>
# <license>
...
Also change all <description>s from
# Kconfig[.extension] - Foo-related options
to just
# Foo-related options
It's clear enough that it's about Kconfig.
The <description> cleanup was done with this command, along with some
manual cleanup (big letter at the start, etc.)
git ls-files '*Kconfig*' | \
xargs sed -i -E '1 s/#\s*Kconfig[\w.-]*\s*-\s*/# /'
Signed-off-by: Ulf Magnusson <Ulf.Magnusson@nordicsemi.no>
Same deal as in commit 7fdb525754 ("kconfig: Use 'default' instead of
'def_bool' in Kconfig.defconfig files"), but I hacked Kconfiglib to also
find cases where the type is given separately as e.g.
config FOO
int
default 3
Motivation (from a note in
https://docs.zephyrproject.org/latest/guides/kconfig/index.html):
For a symbol defined in multiple locations (e.g., in a
Kconfig.defconfig file in Zephyr), it is best to only give the
symbol type for the "base" definition of the symbol, and to use
'default' (instead of 'def_<type>' value) for the remaining
definitions. That way, if the base definition of the symbol is
removed, the symbol ends up without a type, which generates a
warning that points to the other definitions. That makes the extra
definitions easier to discover and remove.
It's also nice if 'def_bool' and the like turn into a semi-reliable flag
that the symbol is only defined in Kconfig.defconfig files. That might
be a sign that things could be cleaned up.
Will do a separate pass later to remove some symbols only defined in
Kconfig.defconfig files.
Signed-off-by: Ulf Magnusson <Ulf.Magnusson@nordicsemi.no>
Unused since commit 6fd6b7e50a ("xtensa: remove legacy arch
implementation").
Found with a script.
Signed-off-by: Ulf Magnusson <Ulf.Magnusson@nordicsemi.no>
Unused since commit 6fd6b7e50a ("xtensa: remove legacy arch
implementation").
Found with a script.
Signed-off-by: Ulf Magnusson <Ulf.Magnusson@nordicsemi.no>
include/sys/arch_inlines.h will contain all architecture APIs
that are used by public inline functions and macros,
with implementations deriving from include/arch/cpu.h.
kernel/include/arch_interface.h will contain everything
else, with implementations deriving from
arch/*/include/kernel_arch_func.h.
Instances of duplicate documentation for these APIs have been
removed; implementation details have been left in place.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This makes it clearer that this is an API that is expected
to be implemented at the architecture level.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This is part of the core kernel -> architecture interface and
has been renamed z_arch_kernel_init().
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
k_cpu_idle() and k_cpu_atomic_idle() were being directly
implemented by arch code.
Rename these implementations to z_arch_cpu_idle() and
z_arch_cpu_atomic_idle(), and call them from new inline
function definitions in kernel.h.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This is part of the core kernel -> architecture interface
and is appropriately renamed z_arch_is_in_isr().
References from test cases changed to k_is_in_isr().
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This is part of the core kernel -> architecture interface
and should have a leading prefix z_arch_.
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Various C and Assembly modules
make function calls to z_sys_trace_*. These merely call
corresponding functions sys_trace_*. This commit
is to simplify these by making direct function calls
to the sys_trace_* functions from these modules.
Subsequently, the z_sys_trace_* functions are removed.
Signed-off-by: Mrinal Sen <msen@oticon.com>
We re-wrote the xtensa arch code, but never got around
to purging the old implementation.
Removed those boards which hadn't been moved to the new
arch code. These were all xt-sim simulator targets and not
real hardware.
Fixes: #18138
Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
This adds a simple infinite loop when double exception is raised.
Without this, if double exception occurs, it would execute
arbitrary code.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
This follows the z_arch_irq_en-/dis-able() so that the SoC
definitions are responsible for functions related to multi-level
interrupts.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>