Commit graph

2929 commits

Author SHA1 Message Date
Mohamed ElShahawi 7084662cc8 kernel: system_work_q: Mark queue thread as essential
Marking sysworkq as essential, so when it fails, the system will halt
instead of continuously working, and dependent components stay
in a broken state.

Signed-off-by: Mohamed ElShahawi <ExtremeGTX@hotmail.com>
2024-04-25 21:40:24 +02:00
Mohamed ElShahawi 9ba4653243 kernel: work_queue: make thread essential flag configurable
Allow the creator of a work_queue instance to choose whether
the work_queue thread should be marked as ESSENTIAL or not.

Signed-off-by: Mohamed ElShahawi <ExtremeGTX@hotmail.com>
2024-04-25 21:40:24 +02:00
Andy Ross 02b24911f7 kernel/sched: Fix lockless ordering in halt_thread()
We've had threads spinning on the thread state bits, but weren't being
careful to ensure that those bits were the last things seen to change
in a halting thread.  Move it to the end, and add a barrier for
correctness.

Signed-off-by: Andy Ross <andyross@google.com>
2024-04-25 15:12:02 +02:00
Andy Ross 20611f13ca sched: Optimize dummy thread usage on SMP
Nicolas Pitre points out that since these thread structs are just
dummies for the context swtiching, they can be presumed to be "write
only" and thus there's no point in having one per CPU, everyone can
share the same one.

The only gotcha is that we never really documented (nor really have a
place to document) that rule, so it's not theoretically impossible for
an architecture to read back what it might have written underneath
arch_switch().  Leave this in a separate commit for bisection
purposes, but the risk seems very low.

Signed-off-by: Andy Ross <andyross@google.com>
2024-04-25 15:12:02 +02:00
Andy Ross 61c70626a5 kernel/sched: Fix free-memory write when ISRs abort _current
After a k_thread_abort(), the resulting thread struct is documented as
unused/free memory that may be re-used (for example, to respawn a new
thread).

But in the special case of aborting the current thread from within an
ISR, that wasn't quite happening.  The scheduler cleanup would
complete, but the architecture layer would still try to context switch
away from the aborted thread on exit, and that can include writes to
the now-reused thread struct!  The specifics will depend on
architecture (some do a full context save on entry, most don't), but
in the case of USE_SWITCH=y it will at the very least write the
switch_handle field.

Fix this simply, with a per-cpu "switch dummy" thread struct for use
as a target for context switches like this.  There is some non-trivial
memory cost to that; thread structs on many architectures are large.

Pleasingly, this also addresses a known deadlock on SMP: because the
"spin in ISR" step now happens as the very last stage of
k_thread_abort() handling, the existing scheduler lock works to
serialize calls such that it's impossible for a cycle of threads to
independently decide to spin on each other: at least one will see
itself as "already aborting" and break the cycle.

Fixes #64646

Signed-off-by: Andy Ross <andyross@google.com>
2024-04-25 15:12:02 +02:00
Andy Ross 93dc7e7438 kernel/spinlock: Fix SPIN_VALIDATE in ISRs
Spinlocks taken in ISRs were storing the _current thread pointer of
the interrupted thread as the owner, which was never strictly correct
but was benign as the thread would never run until the lock was
released.

But now k_thread_abort(_current) in an ISR has been fixed to eliminate
all references to the (now aborted) thread struct, and _current points
to a dummy thread.  Handle that edge case in the validation framework.

Signed-off-by: Andy Ross <andyross@google.com>
2024-04-25 15:12:02 +02:00
Andy Ross 5fa2b6f377 kernel/sched: Refeactor/cleanup z_thread_halt()
Big change is to factor out a thread_halt_spin() utility to manage the
core complexity of this code: the situation where an ISR is asked to
abort a thread already running on another SMP CPU.

With that gone, things can be cleaned up quite a bit.  Remove early
returns, most of the "#if CONFIG_SMP" usage was superfluous and will
optimize out, unify and clean up the comments, etc...

No behavioral changes (hopefully), just refactoring.

Signed-off-by: Andy Ross <andyross@google.com>
2024-04-25 15:12:02 +02:00
Anas Nashif 297ad5186d kernel: increase main stack size for ztest on ARC
stack has been on the edge when using ztest, use 1024 for ARC as well.

Fixes #71797

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2024-04-24 10:49:05 +02:00
Anas Nashif 4593f0d71c kernel: priority queues: declare as static inlines
After the move to C files we got some drop in the performance when
running latency_measure. This patch declares the priority queue
functions as static inlines with minor optimizations.

The result for one metric (on qemu):

3.6 and before the anything was changed:

  Get data from LIFO (w/ ctx switch): 13087 ns

after original change (46484da502):

  Get data from LIFO (w/ ctx switch): 13663 ns

with this change:

  Get data from LIFO (w/ ctx switch): 12543 ns

So overall, a net gain of ~ 500ns that can be seen across the board on many
of the metrics.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2024-04-22 16:40:11 -04:00
Flavio Ceolin 11b85ee510 kernel: stack: Check possible overflow
Check possible overflow in k_stack data struct. An overflow
can happens resulting in a much smaller amount of memory allocation.

Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
2024-04-22 15:20:39 -04:00
Ederson de Souza eeebb4d911 kernel: Device deferred initialization
Currently, all devices are initialized at boot time (following their
level and priority order). This patch introduces deferred
initialization: by setting the property `zephyr,deferred-init` on a
device on the devicetree, Zephyr will not initialized the device.

To initialize such devices, one has to call `device_init()`.

Deferred initialization is done by grouping all deferred devices on a
different ELF section. In this way, there's no need to consume more
memory to keep track of deferred devices. When `device_init()` is
called, Zephyr will scan the deferred devices section and call the
initialization function for the matching device. As this scanning is
done only during deferred device initialization, its cost should be
bearable.

Signed-off-by: Ederson de Souza <ederson.desouza@intel.com>
2024-04-11 15:50:44 -04:00
Daniel Leung d0a90a0b33 kernel: add the ability to memory map thread stacks
This introduces support for memory mapped thread stacks,
where each thread stack is mapped into virtual memory
address space with two guard pages to catch
under-/over-flowing the stack. This is just on the kernel
side. Additional architecture code is required to fully
support this feature.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2024-04-10 07:44:27 -04:00
Daniel Leung 04c5632bd4 kernel: mm: introduce k_mem_phys_map()/_unmap()
This is similar to k_mem_map()/_unmap(). But instead of using
anonymous memory, the provided physical region is mapped
into virtual address instead. In addition to simple mapping
physical ro virtual addresses, the mapping also adds two
guard pages before and after the virtual region to catch
buffer under-/over-flow.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2024-04-10 07:44:27 -04:00
Daniel Leung 378131c266 kernel: add options to cleanup after aborting current thread
This adds the mechanism to do cleanup after k_thread_abort()
is called with the current thread. This is mainly used for
cleaning up things when the thread cannot be running, e.g.,
cleanup the thread stack.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2024-04-10 07:44:27 -04:00
Krzysztof Chruściński aee8dd1d33 kernel: timeout: Optimize setting next alarm
Next timeout was set unconditionally at the end of sys_clock_announce.
However, if one of the current expired timeouts was setting a new
timeout which is the first to execute then system clock was configured
twice. Lets configure system clock only once in the isr at the and of
sys_clock_announce.

If timeouts are frequent this optimization can reduce CPU load. In
many cases setting the new sys_clock timeout is the most time
consuming operation in the sys_clock isr handler. As an example,
on the target I used setting new sys_clock timeout is taking 6 uS of
9 uS spent in the isr and it takes 16 uS with the redundant call.

Signed-off-by: Krzysztof Chruściński <krzysztof.chruscinski@nordicsemi.no>
2024-04-09 13:55:07 -04:00
Simon Hein ebdb07a05c kernel: Clean up mailbox async msg configuration
Remove confusing and duplicate async message configuration
switches for mailboxes.

Signed-off-by: Simon Hein <Shein@baumer.com>
2024-04-09 11:05:55 +02:00
Anas Nashif 20b2c98add kernel: move nothread support to own file
Do not build threading support when CONFIG_MULTITHREADING=n is set and
move needed calls to a new file with the changes needed instead of the
ifdef party in sched.c

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2024-04-06 14:22:08 +03:00
Yong Cong Sin fbaf7dfdc1 kernel: banner: use BUILD_VERSION only if not empty
The `BUILD_VERSION` can be defined but empty when built
without git, causing version to be missing from the banner:

```
*** Booting Zephyr OS build  ***
Hello World! qemu_riscv64
```

Let's check if it is empty before using it, so that
`KERNEL_VERSION_STRING`, which is generated independently
with cmake can be used as a fallback:

```
*** Booting Zephyr OS build 3.5.0 ***
Hello World! qemu_riscv64
```

Signed-off-by: Yong Cong Sin <ycsin@meta.com>
2024-04-04 23:47:33 +02:00
Anas Nashif 8a88cd4805 kernel: thread: move thread states to header
Move state string defines into thread header.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2024-04-01 18:47:36 -04:00
Anas Nashif f5435b3df7 kernel: thread: move k_thread_priority_get
Move to thread.c alongside all other thread calls.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2024-04-01 18:47:36 -04:00
Anas Nashif 5c170c7046 kernel: thread: rename is_preempt
Trivila rename to thread_is_preempt.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2024-04-01 18:47:36 -04:00
Anas Nashif 6754cbd1b5 kernel: thread: move k_is_preempt_thread to thread.c
This belongs in thread.c

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2024-04-01 18:47:36 -04:00
Anas Nashif 17c874f4fc kernel: thread: rename is_metairq
Trivial rename of is_metairq to thread_is_metairq.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2024-04-01 18:47:36 -04:00
Anas Nashif 0b473ce925 kernel: rename sliceable -> thread_is_sliceable
Trivial rename of sliceable function.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2024-04-01 18:47:36 -04:00
Anas Nashif 37df485463 kernel: split timeslicing/ipi code out of sched.c
Move both timeslicing and IPI code to own files.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2024-04-01 18:47:36 -04:00
Anas Nashif 31bc210bbc kernel: sched: remove unused prototype: z_is_thread_time_slicing
A prototype not used anywhere.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2024-04-01 18:47:36 -04:00
Anas Nashif ebb503ff7b kernel: move thread related helper function kthread.h
Move some helper functions to inernal kthread.h, to offload crowded
sched.c

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2024-04-01 18:47:36 -04:00
Daniel Leung d34351d994 kernel: align thread stack size declaration
When thread stack is defined as an array, K_THREAD_STACK_LEN()
is used to calculate the size for each stack in the array.
However, standalone thread stack has its size calculated by
Z_THREAD_STACK_SIZE_ADJUST() instead. Depending on the arch
alignment requirement, they may not be the same... which
could cause some confusions. So align them both to use
K_THREAD_STACK_LEN().

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2024-03-27 19:27:10 -04:00
Daniel Leung 6cd7936f57 kernel: align kernel stack size declaration
When kernel stack is defined as an array, K_KERNEL_STACK_LEN()
is used to calculate the size for each stack in the array.
However, standalone kernel stack has its size calculated by
Z_KERNEL_STACK_SIZE_ADJUST() instead. Depending on the arch
alignment requirement, they may not be the same... which
could cause some confusions. So align them both to use
K_KERNEL_STACK_LEN().

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2024-03-27 19:27:10 -04:00
Daniel Leung efe30749de kernel: rename Z_THREAD_STACK_BUFFER to K_THREAD_STACK_BUFFER
Simple rename to align the kernel naming scheme. This is being
used throughout the tree, especially in the architecture code.
As this is not a private API internal to kernel, prefix it
appropriately with K_.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2024-03-27 19:27:10 -04:00
Daniel Leung b69d2486fe kernel: rename Z_KERNEL_STACK_BUFFER to K_KERNEL_STACK_BUFFER
Simple rename to align the kernel naming scheme. This is being
used throughout the tree, especially in the architecture code.
As this is not a private API internal to kernel, prefix it
appropriately with K_.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2024-03-27 19:27:10 -04:00
Ederson de Souza 321e395a8f linker: Include libkernel.a in the whole-archive when llext is enabled
Differently from other libraries, which are included whole in the final
Zephyr ELF, libkernel.a itself isn't. Assuming this is intended to
enable optimisations (if it isn't, this patch will break things) - linker
can remove parts of the kernel that are not used by the application.

However, when considering Linkable Loadable Extensions (llext), this
optimisations can be counterproductive: for instance, syscalls that are
not used by the application won't be available for extensions. It won't
matter if someone "EXPORT_SYMBOL" for them, or even try to keep them
using LINKER_KEEP, they'll be gone.

To avoid that, this patches includes, when CONFIG_LLEXT=y, libkernel.a
inside the linker "whole-archive" block. This ends up making it consider
libkernel.a as a library whose all symbols should be kept. Note this
doesn't mean that all symbols will be there - things compiled out via
Kconfig will naturally still be out.

Signed-off-by: Ederson de Souza <ederson.desouza@intel.com>
2024-03-26 19:31:56 -04:00
Simon Hein bcd1d19322 kernel: add closing comments to config endifs
Add a closing comment to the endif with the configuration
information to which the endif belongs too.
To make the code more clearer if the configs need adaptions.

Signed-off-by: Simon Hein <Shein@baumer.com>
2024-03-25 18:03:31 -04:00
Daniel Leung 6ea749de52 arch: rename arch_start_cpu() to arch_cpu_start()
Rename arch_start_cpu() to arch_cpu_start() so it belongs to
the "cpu" namespace.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2024-03-25 09:58:35 +00:00
Daniel Leung 3664ed64c3 arch: move arch_interface.h under zephyr/arch
arch_interface.h is for architecture and should not be
under sys/. So move it under include/zephyr/arch/.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
2024-03-25 09:58:35 +00:00
Flavio Ceolin e2f3840380 kernel: Fix possible overflow in k_mem_map
k_mem_map additionally allocates two guard pages that are not mapped.
These pages are not being accounted when checking the provided size and
when they are added an overflow can happen and the mapped memory is not
correct.

Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
2024-03-22 15:58:33 -04:00
Ederson de Souza 67bb6db3f8 syscall: Export all emitted syscalls, enabling them for extensions
Linkable loadable extensions can only use syscalls if they are exported
via EXPORT_SYSCALL (or EXPORT_SYMBOL). Instead of enabling used syscalls
one by one, this patch exports all of them automatically via
`gen_syscalls.py`. If CONFIG_LLEXT=n, the section where the exported
symbols live is discarded, so it should be a non-op when llext is not
enabled.

This patch also removes the now redundant EXPORT_SYSCALL macro. Note
that EXPORT_SYMBOL is still useful on different situations (and is
indeed used by the code generated by `gen_syscalls.py`).

Signed-off-by: Ederson de Souza <ederson.desouza@intel.com>
2024-03-20 16:26:54 +00:00
TaiJu Wu 1f5f0cf838 sched: Remove multi-level queue priority limit
Modified bitmask to  bitmask array, it can make multilevel queue remove
32 bit prioriry limit.

We can scan bitmask array to find which queue have ready thread.

Only need the number of queues as priority because the priority
is checked on create_thread.

Signed-off-by: TaiJu Wu <tjwu1217@gmail.com>
2024-03-12 19:37:40 -04:00
Andy Ross f2280d119d kernel/sched: Don't touch deadline values on queued threads
k_thread_deadline_set() would modify the thread's deadline and then,
if it was in the run queue, requeue it to put it at the right spot.
Sounds right, right?

It's wrong.  The deadline field is part of the thread priority, so
this results in a mis-ordered list.  For dlist backends, that's benign
as the removal works anyway, but if CONFIG_SCHED_SCALABLE=y we've now
broken the sorting order of an in-tree item and corrupted the rbtree!

Fixes #69935

Signed-off-by: Andy Ross <andyross@google.com>
2024-03-11 15:42:26 +01:00
Nicolas Pitre c31d646198 kernel: timeout: optimize z_timeout_expires()
This currently calls timeout_rem() then adds back the result of
elapsed() to cancel out the subtraction of another elapsed() call
within timeout_rem(). Better not to make any calls to elapsed() in
that case, especially given that elapsed() may incur hardware access
overhead through sys_clock_elapsed().

Let's move the elapsed() subtraction to the only user of timeout_rem()
that needs it and remove its invocation from the z_timeout_expires()
path entirely.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
2024-03-08 18:05:10 +01:00
TaiJu Wu 39f710136e sched: finalize_cancel_locked can early return
The code `SYS_SLIST_FOR_EACH_CONTAINER_SAFE` just for remove work
from `pending_cancels`.
After removing work successfully, the function can return early.
It is unnecessary to iterate continuously.

Signed-off-by: TaiJu Wu <tjwu1217@gmail.com>
2024-03-07 19:40:51 -05:00
Peter Mitsis 9f7695dda0 kernel: Remove unused z_pend_curr_irqlock()
The routine z_pend_curr_irqlock() is no longer used anywhere.

Signed-off-by: Peter Mitsis <peter.mitsis@intel.com>
2024-03-07 11:51:06 -05:00
Anas Nashif 0d8da5ff93 kernel: rename scheduler spinlock variable and make it private
rename sched_spinlock to _sched_spinglock to maintain it is privacy and
to avoid any misuse.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2024-03-06 19:27:28 -05:00
Anas Nashif 868f099d61 kernel: sched: z_set_prio -> z_thread_prio_set
Rename private function to make it clear what priority we are setting
and to be consistent across the code.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2024-03-06 19:27:28 -05:00
Anas Nashif 477a04a098 kernel: rename h -> heap
Avoid single characker variables that renders code unreadable and might
cause conflicts in maing, similar to t for both timeout and thread in
some places.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2024-03-06 19:27:28 -05:00
Anas Nashif 595ff63f00 kernel: thread: use consistent thread parameter
Use thread wherever it makes sense, using 't' in some places can get
confused with 't' used for timeouts for example.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2024-03-06 19:27:28 -05:00
Anas Nashif 3dc0c4544b kernel: move float operations out of thread.c
Move float operation out and add missing vrfy hook for enabling float.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2024-03-06 19:27:28 -05:00
Anas Nashif 6c003bdbcf kernel: remove unused code in headers
List of functions defined in headers and not being used anywhere.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2024-03-06 19:27:28 -05:00
Anas Nashif 9e83413542 kernel: split thread monitor
Move thread monitor related functions, not enabled in most cases outside
of thread.c and cleanup headers.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2024-03-06 19:27:28 -05:00
Anas Nashif 5e591c38f1 kernel: do not export z_thread_priority_set
This function is only being used by a test, so instead of reimplementing
a syscall in the test, provide a Kconfig option to provide the
functionality that only works with tests and remove some of the
duplication and extra code.

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2024-03-06 19:27:28 -05:00