zephyr

Author	SHA1	Message	Date
Yasushi SHOJI	20d072465d	kernel: sched: Do not force preempt when k_sched_unlock() The scheduler lock is a nestable lock. Unlocking a nested, still-having, lock shouldn't preempt the current thread. k_sched_lock(); k_sched_lock(); k_sched_unlock(); /* <--- this shouldn't be a scheduling point / k_sched_unlock(); / <--- this is a scheduling point */ This commit changes the preempt_ok argument from 1 to 0. This let should_preempt() check whether it should preempt at the point or not. This fixes #17869. Signed-off-by: Yasushi SHOJI <y-shoji@ispace-inc.com>	2019-08-06 10:19:50 +02:00
Wentong Wu	2463ded4c8	kernel: timeout: do not active time slicing if idle thread ready zero slice_ticks when can't time slice so that next_timeout will ignore slice_ticks of _current_cpu and system can stay low power state longer time. Fixes: #17368. Signed-off-by: Wentong Wu <wentong.wu@intel.com>	2019-07-24 14:02:23 -07:00
Andy Ross	4d8e1f223b	kernel/sched: Fix k_thread_priority_set() on SMP On SMP systems, currently scheduled threads are not in the run queue and can't be unconditionally removoed/added. Fixes #17170 Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2019-07-12 14:09:16 -07:00
Andy Ross	ed7d86310f	kernel/sched: Interpret zero timeslice time correctly The scheduler API has always allowed setting a zero slice size as a way to disable timeslicing. But the workaround introduced for CONFIG_SWAP_NONATOMIC forgot that convention, and was calling reset_time_slice() with that zero value (i.e. requesting an immediate interrupt) in circumstances where z_swap() had been interrupted nonatomically. In practice, this never happened. And if it did, it was a single spurious no-op interrupt that no one cared about. Until it did, anyway... Now that ticks on nRF devices are at full 32 kHz speed, we can get into a situation where the rapidly triggering timeslice interrupts are interrupting z_swap() calls, and the process feeds back on itself and becomes self-sustaining. Put that test into the time slice code itself to prevent this kind of mistake in the future. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2019-07-02 22:52:29 -04:00
Anas Nashif	68c389c1f8	include: move system timer headers to include/drivers/timer/ Move internal and architecture specific headers from include/drivers to subfolder for timer: include/drivers/timer Signed-off-by: Anas Nashif <anas.nashif@intel.com>	2019-06-25 15:27:00 -04:00
Andrew Boie	3f974243be	kernel: allow k_sleep(K_FOREVER) Threads that are sleeping forever may be woken up with k_wakeup(), this shouldn't fail assertions. Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>	2019-06-18 09:08:01 -04:00
Andy Ross	312b43f145	kernel/sched: Don't reschedule inside a nested lock The internal "reschedule" API has always understood the idea that it might run in a ISR context where it can't swap. But it has always been happy to swap away when in thread mode, even when the environment contains an outer lock that would NOT be expecting to swap! As it happened, the way irq locks are implemented (they store flag state that can be restored without context) this would "work" even though it was completely breaking the synchronization promise made by the outer lock. But now, with spinlocks, the error gets detected (albeit in a clumsy way) in debug builds. The unexpected swap triggers SPIN_VALIDATE failures in later threads (this gets reported as a "recursive" lock, but what actually happened is that another thread got to run before the lock was released and tried to grab the same lock). Fix this so that swap can only be called in a situation where the irq lock key it was passed would have the effect of unmasking interrupts. Note that this is a real behavioral change that affects when swaps occur: it's not impossible that there is code out there that actually relies on this "lock breaking reschedule" for correct behavior. But our previous implementation was irredeemably broken and I don't know how to address that. Fixes #16273 Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2019-06-03 12:03:48 -07:00
Charles E. Youse	a567831bed	kernel/sched.c: add k_usleep() API function Add k_usleep() API, analogous to k_sleep(), excepting that the argument is in microseconds rather than milliseconds. Signed-off-by: Charles E. Youse <charles.youse@intel.com>	2019-05-21 23:09:16 -04:00
Charles E. Youse	b186303cb6	kernel/sched.c: refactor k_sleep() implementation for varied timescales Current z_impl_k_sleep() does double duty, converting between units specified by the API and ticks, as well as implementing the sleeping mechanism itself. This patch separates the API from the mechanism, so that sleeps need not be tied to millisecond timescales. Signed-off-by: Charles E. Youse <charles.youse@intel.com>	2019-05-21 23:09:16 -04:00
Andrew Boie	ae0d1b2b79	kernel: sched: move stack sentinel check earlier Checking the stack sentinel may abort the current thread, make this check before we determine what the next thread to run is. Fixes: #15037 Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>	2019-03-29 22:13:40 -04:00
Patrik Flykt	21358baa72	all: Update unsigend 'U' suffix due to multiplication As the multiplication rule is updated, new unsigned suffixes are added in the code. Signed-off-by: Patrik Flykt <patrik.flykt@intel.com>	2019-03-28 17:15:58 -05:00
Patrik Flykt	24d71431e9	all: Add 'U' suffix when using unsigned variables Add a 'U' suffix to values when computing and comparing against unsigned variables. Signed-off-by: Patrik Flykt <patrik.flykt@intel.com>	2019-03-28 17:15:58 -05:00
Flavio Ceolin	2df02cc8db	kernel: Make if/iteration evaluate boolean operands Controlling expression of if and iteration statements must have a boolean type. MISRA-C rule 14.4 Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>	2019-03-26 22:06:45 -04:00
Flavio Ceolin	a996203739	kernel: Use macro BIT for shift operations BIT macro uses an unsigned int avoiding implementation-defiend behavior when shifting signed types. MISRA-C rule 10.1 Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>	2019-03-26 14:31:29 -04:00
Andy Ross	4521e0c111	kernel/sched: Mark sleeping threads suspended On SMP, there was a bug where the logic that re-adds _current to the run queue at swap time would accidentally reschedule threads that had just gone to sleep, because the is_thread_prevented_from_running() predicate only tests for threads that are "suspended" or "pending" and not sleeping. Overload _THREAD_SUSPENDED to indicate "sleeping" also. Simple fix for an immediate bug, though long term we really want to unify all the blocked conditions to prevent this kind of state bug. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2019-03-23 19:28:15 -04:00
Andy Ross	3dea408405	kernel/sched: Flag DEAD on correct thread in cross-CPU abort Daniel Leung caught a good one: In the (SMP) case where we were aborting a thread that was not currently scheduled, we were flagging the DEAD state on _current and not the thread we were aborting! This wasn't as fatal as it seems, as the thread that called z_sched_abort() would effectively go on living (as a zombie?) in a state where it would always be preempted, but would otherwise remain scheduleable. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2019-03-19 13:39:24 -05:00
Andy Ross	722aeead91	kernel/sched: Nonatomic swap workaround update for qemu behavior The workaround for nonatomic swap had yet another edge case: it would save off the _current pointer when pending a thread so that the next time slice interrupt could test it to see if the swap had actually happened before assuming that _current could be rescheduled (if it just pended itself, that's impossible). Then it would clear the pending_current pointer so future interrupts wouldn't be confused. BUT: it turns out that qemu, when faced with really rapid timer rates that exceed its (host-based) timing accuracy, is perfectly willing to "stack up" timer interrupts such the one goes pending before the previous one is finished executing. In that case, we can enter the SECOND timer interrupt, to try timeslicing a SECOND time, STILL before the PendSV exception has run to actually effect the context switch. Except this time pending_current has been cleared and we try to reschedule the pended _current thread incorrectly. In theory real hardware could do this too, though it would involve absolutely crazy interrupt latency problems. Work around this by moving the clear to the thread itself, immediately after it wakes up from the pend call it retakes a lock and clears pending_current if it still matches _current. That is not a perfect fix: there remains a 2-3 instruction race at that moment where we return from pend and before we can lock interrupts again where a timer interrupt will see an incorrect pointer. But I hammered at this and couldn't make qemu do that (i.e. return from a timer interrupt but flag a new one in just a cycle or two). Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2019-03-15 05:50:43 +01:00
Andy Ross	ea1c99b11b	kernel/sched: Fix k_yield() in SMP This was always doing a remove/add of the _current thread to the run queue, which is wrong because in SMP _current isn't in the queue to remove. But it went undetected until the recent dlist changes. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2019-03-13 19:15:20 +01:00
Andy Ross	8c1bdda33c	kernel/sched: Fix spinlock validation glitch in SMP In SMP, we are setting the _current pointer while holding the scheduler spinlock locally, which means that when we try to release it the validation layer (not the spinlock per se) will scream at us because the thread that took the lock doesn't match the one releasing it. Special case this when validation is enabled. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2019-03-13 19:15:20 +01:00
Andy Ross	b18685bcf1	kernel/sched: Clean up tracing hooks The tracing fixes in commit `e87193896a` ("subsys: debug: tracing: Fix thread tracing") were... not a readability win. The point appears to have been to put a tracing hook immediately before and after the assignment to the _current pointer. So do that in an abstracted function and clean up _get_next_switch_handle() (which is a subtle and important function already polluted with some unavoidable preprocessor testing!) Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2019-03-13 19:15:20 +01:00
Andy Ross	42ed12a387	kernel/sched: arch/x86_64: Support synchronous k_thread_abort() in SMP Currently thread abort doesn't work if a thread is currently scheduled on a different CPU, because we have no way of delivering an interrupt to the other CPU to force the issue. This patch adds a simple framework for an architecture to provide such an IPI, implements it for x86_64, and uses it to implement a spin loop in abort for the case where a thread is currently scheduled elsewhere. On SMP architectures (xtensa) where no such IPI is implemented, we fall back to waiting on an arbitrary interrupt to occur. This "works" for typical code (and all current tests), but of course it cannot be guaranteed on such an architecture that k_thread_abort() will return in finite time (e.g. the other thread on the other CPU might have taken a spinlock and entered an infinite loop, so it will never receive an interrupt to terminate itself)! On non-SMP architectures this patch changes no code paths at all. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2019-03-13 19:15:20 +01:00
Andy Ross	aed8288196	kernel/sched: Handle aboring _current correctly in SMP In SMP, _current is not "queued". (The run queue only stores unscheduled threads because we can't rely on the head of the list being _current). We weren't updating the cache choice, which would flag swap_ok, so calling k_thread_abort(_current) (for example, when a thread exits from its entry function) would try to switch back into the thread and then run off the end of the function. Amusingly this was more benign than you'd think. Stumbled on it by accident. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2019-03-13 19:15:20 +01:00
Patrik Flykt	4344e27c26	all: Update reserved function names Update reserved function names starting with one underscore, replacing them as follows: '_k_' with 'z_' '_K_' with 'Z_' '_handler_' with 'z_handl_' '_Cstart' with 'z_cstart' '_Swap' with 'z_swap' This renaming is done on both global and those static function names in kernel/include and include/. Other static function names in kernel/ are renamed by removing the leading underscore. Other function names not starting with any prefix listed above are renamed starting with a 'z_' or 'Z_' prefix. Function names starting with two or three leading underscores are not automatcally renamed since these names will collide with the variants with two or three leading underscores. Various generator scripts have also been updated as well as perf, linker and usb files. These are drivers/serial/uart_handlers.c include/linker/kobject-text.ld kernel/include/syscall_handler.h scripts/gen_kobject_list.py scripts/gen_syscall_header.py Signed-off-by: Patrik Flykt <patrik.flykt@intel.com>	2019-03-11 13:48:42 -04:00
Patrik Flykt	cf2d57952e	kernel/sched: Rename scheduler spinlock Rename scheduler spinlock sched_lock to sched_spinlock as it will collide with the cleanup of the reserved function name _sched_lock(), which will also be called sched_lock(). Signed-off-by: Patrik Flykt <patrik.flykt@intel.com>	2019-03-11 13:48:42 -04:00
Andy Ross	dff6b71450	kernel/sched: More nonatomic swap fixes Nonatomic swap strikes again. These issues are all longstanding, but were unmasked by the dlist work in commit `d40b8ce1fb` ("sys: dlist: Add sys_dnode_is_linked") where list node pointers become nulls on removal. The previous fix was for a specific case where a timeslicing interrupt would try to slice out the "wrong" current thread because the thread has "just" pended itself. That was incomplete, because the parallel code in k_sleep() didn't flag itself the same way. And beyond that, it turns out to be basically impossible (now that I'm thinking about it correctly) to prevent interrupt code from calling into the scheduler to suspend a "just pended but not quite" current and/or preempt away to another thread. In any of these cases, the scheduler modifications to the state bits remain correct but the queue nodes may be corrupt because the thread was already removed from the ready queue. So we have to test and correct this at the lowest level, where a thread is being removed from a priq: check that it's (1) the ready queue and not a waitq, (2) the current thread, and (3) already marked suspended and thus not in the queue. There are lots of existing issues filed in the last few months all pointing to odd instability on ARM platforms. I'm reasonably certain this is the root cause for most or all of them. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2019-02-27 12:07:34 -08:00
Andy Ross	1202810119	kernel/sched: _thread_priority_set needs to be sched_lock aware This API doesn't use the normal thread priority comparison itself, so doesn't get the magic that thread_base.prio provides. If called when another thread should be run, this would preempt the current thread always, even if the scheduler lock was taken. That was benign until recent spinlockifiation exposed it: a mutex in the philosophers test run in preempt_only mode would swap away while holding a spinlock (which used to work with irq locks) and fail later with a "recursive" spinlock assert. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2019-02-08 14:49:39 -05:00
Andy Ross	d27d4e6af2	kernel/sched: Remove remaining irq_lock use The k_sleep() locking was actually to protect the _current state from preemption before the context switch, so document that and replace with a spinlock. Should probably unify this with the rather cleaner logic in pend_curr(), but right now "sleeping" and "pended" are needlessly distinct states. And we can remove the locking entirely from k_wakeup(). There's no reason for any of that to need to be synchronized. Even if we're racing with other thread modifiations, the state on exit will be a runnable thread without a timeout, or whatever timeout/pend state the other side was requesting (i.e. it's a bug, but not one solved by synhronization). Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2019-02-08 14:49:39 -05:00
Andy Ross	1bf9bd04b1	kernel: Add _unlocked() variant to context switch primitives These functions, for good design reason, take a locking key to atomically release along with the context swtich. But there's still a common pattern in code to do a switch unconditionally by passing irq_lock() directly. On SMP that's a little hurtful as it spams the global lock. Provide an _unlocked() variant for _Swap/_reschedule/_pend_curr for simplicity and efficiency. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2019-02-08 14:49:39 -05:00
Andy Ross	ec554f44d9	kernel: Split reschdule & pend into irq/spin lock versions Just like with _Swap(), we need two variants of these utilities which can atomically release a lock and context switch. The naming shifts (for byte count reasons) to _reschedule/_pend_curr, and both have an _irqlock variant which takes the traditional locking. Just refactoring. No logic changes. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2019-02-08 14:49:39 -05:00
Andy Ross	aa6e21c24c	kernel: Split _Swap() API into irqlock and spinlock variants We want a _Swap() variant that can atomically release/restore a spinlock state in addition to the legacy irqlock. The function as it was is now named "_Swap_irqlock()", while _Swap() now refers to a spinlock and takes two arguments. The former will be going away once existing users (not that many! Swap() is an internal API, and the long port away from legacy irqlocking is going to be happening mostly in drivers) are ported to spinlocks. Obviously on uniprocessor setups, these produce identical code. But SMP requires that the correct API be used to maintain the global lock. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2019-02-08 14:49:39 -05:00
Andy Ross	dc0713a706	kernel: Cleanup. Remove redundant test when calling _Swap() _Swap() must already handle the case where _get_next_ready_thread() is the same as _current. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2019-02-08 14:49:39 -05:00
Daniel Leung	4bb10eeada	kernel/sched: fix CPU mask kconfig typo The kconfig used in BUILD_ASSERT_MSG() is missing a "S". So add it back. Signed-off-by: Daniel Leung <daniel.leung@intel.com>	2019-02-04 15:53:09 -05:00
Andy Ross	ab46b1b3c5	kernel/sched: CPU mask affinity/pinning API This adds a simple implementation of SMP CPU affinity to Zephyr. The API is simple and doesn't try to invent abstractions like "cpu sets". Each thread has an enable/disable flag associated with each CPU in the system, and the bits can be turned on and off (for threads that are not currently runnable, of course) using an easy three-function API. Because the implementation picked requires enumerating runnable threads in priority order looking for one that match the current CPU, this is not a good fit for the SCALABLE or MULTIQ scheduler backends, so it currently can be enabled only for SCHED_DUMB (which is the default anyway). Fancier algorithms do exist, but even the best of them scale as O(N_CPUS), so aren't quite constant time and often require significant memory overhead to keep separate lists for different cpus/sets. The intended use here is for apps that want to "pin" threads to specific CPUs for latency control, or conversely to prevent certain threads from taking time on specific CPUs to leave them free for fast response. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2019-02-01 21:37:24 -05:00
Andy Ross	bd049626c5	kernel/sched: Limit idle testing in preemption hot path Idle threads must (for obvious reasons!) always be preemptible from the perspective of the scheduler. But when preemptive scheduling is disabled, they are given a priority of -1, which is the lowest COOPERATIVE priority. So the scheduler preemption logic needed an extra test for this case and couldn't just rely on the existing priority comparison. This was a measurable performance loss, as this is a hot path on existing benchmarks. Limit that test to circumstances (!CONFIG_PREEMPT_ENABLED) where it's actually needed. Longer term it would be better to just force the existence of one "preemptible" thread priority always, but right now the number of priorities and the state of the PREEMPT_ENABLED kconfig flag are linked, and the existing interrupt return code (with no preemption, you know with certainty which thread you are returning to and can skip some work) on some platforms fails when I try this. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2019-02-01 15:57:21 -05:00
Andy Ross	1763a017b4	kernel/sched: Simplify init-time dummy thread & scheduling predicate For historical reasons, some architectures had a valid _current thread pointer at initialization time and others didn't. So the scheduler logic had a test that checks _current vs. NULL every time it needed to check premption, when this was only a workaround for initialization state. Fix things so that there is a dummy thread always (and clean up the code to do a struct assignment instead of a memset of bare memory), and we can remove that test from the scheduler hot path. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2019-02-01 15:57:21 -05:00
Andy Ross	b2791b0ac8	kernel/sched: Force inlining of some routines within the scheduler guts GCC 6.2.0 is making frustratingly poor inlining decisions with some of these routines, resulting in an awful lot of runtime calls for code that is only ever expanded once or twice within the file. Treat with targetted ALWAYS_INLINE's to force the issue. The scheduler code is a hot path. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2019-02-01 15:57:21 -05:00
Andy Ross	eda4c027da	misc/dlist: Swap insertion API for a faster one The sys_dlist_insert_*() functions had a behavior where a NULL argument for the insertion position to sys_dlist_insert_after/before() was interpreted as "the end of the list". We never used that convention (except in one spot internal to dlist.h which was not itself used anywhere), and of course already have an API for appending and prepending to a list. In practice this was a performance disaster. The NULL check is virtually never provable statically by the compiler, so that test and branch is present always. And worse, the check and call to another function was pushing this beyond the complexity limit for gcc to inline a function (at -Os optimization anyway), forcing us to use function calls for what should be a ~8 instruction sequence. The upshot is that dlist insertions were 2-3x slower than they needed to be. Deprecate these older APIs and introduce a new sys_dlist_insert() call which can be much better optimized. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2019-02-01 15:57:21 -05:00
Peter A. Bigot	b4ece0ad44	kernel: timeout: detect inactive timeouts using dnode linked state Whether a timeout is linked into the timeout queue can be determined from the corresponding sys_dnode_t linked state. This removes the need to use a special flag value in dticks to determine that the timeout is inactive. Update _abort_timeout to return an error code, rather than the flag value, when the timeout to be aborted was not active. Remove the _INACTIVE flag value, and replace its external uses with an internal API function that checks whether a timeout is inactive. Signed-off-by: Peter A. Bigot <pab@pabigot.com>	2019-01-23 20:46:49 +01:00
Peter A. Bigot	692e1033e7	kernel: sched: fix empty list detection CONTAINER_OF() on a NULL pointer returns some offset around NULL and not another NULL pointer. We have to check for that ourselves. This only worked because the dnode happened to be at the start of the struct. Signed-off-by: Peter A. Bigot <pab@pabigot.com>	2019-01-23 20:46:49 +01:00
Andy Ross	7fb8eb57e8	kernel/sched: SWAP_NONATOMIC workaround for timeslicing Timeslicing works by removing the _current thread from the run queue and re-adding it at the end of its priority. On systems with a _Swap() that can be preempted by a timer interrupt, that means it's possible for the timeslice to try to slice out a thread that had already pended itself! This behavior used to be benign (or at least undetectable) as the duplicated list operations were idempotent. But now the dlist code is stricter about correctness and has exposed the bug -- it will blow up if you try to remove an already-removed list node. Fix (on affected platforms) by stashing the _current pointer in _pend_current_thread() that is checked and cleared in the timer interrupt. If we discover we're trying to interrupt a thread that's already interrupted itself, we can safely exit z_time_slice() as a noop. The timeslicing bookeeping was already done for us underneath the pend code. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2019-01-15 13:06:35 +01:00
Andy Ross	23c5a63aa8	kernel/sched: Predicate SWAP_NONATOMIC workaround properly This is a refactoring of the fix in commit `6c95dafd82` to limit its application to affected platforms now that the root cause is understood. Note that the bug that fix was addressing was rare and seen only on after multi-hour sessions on Michael Scott's test rig. So if something regresses, this is where to look! Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2019-01-15 13:06:35 +01:00
Andy Ross	bb86f2019c	kernel/sched: Remove stale comment The recent change that added a locked z_set_timeout_expiry() API obsoleted the subtle note about synchronization above reset_time_slice(). None of that matters any more, the API is synchronized internally in a conventional way. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2019-01-03 12:29:02 -05:00
Flavio Ceolin	118715c62d	misra: Fixes for MISRA-C rule 8.3 MISRA-C says all declarations of an object or function must use the same name and qualifiers. MISRA-C rule 8.3 Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>	2018-12-07 09:06:34 -05:00
Flavio Ceolin	26be3355ac	kernel: sched: Fix undefined behavior The order of evaluation of function calls in the arguments of a function. This is undefined (32)/ unspecified(15-18) in C99. MISRA-C rule 13.2 does not allow that a value of an expression and its side effects happens in not deterministic order to avoid these undefined behaviors. Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>	2018-12-07 09:06:34 -05:00
Flavio Ceolin	80418602ed	kernel: sched: Make boolean functions return bool MISRA-C rule 14.4 Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>	2018-11-30 08:05:11 -08:00
Pawel Dunaj	baea22407d	kernel: Always set clock expiry with sync with timeout module System must not set the clock expiry via backdoor as it may effect in unbound time drift of all scheduled timeouts. Fixes: #11502 Signed-off-by: Pawel Dunaj <pawel.dunaj@nordicsemi.no>	2018-11-26 12:24:59 +01:00
Andy Ross	02165d76a0	kernel/timeout: Fix race with clock timeout setting The call to z_clock_set_timeout() was being made outside the timeout lock, which can race against other contexts setting sooner-expiring timeouts. Also add a long comment to one spot (timeslicing) where this call is made outside the timeout spinlock (inside the scheduler lock) and why this is OK. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2018-11-21 12:52:49 +01:00
Andy Ross	1c3051459b	kernel/sched: Fix race in k_sched_time_slice_set() If this function is itself interrupted by a timeslice event, the slicing state can be corrupted. Just re-use the scheduler lock instead of using a new spinlock; this is a low-latency function that won't deadlock. Found by inspection. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2018-11-13 17:10:07 -05:00
Flavio Ceolin	a406b88fca	kernel: Remove duplicated identifier There was an struct and a variable called _kernel. This is error prone and a MISRA-C violation. It is changing the struct to have a unique identifier. MISRA-C rule 5.8 Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>	2018-11-04 11:37:24 -05:00
Piotr Zięcik	7700eb2a15	kernel: sched: Make k_sleep() similar to POSIX equivalent This commit introduces k_sleep() return value, which provides information about actual sleep time. If the returned value is not-zero, the thread slept shorter than requested, which is only possible if the thread has been woken up by k_wakeup() call. Signed-off-by: Piotr Zięcik <piotr.ziecik@nordicsemi.no>	2018-10-30 18:27:31 +01:00
Marek Pieta	e87193896a	subsys: debug: tracing: Fix thread tracing Change fixes issue with thread execution tracing. Signed-off-by: Marek Pieta <Marek.Pieta@nordicsemi.no>	2018-10-29 22:09:12 -04:00
Spoorthi K	b6cd192fa5	kernel: sched: Fix compiler warning Ignore return value of _Swap() as it is not used anywhere. Signed-off-by: Spoorthi K <spoorthi.k@intel.com>	2018-10-24 09:48:17 +01:00
Adithya Baglody	6176692f4b	kernel: ksched.h: Incorrect argument type in _pend_current_thread In _pend_current_thread the argument key is always a unsigned interger type and this function forces it to become a signed interger. This is a dangerous behavior and cant be trusted to work as expected. Signed-off-by: Adithya Baglody <adithya.nagaraj.baglody@intel.com>	2018-10-17 12:17:58 -04:00
Adithya Baglody	1424561252	kernel: sched: Fixed incorrect argument type of _reschedule() This API shouldn't take a int type but instead it should take u32_t. This argument has to be similar to irq_lock() and irq_unlock(). Signed-off-by: Adithya Baglody <adithya.nagaraj.baglody@intel.com>	2018-10-17 07:59:51 -04:00
Andy Ross	7a035c0dc7	kernel/sched: Fix timeslice accounting for already-elapsed ticks In tickless mode, not all elapsed ticks may have been announced yet, so future z_time_slice() calls will include "extra" ticks that we have to account for when setting up the slice count. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2018-10-16 15:03:10 -04:00
Andy Ross	1129ea9394	kernel/sched: Fix timeslicing predicate It's possible to interrupt a thread that has already scheduled a timeout. Really this is a race against the usage of _add_thread_timeout() and needs some design work to provide proper locking (which is a distinct requirement from the scheduler lock and timeout lock!), as the users of that API are spread around the kernel. But existing usage always schedules the timeouts first, so this is safe. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2018-10-16 15:03:10 -04:00
Andy Ross	2dd9e2cad4	kernel/sched: Remove spurious locking The timeout APIs are properly synchronized now. This irq_lock() (and the comment explaining it) isn't needed anymore. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2018-10-16 15:03:10 -04:00
Andy Ross	987c0e5fc1	kernel: New timeout implementation Now that the API has been fixed up, replace the existing timeout queue with a much smaller version. The basic algorithm is unchanged: timeouts are stored in a sorted dlist with each node nolding a delta time from the previous node in the list; the announce call just walks this list pulling off the heads as needed. Advantages: * Properly spinlocked and SMP-aware. The earlier timer implementation relied on only CPU 0 doing timeout work, and on an irq_lock() being taken before entry (something that was violated in a few spots). Now any CPU can wake up for an event (or all of them) and everything works correctly. * The _thread_timeout() API is now expressible as a clean wrapping (just one liners) around the lower-level interface based on function pointer callbacks. As a result the timeout objects no longer need to store backpointers to the thread and wait_q and have shrunk by 33%. MUCH smaller, to the tune of hundreds of lines of code removed. * Future proof, in that all operations on the queue are now fronted by just two entry points (_add_timeout() and z_clock_announce()) which can easily be augmented with fancier data structures. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2018-10-16 15:03:10 -04:00
Andy Ross	5d203523b6	kernel/timeout: Eliminate wait_q parameters from API Now that this is known to be an unused value, remove it from the API. Note that this caught a few spots where we were passing values (a non-NULL wait_q with a NULL thread handle) that were always being ignored before. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2018-10-16 15:03:10 -04:00
Andy Ross	15d520819d	kernel/timeout: Prepare unification of timeout/thread wait_q fields The existing timeout API wants to store a wait_q on which the thread is waiting, but it only uses that value in one spot (and there only as a boolean flag indicating "this thread is waiting on a wait_q). As it happens threads can already store their own backpointers to a wait_q (needed for the SCALABLE scheduler backend), so we should use that instead. This patch doesn't actually perform that unification yet. It reorgnizes things such that the pended_on field is always set at the point of timeout interaction, and adds a bunch of asserts to make 100% sure the logic is correct. The next patch will modify the API. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2018-10-16 15:03:10 -04:00
Andy Ross	9098a45c84	kernel: New timeslicing implementation Instead of checking every time we hit the low-level context switch path to see if the new thread has a "partner" with which it needs to share time, just run the slice timer always and reset it from the scheduler at the points where it has already decided a switch needs to happen. In TICKLESS_KERNEL situations, we pay the cost of extra timer interrupts at ~10Hz or whatever, which is low (note also that this kind of regular wakeup architecture is required on SMP anyway so the scheduler can "notice" threads scheduled by other CPUs). Advantages: 1. Much simpler logic. Significantly smaller code. No variance or dependence on tickless modes or timer driver (beyond setting a simple timeout). 2. No arch-specific assembly integration with _Swap() needed 3. Better performance on many workloads, as the accounting now happens at most once per timer interrupt (~5 Hz) and true rescheduling and not on every unrelated context switch and interrupt return. 4. It's SMP-safe. The previous scheme kept the slice ticks as a global variable, which was an unnoticed bug. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2018-10-16 15:03:10 -04:00
Andy Ross	1c08aefe56	kernel/timeoutq: Uninline the timeout methods There was no good reason to have these rather large functions in a header. Put them into sys_clock.c for now, pending rework to the system. Now the API is clearly visible in a small header. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2018-10-16 15:03:10 -04:00
Andy Ross	722a888ef7	timer: Clean up hairy tickless APIs The tickless driver had a bunch of "hairy" APIs which forced the timer drivers to do needless low-level accounting for the benefit of the kernel, all of which then proceeded to implement them via cut and paste. Specifically the "program_time" calls forced the driver to expose to the kernel exactly when the next interrupt was due and how much time had elapsed, in a parallel API to the existing "what time is it" and "announce a tick" interrupts that carry the same information. Remove these from the kernel, replacing them with synthesized logic written in terms of the simpler APIs. In some cases there will be a performance impact due to the use of the 64 bit uptime call, but that will go away soon. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2018-10-16 15:03:10 -04:00
Andy Ross	ab488277bc	drivers/timer: Unify timeout setting APIs The existing API had two almost identical functions: _set_time() and _timer_idle_enter(). Both simply instruct the timer driver to set the next timer interrupt expiration appropriately so that the call to z_clock_announce() will be made at the requested number of ticks. On most/all hardware, these should be implementable identically. Unfortunately because they are specified differently, existing drivers have implemented them in parallel. Specify a new, unified, z_clock_set_timeout(). Document it clearly for implementors. And provide a shim layer for legacy drivers that will continue to use the old functions. Note that this patch fixes an existing bug found by inspection: the old call to _set_time() out of z_clock_announce() failed to test for the "wait forever" case in the situation where clock_always_on is true, meaning that a system that reached this point and then never set another timeout would freeze its uptime clock incorrectly. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2018-10-16 15:03:10 -04:00
Flavio Ceolin	02ed85bd82	kernel: sched: Change boolean APIs to return bool Change APIs that essentially return a boolean expression - 0 for false and 1 for true - to return a bool. MISRA-C rule 14.4 Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>	2018-09-28 06:28:41 +05:30
Flavio Ceolin	4218d5f8f0	kernel: Make If statement have essentially Boolean type Make if statement using pointers explicitly check whether the value is NULL or not. The C standard does not say that the null pointer is the same as the pointer to memory address 0 and because of this is a good practice always compare with the macro NULL. Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>	2018-09-18 13:57:15 -04:00
Flavio Ceolin	8f72f245bd	kernel: Explicitly check _abort_thread_timemout A lot of times this API is called during some cleanup even if the timeout was not set to make the code simpler. In these cases it's not necessary checking the return. Adding a cast to acknowledge it. Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>	2018-09-14 16:55:37 -04:00
Flavio Ceolin	98c64b6d92	kernel: Change _reschedule signature _reschedule return's value is not used anywhere, except erroneously by pthread_barrier_wait. Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>	2018-09-14 16:55:37 -04:00
Flavio Ceolin	5884c7f54b	kernel: Explicitly ignoring _Swap return Ignoring _Swap return where there is no treatment or nothing to do. Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>	2018-09-14 16:55:37 -04:00
Anas Nashif	a9f32d66cf	tracing: remove stray event_logger code Remove obsolete kernel event logger code. Signed-off-by: Anas Nashif <anas.nashif@intel.com>	2018-09-05 16:05:08 -04:00
Andy Ross	9ecc4ead68	sched: Properly account for timeslicing in tickless mode When adding a new runnable thread in tickless mode, we need to detect whether it will timeslice with the running thread and reset the timer, otherwise it won't get any CPU time until the next interrupt fires at some indeterminate time in the future. This fixes the specific bug discussed in #7193, but the broader problem of tickless and timeslicing interacting badly remains. The code as it exists needs some rework to avoid all the #ifdef mess. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2018-08-29 10:01:41 -04:00
Anas Nashif	0e07f8e97a	Revert "sched: Properly account for timeslicing in tickless mode" This reverts commit `bc6fb65c81`. Causes MPU faults on multiple platforms. Signed-off-by: Anas Nashif <anas.nashif@intel.com>	2018-08-27 18:39:51 -04:00
Andy Ross	bc6fb65c81	sched: Properly account for timeslicing in tickless mode When adding a new runnable thread in tickless mode, we need to detect whether it will timeslice with the runnable thread and reset the timer, otherwise it won't get any CPU time until the next interrupt fires at some indeterminate time in the future. This fixes the specific bug discussed in #7193, but the broader problem of tickless and timeslicing interacting badly remains. The code as it exists needs some rework to avoid all the #ifdef mess. Note that the patch also moves _ready_thread() from a ksched.h inline to sched.c. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2018-08-27 13:19:29 -04:00
Anas Nashif	b6304e66f6	tracing: support generic tracing hooks Define generic interface and hooks for tracing to replace kernel_event_logger and existing tracing facilities with something more common. Signed-off-by: Anas Nashif <anas.nashif@intel.com>	2018-08-21 05:45:47 -07:00
Flavio Ceolin	0866d18d03	irq: Fix irq_lock api usage irq_lock returns an unsigned int, though, several places was using signed int. This commit fix this behaviour. In order to avoid this error happens again, a coccinelle script was added and can be used to check violations. Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>	2018-08-16 19:47:41 -07:00
Piotr Zięcik	2a26576b03	kernel: sched: Use ticks as time unit in time slicing. The time slicing settings was kept in milliseconds while all related operations was based on ticks. Continuous back and forth conversion between ticks and milliseconds introduced an accumulating error due to rounding in _ms_to_ticks() and __ticks_to_ms(). As result configured time slice duration was not achieved. This commit removes excessive ticks <-> ms conversion by using ticks as time unit for all operations related to time slicing. Also, it fixes #8896 as well as #8897. Signed-off-by: Piotr Zięcik <piotr.ziecik@nordicsemi.no>	2018-08-14 07:18:44 -07:00
Piotr Zięcik	e670135fdc	kernel: sched: Fix comparsion in _update_time_slice_before_swap() The _update_time_slice_before_swap() function directly compared _time_slice_duration (expressed in ms) with value returned by _get_remaining_program_time() which used ticks as a time unit. Moreover, the _time_slice_duration was also used as an argument for _set_time(), which expects time expressed in ticks. This commit ensures that the same unit (ticks) is used in comparsion and timer adjustments. Signed-off-by: Piotr Zięcik <piotr.ziecik@nordicsemi.no>	2018-08-14 07:18:44 -07:00
Piotr Zięcik	4a39b9ea64	kernel: sched: Use ticks as time unit in time slicing. The time slicing settings was kept in milliseconds while all related operations was based on ticks. Continuous back and forth conversion between ticks and milliseconds introduced an accumulating error due to rounding in _ms_to_ticks() and __ticks_to_ms(). As result configured time slice duration was not achieved. This commit removes excessive ticks <-> ms conversion by using ticks as time unit for all operations related to time slicing. Also, it fixes #8896 as well as #8897. Signed-off-by: Piotr Zięcik <piotr.ziecik@nordicsemi.no>	2018-08-13 07:13:22 -07:00
Piotr Zięcik	ee9a0615a4	kernel: sched: Fix comparsion in _update_time_slice_before_swap() The _update_time_slice_before_swap() function directly compared _time_slice_duration (expressed in ms) with value returned by _get_remaining_program_time() which used ticks as a time unit. Moreover, the _time_slice_duration was also used as an argument for _set_time(), which expects time expressed in ticks. This commit ensures that the same unit (ticks) is used in comparsion and timer adjustments. Signed-off-by: Piotr Zięcik <piotr.ziecik@nordicsemi.no>	2018-08-13 07:13:22 -07:00
Piotr Zięcik	fe2ac39bf2	kernel: Cleanup _ms_to_ticks(). This commit moves all implementations of the _ms_to_ticks() into single file. Also, the function is now inline even if _NEED_PRECISE_TICK_MS_CONVERSION is defined. Signed-off-by: Piotr Zięcik <piotr.ziecik@nordicsemi.no>	2018-07-03 22:46:39 -04:00
Andy Ross	9f06a35450	kernel: Add the old "multi queue" scheduler algorithm as an option Zephyr 1.12 removed the old scheduler and replaced it with the choice of a "dumb" list or a balanced tree. But the old multi-queue algorithm is still useful in the space between these two (applications with large-ish numbers of runnable threads, but that don't need fancy features like EDF or SMP affinity). So add it as a CONFIG_SCHED_MULTIQ option. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2018-07-03 17:09:15 -04:00
Andy Ross	225c74bbdf	kernel/Kconfig: Reorgnize wait_q and sched algorithm choices Make these "choice" items instead of a single boolean that implies the element unset. Also renames WAITQ_FAST to WAITQ_SCALABLE, as the rbtree is really only "fast" for large queue sizes (it's constant factor overhead is bigger than a list's!) Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2018-07-03 17:09:15 -04:00
Anas Nashif	80e6a978a6	kernel/drivers: fix compile warnings Uncovered by clang we have some functions being only used conditionally, so gaurd them to make them only available when those conditions are met. Signed-off-by: Anas Nashif <anas.nashif@intel.com>	2018-07-01 22:58:23 +02:00
Michael Scott	6c95dafd82	kernel: sched: use _is_thread_ready() in should_preempt() We are using _is_thread_prevented_from_running() to see if the _current thread can be preempted in should_preempt(). The idea being that even if the _current thread is a high priority coop thread, we can still preempt it when it's pending, suspended, etc. This does not take into account if the thread is sleeping. k_sleep() merely removes the thread from the ready_q and calls Swap(). The scheduler will swap away from the thread temporarily and then on the next cycle get stuck to the sleeping thread for however long the sleep timeout is, doing exactly nothing because other functions like _ready_thread() use _is_thread_ready() as a check before proceeding. We should use !_is_thread_ready() to take into account when threads are waiting on a timer, and let other threads run in the meantime. Signed-off-by: Michael Scott <michael@opensourcefoundries.com>	2018-06-04 08:21:47 -04:00
Andy Ross	43553da9b2	kernel/sched: Fix preemption logic The should_preempt() code was catching some of the "unrunnable" cases but not all of them, opening the possibility of failing to preempt a just-pended thread and thus waking it up synchronously. There are reports of this causing spin loops over k_poll() in the network stack work queues (see #8049). Note that the previous _is_dummy() call is folded into (the somewhat verbosely named) _is_thread_prevented_from_running(), and that the order of tests has been changed/optimized to hopefully catch common cases earlier. Suggested-by: Michael Scott <michael@opensourcefoundries.com> Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2018-05-31 16:46:14 -04:00
Andy Ross	eace1df539	kernel/sched: Fix SMP scheduling Recent changes post-scheduler-rewrite broke scheduling on SMP: The "preempt_ok" feature added to isolate preemption points wasn't honored in SMP mode. Fix this by adding a "swap_ok" field to the CPU record (not the thread) which is set at the same time out of update_cache(). The "queued" flag wasn't being maintained correctly when swapping away from _current (it was added back to the queue, but the flag wasn't set). Abstract out a "should_preempt()" predicate so SMP and uniprocessor paths share the same logic, which is distressingly subtle. There were two places where _Swap() was predicated on _get_next_ready_thread() != _current. That's no longer a benign optimization in SMP, where the former function REMOVES the next thread from the queue. Just call _Swap() directly in SMP, which has a unified C implementation that does this test already. Don't change other architectures in case it exposes bugs with _Swap() switching back to the same thread (it should work, I just don't want to break anything). Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2018-05-31 14:02:03 -04:00
Andy Ross	3a0cb2d35d	kernel: Remove legacy preemption checking The metairq feature exposed the fact that all of our arch code (and a few mistaken spots in the scheduler too) was trying to interpret "preemptible" threads independently. As of the scheduler rewrite, that logic is entirely within sched.c and doing it externally is redundant. And now that "cooperative" threads can be preempted, it's wrong and produces test failures when used with metairq threads. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2018-05-25 09:40:55 -07:00
Andy Ross	4a2e50f6b0	kernel: Earliest-deadline-first scheduling policy Very simple implementation of deadline scheduling. Works by storing a single word in each thread containing a deadline, setting it (as a delta from "now") via a single new API call, and using it as extra input to the existing thread priority comparison function when priorities are equal. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2018-05-23 14:25:52 -04:00
Andy Ross	7aa25fa5eb	kernel: Add "meta IRQ" thread priorities This patch adds a set of priorities at the (numerically) lowest end of the range which have "meta-irq" behavior. Runnable threads at these priorities will always be scheduled before threads at lower priorities, EVEN IF those threads are otherwise cooperative and/or have taken a scheduler lock. Making such a thread runnable in any way thus has the effect of "interrupting" the current task and running the meta-irq thread synchronously, like an exception or system call. The intent is to use these priorities to implement "interrupt bottom half" or "tasklet" behavior, allowing driver subsystems to return from interrupt context but be guaranteed that user code will not be executed (on the current CPU) until the remaining work is finished. As this breaks the "promise" of non-preemptibility granted by the current API for cooperative threads, this tool probably shouldn't be used from application code. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2018-05-23 14:25:52 -04:00
Andy Ross	1856e2206d	kernel/sched: Don't preempt cooperative threads The scheduler rewrite added a regression in uniprocessor mode where cooperative threads would be unexpectedly preempted, because nothing was checking the preemption status of _current at the point where the next-thread cache pointer was being updated. Note that update_cache() needs a little more context: spots like k_yield() that leave _current runable need to be able to tell it that "yes, preemption is OK here even though the thread is cooperative'. So it has a "preempt_ok" argument now. Interestingly this didn't get caught because we don't test that. We have lots and lots of tests of the converse cases (i.e. making sure that threads get preempted when we expect them to), but nothing that explicitly tries to jump in front of a cooperative thread. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2018-05-23 14:25:52 -04:00
Andy Ross	1acd8c2996	kernel: Scheduler rewrite This replaces the existing scheduler (but not priority handling) implementation with a somewhat simpler one. Behavior as to thread selection does not change. New features: + Unifies SMP and uniprocessing selection code (with the sole exception of the "cache" trick not being possible in SMP). + The old static multi-queue implementation is gone and has been replaced with a build-time choice of either a "dumb" list implementation (faster and significantly smaller for apps with only a few threads) or a balanced tree queue which scales well to arbitrary numbers of threads and priority levels. This is controlled via the CONFIG_SCHED_DUMB kconfig variable. + The balanced tree implementation is usable symmetrically for the wait_q abstraction, fixing a scalability glitch Zephyr had when many threads were waiting on a single object. This can be selected via CONFIG_WAITQ_FAST. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2018-05-19 07:00:55 +03:00
Andy Ross	ccf3bf7ed3	kernel: Fix sloppy wait queue API There were multiple spots where code was using the _wait_q_t abstraction as a synonym for a dlist and doing direct list management on them with the dlist APIs. Refactor _wait_q_t into a proper opaque struct (not a typedef for sys_dlist_t) and write a simple wrapper API for the existing usages. Now replacement of wait_q with a different data structure is much cleaner. Note that there were some SYS_DLIST_FOR_EACH_SAFE loops in mailbox.c that got replaced by the normal/non-safe macro. While these loops do mutate the list in the code body, they always do an early return in those circumstances instead of returning into the macro'd for() loop, so the _SAFE usage was needless. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2018-05-18 01:48:48 +03:00
Andy Ross	4ca0e07088	kernel: Add _unpend_all convenience wrapper to scheduler API Refactoring. Mempool wants to unpend all threads at once. It's cleaner to do this in the scheduler instead of the IPC code. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2018-05-18 01:48:48 +03:00
Andrew Boie	8345e5ebf0	syscalls: remove policy from handler checks The various macros to do checks in system call handlers all implictly would generate a kernel oops if a check failed. This is undesirable for a few reasons: * System call handlers that acquire resources in the handler have no good recourse for cleanup if a check fails. * In some cases we may want to propagate a return value back to the caller instead of just killing the calling thread, even though the base API doesn't do these checks. These macros now all return a value, if nonzero is returned the check failed. K_OOPS() now wraps these calls to generate a kernel oops. At the moment, the policy for all APIs has not changed. They still all oops upon a failed check/ The macros now use the Z_ notation for private APIs. Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>	2018-05-17 23:34:03 +03:00
Andy Ross	15c400774e	kernel: Rework SMP irq_lock() compatibility layer This was wrong in two ways, one subtle and one awful. The subtle problem was that the IRQ lock isn't actually globally recursive, it gets reset when you context switch (i.e. a _Swap() implicitly releases and reacquires it). So the recursive count I was keeping needs to be per-thread or else we risk deadlock any time we swap away from a thread holding the lock. And because part of my brain apparently knew this, there was an "optimization" in the code that tested the current count vs. zero outside the lock, on the argument that if it was non-zero we must already hold the lock. Which would be true of a per-thread counter, but NOT a global one: the other CPU may be holding that lock, and this test will tell you you do. The upshot is that a recursive irq_lock() would almost always SUCCEED INCORRECTLY when there was lock contention. That this didn't break more things is amazing to me. The rework is actually simpler than the original, thankfully. Though there are some further subtleties: * The lock state implied by irq_lock() allows the lock to be implicitly released on context switch (i.e. you can _Swap() with the lock held at a recursion level higher than 1, which needs to allow other processes to run). So return paths into threads from _Swap() and interrupt/exception exit need to check and restore the global lock state, spinning as needed. * The idle loop design specifies a k_cpu_idle() function that is on common architectures expected to enable interrupts (for obvious reasons), but there is no place to put non-arch code to wire it into the global lock accounting. So on SMP, even CPU0 needs to use the "dumb" spinning idle loop. Finally this patch contains a simple bugfix too, found by inspection: the interrupt return code used when CONFIG_SWITCH is enabled wasn't correctly setting the active flag on the threads, opening up the potential for a race that might result in a thread being scheduled on two CPUs simultaneously. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2018-05-02 10:00:17 -07:00
Andy Ross	e7ded11a2e	kernel: Prune ksched.h of dead code There was a ton of junk in this header. Pare it down to just the stuff actually used by code outside of sched.c, move the needed internal stuff into sched.c itself, and drop everything else. Note that (other than the tiny inlines that remain here in the header) the scheduler interface exposed to the rest of the system is now composed of just 12 functions. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2018-04-25 13:13:23 -07:00
Andy Ross	22642cf309	kernel: Clean up _unpend_thread() API Almost everywhere this was called, it was immediately followed by _abort_thread_timeout(), for obvious reasons. The only exceptions were in timeout and k_timer expiration (unifying these two would be another good cleanup), which are peripheral parts of the scheduler and can plausibly use a more "internal" API. So make the common case the default, and expose the old behavior as _unpend_thread_no_timeout(). (Along with identical changes for _unpend_first_thread) Saves code bytes and simplifies scheduler surface area for future synchronization work. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2018-04-24 03:57:20 +05:30
Andy Ross	15cb5d7293	kernel: Further unify _reschedule APIs Now that other work has eliminated the two cases where we had to do a reschedule "but yield even if we are cooperative", we can squash both down to a single _reschedule() function which does almost exactly what legacy _Swap() did, but wrapped as a proper scheduler API. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2018-04-24 03:57:20 +05:30
Andy Ross	e0a572beeb	kernel: Refactor, unifying _pend_current_thread() + _Swap() idiom Everywhere the current thread is pended, the code is going to have to do a _Swap() soon afterward, yet the scheduler API exposed these as separate steps. Unify this pattern everywhere it appears, which saves some code bytes and gets _Swap() out of the general scheduler API at zero cost. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2018-04-24 03:57:20 +05:30
Andy Ross	8606fabf74	kernel: Scheduler refactoring: use _reschedule_*() always There was a somewhat promiscuous pattern in the kernel where IPC mechanisms would do something that might effect the current thread choice, then check _must_switch_threads() (or occasionally __must_switch_threads -- don't ask, the distinction is being replaced by real English words), sometimes _is_in_isr() (but not always, even in contexts where that looks like it would be a mistake), and then call _Swap() if everything is OK, otherwise releasing the irq_lock(). Sometimes this was done directly, sometimes via the inverted test, sometimes (poll, heh) by doing the test when the thread state was modified and then needlessly passing the result up the call stack to the point of the _Swap(). And some places were just calling _reschedule_threads(), which did all this already. Unify all this madness. The old _reschedule_threads() function has split into two variants: _reschedule_yield() and _reschedule_noyield(). The latter is the "normal" one that respects the cooperative priority of the current thread (i.e. it won't switch out even if there is a higher priority thread ready -- the current thread has to pend itself first), the former is used in the handful of places where code was doing a swap unconditionally, just to preserve precise behavior across the refactor. I'm not at all convinced it should exist... Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2018-04-24 03:57:20 +05:30
Andy Ross	b481d0a045	kernel: Allow pending w/o wait_q for scheduler API cleanup The mailbox code was written to use the _remove_thread_from_ready_q() API directly, which would be good to get out of the scheduler internal API. What it really wanted to do is to mark a thread "PENDING" without actually adding it to a wait queue, which is sane enough (the message stores the "thread to wake up on receipt" handle). So allow that naturally in the _pend_thread() API by passing a NULL wait_q. Really a wait_q needn't be the only way a thread can block. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2018-04-24 03:57:20 +05:30
Leandro Pereira	541c3cb18b	kernel: sched: Fix validation of priority levels A priority value cannot be simultaneously higher than the maximum possible value and smaller than the minimum value. Rewrite the _VALID_PRIO() macro as a function so that this if either of these invariants are invalid, the priority is considered invalid. Coverity-CID: 182584 Coverity-CID: 182585 Signed-off-by: Leandro Pereira <leandro.pereira@intel.com>	2018-04-21 08:39:42 -07:00
Andy Ross	81242985c2	kernel/sched: Clean up docs for _pend_thread(), limit scope The scheduler has a kernel-internal _pend_thread() utility which sounds like a function which will add an arbitrary thread to a wait_q. This is essentially unsupportable in SMP, where that thread might actually be executing on a different CPU. Thankfully we never used it like that. The only spots outside the scheduler that use the API are in pipes and mailbox, which both just want to pend a DUMMY thread to track the timeout but will never try to pend a true foreign thread. Clarify the comment and add an assertion to make sure this promise isn't broken in the future. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2018-03-18 16:58:12 -04:00
Andy Ross	85bc0a3fe6	kernel: Cleanup, unify _add_thread_to_ready_q() and _ready_thread() The scheduler exposed two APIs to do the same thing: _add_thread_to_ready_q() was a low level primitive that in most cases was wrapped by _ready_thread(), which also (1) checks that the thread _is_ready() or exits, (2) flags the thread as "started" to handle the case of a thread running for the first time out of a waitq timeout, and (3) signals a logger event. As it turns out, all existing usage was already checking case #1. Case #2 can be better handled in the timeout resume path instead of on every call. And case #3 was probably wrong to have been skipping anyway (there were paths that could make a thread runnable without logging). Now _add_thread_to_ready_q() is an internal scheduler API, as it probably always should have been. This also moves some asserts from the inline _ready_thread() wrapper to the underlying true function for code size reasons, otherwise the extra use of the inline added by this patch blows past code size limits on Quark D2000. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2018-03-18 16:58:12 -04:00
Andy Ross	9d367eeb0a	xtensa, kernel/sched: Move next switch_handle selection to the scheduler The xtensa asm2 layer had a function to select the next switch handle to return into following an exception. There is no arch-specific code there, it's just scheduler logic. Move it to the scheduler where it belongs. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2018-03-18 16:58:12 -04:00
Leandro Pereira	a1ae8453f7	kernel: Name of static functions should not begin with an underscore Names that begin with an underscore are reserved by the C standard. This patch does not change names of functions defined and implemented in header files. Signed-off-by: Leandro Pereira <leandro.pereira@intel.com>	2018-03-10 08:39:10 -05:00
Andy Ross	2724fd11cb	kernel: SMP-aware scheduler The scheduler needs a few tweaks to work in SMP mode: 1. The "cache" field just doesn't work. With more than one CPU, caching the highest priority thread isn't useful as you may need N of them at any given time before another thread is returned to the scheduler. You could recalculate it at every change, but that provides no performance benefit. Remove. 2. The "bitmask" designed to prevent the need to individually check priorities is likewise dropped. This could work, but in fact on our only current SMP system and with current K_NUM_PRIOPRITIES values it provides no real benefit. 3. The individual threads now have a "current cpu" and "active" flag so that the choice of the next thread to run can correctly skip threads that are active on other CPUs. The upshot is that a decent amount of code gets #if'd out, and the new SMP implementations for _get_highest_ready_prio() and _get_next_ready_thread() are simpler and smaller, at the expense of having to drop older optimizations. Note that scheduler synchronization is unchanged: all scheduler APIs used to require that an irq_lock() be held, which means that they now require the global spinlock via the same API. This should be a very early candidate for lock granularity attention! Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2018-02-16 10:44:29 -05:00
Andy Ross	9c62cc677d	kernel: Add kswap.h header to unbreak cycles The xtensa-asm2 work included a patch that added nano_internal.h includes in lots of places that needed to have _Swap defined, because it had to break a cycle and this no longer got pulled in from the arch headers. Unfortunately those new includes created new and more amusing cycles elsewhere which led to breakage on other platforms. Break out the _Swap definition (only) into a separate header and use that instead. Cleaner. Seems not to have any more hidden gotchas. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2018-02-16 10:44:29 -05:00
Andy Ross	32a444c54e	kernel: Fix nano_internal.h inclusion _Swap() is defined in nano_internal.h. Everything calls _Swap(). Pretty much nothing that called _Swap() included nano_internal.h, expecting it to be picked up automatically through other headers (as it happened, from the kernel arch-specific include file). A new _Swap() is going to need some other symbols in the inline definition, so I needed to break that cycle. Now nothing sees _Swap() defined anymore. Put nano_internal.h everywhere it's needed. Our kernel includes remain a big awful yucky mess. This makes things more correct but no less ugly. Needs cleanup. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>	2018-02-16 10:44:29 -05:00
Andrew Boie	9f38d2a91a	kernel: have k_sched_lock call _sched_lock Having two implementations of the same thing is bad, especially when one can just call the other inline version. Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>	2017-11-17 17:42:54 -05:00
Punit Vara	ce60d04fb6	kernel: sched.c: Fix datatype mismatch in comparision All arguments comes from userspace has data type u32_t but base.prio has data type of s8_t. Comparision between s8_t and u32_t cannot be done. That's why typecast priority coming from userspace(prio) to s8_t data type. Signed-off-by: Punit Vara <punit.vara@intel.com>	2017-11-14 09:49:00 -08:00
Leandro Pereira	6f99bdb02a	kernel: Provide only one _SYSCALL_HANDLER() macro Use some preprocessor trickery to automatically deduce the amount of arguments for the various _SYSCALL_HANDLERn() macros. Makes the grunt work of converting a bunch of kernel APIs to system calls slightly easier. Signed-off-by: Leandro Pereira <leandro.pereira@intel.com>	2017-10-16 13:42:15 -04:00
Andrew Boie	5008fedc92	kernel: restrict user threads to worsen priority User threads aren't trusted and shouldn't be able to alter the scheduling assumptions of the system by making thread priorities more favorable. Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>	2017-10-12 16:24:48 -07:00
Andrew Boie	225e4c0e76	kernel: greatly simplify syscall handlers We now have macros which should significantly reduce the amount of boilerplate involved with defining system call handlers. - Macros which define the proper prototype based on number of arguments - "SIMPLE" variants which create handlers that don't need anything other than object verification Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>	2017-10-12 16:26:28 -05:00
Andrew Boie	37ff5a9bc5	kernel: system call handler cleanup Use new _SYSCALL_OBJ/_SYSCALL_OBJ_INIT macros. Use new _SYSCALL_MEMORY_READ/_SYSCALL_MEMORY_WRITE macros. Some non-obvious checks changed to use _SYSCALL_VERIFY_MSG. Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>	2017-10-11 17:54:47 -07:00
Andrew Boie	468190a795	kernel: convert most thread APIs to system calls Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>	2017-10-07 10:45:15 -07:00
Andrew Boie	76c04a21ee	kernel: implement some more system calls These are needed to demonstrate the Philosophers demo with threads running in user mode. Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>	2017-10-07 10:45:15 -07:00
Luiz Augusto von Dentz	87aa621915	kernel: Use SYS_DLIST_FOR_EACH_CONTAINER whenever possible SYS_DLIST_FOR_EACH_CONTAINER is preferable over using SYS_DLIST_FOR_EACH_NODE as that avoid casting directly which assumes the node field is always at the beginning. Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2017-08-25 09:08:50 -04:00
Youvedeep Singh	f807d4db7e	Scheduler: Same priority Preemptive threads should get equal time slice If there are multiple preemptive threads with same priority, and any one thread preempts before its time slice expires (due to yields/ semaphore take/queue etc), then next schedules thread is getting lower time slide than expected. This patch fixes this issue by accounting time expired when a thread releases CPU before its time slide expires. Jira: ZEP-2217/ZEP-2218 Signed-off-by: Youvedeep Singh <youvedeep.singh@intel.com>	2017-08-08 08:51:24 -04:00
Andrew Boie	3989de7e3b	kernel: fix short time-slice reset The kernel tracks time slice usage with the _time_slice_elapsed global. Every time the timer interrupt goes off and the timer driver calls _nano_sys_clock_tick_announce() with the elapsed time, this is added to _time_slice_elapsed. If it exceeds the total time slice, the thread is moved to the back of the queue for that priority level and _time_slice_elapsed is reset to zero. In a non-tickless kernel, this is the only time _time_slice_elapsed is reset. If a thread uses up a partial time slice, and then cooperatively switches to another thread, the next thread will inherit the remaining time slice, causing it not to be able to run as long as it ought to. There does exist code to properly reset the elapsed count, but it was only compiled in a tickless kernel. Now it is built any time CONFIG_TIMESLICING is enabled. Issue: ZEP-2107 Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>	2017-06-02 14:47:01 -04:00
Maciek Borzecki	81bdee3592	kernel: make _dump_ready_q() static and visible only with CONFIG_KERNEL_DEBUG Fixes sparse warning: <snip>/zephyr/kernel/sched.c:368:6: warning: symbol '_dump_ready_q' was not declared. Should it be static? Change-Id: I156e89f1d74178bbd99cc25e532da544c7ebee60 Signed-off-by: Maciek Borzecki <maciek.borzecki@gmail.com>	2017-05-18 12:41:56 -05:00
Andrew Boie	5dcb279df8	debug: add stack sentinel feature This places a sentinel value at the lowest 4 bytes of a stack memory region and checks it at various intervals, including when servicing interrupts or context switching. This is implemented on all arches except ARC, which supports stack bounds checking directly in hardware. Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>	2017-05-13 15:14:41 -04:00
Ramesh Thomas	89ffd44dfb	kernel: tickless: Add tickless kernel support Adds event based scheduling logic to the kernel. Updates management of timeouts, timers, idling etc. based on time tracked at events rather than periodic ticks. Provides interfaces for timers to announce and get next timer expiry based on kernel scheduling decisions involving time slicing of threads, timeouts and idling. Uses wall time units instead of ticks in all scheduling activities. The implementation involves changes in the following areas 1. Management of time in wall units like ms/us instead of ticks The existing implementation already had an option to configure number of ticks in a second. The new implementation builds on top of that feature and provides option to set the size of the scheduling granurality to mili seconds or micro seconds. This allows most of the current implementation to be reused. Due to this re-use and co-existence with tick based kernel, the names of variables may contain the word "tick". However, in the tickless kernel implementation, it represents the currently configured time unit, which would be be mili seconds or micro seconds. The APIs that take time as a parameter are not impacted and they continue to pass time in mili seconds. 2. Timers would not be programmed in periodic mode generating ticks. Instead they would be programmed in one shot mode to generate events at the time the kernel scheduler needs to gain control for its scheduling activities like timers, timeouts, time slicing, idling etc. 3. The scheduler provides interfaces that the timer drivers use to announce elapsed time and get the next time the scheduler needs a timer event. It is possible that the scheduler may not need another timer event, in which case the system would wait for a non-timer event to wake it up if it is idling. 4. New APIs are defined to be implemented by timer drivers. Also they need to handler timer events differently. These changes have been done in the HPET timer driver. In future other timers that support tickles kernel should implement these APIs as well. These APIs are to re-program the timer, update and announce elapsed time. 5. Philosopher and timer_api applications have been enabled to test tickless kernel. Separate configuration files are created which define the necessary CONFIG flags. Run these apps using following command make pristine && make BOARD=qemu_x86 CONF_FILE=prj_tickless.conf qemu Jira: ZEP-339 ZEP-1946 ZEP-948 Change-Id: I7d950c31bf1ff929a9066fad42c2f0559a2e5983 Signed-off-by: Ramesh Thomas <ramesh.thomas@intel.com>	2017-04-27 13:46:28 +00:00
Kumar Gala	cc334c7273	Convert remaining code to using newly introduced integer sized types Convert code to use u{8,16,32,64}_t and s{8,16,32,64}_t instead of C99 integer types. This handles the remaining includes and kernel, plus touching up various points that we skipped because of include dependancies. We also convert the PRI printf formatters in the arch code over to normal formatters. Jira: ZEP-2051 Change-Id: Iecbb12601a3ee4ea936fd7ddea37788a645b08b0 Signed-off-by: Kumar Gala <kumar.gala@linaro.org>	2017-04-21 11:38:23 -05:00
Kumar Gala	34a57db844	Revert "kernel: Convert formatter strings to use PRI defines" This reverts commit `7b9dc107a8`. We revert this as we intent to move away from {u}int{8,16,32,64}_t types to our own internal types for sized variables so we shouldn't need the PRI macros anymore. Change-Id: I1d9d797fee47ca266867ae65656c150f8fe2adb2 Signed-off-by: Kumar Gala <kumar.gala@linaro.org>	2017-04-19 10:50:51 -05:00
Kumar Gala	7b9dc107a8	kernel: Convert formatter strings to use PRI defines To allow for various libc implementations (like newlib) in which the way various {u}int{8,16,32}_t types are defined vary between both libc implementations and across architectures we need to utilize the PRI defines. Change-Id: Ie884fb67015502288152ecbd64c37961a4f538e4 Signed-off-by: Kumar Gala <kumar.gala@linaro.org>	2017-04-17 11:09:36 -05:00
Benjamin Walsh	8d7c274e55	kernel/sched: protect thread sched_lock with compiler barriers This has not bitten us yet, but it was a ticking timebomb. This is similar to the issue that was found with irq_lock/irq_unlock implementations on several architectures. Having a volatile variable is not the way to force the sched_lock variable to be incremented/decremented around the accesses to data it protects. Instead, a compiler barrier must prevent the compiler from reordering the memory accesses around setting of sched_lock. Needed in the inline implementations _sched_lock()/_sched_unlock_no_reschedule(), which resolve to simple decrement/increment of the per-thread sched_lock variable. Change-Id: I06f5b3524889f193efe69caa947118404b1be0b5 Signed-off-by: Benjamin Walsh <walsh.benj@gmail.com>	2017-02-16 04:56:21 +00:00
David B. Kinder	ac74d8b652	license: Replace Apache boilerplate with SPDX tag Replace the existing Apache 2.0 boilerplate header with an SPDX tag throughout the zephyr code tree. This patch was generated via a script run over the master branch. Also updated doc/porting/application.rst that had a dependency on line numbers in a literal include. Manually updated subsys/logging/sys_log.c that had a malformed header in the original file. Also cleanup several cases that already had a SPDX tag and we either got a duplicate or missed updating. Jira: ZEP-1457 Change-Id: I6131a1d4ee0e58f5b938300c2d2fc77d2e69572c Signed-off-by: David B. Kinder <david.b.kinder@intel.com> Signed-off-by: Kumar Gala <kumar.gala@linaro.org>	2017-01-19 03:50:58 +00:00
Benjamin Walsh	2f280416e6	kernel: fix total number of coop prios in coop-only mode The idle priority was not accounted for. With this change, the philosophers demo runs in coop-only mode. Change-Id: I23db33687bcf3b2107d5fc07977143730f62e476 Signed-off-by: Benjamin Walsh <walsh.benj@gmail.com>	2017-01-17 12:17:27 +00:00
Benjamin Walsh	b8c2160a2b	kernel: do not use sys_dlist_insert_at() in _pend_thread() It's calling a function on every iteration, it's more efficient to just do the logic inline. Change-Id: I166e377d4ffb3056749fd625cb789173030904ac Signed-off-by: Benjamin Walsh <benjamin.walsh@windriver.com>	2017-01-06 17:32:26 +00:00
Benjamin Walsh	e6a69cae54	kernel/arch: reverse polarity on sched_locked This will allow for an enhancement when checking if the thread is preemptible when exiting an interrupt. Change-Id: If93ccd1916eacb5e02a4d15b259fb74f9800d6f4 Signed-off-by: Benjamin Walsh <benjamin.walsh@windriver.com>	2017-01-06 17:32:24 +00:00
Benjamin Walsh	04ed860c68	kernel: make _thread.sched_locked a non-atomic operator variable Not needed, since only the thread itself can modifiy its own sched_locked count. Change-Id: I3d3d8be548d2b24ca14f51637cc58bda66f8b9ee Signed-off-by: Benjamin Walsh <benjamin.walsh@windriver.com>	2017-01-06 17:32:23 +00:00
Benjamin Walsh	6209218f40	kernel: optimize ms-to-ticks for certain tick frequencies Some tick frequencies lend themselves to optimized conversions from ms to ticks and vice-versa. - 1000Hz which does not need any conversion - 500Hz, 250Hz, 125Hz where the division/multiplication are a straight shift since they are power-of-two factors of 1000. In addition, some more generally used values are made to use optimized conversion equations rather than the generic one that uses 64-bit math, and often results in calling compiler intrinsics. These values are: 100Hz, 50Hz, 25Hz, 20Hz, 10Hz, 1Hz (the last one used in some testing). Avoiding the 64-bit math intrisics has the additional benefit, in addition to increased performance, of using a significant lower amount of stack space: 52 bytes on ARM Cortex-M and 80 bytes on x86. Change-Id: I080eb338a2637d6b1c6838c119af1a9fa37fe869 Signed-off-by: Benjamin Walsh <benjamin.walsh@windriver.com>	2016-12-21 19:50:07 +00:00
Anas Nashif	d687a95611	kernel: move kernel code to kernel/ directly Also remove mentions of unified kernel in various places in the kernel, samples and documentation. Change-Id: Ice43bc73badbe7e14bae40fd6f2a302f6528a77d Signed-off-by: Anas Nashif <anas.nashif@intel.com>	2016-12-19 14:59:35 -05:00

... 3 4 5 6 7

334 commits