f9932c578d
We now have a different output, so capture it correctly. Signed-off-by: Anas Nashif <anas.nashif@intel.com> |
||
---|---|---|
.. | ||
boards | ||
src | ||
CMakeLists.txt | ||
Kconfig | ||
prj.conf | ||
README.rst | ||
testcase.yaml |
Latency Measurements #################### This benchmark measures the average latency of selected kernel capabilities, including: * Context switch time between preemptive threads using k_yield * Context switch time between cooperative threads using k_yield * Time to switch from ISR back to interrupted thread * Time from ISR to executing a different thread (rescheduled) * Time to signal a semaphore then test that semaphore * Time to signal a semaphore then test that semaphore with a context switch * Times to lock a mutex then unlock that mutex * Time it takes to create a new thread (without starting it) * Time it takes to start a newly created thread * Time it takes to suspend a thread * Time it takes to resume a suspended thread * Time it takes to abort a thread * Time it takes to add data to a fifo.LIFO * Time it takes to retrieve data from a fifo.LIFO * Time it takes to wait on a fifo.lifo.(and context switch) * Time it takes to wake and switch to a thread waiting on a fifo.LIFO * Time it takes to send and receive events * Time it takes to wait for events (and context switch) * Time it takes to wake and switch to a thread waiting for events * Time it takes to push and pop to/from a k_stack * Measure average time to alloc memory from heap then free that memory When userspace is enabled using the prj_user.conf configuration file, this benchmark will where possible, also test the above capabilities using various configurations involving user threads: * Kernel thread to kernel thread * Kernel thread to user thread * User thread to kernel thread * User thread to user thread Sample output of the benchmark (without userspace enabled):: *** Booting Zephyr OS build zephyr-v3.5.0-4267-g6ccdc31233a3 *** thread.yield.preemptive.ctx.k_to_k - Context switch via k_yield : 329 cycles , 2741 ns : thread.yield.cooperative.ctx.k_to_k - Context switch via k_yield : 329 cycles , 2741 ns : isr.resume.interrupted.thread.kernel - Return from ISR to interrupted thread : 363 cycles , 3033 ns : isr.resume.different.thread.kernel - Return from ISR to another thread : 404 cycles , 3367 ns : thread.create.kernel.from.kernel - Create thread : 404 cycles , 3374 ns : thread.start.kernel.from.kernel - Start thread : 423 cycles , 3533 ns : thread.suspend.kernel.from.kernel - Suspend thread : 428 cycles , 3574 ns : thread.resume.kernel.from.kernel - Resume thread : 350 cycles , 2924 ns : thread.abort.kernel.from.kernel - Abort thread : 339 cycles , 2826 ns : fifo.put.immediate.kernel - Add data to FIFO (no ctx switch) : 269 cycles , 2242 ns : fifo.get.immediate.kernel - Get data from FIFO (no ctx switch) : 128 cycles , 1074 ns : fifo.put.alloc.immediate.kernel - Allocate to add data to FIFO (no ctx switch) : 945 cycles , 7875 ns : fifo.get.free.immediate.kernel - Free when getting data from FIFO (no ctx switch) : 575 cycles , 4792 ns : fifo.get.blocking.k_to_k - Get data from FIFO (w/ ctx switch) : 551 cycles , 4592 ns : fifo.put.wake+ctx.k_to_k - Add data to FIFO (w/ ctx switch) : 660 cycles , 5500 ns : fifo.get.free.blocking.k_to_k - Free when getting data from FIFO (w/ ctx siwtch) : 553 cycles , 4608 ns : fifo.put.alloc.wake+ctx.k_to_k - Allocate to add data to FIFO (w/ ctx switch) : 655 cycles , 5458 ns : lifo.put.immediate.kernel - Add data to LIFO (no ctx switch) : 280 cycles , 2341 ns : lifo.get.immediate.kernel - Get data from LIFO (no ctx switch) : 133 cycles , 1116 ns : lifo.put.alloc.immediate.kernel - Allocate to add data to LIFO (no ctx switch) : 945 cycles , 7875 ns : lifo.get.free.immediate.kernel - Free when getting data from LIFO (no ctx switch) : 580 cycles , 4833 ns : lifo.get.blocking.k_to_k - Get data from LIFO (w/ ctx switch) : 553 cycles , 4608 ns : lifo.put.wake+ctx.k_to_k - Add data to LIFO (w/ ctx switch) : 655 cycles , 5458 ns : lifo.get.free.blocking.k_to_k - Free when getting data from LIFO (w/ ctx switch) : 550 cycles , 4583 ns : lifo.put.alloc.wake+ctx.k_to_k - Allocate to add data to LIFO (w/ ctx siwtch) : 655 cycles , 5458 ns : events.post.immediate.kernel - Post events (nothing wakes) : 225 cycles , 1875 ns : events.set.immediate.kernel - Set events (nothing wakes) : 225 cycles , 1875 ns : events.wait.immediate.kernel - Wait for any events (no ctx switch) : 130 cycles , 1083 ns : events.wait_all.immediate.kernel - Wait for all events (no ctx switch) : 135 cycles , 1125 ns : events.wait.blocking.k_to_k - Wait for any events (w/ ctx switch) : 573 cycles , 4783 ns : events.set.wake+ctx.k_to_k - Set events (w/ ctx switch) : 784 cycles , 6534 ns : events.wait_all.blocking.k_to_k - Wait for all events (w/ ctx switch) : 589 cycles , 4916 ns : events.post.wake+ctx.k_to_k - Post events (w/ ctx switch) : 795 cycles , 6626 ns : semaphore.give.immediate.kernel - Give a semaphore (no waiters) : 125 cycles , 1041 ns : semaphore.take.immediate.kernel - Take a semaphore (no blocking) : 69 cycles , 575 ns : semaphore.take.blocking.k_to_k - Take a semaphore (context switch) : 494 cycles , 4116 ns : semaphore.give.wake+ctx.k_to_k - Give a semaphore (context switch) : 599 cycles , 4992 ns : condvar.wait.blocking.k_to_k - Wait for a condvar (context switch) : 692 cycles , 5767 ns : condvar.signal.wake+ctx.k_to_k - Signal a condvar (context switch) : 715 cycles , 5958 ns : stack.push.immediate.kernel - Add data to k_stack (no ctx switch) : 166 cycles , 1391 ns : stack.pop.immediate.kernel - Get data from k_stack (no ctx switch) : 82 cycles , 691 ns : stack.pop.blocking.k_to_k - Get data from k_stack (w/ ctx switch) : 499 cycles , 4166 ns : stack.push.wake+ctx.k_to_k - Add data to k_stack (w/ ctx switch) : 645 cycles , 5375 ns : mutex.lock.immediate.recursive.kernel - Lock a mutex : 100 cycles , 833 ns : mutex.unlock.immediate.recursive.kernel - Unlock a mutex : 40 cycles , 333 ns : heap.malloc.immediate - Average time for heap malloc : 627 cycles , 5225 ns : heap.free.immediate - Average time for heap free : 432 cycles , 3600 ns : =================================================================== PROJECT EXECUTION SUCCESSFUL Sample output of the benchmark (with userspace enabled):: *** Booting Zephyr OS build zephyr-v3.5.0-4268-g6af7a1230a08 *** thread.yield.preemptive.ctx.k_to_k - Context switch via k_yield : 970 cycles , 8083 ns : thread.yield.preemptive.ctx.u_to_u - Context switch via k_yield : 1260 cycles , 10506 ns : thread.yield.preemptive.ctx.k_to_u - Context switch via k_yield : 1155 cycles , 9632 ns : thread.yield.preemptive.ctx.u_to_k - Context switch via k_yield : 1075 cycles , 8959 ns : thread.yield.cooperative.ctx.k_to_k - Context switch via k_yield : 970 cycles , 8083 ns : thread.yield.cooperative.ctx.u_to_u - Context switch via k_yield : 1260 cycles , 10506 ns : thread.yield.cooperative.ctx.k_to_u - Context switch via k_yield : 1155 cycles , 9631 ns : thread.yield.cooperative.ctx.u_to_k - Context switch via k_yield : 1075 cycles , 8959 ns : isr.resume.interrupted.thread.kernel - Return from ISR to interrupted thread : 415 cycles , 3458 ns : isr.resume.different.thread.kernel - Return from ISR to another thread : 985 cycles , 8208 ns : isr.resume.different.thread.user - Return from ISR to another thread : 1180 cycles , 9833 ns : thread.create.kernel.from.kernel - Create thread : 989 cycles , 8249 ns : thread.start.kernel.from.kernel - Start thread : 1059 cycles , 8833 ns : thread.suspend.kernel.from.kernel - Suspend thread : 1030 cycles , 8583 ns : thread.resume.kernel.from.kernel - Resume thread : 994 cycles , 8291 ns : thread.abort.kernel.from.kernel - Abort thread : 2370 cycles , 19751 ns : thread.create.user.from.kernel - Create thread : 860 cycles , 7167 ns : thread.start.user.from.kernel - Start thread : 8965 cycles , 74713 ns : thread.suspend.user.from.kernel - Suspend thread : 1400 cycles , 11666 ns : thread.resume.user.from.kernel - Resume thread : 1174 cycles , 9791 ns : thread.abort.user.from.kernel - Abort thread : 2240 cycles , 18666 ns : thread.create.user.from.user - Create thread : 2105 cycles , 17542 ns : thread.start.user.from.user - Start thread : 9345 cycles , 77878 ns : thread.suspend.user.from.user - Suspend thread : 1590 cycles , 13250 ns : thread.resume.user.from.user - Resume thread : 1534 cycles , 12791 ns : thread.abort.user.from.user - Abort thread : 2850 cycles , 23750 ns : thread.start.kernel.from.user - Start thread : 1440 cycles , 12000 ns : thread.suspend.kernel.from.user - Suspend thread : 1219 cycles , 10166 ns : thread.resume.kernel.from.user - Resume thread : 1355 cycles , 11292 ns : thread.abort.kernel.from.user - Abort thread : 2980 cycles , 24834 ns : fifo.put.immediate.kernel - Add data to FIFO (no ctx switch) : 315 cycles , 2625 ns : fifo.get.immediate.kernel - Get data from FIFO (no ctx switch) : 209 cycles , 1749 ns : fifo.put.alloc.immediate.kernel - Allocate to add data to FIFO (no ctx switch) : 1040 cycles , 8667 ns : fifo.get.free.immediate.kernel - Free when getting data from FIFO (no ctx switch) : 670 cycles , 5583 ns : fifo.put.alloc.immediate.user - Allocate to add data to FIFO (no ctx switch) : 1765 cycles , 14709 ns : fifo.get.free.immediate.user - Free when getting data from FIFO (no ctx switch) : 1410 cycles , 11750 ns : fifo.get.blocking.k_to_k - Get data from FIFO (w/ ctx switch) : 1220 cycles , 10168 ns : fifo.put.wake+ctx.k_to_k - Add data to FIFO (w/ ctx switch) : 1285 cycles , 10708 ns : fifo.get.free.blocking.k_to_k - Free when getting data from FIFO (w/ ctx siwtch) : 1235 cycles , 10291 ns : fifo.put.alloc.wake+ctx.k_to_k - Allocate to add data to FIFO (w/ ctx switch) : 1340 cycles , 11167 ns : fifo.get.free.blocking.u_to_k - Free when getting data from FIFO (w/ ctx siwtch) : 1715 cycles , 14292 ns : fifo.put.alloc.wake+ctx.k_to_u - Allocate to add data to FIFO (w/ ctx switch) : 1665 cycles , 13876 ns : fifo.get.free.blocking.k_to_u - Free when getting data from FIFO (w/ ctx siwtch) : 1565 cycles , 13042 ns : fifo.put.alloc.wake+ctx.u_to_k - Allocate to add data to FIFO (w/ ctx switch) : 1815 cycles , 15126 ns : fifo.get.free.blocking.u_to_u - Free when getting data from FIFO (w/ ctx siwtch) : 2045 cycles , 17042 ns : fifo.put.alloc.wake+ctx.u_to_u - Allocate to add data to FIFO (w/ ctx switch) : 2140 cycles , 17834 ns : lifo.put.immediate.kernel - Add data to LIFO (no ctx switch) : 309 cycles , 2583 ns : lifo.get.immediate.kernel - Get data from LIFO (no ctx switch) : 219 cycles , 1833 ns : lifo.put.alloc.immediate.kernel - Allocate to add data to LIFO (no ctx switch) : 1030 cycles , 8583 ns : lifo.get.free.immediate.kernel - Free when getting data from LIFO (no ctx switch) : 685 cycles , 5708 ns : lifo.put.alloc.immediate.user - Allocate to add data to LIFO (no ctx switch) : 1755 cycles , 14625 ns : lifo.get.free.immediate.user - Free when getting data from LIFO (no ctx switch) : 1405 cycles , 11709 ns : lifo.get.blocking.k_to_k - Get data from LIFO (w/ ctx switch) : 1229 cycles , 10249 ns : lifo.put.wake+ctx.k_to_k - Add data to LIFO (w/ ctx switch) : 1290 cycles , 10751 ns : lifo.get.free.blocking.k_to_k - Free when getting data from LIFO (w/ ctx switch) : 1235 cycles , 10292 ns : lifo.put.alloc.wake+ctx.k_to_k - Allocate to add data to LIFO (w/ ctx siwtch) : 1310 cycles , 10917 ns : lifo.get.free.blocking.u_to_k - Free when getting data from LIFO (w/ ctx switch) : 1715 cycles , 14293 ns : lifo.put.alloc.wake+ctx.k_to_u - Allocate to add data to LIFO (w/ ctx siwtch) : 1630 cycles , 13583 ns : lifo.get.free.blocking.k_to_u - Free when getting data from LIFO (w/ ctx switch) : 1554 cycles , 12958 ns : lifo.put.alloc.wake+ctx.u_to_k - Allocate to add data to LIFO (w/ ctx siwtch) : 1805 cycles , 15043 ns : lifo.get.free.blocking.u_to_u - Free when getting data from LIFO (w/ ctx switch) : 2035 cycles , 16959 ns : lifo.put.alloc.wake+ctx.u_to_u - Allocate to add data to LIFO (w/ ctx siwtch) : 2125 cycles , 17709 ns : events.post.immediate.kernel - Post events (nothing wakes) : 295 cycles , 2458 ns : events.set.immediate.kernel - Set events (nothing wakes) : 300 cycles , 2500 ns : events.wait.immediate.kernel - Wait for any events (no ctx switch) : 220 cycles , 1833 ns : events.wait_all.immediate.kernel - Wait for all events (no ctx switch) : 215 cycles , 1791 ns : events.post.immediate.user - Post events (nothing wakes) : 795 cycles , 6625 ns : events.set.immediate.user - Set events (nothing wakes) : 790 cycles , 6584 ns : events.wait.immediate.user - Wait for any events (no ctx switch) : 740 cycles , 6167 ns : events.wait_all.immediate.user - Wait for all events (no ctx switch) : 740 cycles , 6166 ns : events.wait.blocking.k_to_k - Wait for any events (w/ ctx switch) : 1190 cycles , 9918 ns : events.set.wake+ctx.k_to_k - Set events (w/ ctx switch) : 1464 cycles , 12208 ns : events.wait_all.blocking.k_to_k - Wait for all events (w/ ctx switch) : 1235 cycles , 10292 ns : events.post.wake+ctx.k_to_k - Post events (w/ ctx switch) : 1500 cycles , 12500 ns : events.wait.blocking.u_to_k - Wait for any events (w/ ctx switch) : 1580 cycles , 13167 ns : events.set.wake+ctx.k_to_u - Set events (w/ ctx switch) : 1630 cycles , 13583 ns : events.wait_all.blocking.u_to_k - Wait for all events (w/ ctx switch) : 1765 cycles , 14708 ns : events.post.wake+ctx.k_to_u - Post events (w/ ctx switch) : 1795 cycles , 14960 ns : events.wait.blocking.k_to_u - Wait for any events (w/ ctx switch) : 1375 cycles , 11459 ns : events.set.wake+ctx.u_to_k - Set events (w/ ctx switch) : 1825 cycles , 15209 ns : events.wait_all.blocking.k_to_u - Wait for all events (w/ ctx switch) : 1555 cycles , 12958 ns : events.post.wake+ctx.u_to_k - Post events (w/ ctx switch) : 1995 cycles , 16625 ns : events.wait.blocking.u_to_u - Wait for any events (w/ ctx switch) : 1765 cycles , 14708 ns : events.set.wake+ctx.u_to_u - Set events (w/ ctx switch) : 1989 cycles , 16583 ns : events.wait_all.blocking.u_to_u - Wait for all events (w/ ctx switch) : 2085 cycles , 17376 ns : events.post.wake+ctx.u_to_u - Post events (w/ ctx switch) : 2290 cycles , 19084 ns : semaphore.give.immediate.kernel - Give a semaphore (no waiters) : 220 cycles , 1833 ns : semaphore.take.immediate.kernel - Take a semaphore (no blocking) : 130 cycles , 1083 ns : semaphore.give.immediate.user - Give a semaphore (no waiters) : 710 cycles , 5917 ns : semaphore.take.immediate.user - Take a semaphore (no blocking) : 655 cycles , 5458 ns : semaphore.take.blocking.k_to_k - Take a semaphore (context switch) : 1135 cycles , 9458 ns : semaphore.give.wake+ctx.k_to_k - Give a semaphore (context switch) : 1244 cycles , 10374 ns : semaphore.take.blocking.k_to_u - Take a semaphore (context switch) : 1325 cycles , 11048 ns : semaphore.give.wake+ctx.u_to_k - Give a semaphore (context switch) : 1610 cycles , 13416 ns : semaphore.take.blocking.u_to_k - Take a semaphore (context switch) : 1499 cycles , 12499 ns : semaphore.give.wake+ctx.k_to_u - Give a semaphore (context switch) : 1434 cycles , 11957 ns : semaphore.take.blocking.u_to_u - Take a semaphore (context switch) : 1690 cycles , 14090 ns : semaphore.give.wake+ctx.u_to_u - Give a semaphore (context switch) : 1800 cycles , 15000 ns : condvar.wait.blocking.k_to_k - Wait for a condvar (context switch) : 1385 cycles , 11542 ns : condvar.signal.wake+ctx.k_to_k - Signal a condvar (context switch) : 1420 cycles , 11833 ns : condvar.wait.blocking.k_to_u - Wait for a condvar (context switch) : 1537 cycles , 12815 ns : condvar.signal.wake+ctx.u_to_k - Signal a condvar (context switch) : 1950 cycles , 16250 ns : condvar.wait.blocking.u_to_k - Wait for a condvar (context switch) : 2025 cycles , 16875 ns : condvar.signal.wake+ctx.k_to_u - Signal a condvar (context switch) : 1715 cycles , 14298 ns : condvar.wait.blocking.u_to_u - Wait for a condvar (context switch) : 2313 cycles , 19279 ns : condvar.signal.wake+ctx.u_to_u - Signal a condvar (context switch) : 2225 cycles , 18541 ns : stack.push.immediate.kernel - Add data to k_stack (no ctx switch) : 244 cycles , 2041 ns : stack.pop.immediate.kernel - Get data from k_stack (no ctx switch) : 195 cycles , 1630 ns : stack.push.immediate.user - Add data to k_stack (no ctx switch) : 714 cycles , 5956 ns : stack.pop.immediate.user - Get data from k_stack (no ctx switch) : 1009 cycles , 8414 ns : stack.pop.blocking.k_to_k - Get data from k_stack (w/ ctx switch) : 1234 cycles , 10291 ns : stack.push.wake+ctx.k_to_k - Add data to k_stack (w/ ctx switch) : 1360 cycles , 11333 ns : stack.pop.blocking.u_to_k - Get data from k_stack (w/ ctx switch) : 2084 cycles , 17374 ns : stack.push.wake+ctx.k_to_u - Add data to k_stack (w/ ctx switch) : 1665 cycles , 13875 ns : stack.pop.blocking.k_to_u - Get data from k_stack (w/ ctx switch) : 1544 cycles , 12874 ns : stack.push.wake+ctx.u_to_k - Add data to k_stack (w/ ctx switch) : 1850 cycles , 15422 ns : stack.pop.blocking.u_to_u - Get data from k_stack (w/ ctx switch) : 2394 cycles , 19958 ns : stack.push.wake+ctx.u_to_u - Add data to k_stack (w/ ctx switch) : 2155 cycles , 17958 ns : mutex.lock.immediate.recursive.kernel - Lock a mutex : 155 cycles , 1291 ns : mutex.unlock.immediate.recursive.kernel - Unlock a mutex : 57 cycles , 475 ns : mutex.lock.immediate.recursive.user - Lock a mutex : 665 cycles , 5541 ns : mutex.unlock.immediate.recursive.user - Unlock a mutex : 585 cycles , 4875 ns : heap.malloc.immediate - Average time for heap malloc : 640 cycles , 5341 ns : heap.free.immediate - Average time for heap free : 436 cycles , 3633 ns : =================================================================== PROJECT EXECUTION SUCCESSFUL