47ec4359b9
Resolve wrong documentation c function links for irq: z_shared_isr, rtio: rtio_cqe_get_mempool_buffer and sensor: sensor_read Signed-off-by: Simon Hein <Shein@baumer.com>
229 lines
8.8 KiB
ReStructuredText
229 lines
8.8 KiB
ReStructuredText
.. _rtio_api:
|
|
|
|
Real Time I/O (RTIO)
|
|
####################
|
|
|
|
.. contents::
|
|
:local:
|
|
:depth: 2
|
|
|
|
.. image:: rings.png
|
|
:width: 800
|
|
:alt: Submissions and Completion Ring Queues
|
|
|
|
RTIO provides a framework for doing asynchronous operation chains with event
|
|
driven I/O. This section covers the RTIO API, queues, executor, iodev,
|
|
and common usage patterns with peripheral devices.
|
|
|
|
RTIO takes a lot of inspiration from Linux's io_uring in its operations and API
|
|
as that API matches up well with hardware transfer queues and descriptions such as
|
|
DMA transfer lists.
|
|
|
|
Problem
|
|
*******
|
|
|
|
An application wishing to do complex DMA or interrupt driven operations today
|
|
in Zephyr requires direct knowledge of the hardware and how it works. There is
|
|
no understanding in the DMA API of other Zephyr devices and how they relate.
|
|
|
|
This means doing complex audio, video, or sensor streaming requires direct
|
|
hardware knowledge or leaky abstractions over DMA controllers. Neither is ideal.
|
|
|
|
To enable asynchronous operations, especially with DMA, a description of what
|
|
to do rather than direct operations through C and callbacks is needed. Enabling
|
|
DMA features such as channels with priority, and sequences of transfers requires
|
|
more than a simple list of descriptions.
|
|
|
|
Using DMA and/or interrupt driven I/O shouldn't dictate whether or not the
|
|
call is blocking or not.
|
|
|
|
Inspiration, introducing io_uring
|
|
*********************************
|
|
|
|
It's better not to reinvent the wheel (or ring in this case) and io_uring as an
|
|
API from the Linux kernel provides a winning model. In io_uring there are two
|
|
lock-free ring buffers acting as queues shared between the kernel and a userland
|
|
application. One queue for submission entries which may be chained and flushed to
|
|
create concurrent sequential requests. A second queue for completion queue events.
|
|
Only a single syscall is actually required to execute many operations, the
|
|
io_uring_submit call. This call may block the caller when a number of
|
|
operations to wait on is given.
|
|
|
|
This model maps well to DMA and interrupt driven transfers. A request to do a
|
|
sequence of operations in an asynchronous way directly relates
|
|
to the way hardware typically works with interrupt driven state machines
|
|
potentially involving multiple peripheral IPs like bus and DMA controllers.
|
|
|
|
Submission Queue
|
|
****************
|
|
|
|
The submission queue (sq), is the description of the operations
|
|
to perform in concurrent chains.
|
|
|
|
For example imagine a typical SPI transfer where you wish to write a
|
|
register address to then read from. So the sequence of operations might be...
|
|
|
|
1. Chip Select
|
|
2. Clock Enable
|
|
3. Write register address into SPI transmit register
|
|
4. Read from the SPI receive register into a buffer
|
|
5. Disable clock
|
|
6. Disable Chip Select
|
|
|
|
If anything in this chain of operations fails give up. Some of those operations
|
|
can be embodied in a device abstraction that understands a read or write
|
|
implicitly means setup the clock and chip select. The transactional nature of
|
|
the request also needs to be embodied in some manner. Of the operations above
|
|
perhaps the read could be done using DMA as its large enough make sense. That
|
|
requires an understanding of how to setup the device's particular DMA to do so.
|
|
|
|
The above sequence of operations is embodied in RTIO as chain of
|
|
submission queue entries (sqe). Chaining is done by setting a bitflag in
|
|
an sqe to signify the next sqe must wait on the current one.
|
|
|
|
Because the chip select and clocking is common to a particular SPI controller
|
|
and device on the bus it is embodied in what RTIO calls an iodev.
|
|
|
|
Multiple operations against the same iodev are done in the order provided as
|
|
soon as possible. If two operation chains have varying points using the same
|
|
device its possible one chain will have to wait for another to complete.
|
|
|
|
Completion Queue
|
|
****************
|
|
|
|
In order to know when a sqe has completed there is a completion
|
|
queue (cq) with completion queue events (cqe). A sqe once completed results in
|
|
a cqe being pushed into the cq. The ordering of cqe may not be the same order of
|
|
sqe. A chain of sqe will however ensure ordering and failure cascading.
|
|
|
|
Other potential schemes are possible but a completion queue is a well trod
|
|
idea with io_uring and other similar operating system APIs.
|
|
|
|
Executor
|
|
********
|
|
|
|
The RTIO executor is a low overhead concurrent I/O task scheduler. It ensures
|
|
certain request flags provide the expected behavior. It takes a list of
|
|
submissions working through them in order. Various flags allow for changing the
|
|
behavior of how submissions are worked through. Flags to form in order chains of
|
|
submissions, transactional sets of submissions, or create multi-shot
|
|
(continuously producing) requests are all possible!
|
|
|
|
IO Device
|
|
*********
|
|
|
|
Turning submission queue entries (sqe) into completion queue events (cqe) is the
|
|
job of objects implementing the iodev (IO device) API. This API accepts requests
|
|
in the form of the iodev submit API call. It is the io devices job to work
|
|
through its internal queue of submissions and convert them into completions. In
|
|
effect every io device can be viewed as an independent, event driven actor like
|
|
object, that accepts a never ending queue of I/O like requests. How the iodev
|
|
does this work is up to the author of the iodev, perhaps the entire queue of
|
|
operations can be converted to a set of DMA transfer descriptors, meaning the
|
|
hardware does almost all of the real work.
|
|
|
|
Cancellation
|
|
************
|
|
|
|
Canceling an already queued operation is possible but not guaranteed. If the
|
|
SQE has not yet started, it's likely that a call to :c:func:`rtio_sqe_cancel`
|
|
will remove the SQE and never run it. If, however, the SQE already started
|
|
running, the cancel request will be ignored.
|
|
|
|
Memory pools
|
|
************
|
|
|
|
In some cases requests to read may not know how much data will be produced.
|
|
Alternatively, a reader might be handling data from multiple io devices where
|
|
the frequency of the data is unpredictable. In these cases it may be wasteful
|
|
to bind memory to in flight read requests. Instead with memory pools the memory
|
|
to read into is left to the iodev to allocate from a memory pool associated with
|
|
the RTIO context that the read was associated with. To create such an RTIO
|
|
context the :c:macro:`RTIO_DEFINE_WITH_MEMPOOL` can be used. It allows creating
|
|
an RTIO context with a dedicated pool of "memory blocks" which can be consumed by
|
|
the iodev. Below is a snippet setting up the RTIO context with a memory pool.
|
|
The memory pool has 128 blocks, each block has the size of 16 bytes, and the data
|
|
is 4 byte aligned.
|
|
|
|
.. code-block:: C
|
|
|
|
#include <zephyr/rtio/rtio.h>
|
|
|
|
#define SQ_SIZE 4
|
|
#define CQ_SIZE 4
|
|
#define MEM_BLK_COUNT 128
|
|
#define MEM_BLK_SIZE 16
|
|
#define MEM_BLK_ALIGN 4
|
|
|
|
RTIO_DEFINE_WITH_MEMPOOL(rtio_context,
|
|
SQ_SIZE, CQ_SIZE, MEM_BLK_COUNT, MEM_BLK_SIZE, MEM_BLK_ALIGN);
|
|
|
|
When a read is needed, the caller simply needs to replace the call
|
|
:c:func:`rtio_sqe_prep_read` (which takes a pointer to a buffer and a length)
|
|
with a call to :c:func:`rtio_sqe_prep_read_with_pool`. The iodev requires
|
|
only a small change which works with both pre-allocated data buffers as well as
|
|
the mempool. When the read is ready, instead of getting the buffers directly
|
|
from the :c:struct:`rtio_iodev_sqe`, the iodev should get the buffer and count
|
|
by calling :c:func:`rtio_sqe_rx_buf` like so:
|
|
|
|
.. code-block:: C
|
|
|
|
uint8_t *buf;
|
|
uint32_t buf_len;
|
|
int rc = rtio_sqe_rx_buff(iodev_sqe, MIN_BUF_LEN, DESIRED_BUF_LEN, &buf, &buf_len);
|
|
|
|
if (rc != 0) {
|
|
LOG_ERR("Failed to get buffer of at least %u bytes", MIN_BUF_LEN);
|
|
return;
|
|
}
|
|
|
|
Finally, the consumer will be able to access the allocated buffer via
|
|
:c:func:`rtio_cqe_get_mempool_buffer`.
|
|
|
|
.. code-block:: C
|
|
|
|
uint8_t *buf;
|
|
uint32_t buf_len;
|
|
int rc = rtio_cqe_get_mempool_buffer(&rtio_context, &cqe, &buf, &buf_len);
|
|
|
|
if (rc != 0) {
|
|
LOG_ERR("Failed to get mempool buffer");
|
|
return rc;
|
|
}
|
|
|
|
/* Release the cqe events (note that the buffer is not released yet */
|
|
rtio_cqe_release_all(&rtio_context);
|
|
|
|
/* Do something with the memory */
|
|
|
|
/* Release the mempool buffer */
|
|
rtio_release_buffer(&rtio_context, buf);
|
|
|
|
When to Use
|
|
***********
|
|
|
|
RTIO is useful in cases where concurrent or batch like I/O flows are useful.
|
|
|
|
From the driver/hardware perspective the API enables batching of I/O requests, potentially in an optimal way.
|
|
Many requests to the same SPI peripheral for example might be translated to hardware command queues or DMA transfer
|
|
descriptors entirely. Meaning the hardware can potentially do more than ever.
|
|
|
|
There is a small cost to each RTIO context and iodev. This cost could be weighed
|
|
against using a thread for each concurrent I/O operation or custom queues and
|
|
threads per peripheral. RTIO is much lower cost than that.
|
|
|
|
API Reference
|
|
*************
|
|
|
|
.. doxygengroup:: rtio
|
|
|
|
MPSC Lock-free Queue API
|
|
========================
|
|
|
|
.. doxygengroup:: rtio_mpsc
|
|
|
|
SPSC Lock-free Queue API
|
|
========================
|
|
|
|
.. doxygengroup:: rtio_spsc
|