zephyr/lib/os
Nicolas Pitre 822dfbd012 lib/os/prf.c: alternate implementation for _ldiv5()
The _ldiv5() is an optimized divide-by-5 function that is smaller and
faster than the generic libgcc implementation.

Yet it can be made even smaller and faster with this replacement
implementation based on a reciprocal multiplication plus some tricks.

For example, here's the assembly from the original code on ARM:

_ldiv5:
        ldr     r3, [r0]
        movw    ip, #52429
        ldr     r1, [r0, #4]
        movt    ip, 52428
        adds    r3, r3, #2
        push    {r4, r5, r6, r7, lr}
        mov     lr, #0
        adc     r1, r1, lr
        adds    r2, lr, lr
        umull   r7, r6, ip, r1
        lsr     r6, r6, #2
        adc     r7, r6, r6
        adds    r2, r2, r2
        adc     r7, r7, r7
        adds    r2, r2, lr
        adc     r7, r7, r6
        subs    r3, r3, r2
        sbc     r7, r1, r7
        lsr     r2, r3, #3
        orr     r2, r2, r7, lsl #29
        umull   r2, r1, ip, r2
        lsr     r2, r1, #2
        lsr     r7, r1, #31
        lsl     r1, r2, #3
        adds    r4, lr, r1
        adc     r5, r6, r7
        adds    r2, r1, r1
        adds    r2, r2, r2
        adds    r2, r2, r1
        subs    r2, r3, r2
        umull   r3, r2, ip, r2
        lsr     r2, r2, #2
        adds    r4, r4, r2
        adc     r5, r5, #0
        strd    r4, [r0]
        pop     {r4, r5, r6, r7, pc}

And here's the resulting assembly with this commit applied:

_ldiv5:
        push    {r4, r5, r6, r7}
        movw    r4, #13107
        ldr     r6, [r0]
        movt    r4, 13107
        ldr     r1, [r0, #4]
        mov     r3, #0
        umull   r6, r7, r6, r4
        add     r2, r4, r4, lsl #1
        umull   r4, r5, r1, r4
        adds    r1, r6, r2
        adc     r2, r7, r2
        adds    ip, r6, r4
        adc     r1, r7, r5
        adds    r2, ip, r2
        adc     r2, r1, r3
        adds    r2, r4, r2
        adc     r3, r5, r3
        strd    r2, [r0]
        pop     {r4, r5, r6, r7}
        bx      lr

So we're down to 20 instructions from 36 initially, with only 2 umull
instructions instead of 3, and slightly smaller stack footprint.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
2020-11-09 13:23:25 -08:00
..
assert.c assert: Completely remove file info and condition expression 2020-01-13 13:59:55 +01:00
base64.c zephyr: replace zephyr integer types with C99 types 2020-06-08 08:23:57 -05:00
CMakeLists.txt shell: support floating point output with newlib 2020-09-03 21:53:09 +02:00
crc7_sw.c zephyr: replace zephyr integer types with C99 types 2020-06-08 08:23:57 -05:00
crc8_sw.c zephyr: replace zephyr integer types with C99 types 2020-06-08 08:23:57 -05:00
crc16_sw.c zephyr: replace zephyr integer types with C99 types 2020-06-08 08:23:57 -05:00
crc32_sw.c zephyr: replace zephyr integer types with C99 types 2020-06-08 08:23:57 -05:00
dec.c zephyr: replace zephyr integer types with C99 types 2020-06-08 08:23:57 -05:00
fdtable.c lib: fdtable: fix z_free_fd multiple calls fd leak 2020-09-10 16:04:36 -05:00
heap-validate.c lib/os/heap: make "solo free headers" into first-class citizens 2020-07-14 19:35:52 -04:00
heap.c lib/os/heap: Correct aligned_alloc sizing for small heaps 2020-10-23 12:52:04 -04:00
heap.h code-guideline: Fixing code violation 10.4 Rule 2020-10-01 17:13:29 -04:00
hex.c lib: hex: Remove constant expression 2020-09-02 13:45:50 -04:00
json.c misc: Replace assert include and calls by sys/__assert.h equivalent 2020-10-02 11:42:40 +02:00
Kconfig lib/os/heap: remove big_heap restriction for aligned allocations 2020-07-14 19:35:52 -04:00
mempool.c code-guideline: Fixing code violation 10.4 Rule 2020-10-01 17:13:29 -04:00
mutex.c kernel/timeout: Make timeout arguments an opaque type 2020-03-31 19:40:47 -04:00
notify.c zephyr: replace zephyr integer types with C99 types 2020-06-08 08:23:57 -05:00
onoff.c code-guideline: Fixing code violation 10.4 Rule 2020-10-01 17:13:29 -04:00
prf.c lib/os/prf.c: alternate implementation for _ldiv5() 2020-11-09 13:23:25 -08:00
printk.c code-guideline: Fixing code violation 10.4 Rule 2020-10-01 17:13:29 -04:00
rb.c cleanup: include/: move misc/rb.h to sys/rb.h 2019-06-27 22:55:49 -04:00
ring_buffer.c zephyr: replace zephyr integer types with C99 types 2020-06-08 08:23:57 -05:00
sem.c code-guideline: Fixing code violation 10.4 Rule 2020-10-01 17:13:29 -04:00
thread_entry.c lib: os: remove dead code 2019-06-18 09:08:01 -04:00
timeutil.c zephyr: replace zephyr integer types with C99 types 2020-06-08 08:23:57 -05:00
work_q.c os: work_q: Use NULL instead of 0 2020-09-02 13:45:50 -04:00