Manuel Pégourié-Gonnard 8c76a9489e aria: turn macro into static inline function
Besides documenting types better and so on, this give the compiler more room
to optimise either for size or performance.

Here are some before/after measurements of:
- size of aria.o in bytes (less is better)
- instruction count for the selftest function (less is better)
with various -O flags.

Before:
O	aria.o	ins
s	10896	37,256
2	11176	37,199
3	12248	27,752

After:
O	aria.o	ins
s	8784	41,408
2	11112	37,001
3	13096	27,438

The new version allows the compiler to reach smaller size with -Os while
maintaining (actually slightly improving) performance with -O2 and -O3.

Measurements were done on x86_64 (but since this is mainly about inlining
code, this should transpose well to other platforms) using the following
helper program and script, after disabling CBC, CFB and CTR in config.h, in
order to focus on the core functions.

==> st.c <==
 #include "mbedtls/aria.h"

int main( void ) {
    return mbedtls_aria_self_test( 0 );
}

==> p.sh <==
 #!/bin/sh

set -eu

ccount () {
    (
    valgrind --tool=callgrind --dump-line=no --callgrind-out-file=/dev/null --collect-atstart=no --toggle-collect=main $1
    ) 2>&1 | sed -n -e 's/.*refs: *\([0-9,]*\)/\1/p'
}

printf "O\taria.o\tins\n"
for O in s 2 3; do
    GCC="gcc -Wall -Wextra -Werror -Iinclude"

    $GCC -O$O -c library/aria.c
    $GCC -O1 st.c aria.o -o st
   ./st

    SIZE=$( du -b aria.o | cut -f1 )
    INS=$( ccount ./st )

    printf "$O\t$SIZE\t$INS\n"
done
2018-02-27 12:39:12 +01:00
..
2018-02-27 12:39:12 +01:00
2018-02-27 12:39:12 +01:00
2017-10-10 19:04:27 +03:00
2018-02-27 12:39:12 +01:00
2018-02-27 12:39:12 +01:00
2018-02-22 10:24:30 +00:00
2018-02-22 10:24:30 +00:00
2018-02-22 10:24:30 +00:00
2017-10-29 17:53:52 +02:00
2018-02-27 12:39:12 +01:00
2018-01-29 10:24:50 +01:00
2018-02-06 15:59:38 +02:00