A lot of the internal library headers don't have include guards because
they aren't needed. It might look like a bug, though, and it doesn't
hurt to add them. So do this.
Update https://github.com/ebiggers/libdeflate/issues/117
Remove the ability of matchfinder_init() and matchfinder_rebase() to
fail due to the matchfinder memory size being misaligned. Instead,
require that the size always be 128-byte aligned -- which is already the
case. Also, make the matchfinder memory always be 32-byte aligned --
which doesn't really have any downside.
Add a CRC32 implementation that uses the ARM CRC32 instructions.
This is simpler and faster than the PMULL implementation. On AWS
Graviton2, the performance improvement is about 70%. On Hikey960, the
performance improvement is about 30% for the Cortex-A53 cores or about
5% for the Cortex-A73 cores.
Based on work by Greg V <greg@unrelenting.technology>
(https://github.com/ebiggers/libdeflate/pull/45)
and Andrew Steinborn <git@steinborn.me>
(https://github.com/ebiggers/libdeflate/pull/76).
If support for CRC32 instructions is detected, set
ARM_CPU_FEATURE_CRC32. Also define
COMPILER_SUPPORTS_CRC32_TARGET_INTRINSICS when appropriate, and update
run_tests.sh to toggle the crc32 feature for testing.
Make test-only builds of libdeflate support an environmental variable
LIBDEFLATE_DISABLE_CPU_FEATURES that contains a list of CPU features to
disable like "avx512bw,avx2,sse2".
This makes it possible to test all the variants of dynamically
dispatched code without editing the source code.
Note, this environmental variable is not a stable interface, so put the
support for it behind a scary-looking option TEST_SUPPORT__DO_NOT_USE.
Allow building libdeflate without linking to any libc functions by using
'make FREESTANDING=1'. When using such a library build, the user will
need to call libdeflate_set_memory_allocator() before anything else,
since malloc() and free() will be unavailable.
[Folded in fix from Ingvar Stepanyan to use -nostdlib, and made
freestanding_tests() check that no libs are linked to.]
Update https://github.com/ebiggers/libdeflate/issues/62
Move the x86 and ARM-specific code into their own directories to prevent
it from cluttering up the main library. This will make it a bit easier
to add new architecture-specific code.
But to avoid complicating things too much for people who aren't using
the provided Makefile, we still just compile all .c files for all
architectures (irrelevant ones end up #ifdef'ed out), and the headers
are included explicitly for each architecture so that an
architecture-specific include path isn't needed. So, now people just
need to compile both lib/*.c and lib/*/*.c instead of only lib/*.c.