13 Commits

Author SHA1 Message Date
Eric Biggers
83a1bbf1d3 lib: consistently use include guards
A lot of the internal library headers don't have include guards because
they aren't needed.  It might look like a bug, though, and it doesn't
hurt to add them.  So do this.

Update https://github.com/ebiggers/libdeflate/issues/117
2021-03-12 00:07:30 -08:00
Eric Biggers
ff8634427b lib/matchfinder: simplify init and rebase
Remove the ability of matchfinder_init() and matchfinder_rebase() to
fail due to the matchfinder memory size being misaligned.  Instead,
require that the size always be 128-byte aligned -- which is already the
case.  Also, make the matchfinder memory always be 32-byte aligned --
which doesn't really have any downside.
2020-10-25 22:42:25 -07:00
Eric Biggers
ea88fa822f lib/arm/crc32: add support for ARM CRC32 instructions
Add a CRC32 implementation that uses the ARM CRC32 instructions.

This is simpler and faster than the PMULL implementation.  On AWS
Graviton2, the performance improvement is about 70%.  On Hikey960, the
performance improvement is about 30% for the Cortex-A53 cores or about
5% for the Cortex-A73 cores.

Based on work by Greg V <greg@unrelenting.technology>
(https://github.com/ebiggers/libdeflate/pull/45)
and Andrew Steinborn <git@steinborn.me>
(https://github.com/ebiggers/libdeflate/pull/76).
2020-10-10 23:03:50 -07:00
Eric Biggers
2eeaa9282e lib/arm/cpu_features: recognize the crc32 feature
If support for CRC32 instructions is detected, set
ARM_CPU_FEATURE_CRC32.  Also define
COMPILER_SUPPORTS_CRC32_TARGET_INTRINSICS when appropriate, and update
run_tests.sh to toggle the crc32 feature for testing.
2020-10-10 23:03:50 -07:00
Eric Biggers
7373bdc9ff lib/arm/cpu_features: reorganize arm feature macros
Reorganize up some confusing logic.
2020-10-10 23:03:50 -07:00
Eric Biggers
5729095d2d lib/cpu_features: support disabling CPU features for testing
Make test-only builds of libdeflate support an environmental variable
LIBDEFLATE_DISABLE_CPU_FEATURES that contains a list of CPU features to
disable like "avx512bw,avx2,sse2".

This makes it possible to test all the variants of dynamically
dispatched code without editing the source code.

Note, this environmental variable is not a stable interface, so put the
support for it behind a scary-looking option TEST_SUPPORT__DO_NOT_USE.
2020-10-05 00:35:19 -07:00
Eric Biggers
27d5a74f03 lib: add freestanding support
Allow building libdeflate without linking to any libc functions by using
'make FREESTANDING=1'.  When using such a library build, the user will
need to call libdeflate_set_memory_allocator() before anything else,
since malloc() and free() will be unavailable.

[Folded in fix from Ingvar Stepanyan to use -nostdlib, and made
 freestanding_tests() check that no libs are linked to.]

Update https://github.com/ebiggers/libdeflate/issues/62
2020-04-17 22:32:49 -07:00
Eric Biggers
a735fa830f lib, programs: remove all unnecessary 'extern' keywords
'extern' on function declarations is redundant.
2020-04-17 21:27:56 -07:00
Eric Biggers
6eef15d6f3 lib/arm: fix PMULL detection on AArch64 2018-03-03 12:47:50 -08:00
Eric Biggers
fc2ea22b44 lib/arm: add ARM PMULL implementation of CRC-32
Add an ARM PMULL implementation of CRC-32.  This is based on a patch by
Jun He <jun.he@linaro.org> as well as the x86 PCLMUL implementation.
2018-02-18 23:03:26 -08:00
Eric Biggers
fb5c6a8c85 lib/arm: allow choosing adler32_neon() at runtime
Now that we detect CPU features on ARM, allow the NEON implementation of
Adler-32 to be selected at runtime based on the presence of the NEON
feature.
2018-02-18 23:03:26 -08:00
Eric Biggers
2575ede5ff lib/arm: add ARM CPU feature detection (Linux only for now) 2018-02-18 23:03:26 -08:00
Eric Biggers
4829a5add2 lib: refactor architecture-specific code
Move the x86 and ARM-specific code into their own directories to prevent
it from cluttering up the main library.  This will make it a bit easier
to add new architecture-specific code.

But to avoid complicating things too much for people who aren't using
the provided Makefile, we still just compile all .c files for all
architectures (irrelevant ones end up #ifdef'ed out), and the headers
are included explicitly for each architecture so that an
architecture-specific include path isn't needed.  So, now people just
need to compile both lib/*.c and lib/*/*.c instead of only lib/*.c.
2018-02-18 23:03:26 -08:00