android_tests is only useful for local testing, and it wasn't being run
in Travis CI. Move it into a separate script to avoid complicating
run_tests.sh.
This was only useful for me to do local testing, I don't have the needed
MIPS router available anymore, and its main purpose was test a big
endian system but that's now covered by testing s390x with Travis CI.
This script only worked for me to do local testing and wasn't otherwise
used. In particular, the Windows build tests in Travis CI don't use
this script, nor does the make-windows-releases script use it.
Compression is based on heuristics, so we can't guarantee in every
circumstance that the compression ratio will improve as the compression
level increases. The tests need to be 100% reliable though, so drop
this part of the test.
Move the user-specified CFLAGS to the end of the CFLAGS definition, so
that warnings the Makefile enables can be disabled using -Wno-$foo.
This is useful when old compilers give false positive warnings.
This is needed to avoid the following error when using
-fsanitize=undefined with gcc:
lib/x86/adler32_impl.h:214:2: runtime error: signed integer overflow:
1951294680 + 1956941400 cannot be represented in type 'int'
Note that this isn't seen when using -fsanitize=undefined with clang.
Old compilers don't have unsigned vector types, so work around that.
Add a CRC32 implementation that uses the ARM CRC32 instructions.
This is simpler and faster than the PMULL implementation. On AWS
Graviton2, the performance improvement is about 70%. On Hikey960, the
performance improvement is about 30% for the Cortex-A53 cores or about
5% for the Cortex-A73 cores.
Based on work by Greg V <greg@unrelenting.technology>
(https://github.com/ebiggers/libdeflate/pull/45)
and Andrew Steinborn <git@steinborn.me>
(https://github.com/ebiggers/libdeflate/pull/76).
If support for CRC32 instructions is detected, set
ARM_CPU_FEATURE_CRC32. Also define
COMPILER_SUPPORTS_CRC32_TARGET_INTRINSICS when appropriate, and update
run_tests.sh to toggle the crc32 feature for testing.
android_build.sh no longer works with recent NDKs, and it has a lot of
logic to use old NDKs directly that wasn't really necessary because it
could have just required standalone toolchains instead.
Recent NDKs (r19 and later) come with standalone toolchains by default.
Also, they now only include clang, not gcc.
Modify the script to just support these recent NDKs. Also, default to
arm64 and add support for enabling CRC instructions.
Some users may require a valid DEFLATE, zlib, or gzip stream but know
ahead of time that particular inputs are not compressible. zlib
supports "level 0" for this use case. Support this in libdeflate too.
Resolves https://github.com/ebiggers/libdeflate/issues/86
To test the different CRC-32 and Adler-32 implementations, use
LIBDEFLATE_DISABLE_CPU_FEATURES instead of running some hack-ish
'sed' commands to edit the source code.
Make test-only builds of libdeflate support an environmental variable
LIBDEFLATE_DISABLE_CPU_FEATURES that contains a list of CPU features to
disable like "avx512bw,avx2,sse2".
This makes it possible to test all the variants of dynamically
dispatched code without editing the source code.
Note, this environmental variable is not a stable interface, so put the
support for it behind a scary-looking option TEST_SUPPORT__DO_NOT_USE.
In cpuid() in the '__i386__ && __PIC__' case, the second output operand
is written to before the input operands are used. So the second output
operand needs the earlyclobber constraint.
Don't assume that lib_common.h and libdeflate.h don't include
<stdlib.h>. Currently this change doesn't matter unless someone uses
-DFREESTANDING for a Windows build, which isn't supported anyway, but we
might as well clean this up.
Update https://github.com/ebiggers/libdeflate/pull/68
gcc 10 is miscompiling libdeflate on x86_64 at -O3 due to a regression
in how packed structs are handled
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94994).
Work around this by just always using memcpy() for unaligned accesses.
It's unclear that the "packed struct" approach is worthwhile to maintain
anymore. Currently I'm only aware that it's useful with old gcc's on
arm32. Hopefully, compilers are good enough now that we can simply use
memcpy() everywhere.
Update https://github.com/ebiggers/libdeflate/issues/64