193 Commits

Author SHA1 Message Date
Eric Biggers
6c26eb18ea prog_util: add ASSERT() macro 2018-12-23 12:03:00 -06:00
Eric Biggers
becd91bb63 lib/arm: NEON intrinsics require hardware floating point support
NEON intrinsics cannot be used when compiling for an ARM CPU without
hardware floating point support, e.g. the Debian armel port.  In this
case arm_neon.h cannot even be included as it causes an #error.

[Based on a patch by Adrian Bunk <bunk@debian.org>, but changed to check
 for __ARM_FP instead of !__SOFTFP__ to be consistent with arm_neon.h,
 and added a comment.]
2018-12-22 00:00:04 -06:00
Eric Biggers
d6d50c6955 Fix stack alignment in 32-bit Windows builds
Resolves https://github.com/ebiggers/libdeflate/issues/35
2018-12-08 10:11:11 -08:00
Eric Biggers
906c54c16f Makefile: make changing libdeflate.h trigger a rebuild 2018-12-08 10:11:11 -08:00
Eric Biggers
65fd37d987 Add soname to shared library
To match common shared library packaging conventions: name the shared
library libdeflate.so.0, with matching soname, and make libdeflate.so
a symlink that points to it.
2018-12-06 21:43:08 -08:00
Eric Biggers
7fad94b8c9 Include import library in Windows binary releases
Previously:
	- libdeflate.dll: the dynamic library
	- libdeflate.lib: the static library

Now:
	- libdeflate.dll: the dynamic library
	- libdeflate.lib: the import library
	- libdeflatestatic.lib: the static library
2018-12-06 20:13:18 -08:00
Eric Biggers
2b6689d8aa Support 'make install' and 'make uninstall' 2018-06-14 22:58:46 -07:00
Eric Biggers
89b2d68aac README updates 2018-05-18 19:33:51 -07:00
ebiggers
203c1a8989
Merge pull request #32 from antonblanchard/ppc64_unaligned
Set UNALIGNED_ACCESS_IS_FAST on powerpc64
2018-05-12 21:49:24 -07:00
Anton Blanchard
9205845a16 Set UNALIGNED_ACCESS_IS_FAST on powerpc64
All 64bit PowerPC CPUs handle unaligned accesses reasonably fast, so
set UNALIGNED_ACCESS_IS_FAST.

Decompression of the snappy html test case is almost 50% faster on
POWER9 with this patch applied.
2018-05-13 07:48:40 +10:00
Eric Biggers
9a327aae41 v1.0 v1.0 2018-04-13 22:46:08 -07:00
Eric Biggers
e9d1014161 tools/checksum_benchmarks.sh: fix detecting/disabling NEON on AArch64 2018-03-03 13:04:13 -08:00
Eric Biggers
6eef15d6f3 lib/arm: fix PMULL detection on AArch64 2018-03-03 12:47:50 -08:00
Eric Biggers
fc2ea22b44 lib/arm: add ARM PMULL implementation of CRC-32
Add an ARM PMULL implementation of CRC-32.  This is based on a patch by
Jun He <jun.he@linaro.org> as well as the x86 PCLMUL implementation.
2018-02-18 23:03:26 -08:00
Eric Biggers
1fb34f86b5 lib: add template for vectorized CRC-32 implementations 2018-02-18 23:03:26 -08:00
Eric Biggers
0c62e25464 tools/run_tests.sh: detect gcc without multilib support 2018-02-18 23:03:26 -08:00
Eric Biggers
5f3afad793 tools/run_tests.sh: run checksum benchmarks 2018-02-18 23:03:26 -08:00
Eric Biggers
2f4315c21c tools/run_tests.sh: more Android tests 2018-02-18 23:03:26 -08:00
Eric Biggers
794a40401d tools/android_build.sh: move -pie to LDFLAGS 2018-02-18 23:03:26 -08:00
Eric Biggers
4282583b9b tools/android_build.sh: support crypto extensions 2018-02-18 23:03:26 -08:00
Eric Biggers
e7aa4666e0 tools/checksum_benchmarks.sh: various improvements
Make it compatible with the new code organization, make it run the
test_checksums program for each implementation, and run each
implementation in both 64-bit and 32-bit modes.
2018-02-18 23:03:26 -08:00
Eric Biggers
bf0797e666 programs/test_checksums: test Adler-32 overflow cases 2018-02-18 23:03:26 -08:00
Eric Biggers
fb5c6a8c85 lib/arm: allow choosing adler32_neon() at runtime
Now that we detect CPU features on ARM, allow the NEON implementation of
Adler-32 to be selected at runtime based on the presence of the NEON
feature.
2018-02-18 23:03:26 -08:00
Eric Biggers
2575ede5ff lib/arm: add ARM CPU feature detection (Linux only for now) 2018-02-18 23:03:26 -08:00
Eric Biggers
8d58d51160 common: detect ARM NEON and PMULL target intrinsics 2018-02-18 23:03:26 -08:00
Eric Biggers
1617206086 lib/x86: allow choosing adler32_sse2() at runtime
Now that we detect CPU features on 32-bit x86, allow the SSE2
implementation of Adler-32 to be selected at runtime based on the
presence of the SSE2 feature.
2018-02-18 23:03:26 -08:00
Eric Biggers
0d1260be99 lib/x86: allow CPU feature detection on 32-bit x86
The SSE2, AVX2, BMI2, etc. code actually works on 32-bit x86 if the CPU
has those features.  So there is no need to restrict it to x86_64-only.
2018-02-18 23:03:26 -08:00
Eric Biggers
58978af429 lib: make CPU feature masks and dispatch pointers volatile
Use 'volatile' for the CPU feature masks and dispatched function
pointers.  We don't need memory barriers for them, so 'volatile' is good
enough to stop the compiler from inserting bogus reads/writes.
2018-02-18 23:03:26 -08:00
Eric Biggers
4829a5add2 lib: refactor architecture-specific code
Move the x86 and ARM-specific code into their own directories to prevent
it from cluttering up the main library.  This will make it a bit easier
to add new architecture-specific code.

But to avoid complicating things too much for people who aren't using
the provided Makefile, we still just compile all .c files for all
architectures (irrelevant ones end up #ifdef'ed out), and the headers
are included explicitly for each architecture so that an
architecture-specific include path isn't needed.  So, now people just
need to compile both lib/*.c and lib/*/*.c instead of only lib/*.c.
2018-02-18 23:03:26 -08:00
Eric Biggers
0191c6bc26 lib: remove unused x86_cpu_features functionality
Remove the unused CPU features as well as the DEBUG code.
2018-02-18 23:03:26 -08:00
Eric Biggers
f76dcd5ee1 common: replace COMPILER_SUPPORTS_TARGET_INTRINSICS
Replace COMPILER_SUPPORTS_TARGET_INTRINSICS with macros for the
individual features, since COMPILER_SUPPORTS_TARGET_INTRINSICS was
x86-specific and would cause confusion when we try to use intrinsics in
'target' functions for other architectures.
2018-02-18 23:03:26 -08:00
Eric Biggers
5a9d25a892 Support multi-member gzip files 2017-11-20 00:35:24 -08:00
Eric Biggers
3d96a83ef9 v0.8 v0.8 2017-07-29 14:38:03 -07:00
Eric Biggers
48dcf684ec Improve instructions for building libdeflate on Windows 2017-06-09 18:53:04 -07:00
Eric Biggers
1726e9e87f benchmark: make it easier to integrate other compression/decompression engines 2017-05-29 18:37:01 -07:00
Eric Biggers
65a119ddfd run_tests.sh: test for same output on big endian CPU 2017-05-29 18:25:07 -07:00
Eric Biggers
8067f44e8c run_tests.sh: portability improvements
Support running the run_tests.sh script on more types of systems.
2017-05-29 17:44:50 -07:00
Eric Biggers
aadf6d8198 deflate_compress: produce same results on all CPUs 2017-05-29 17:44:50 -07:00
Eric Biggers
e42013f92e bt_matchfinder: produce same results on big endian CPUs 2017-05-29 17:38:58 -07:00
Eric Biggers
c4aca64dcb hc_matchfinder: produce same results on big endian CPUs 2017-05-29 17:38:58 -07:00
Eric Biggers
a53e457a5a matchfinder_common: fix conditions for vectorized init and rebase 2017-05-29 17:38:55 -07:00
Eric Biggers
1f8090cda1 Makefile: set $(AR) more reliably when building with MinGW 2017-04-30 21:07:05 -07:00
Eric Biggers
671e2bb5b5 Compile programs with -D_POSIX_C_SOURCE=200809L
The _DEFAULT_SOURCE feature test macro is only supported by glibc 2.19
and later.  As a result, various things were not being defined when
building with an older glibc version, causing compile errors.  Instead,
_POSIX_C_SOURCE=200809L should expose everything we need.
2017-04-12 22:11:17 -07:00
Eric Biggers
27c13370cb run_tests.sh: run tests on arm64 2017-03-19 12:29:49 -07:00
Eric Biggers
a32bdb097d v0.7 v0.7 2017-01-14 20:56:17 -08:00
Eric Biggers
f2f0df7274 deflate_compress: fix corruption with long literal run
When the block splitting algorithm was implemented, it became possible
for the compressor to use longer blocks, up to ~300KB.  Unfortunately it
was overlooked that this can allow literal runs > 65535 bytes, while in
one place the length of a literal run was still being stored in a u16.
To overflow the litrunlen and hit the bug the data would have had to
have far fewer matches than random data, which is possible but very
unusual.  Fix the bug by reserving more space to hold the litrunlen, and
add a test for it.
2017-01-14 20:51:03 -08:00
Eric Biggers
e79444be27 Fix compilation with icc 2016-11-07 19:45:37 -08:00
Eric Biggers
28cc14994b run_tests.sh: look for other clang versions 2016-11-04 21:27:50 -07:00
Eric Biggers
3a3d2da7c2 Fix compilation with clang 3.7 2016-11-04 21:24:44 -07:00
Eric Biggers
2ea8ddae66 Don't use 'defined' in macro expansion
With clang 3.9:
	warning: macro expansion producing 'defined' has undefined
		 behavior [-Wexpansion-to-defined]

Just eliminate the tests for clang and icc; they shouldn't be necessary.
2016-10-30 12:40:47 -07:00