Allow building libdeflate without linking to any libc functions by using
'make FREESTANDING=1'. When using such a library build, the user will
need to call libdeflate_set_memory_allocator() before anything else,
since malloc() and free() will be unavailable.
[Folded in fix from Ingvar Stepanyan to use -nostdlib, and made
freestanding_tests() check that no libs are linked to.]
Update https://github.com/ebiggers/libdeflate/issues/62
In preparation for adding custom memory allocator support, don't call
the standard memory allocation functions directly but rather wrap them
with libdeflate_malloc() and libdeflate_free().
Unfortunately, MSVC only accepts __stdcall after the return type, while
gcc only accepts __attribute__((visibility("default"))) before the
return type. So we need a macro in each location.
Also, MSVC doesn't define __i386__; that's gcc specific. So instead use
'_WIN32 && !_WIN64' to detect 32-bit Windows.
Another build_decode_table() optimization: rather than filling all the
entries for each codeword using strided stores, just fill one initially
and fill the rest by memcpy()s as the table is incrementally expanded.
Also make some other cleanups and small optimizations.
Further improve build_decode_table() performance by splitting the "fill
direct entries" and "fill subtable pointers and subtables" steps into
separate loops and making some other optimizations.
Improve libdeflate's worst-case performance decompressing malicious
DEFLATE streams by about 14x, bringing it within a factor of about 2x of
zlib, by skipping rebuilding the decode tables for the static Huffman
codes when they're already loaded into the decompressor.
This improves performance decompressing a stream of all empty static
Huffman blocks from about 0.36 MB/s to 175 MB/s, or the original
reproducer given on the Github issue from about 3.3 MB/s to 219 MB/s.
A regression test is added for these cases as well as the empty dynamic
Huffman blocks case to verify worst-case performance comparable to zlib.
Resolves https://github.com/ebiggers/libdeflate/issues/33
Use 'volatile' for the CPU feature masks and dispatched function
pointers. We don't need memory barriers for them, so 'volatile' is good
enough to stop the compiler from inserting bogus reads/writes.
Move the x86 and ARM-specific code into their own directories to prevent
it from cluttering up the main library. This will make it a bit easier
to add new architecture-specific code.
But to avoid complicating things too much for people who aren't using
the provided Makefile, we still just compile all .c files for all
architectures (irrelevant ones end up #ifdef'ed out), and the headers
are included explicitly for each architecture so that an
architecture-specific include path isn't needed. So, now people just
need to compile both lib/*.c and lib/*/*.c instead of only lib/*.c.
I've decided to simplify and standardize the licensing status for the
library by using the MIT license instead of CC0 (a.k.a. "public
domain"). This eliminates the somewhat controversial 4(a) clause in
CC0, and, for this and other reasons, should (somewhat ironically) make
it easier for some people to use and contribute to the project.
Note: copyright will apply to new changes and to new versions of the
work as a whole. Of course, versions previously released as public
domain remain public domain where legally recognized.
It was reported that API symbols were being "exported" from the static
library built with MSVC, causing them to remain exported after being
linked into another program. It turns out this was actually a problem
outside of MSVC as well. The solution is to always build the static and
shared libraries from different object files, where the API symbols are
exported from the shared library object files but not from the static
library object files.
Reported-by: Joergen Ibsen <ji@ibse.dk>
* Bring in common headers and program code from xpack project
* Move program code to programs/
* Move library code to lib/
* GNU89 and MSVC2010 compatibility
* Other changes