libdeflate

cuberite/libdeflate

Fork 0

mirror of https://github.com/cuberite/libdeflate.git synced 2025-09-14 06:49:09 -04:00

Commit Graph

Author	SHA1	Message	Date
Eric Biggers	29dfcfd866	lib/matchfinder: support dynamic dispatch for init and rebase Currently the optimized implementations of matchfinder_init() and matchfinder_rebase() are chosen via static dispatch. That means that the AVX-2 implementations usually aren't used. Fix this by using dynamic dispatch, like what libdeflate does for the Adler-32 and CRC-32 checksums and for DEFLATE decompression. Based on work by Andrew Steinborn <git@steinborn.me> (https://github.com/ebiggers/libdeflate/pull/77). He wrote: "The main impact is on x86: the AVX2 matchfinder can now be properly dynamically dispatched at runtime and if -mavx2 is included in CFLAGS (or -march set to any platform with AVX2 support). On my Ryzen 9 3900X, I got an approximately 1% boost in deflate time (measured with a uncompressed tarball of the Silesia corpus) using just the changes in this PR and the regular CFLAGS, and a 2.7% boost when specifying -mavx2 as CFLAGS. (I also tested with an Intel Xeon Skylake c5.large EC2 instance, and did not see any performance regression)."	2020-10-28 19:20:53 -07:00
Eric Biggers	ff8634427b	lib/matchfinder: simplify init and rebase Remove the ability of matchfinder_init() and matchfinder_rebase() to fail due to the matchfinder memory size being misaligned. Instead, require that the size always be 128-byte aligned -- which is already the case. Also, make the matchfinder memory always be 32-byte aligned -- which doesn't really have any downside.	2020-10-25 22:42:25 -07:00
Eric Biggers	4829a5add2	lib: refactor architecture-specific code Move the x86 and ARM-specific code into their own directories to prevent it from cluttering up the main library. This will make it a bit easier to add new architecture-specific code. But to avoid complicating things too much for people who aren't using the provided Makefile, we still just compile all .c files for all architectures (irrelevant ones end up #ifdef'ed out), and the headers are included explicitly for each architecture so that an architecture-specific include path isn't needed. So, now people just need to compile both lib/.c and lib//.c instead of only lib/.c.	2018-02-18 23:03:26 -08:00

Author

SHA1

Message

Date

Eric Biggers

29dfcfd866

lib/matchfinder: support dynamic dispatch for init and rebase

Currently the optimized implementations of matchfinder_init() and
matchfinder_rebase() are chosen via static dispatch.  That means that
the AVX-2 implementations usually aren't used.

Fix this by using dynamic dispatch, like what libdeflate does for the
Adler-32 and CRC-32 checksums and for DEFLATE decompression.

Based on work by Andrew Steinborn <git@steinborn.me>
(https://github.com/ebiggers/libdeflate/pull/77).  He wrote:

"The main impact is on x86: the AVX2 matchfinder can now be properly
dynamically dispatched at runtime and if -mavx2 is included in CFLAGS
(or -march set to any platform with AVX2 support). On my Ryzen 9 3900X,
I got an approximately 1% boost in deflate time (measured with a
uncompressed tarball of the Silesia corpus) using just the changes in
this PR and the regular CFLAGS, and a 2.7% boost when specifying -mavx2
as CFLAGS. (I also tested with an Intel Xeon Skylake c5.large EC2
instance, and did not see any performance regression)."

2020-10-28 19:20:53 -07:00

Eric Biggers

ff8634427b

lib/matchfinder: simplify init and rebase

Remove the ability of matchfinder_init() and matchfinder_rebase() to
fail due to the matchfinder memory size being misaligned.  Instead,
require that the size always be 128-byte aligned -- which is already the
case.  Also, make the matchfinder memory always be 32-byte aligned --
which doesn't really have any downside.

2020-10-25 22:42:25 -07:00

Eric Biggers

4829a5add2

lib: refactor architecture-specific code

Move the x86 and ARM-specific code into their own directories to prevent
it from cluttering up the main library.  This will make it a bit easier
to add new architecture-specific code.

But to avoid complicating things too much for people who aren't using
the provided Makefile, we still just compile all .c files for all
architectures (irrelevant ones end up #ifdef'ed out), and the headers
are included explicitly for each architecture so that an
architecture-specific include path isn't needed.  So, now people just
need to compile both lib/*.c and lib/*/*.c instead of only lib/*.c.

2018-02-18 23:03:26 -08:00

3 Commits