libdeflate

mirror of https://github.com/cuberite/libdeflate.git synced 2025-09-23 03:17:30 -04:00

Author	SHA1	Message	Date
Tiger Wang	35039593c5	ARM_FP -> ARM_NEON	2022-01-26 22:02:43 +00:00
Tiger Wang	e201c080dd	Add CMakeLists.txt	2022-01-26 22:02:43 +00:00
Eric Biggers	3cc3608e9c	deflate_compress: improve some comments	2022-01-21 00:08:23 -08:00
Eric Biggers	80364d70c6	deflate_compress: add deflate_write_bits() helper function	2022-01-21 00:08:23 -08:00
Eric Biggers	194f6a447f	deflate_compress: factor out deflate_write_match() Reduce code duplication between deflate_write_sequences() and deflate_write_item_list().	2022-01-21 00:08:23 -08:00
Eric Biggers	90e7211d39	deflate_compress: include bit reversal in gen_codewords() This way we don't have to iterate through all the codewords again.	2022-01-21 00:08:23 -08:00
Eric Biggers	241091e8a7	deflate_compress: fix up GET_NUM_COUNTERS() GET_NUM_COUNTERS() actually returns a value about 4 times higher than intended, due to missing parentheses; ((num_syms) + 3 / 4) is supposed to be (((num_syms) + 3) / 4), or equivalently DIV_ROUND_UP(num_syms, 4) as it was in my original code from wimlib commit 394751ae1302. This value affects the performance of Huffman code generation. But oddly enough, I'm measuring that the actual behavior results in better performance than the intended behavior. This could be due to differences between DEFLATE and LZX (the compression format I had originally tuned this value to), or a different CPU, or both. Moreover, I can't make it significantly faster by trying other values. So, make GET_NUM_COUNTERS() intentionally expand to simply num_syms. Omit the rounding up to a multiple of 4, which doesn't seem helpful.	2022-01-21 00:08:23 -08:00
Eric Biggers	dc64ccfb25	deflate_decompress: remove len_t typedef Solaris already defines 'len_t' in <sys/types.h>, which causes a build error. This typedef isn't important, so just remove it and use u8 directly. Fixes https://github.com/ebiggers/libdeflate/issues/159	2022-01-15 18:04:11 -08:00
Eric Biggers	3f21ec9d61	deflate_decompress: error out if overread count gets too large Currently, if the decompressor runs out of input, it may spin until it completely fills the output buffer, at which point it returns LIBDEFLATE_INSUFFICIENT_SPACE. This occurs because the decompressor implicitly appends zeroes to the input. It's designed this way because precisely detecting overreads in the inner loop would hurt performance, as normally it's hard to distinguish between real overreads and the expected overreads that occur due to reading variable-length codewords. This behavior is fine for decompressing with a known output size, which is the recommended use case of libdeflate. However, it's not so great for decompressing with an unknown output size, as it may cause the user to allocate an increasingly large buffer until they run out of memory. Be more friendly to this use case by checking for excessive overreads more often, and returning LIBDEFLATE_BAD_DATA if it happens. In theory the new check won't hurt performance, as it only happens when the end of the input has been reached, which was already being checked for. Fixes https://github.com/ebiggers/libdeflate/issues/157	2022-01-14 23:13:12 -08:00
Eric Biggers	5be041247e	deflate_decompress: rename overrun_count to overread_count Make it clear that this is for the input, not the output. Update https://github.com/ebiggers/libdeflate/issues/157	2022-01-14 23:13:12 -08:00
Eric Biggers	572e4c5db0	gunzip: limit uncompressed buffer size to 1032x compressed size Don't allocate an obviously-too-large buffer for uncompressed data if ISIZE becomes corrupted in a small file. This isn't too effective, as we still must allow over 1000x expansion, but we might as well do this. Update https://github.com/ebiggers/libdeflate/issues/157	2022-01-14 21:35:32 -08:00
Eric Biggers	6c095314d0	v1.9 v1.9	2022-01-11 21:24:28 -08:00
Eric Biggers	bfa6ddb34b	deflate_compress: minor cleanups	2022-01-11 21:24:28 -08:00
Eric Biggers	55f9f70972	deflate_compress: rewind blocks in near-optimal compressor The block splitting algorithm works by examining successive chunks of data and ending the block when a chunk differs significantly from the rest of the block. Currently, the data chunk where the change is detected is included in the block, which is suboptimal -- after all, we know that it's different from the rest of the block. Better results could be achieved by ending the block just before the chunk. Implement this in the near-optimal compressor. This slightly improves its compression ratio. Note: I also tested an implementation of this for the lazy compressor. It improves compression ratio too, but it doesn't seem worthwhile there from a performance and complexity standpoint.	2022-01-05 22:51:34 -08:00
Eric Biggers	5f4da4b243	deflate_compress: allow shorter blocks The MIN_BLOCK_LENGTH of 10000 is too high; shorter blocks are sometimes worthwhile. Reduce MIN_BLOCK_LENGTH to 5000, but penalize very short blocks so they are only chosen when they clearly seem worthwhile.	2022-01-04 21:34:22 -08:00
Eric Biggers	968588d19a	deflate_compress: use num_new_observations NUM_OBSERVATIONS_PER_BLOCK_CHECK was being used as an approximation for num_new_observations in one place. Just use num_new_observations.	2022-01-04 21:34:22 -08:00
Eric Biggers	074ef7db98	deflate_compress: improve block splitting in near-optimal compressor When choosing which literals and matches to use for the block split statistics in deflate_compress_near_optimal(), take into account the min_len heuristic. This should give slightly more accurate results.	2022-01-04 21:25:56 -08:00
Eric Biggers	ea536bcce2	bt_matchfinder: remove best_len_ret parameter It doesn't seem worthwhile to have bt_matchfinder_get_matches() return the best_len separately anymore, especially since it doesn't work as expected due to it not handling length 3 matches.	2022-01-04 21:25:56 -08:00
Eric Biggers	3675136c39	deflate_compress: observe correct items in lazy compressor Ensure that whenever deflate_choose_() is called to choose a literal or match, observe_() is also called to tally it in the block split statistics (except in deflate_compress_fastest(), which doesn't use the block split statistics). Previously, the lazy compressor contained an optimization that made the block split statistics differ from the actual lazy parse. But that optimization no longer seems to be worthwhile.	2022-01-04 21:25:56 -08:00
Eric Biggers	6559b86a5a	deflate_compress: rearrange compressor fields slightly Put some fields in a more logical order.	2022-01-04 21:15:30 -08:00
Eric Biggers	60f38b4598	deflate_compress: clean up coding style Use a consistent comment style, consistently limit lines to 80 columns, consistently use a blank line after declarations, and other cleanups.	2022-01-04 21:15:30 -08:00
Eric Biggers	08692b8696	deflate_compress: refactor writing literals into separate function Avoid too many levels of indentation.	2022-01-04 21:15:30 -08:00
leleliu008	f6e7593cfc	add GitHubActions to build for android Reference: https://developer.android.com/ndk/guides/other_build_systems	2022-01-03 15:23:34 -06:00
Eric Biggers	d352945791	deflate_compress: fix checks for sequence store filled There were off by 1.	2022-01-02 19:44:10 -06:00
Eric Biggers	4dd63ea272	deflate_compress: misc cleanups for new code	2022-01-02 19:44:10 -06:00
Eric Biggers	71db68b27f	deflate_compress: adjust block splitting conditions For fastest, greedy, lazy, and lazy2: save memory by reducing the length of the sequence store, and forcing a split if it is filled. For fastest: increase the max block length, but use a relatively short sequence store that will cause shorter blocks to be used often. For all: allow the final block to exceed the soft maximum length if it avoids having to create a block below the minimum length.	2022-01-02 12:30:35 -06:00
Eric Biggers	7c60c4cdaf	Makefile: avoid redundant invocations of "$(CC) -dumpmachine"	2022-01-02 11:54:02 -06:00
leleliu008	fcafa11201	buildsys: android apk do not support soversion	2022-01-02 11:23:14 -06:00
Eric Biggers	88d45c7e1c	deflate_compress: only use full offset slot map when useful Reduce the memory usage of compression levels 1-9 by using the condensed offset slot map instead of the full one.	2022-01-01 20:15:27 -06:00
Eric Biggers	5e9226fff8	deflate_compress: optimize level 1 Introduce deflate_compress_fastest(), and use it for level 1. It uses the ht_matchfinder instead of the hc_matchfinder, and it skips the block splitting algorithm. This speeds up level 1, without changing the compression ratio too much (relative to what is expected for level 1).	2022-01-01 20:15:27 -06:00
Eric Biggers	19816c5e26	matchfinder: introduce the ht_matchfinder Add a new matchfinder optimized for very fast compression.	2022-01-01 20:15:27 -06:00
Eric Biggers	8012927541	matchfinder: rename skip_positions() to skip_bytes() This is a bit shorter, and perhaps clearer.	2022-01-01 20:15:27 -06:00
Eric Biggers	320c306db3	hc_matchfinder: make skip_positions() return void	2022-01-01 20:15:27 -06:00
Eric Biggers	c16ba46008	deflate_compress: remove unneeded litrunlen variables Just update deflate_sequence::litrunlen_and_length directly.	2022-01-01 20:15:27 -06:00
Eric Biggers	41685b0dac	matchfinder: add MATCHFINDER_ALIGNED macro Avoid some code duplication.	2022-01-01 20:15:27 -06:00
leleliu008	52b61d98d8	buildsys: change verify macOS platform condition to support cross-compile for Android on macOS Fixes #147	2022-01-01 19:48:39 -06:00
Eric Biggers	6ca065f9f8	hc_matchfinder: fix some comments	2022-01-01 09:36:11 -06:00
Eric Biggers	93a06e313e	scripts: improve benchmark table script	2022-01-01 08:57:27 -06:00
Eric Biggers	3dca7de4bd	deflate_compress: improve costs for near-optimal parsing Further improve the way the near-optimal parser estimates symbol costs: - When setting a block's initial costs, weigh the default costs and previous block's costs differently, depending on how different the current block seems to be from the previous block. - When determining the "default" costs, take into account how many literals appear in the block and how frequent matches seem to be. - Increase BIT_COST from 8 to 16, to increase precision in calculations.	2021-12-31 17:00:05 -06:00
Eric Biggers	bf3e032f71	deflate_compress: use correct previous costs When the near-optimal parser sets the initial costs for a block, it takes into account the costs of the previous block (if there is one). However, the costs for the previous block were not being updated following the final codes being built, so the costs from the last optimization pass were being used instead of the final costs. Make it so that the final costs are used, as intended. Also, include the DEFLATE_END_OF_BLOCK symbol in the non-final codes. In general, these changes improve the compression ratio slightly.	2021-12-31 17:00:05 -06:00
Eric Biggers	8d4a5ae15c	deflate_compress: strengthen levels 10-12 slightly With deflate_compress_near_optimal(), some data benefits more than originally thought from larger values of max_search_depth and nice_match_length. Some data even needs these parameters to be fairly high for deflate_compress_near_optimal() to compress more than deflate_compress_lazy2(). Bump these parameters up a bit.	2021-12-31 17:00:05 -06:00
Eric Biggers	69a7ca07fd	deflate_compress: automatically select minimum match length In the greedy and lazy compressors, automatically increase the minimum match length from the default of 3 if the data doesn't contain many different literals. This greatly improves the compression ratio of levels 1-9 on certain types of data, such as DNA sequencing data, while not worsening the ratio on other types of data. The near-optimal compressor (used by compression levels 10-12) continues to use a minimum match length of 3, since it already did a better job at deciding when short matches are worthwhile. (The method for setting the initial costs needs improvement; later commits address that.) Resolves https://github.com/ebiggers/libdeflate/issues/57	2021-12-31 17:00:05 -06:00
Eric Biggers	3bc42e23d6	deflate_compress: improve parameter comments and ordering Make it clearer what each of the #define's do, and rearrange them into a more logical order.	2021-12-31 17:00:05 -06:00
Eric Biggers	3f706a69bd	deflate_compress: use MAX_PRE_CODEWORD_LEN constant deflate_write_huffman_header() should use MAX_PRE_CODEWORD_LEN, not DEFLATE_MAX_PRE_CODEWORD_LEN (though these are currently the same).	2021-12-31 17:00:05 -06:00
Eric Biggers	be5aefe42f	deflate_compress: rename CACHE_LENGTH to MATCH_CACHE_LENGTH	2021-12-31 17:00:05 -06:00
Eric Biggers	1f45d0b36a	deflate_constants: define constant for window order	2021-12-31 17:00:05 -06:00
Eric Biggers	dd42d1a001	deflate_constants: define constant for first len sym	2021-12-31 17:00:05 -06:00
Eric Biggers	0b6127da2d	deflate_constants: remove unused constants	2021-12-31 17:00:05 -06:00
Eric Biggers	e5132579a4	deflate_compress: replace COST_SHIFT with BIT_COST This is a bit easier to understand, and the compiler will optimize the mulitplications to shifts anyway.	2021-12-31 17:00:05 -06:00
Eric Biggers	f7d3a70d4c	deflate_compress: use lazy2 compressor for levels 8-9 Instead of switching directly from the lazy compressor at level 7 to the near-optimal compressor at level 8, use the lazy2 compressor at levels 8-9 and don't switch to near-optimal until level 10. This avoids poor compression ratio and bad performance (both significantly worse than level 7, and significantly worse than zlib) at levels 8-9 on data where the near-optimal compressor doesn't do well until the parameters are cranked up. On data where the near-optimal compressor does do well, this change worsens the compression ratio of levels 8-9, but also speeds them up a lot, thus positioning them similarly vs. zlib as the lower levels (i.e. much faster and slightly stronger, rather than slightly faster and much stronger). The difference between levels 9 and 10 is increased, but that's perhaps the least bad place to have a discontinuity. Resolves https://github.com/ebiggers/libdeflate/issues/85	2021-12-31 17:00:05 -06:00

1 2 3 4 5 ...

435 Commits