1205 Commits

Author SHA1 Message Date
Marcus Holland-Moritz
c42d168726 nilsimsa2 -> nilsimsa 2023-12-17 23:00:07 +01:00
Marcus Holland-Moritz
2546cc94f4 Nuke nilsimsa v1 2023-12-17 23:00:07 +01:00
Marcus Holland-Moritz
7891608c82 Factor out nilsimsa2 ordering 2023-12-17 23:00:07 +01:00
Marcus Holland-Moritz
a90768ae0f More inode methods 2023-12-17 23:00:07 +01:00
Marcus Holland-Moritz
b49dd782c6 add inode_ordering 2023-12-17 22:59:37 +01:00
Marcus Holland-Moritz
f54ac8d50e add fragment_chunkable 2023-12-17 22:58:51 +01:00
Marcus Holland-Moritz
7b24687c56 performance improvement: fs_path -> less_revpath 2023-12-17 22:58:51 +01:00
Marcus Holland-Moritz
0f362be4ee Decouple segmenter from inode / os_access / mmif 2023-12-17 22:58:51 +01:00
Marcus Holland-Moritz
78c15ad028 Ownership cleanup 2023-12-17 22:58:51 +01:00
Marcus Holland-Moritz
1f575df739 Use receiver abstraction for similarity_ordering 2023-12-17 22:58:51 +01:00
Marcus Holland-Moritz
94b1384e68 Add simple receiver abstraction 2023-12-17 22:58:51 +01:00
Marcus Holland-Moritz
a0d00bac2b Add multihreaded nilsimsa ordering using similarity_ordering 2023-12-17 22:58:51 +01:00
Marcus Holland-Moritz
94a66087a9 Add similarity_ordering module 2023-12-17 22:58:51 +01:00
Marcus Holland-Moritz
94b875868e Logging & timing for file_scanner 2023-12-17 22:58:51 +01:00
Marcus Holland-Moritz
b309d7165b CMakeLists test cleanup 2023-12-17 22:58:51 +01:00
Marcus Holland-Moritz
663a95be63 Update TODO 2023-12-17 22:58:06 +01:00
Marcus Holland-Moritz
56dfe58695 DWARFS_UNLIKELY -> [[unlikely]] 2023-12-17 21:59:11 +01:00
Marcus Holland-Moritz
35af027b27 block_manager -> segmenter 2023-12-17 21:59:11 +01:00
Marcus Holland-Moritz
8b05c2b338 Show categories in long help 2023-12-17 21:59:11 +01:00
Marcus Holland-Moritz
e86cd07246 Generate list of seen categories 2023-12-17 21:59:11 +01:00
Marcus Holland-Moritz
a75381b0ef Contextual option logging 2023-12-17 21:59:11 +01:00
Marcus Holland-Moritz
5d191a6dbf Add incompressible categorizer test 2023-12-17 21:59:11 +01:00
Marcus Holland-Moritz
35ae53b3fe More options for incompressible categorizer 2023-12-17 21:59:11 +01:00
Marcus Holland-Moritz
3b981ea406 Make default category available to categorizers 2023-12-17 21:59:11 +01:00
Marcus Holland-Moritz
dbd5502f82 Add some pcmaudio categorizer tests 2023-12-17 21:59:11 +01:00
Marcus Holland-Moritz
0f0505947c Add PCM audio test files 2023-12-17 21:59:11 +01:00
Marcus Holland-Moritz
3b5b15eaf8 Add TODOs for ranges when we switch to C++23 2023-12-17 21:59:11 +01:00
Marcus Holland-Moritz
9d5969adb7 Better modeling of metadata requirements 2023-12-17 21:59:11 +01:00
Marcus Holland-Moritz
e08faf2c0c Basic working FLAC compression 2023-12-17 21:59:11 +01:00
Marcus Holland-Moritz
4d5c039f12 Add pcm_sample_transformer 2023-12-17 21:59:11 +01:00
Marcus Holland-Moritz
6ddbcad93b Perform metadata check 2023-12-17 21:59:11 +01:00
Marcus Holland-Moritz
c2da034983 Compression metadata 2023-12-17 21:59:11 +01:00
Marcus Holland-Moritz
3f0d7c14fd Category-dependent block compression 2023-12-17 21:59:11 +01:00
Marcus Holland-Moritz
92226a73bd Parsing more categorized options 2023-12-17 21:59:11 +01:00
Marcus Holland-Moritz
735883d641 More specific pcmaudio category naming 2023-12-17 21:59:11 +01:00
Marcus Holland-Moritz
eac7fd9424 Add support for requesting metadata sample from categorizer 2023-12-17 21:59:11 +01:00
Marcus Holland-Moritz
cba0291894 Ability to dump all inodes 2023-12-17 21:59:11 +01:00
Marcus Holland-Moritz
7990b3d5e3 Allow dumping of contextual options 2023-12-17 21:59:11 +01:00
Marcus Holland-Moritz
34beffceb3 Integrate categorizers into inode manager 2023-12-17 21:59:11 +01:00
Marcus Holland-Moritz
611d1ef28d Refactor similarity handling in inode manager 2023-12-17 21:59:11 +01:00
Marcus Holland-Moritz
04701f09a9 Replace optional<uint32> by plain uint32 + flags field 2023-12-17 21:59:11 +01:00
Marcus Holland-Moritz
3c3d67a2d6 Add chunks vector to single_inode_fragment 2023-12-17 21:59:11 +01:00
Marcus Holland-Moritz
62e3805b13 Update categorizer_manager interface 2023-12-17 21:59:11 +01:00
Marcus Holland-Moritz
d6d279fbb4 Fix AIFF parser bug, reduce min size 2023-12-17 21:59:11 +01:00
Marcus Holland-Moritz
2544195abf Check for invalid chunk size in WAV64 2023-12-17 21:59:11 +01:00
Marcus Holland-Moritz
99adfdbf22 Clean up pcmaudio categorizer and add wav-like formats 2023-12-17 21:59:11 +01:00
Marcus Holland-Moritz
0d7f08515a More metadata checks 2023-12-17 21:59:11 +01:00
Marcus Holland-Moritz
c17ab2b44a Fix a few bugs found by fuzzing 2023-12-17 21:59:11 +01:00
Marcus Holland-Moritz
0d25c6e704 Basic categorizer fuzzer 2023-12-17 21:59:11 +01:00
Marcus Holland-Moritz
4bcbb3bfe9 CAF format support 2023-12-17 21:59:11 +01:00