Update benchmarks

This commit is contained in:
Marcus Holland-Moritz 2020-12-09 23:26:19 +01:00
parent c234dcc0ca
commit 96afa05b77

467
README.md
View File

@ -182,28 +182,28 @@ versions will produce substantially faster code:
1.42 ± 0.01 times faster than './dwarfs_test-gcc-O3'
1.48 ± 0.01 times faster than './dwarfs_test-gcc-O2'
$ hyperfine -L prog $(echo ./mkdwarfs-* | tr ' ' ,) '{prog} --no-progress --log-level warn -i tree -o /dev/null -C null'
Benchmark #1: ./mkdwarfs-clang-O2 --no-progress --log-level warn -i tree -o /dev/null -C null
$ hyperfine -L prog $(echo mkdwarfs-* | tr ' ' ,) '{prog} --no-progress --log-level warn -i tree -o /dev/null -C null'
Benchmark #1: mkdwarfs-clang-O2 --no-progress --log-level warn -i tree -o /dev/null -C null
Time (mean ± σ): 4.358 s ± 0.033 s [User: 6.364 s, System: 0.622 s]
Range (min … max): 4.321 s … 4.408 s 10 runs
Benchmark #2: ./mkdwarfs-clang-O3 --no-progress --log-level warn -i tree -o /dev/null -C null
Benchmark #2: mkdwarfs-clang-O3 --no-progress --log-level warn -i tree -o /dev/null -C null
Time (mean ± σ): 4.282 s ± 0.035 s [User: 6.249 s, System: 0.623 s]
Range (min … max): 4.244 s … 4.349 s 10 runs
Benchmark #3: ./mkdwarfs-gcc-O2 --no-progress --log-level warn -i tree -o /dev/null -C null
Benchmark #3: mkdwarfs-gcc-O2 --no-progress --log-level warn -i tree -o /dev/null -C null
Time (mean ± σ): 6.212 s ± 0.031 s [User: 8.185 s, System: 0.638 s]
Range (min … max): 6.159 s … 6.250 s 10 runs
Benchmark #4: ./mkdwarfs-gcc-O3 --no-progress --log-level warn -i tree -o /dev/null -C null
Benchmark #4: mkdwarfs-gcc-O3 --no-progress --log-level warn -i tree -o /dev/null -C null
Time (mean ± σ): 5.740 s ± 0.037 s [User: 7.742 s, System: 0.645 s]
Range (min … max): 5.685 s … 5.796 s 10 runs
Summary
'./mkdwarfs-clang-O3 --no-progress --log-level warn -i tree -o /dev/null -C null' ran
1.02 ± 0.01 times faster than './mkdwarfs-clang-O2 --no-progress --log-level warn -i tree -o /dev/null -C null'
1.34 ± 0.01 times faster than './mkdwarfs-gcc-O3 --no-progress --log-level warn -i tree -o /dev/null -C null'
1.45 ± 0.01 times faster than './mkdwarfs-gcc-O2 --no-progress --log-level warn -i tree -o /dev/null -C null'
'mkdwarfs-clang-O3 --no-progress --log-level warn -i tree -o /dev/null -C null' ran
1.02 ± 0.01 times faster than 'mkdwarfs-clang-O2 --no-progress --log-level warn -i tree -o /dev/null -C null'
1.34 ± 0.01 times faster than 'mkdwarfs-gcc-O3 --no-progress --log-level warn -i tree -o /dev/null -C null'
1.45 ± 0.01 times faster than 'mkdwarfs-gcc-O2 --no-progress --log-level warn -i tree -o /dev/null -C null'
These measurements were made with gcc-9.3.0 and clang-10.0.1.
@ -332,54 +332,59 @@ SquashFS that is the default setting for DwarFS:
For DwarFS, I'm sticking to the defaults:
$ time mkdwarfs -i install -o perl-install.dwarfs
16:17:32.906738 scanning install
16:17:46.908065 waiting for background scanners...
16:18:17.922033 assigning directory and link inodes...
16:18:18.259412 finding duplicate files...
16:18:33.110617 saved 28.2 GiB / 47.65 GiB in 1782826/1927501 duplicate files
16:18:33.110713 waiting for inode scanners...
16:18:37.406764 assigning device inodes...
16:18:37.463228 assigning pipe/socket inodes...
16:18:37.518980 building metadata...
16:18:37.519079 building blocks...
16:18:37.519095 saving names and links...
16:18:37.519551 ordering 144675 inodes by similarity...
16:18:38.010929 updating name and link indices...
16:18:38.121606 144675 inodes ordered [602ms]
16:18:38.121690 assigning file inodes...
16:31:51.415939 waiting for block compression to finish...
16:31:51.416127 saving chunks...
16:31:51.444823 saving directories...
16:31:53.812482 waiting for compression to finish...
16:32:38.117797 compressed 47.65 GiB to 544.9 MiB (ratio=0.0111677)
16:32:38.786630 filesystem created without errors [905.9s]
19:35:38.104864 scanning install
19:35:50.481783 waiting for background scanners...
19:37:42.425217 assigning directory and link inodes...
19:37:42.832262 finding duplicate files...
19:37:59.513989 saved 28.2 GiB / 47.65 GiB in 1782826/1927501 duplicate files
19:37:59.514061 waiting for inode scanners...
19:38:33.911012 assigning device inodes...
19:38:33.967734 assigning pipe/socket inodes...
19:38:34.023778 building metadata...
19:38:34.023900 building blocks...
19:38:34.023935 saving names and links...
19:38:34.025091 ordering 144675 inodes using nilsimsa similarity...
19:38:34.033828 nilsimsa: depth=25000, limit=255
19:38:34.505456 updating name and link indices...
19:38:34.984478 pre-sorted index (1016186 name, 639519 path lookups) [950.6ms]
19:47:11.530656 144675 inodes ordered [517.5s]
19:47:11.530750 waiting for segmenting/blockifying to finish...
19:51:40.399851 saving chunks...
19:51:40.438092 saving directories...
19:51:45.202445 waiting for compression to finish...
19:52:47.077210 compressed 47.65 GiB to 486.8 MiB (ratio=0.00997673)
19:52:47.763533 filesystem created without errors [1030s]
⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯
waiting for block compression to finish
scanned/found: 330733/330733 dirs, 0/0 links, 1927501/1927501(2440) files
original size: 47.65 GiB, dedupe: 28.2 GiB (1782826 files), segment: 12.42 GiB
filesystem: 7.027 GiB in 450 blocks (754024 chunks, 144675/144675 inodes)
compressed filesystem: 450 blocks/544.9 MiB written
███████████████████████████████████████████████████████████████████████▏100% |
330733 dirs, 0/2440 soft/hard links, 1927501/1927501 files, 0 other
original size: 47.65 GiB, dedupe: 28.2 GiB (1782826 files), segment: 12.61 GiB
filesystem: 6.842 GiB in 438 blocks (757215 chunks, 144675/144675 inodes)
compressed filesystem: 438 blocks/486.8 MiB written
████████████████████████████████████████████████████████████████████████▏100% -
real 15m5.982s
user 111m45.629s
sys 2m51.002s
real 17m9.873s
user 135m47.100s
sys 3m44.958s
So in this comparison, `mkdwarfs` is more than 4 times faster than `mksquashfs`.
In total CPU time, it's actually 7 times less CPU resources.
In total CPU time, it actually uses 6 times less CPU resources.
$ ls -l perl-install.*fs
-rw-r--r-- 1 mhx users 571363322 Dec 8 16:32 perl-install.dwarfs
-rw-r--r-- 1 mhx users 510428994 Dec 9 19:52 perl-install.dwarfs
-rw-r--r-- 1 mhx users 4748902400 Nov 25 00:37 perl-install.squashfs
In terms of compression ratio, the **DwarFS file system is more than 8 times
In terms of compression ratio, the **DwarFS file system is more than 9 times
smaller than the SquashFS file system**. With DwarFS, the content has been
**compressed down to 1.1% (!) of its original size**.
**compressed down to less than 1% (!) of its original size**. This compression
ratio only considers the data stored in the individual files, not the actual
disk space used. On the original EXT4 file system, according to `du`, the
source folder uses 54 GiB, so **the DwarFS image actually only uses 0.88% of
the original space**.
When using identical block sizes for both file systems, the difference,
quite expectedly, becomes a lot less dramatic:
$ time sudo mksquashfs install perl-install-1M.squashfs -comp zstd -Xcompression-level 22 -b 1M
$ time mksquashfs install perl-install-1M.squashfs -comp zstd -Xcompression-level 22 -b 1M
real 41m55.004s
user 340m30.012s
@ -387,17 +392,17 @@ quite expectedly, becomes a lot less dramatic:
$ time mkdwarfs -i install -o perl-install-1M.dwarfs -S 20
real 26m26.987s
user 245m11.438s
sys 2m29.048s
real 26m39.166s
user 266m15.018s
sys 2m15.315s
$ ll -h perl-install-1M.*
-rw-r--r-- 1 mhx users 2.8G Nov 30 10:34 perl-install-1M.dwarfs
-rw-r--r-- 1 root root 4.0G Nov 30 10:05 perl-install-1M.squashfs
$ ls -l perl-install-1M.*
-rw-r--r-- 1 mhx users 2962199351 Dec 9 21:13 perl-install-1M.dwarfs
-rw-r--r-- 1 mhx users 4198944768 Nov 30 10:05 perl-install-1M.squashfs
But the point is that this is really where SquashFS tops out, as it doesn't
support larger block sizes. And as you'll see below, the larger blocks don't
necessarily negatively impact performance.
support larger block sizes. And as you'll see below, the larger blocks that
DwarFS is using don't necessarily negatively impact performance.
DwarFS also features an option to recompress an existing file system with
a different compression algorithm. This can be useful as it allows relatively
@ -407,44 +412,28 @@ system with the best possible compression (`-l 9`):
$ time mkdwarfs --recompress -i perl-install.dwarfs -o perl-lzma.dwarfs -l 9
16:47:52.221803 filesystem rewritten [657.8s]
20:35:11.312517 filesystem rewritten [610s]
⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯
filesystem: 7.027 GiB in 450 blocks (0 chunks, 0 inodes)
compressed filesystem: 450/450 blocks/458 MiB written
█████████████████████████████████████████████████████████████████████▏100% /
filesystem: 6.842 GiB in 438 blocks (0 chunks, 0 inodes)
compressed filesystem: 438/438 blocks/407.5 MiB written
█████████████████████████████████████████████████████████████████████▏100% -
real 10m57.942s
user 120m58.836s
sys 1m41.885s
real 10m10.098s
user 117m43.919s
sys 1m40.449s
$ ls -l perl-*.dwarfs
-rw-r--r-- 1 mhx users 571363322 Dec 8 16:32 perl-install.dwarfs
-rw-r--r-- 1 mhx users 480277450 Dec 8 16:47 perl-lzma.dwarfs
-rw-r--r-- 1 mhx users 510428994 Dec 9 19:52 perl-install.dwarfs
-rw-r--r-- 1 mhx users 427324302 Dec 9 20:35 perl-lzma.dwarfs
This reduces the file system size by another 16%, pushing the total
compression ratio below 1%.
You *may* be able to push things even further: there's the `nilsimsa`
ordering option which enables a somewhat experimental LSH ordering
scheme that's significantly slower than the default `similarity`
scheme, but can deliver even better clustering of similar data. It
also has the advantage that the ordering can be run while already
compressing data, which counters the slowness of the algorithm. On
the same Perl dataset, I was able to get these file system sizes
without a significant change in file system build time:
$ ll perl-install-nilsimsa*.dwarfs
-rw-r--r-- 1 mhx users 534735009 Dec 8 17:13 perl-nilsimsa.dwarfs
-rw-r--r-- 1 mhx users 449068734 Dec 8 17:25 perl-nilsimsa-lzma.dwarfs
That another 6-7% reduction in file system size for both the default
ZSTD as well as the LZMA compression.
compression ratio to 0.84% (or 0.74% when considering disk usage).
In terms of how fast the file system is when using it, a quick test
I've done is to freshly mount the filesystem created above and run
each of the 1139 `perl` executables to print their version.
$ hyperfine -c "umount mnt" -p "umount mnt; ./dwarfs perl-install.dwarfs mnt -o cachesize=1g -o workers=4; sleep 1" -P procs 5 20 -D 5 "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P{procs} sh -c '\$0 -v >/dev/null'"
$ hyperfine -c "umount mnt" -p "umount mnt; dwarfs perl-install.dwarfs mnt -o cachesize=1g -o workers=4; sleep 1" -P procs 5 20 -D 5 "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P{procs} sh -c '\$0 -v >/dev/null'"
Benchmark #1: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P5 sh -c '$0 -v >/dev/null'
Time (mean ± σ): 4.092 s ± 0.031 s [User: 2.183 s, System: 4.355 s]
Range (min … max): 4.022 s … 4.122 s 10 runs
@ -488,7 +477,7 @@ and the original EXT4 (at `install`). DwarFS is around 15% slower here:
Using the lzma-compressed file system, the metrics for *initial* runs look
considerably worse:
$ hyperfine -c "umount mnt" -p "umount mnt; ./dwarfs perl-lzma.dwarfs mnt -o cachesize=1g -o workers=4; sleep 1" -P procs 5 20 -D 5 "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P{procs} sh -c '\$0 -v >/dev/null'"
$ hyperfine -c "umount mnt" -p "umount mnt; dwarfs perl-lzma.dwarfs mnt -o cachesize=1g -o workers=4; sleep 1" -P procs 5 20 -D 5 "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P{procs} sh -c '\$0 -v >/dev/null'"
Benchmark #1: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P5 sh -c '$0 -v >/dev/null'
Time (mean ± σ): 20.372 s ± 0.135 s [User: 2.338 s, System: 4.511 s]
Range (min … max): 20.208 s … 20.601 s 10 runs
@ -514,7 +503,7 @@ with 4 cores, I did more tests with both SquashFS and DwarFS
(just because on the 6 core box my kernel didn't have support
for zstd in SquashFS):
hyperfine -c 'sudo umount /tmp/perl/install' -p 'umount /tmp/perl/install; ./dwarfs perl-install.dwarfs /tmp/perl/install -o cachesize=1g -o workers=4; sleep 1' -n dwarfs-zstd "ls -1 /tmp/perl/install/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '\$0 -v >/dev/null'" -p 'sudo umount /tmp/perl/install; sudo mount -t squashfs perl-install.squashfs /tmp/perl/install; sleep 1' -n squashfs-zstd "ls -1 /tmp/perl/install/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '\$0 -v >/dev/null'"
hyperfine -c 'sudo umount /tmp/perl/install' -p 'umount /tmp/perl/install; dwarfs perl-install.dwarfs /tmp/perl/install -o cachesize=1g -o workers=4; sleep 1' -n dwarfs-zstd "ls -1 /tmp/perl/install/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '\$0 -v >/dev/null'" -p 'sudo umount /tmp/perl/install; sudo mount -t squashfs perl-install.squashfs /tmp/perl/install; sleep 1' -n squashfs-zstd "ls -1 /tmp/perl/install/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '\$0 -v >/dev/null'"
Benchmark #1: dwarfs-zstd
Time (mean ± σ): 2.071 s ± 0.372 s [User: 1.727 s, System: 2.866 s]
Range (min … max): 1.711 s … 2.532 s 10 runs
@ -532,7 +521,7 @@ SquashFS also uses significantly more CPU power. However, the numbers
shown above for DwarFS obviously don't include the time spent in the
`dwarfs` process, so I repeated the test outside of hyperfine:
$ time ./dwarfs perl-install.dwarfs /tmp/perl/install -o cachesize=1g -o workers=4 -f
$ time dwarfs perl-install.dwarfs /tmp/perl/install -o cachesize=1g -o workers=4 -f
real 0m8.463s
user 0m3.821s
@ -605,48 +594,50 @@ the difference is really marginal.
### With SquashFS & xz
This test uses slightly less pathological input data: the root filesystem of
a recent Raspberry Pi OS release.
a recent Raspberry Pi OS release. This file system also contains device inodes,
so in order to preserve those, we pass `--with-devices` to `mkdwarfs`:
$ time mkdwarfs -i raspbian -o raspbian.dwarfs
17:42:39.027848 scanning raspbian
17:42:39.303335 waiting for background scanners...
17:42:39.898659 assigning directory and link inodes...
17:42:39.912519 finding duplicate files...
17:42:40.014950 saved 31.05 MiB / 1007 MiB in 1617/34582 duplicate files
17:42:40.015532 waiting for inode scanners...
17:42:40.793437 assigning device inodes...
17:42:40.794597 assigning pipe/socket inodes...
17:42:40.795254 building metadata...
17:42:40.795307 building blocks...
17:42:40.795315 saving names and links...
17:42:40.795396 ordering 32965 inodes by similarity...
17:42:40.820329 32965 inodes ordered [24.85ms]
17:42:40.820450 assigning file inodes...
17:42:40.837679 updating name and link indices...
17:43:58.270277 waiting for block compression to finish...
17:43:58.271058 saving chunks...
17:43:58.276149 saving directories...
17:43:58.414952 waiting for compression to finish...
17:44:16.324006 compressed 1007 MiB to 297 MiB (ratio=0.294999)
17:44:16.360627 filesystem created without errors [97.33s]
⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯
$ time sudo mkdwarfs -i raspbian -o raspbian.dwarfs --with-devices
20:38:27.027066 scanning raspbian
20:38:27.303344 waiting for background scanners...
20:38:27.897725 assigning directory and link inodes...
20:38:27.912619 finding duplicate files...
20:38:27.993716 saved 31.05 MiB / 1007 MiB in 1617/34582 duplicate files
20:38:27.993807 waiting for inode scanners...
20:38:30.347655 assigning device inodes...
20:38:30.348996 assigning pipe/socket inodes...
20:38:30.349817 building metadata...
20:38:30.349871 building blocks...
20:38:30.349928 saving names and links...
20:38:30.350001 ordering 32965 inodes using nilsimsa similarity...
20:38:30.351653 nilsimsa: depth=25000, limit=255
20:38:30.384726 updating name and link indices...
20:38:30.405993 pre-sorted index (55243 name, 26489 path lookups) [54.25ms]
20:39:45.269645 32965 inodes ordered [74.92s]
20:39:45.270350 waiting for segmenting/blockifying to finish...
20:39:58.430655 saving chunks...
20:39:58.434939 saving directories...
20:39:58.692174 waiting for compression to finish...
20:40:12.221693 compressed 1007 MiB to 287.3 MiB (ratio=0.285326)
20:40:12.253930 filesystem created without errors [105.2s]
⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯
waiting for block compression to finish
scanned/found: 4435/4435 dirs, 5908/5908 links, 34582/34582(473) files
original size: 1007 MiB, dedupe: 31.05 MiB (1617 files), segment: 52.66 MiB
filesystem: 923 MiB in 58 blocks (46074 chunks, 32965/32965 inodes)
compressed filesystem: 58 blocks/297 MiB written
███████████████████████████████████████████████████████████████████████▏100% /
4435 dirs, 5908/473 soft/hard links, 34582/34582 files, 7 other
original size: 1007 MiB, dedupe: 31.05 MiB (1617 files), segment: 52.83 MiB
filesystem: 922.8 MiB in 58 blocks (46518 chunks, 32965/32965 inodes)
compressed filesystem: 58 blocks/287.3 MiB written
████████████████████████████████████████████████████████████████████▏100% /
real 1m37.384s
user 14m57.678s
sys 0m16.968s
real 1m45.393s
user 18m33.459s
sys 0m16.648s
Again, SquashFS uses the same compression options:
$ time mksquashfs raspbian raspbian.squashfs -comp zstd -Xcompression-level 22
$ time sudo time mksquashfs raspbian raspbian.squashfs -comp zstd -Xcompression-level 22
Parallel mksquashfs: Using 12 processors
Creating 4.0 filesystem on raspbian.squashfs, block size 131072.
[===============================================================/] 38644/38644 100%
[====================================================================-] 38644/38644 100%
Exportable Squashfs 4.0 filesystem, zstd compressed, data block size 131072
compressed data, compressed metadata, compressed fragments,
@ -690,40 +681,42 @@ Again, SquashFS uses the same compression options:
nobody (65534)
adm (4)
mem (8)
1112.37user 2.61system 1:54.97elapsed 969%CPU (0avgtext+0avgdata 1736540maxresident)k
0inputs+743896outputs (0major+460065minor)pagefaults 0swaps
real 1m54.673s
user 18m32.152s
sys 0m2.501s
real 1m54.997s
user 18m32.386s
sys 0m2.627s
The difference in speed is almost negligible. SquashFS is just a bit
slower here. In terms of compression, the difference also isn't huge:
$ ll raspbian.* *.xz -h
-rw-r--r-- 1 root root 297M Dec 8 17:44 raspbian.dwarfs
-rw-r--r-- 1 mhx users 364M Nov 29 23:31 raspbian.squashfs
-rw-r--r-- 1 root root 288M Dec 9 20:40 raspbian.dwarfs
-rw-r--r-- 1 root root 364M Dec 9 22:31 raspbian.squashfs
-rw-r--r-- 1 mhx users 297M Aug 20 12:47 2020-08-20-raspios-buster-armhf-lite.img.xz
Interestingly, `xz` actually can't compress the whole original image
much better.
better than DwarFS.
We can again try to increase the DwarFS compression level:
We can even again try to increase the DwarFS compression level:
$ time mkdwarfs -i raspbian.dwarfs -o raspbian-9.dwarfs -l 9 --recompress
17:58:56.711149 filesystem rewritten [86.46s]
22:36:29.271336 filesystem rewritten [89.19s]
⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯
filesystem: 923 MiB in 58 blocks (0 chunks, 0 inodes)
compressed filesystem: 58/58 blocks/266.5 MiB written
██████████████████████████████████████████████████████████████████▏100% -
filesystem: 922.8 MiB in 58 blocks (0 chunks, 0 inodes)
compressed filesystem: 58/58 blocks/257.8 MiB written
██████████████████████████████████████████████████████████████████▏100% |
real 1m26.496s
user 15m50.757s
sys 0m14.183s
real 1m29.259s
user 15m50.695s
sys 0m14.454s
Now that actually gets the DwarFS image size well below that of the
`xz` archive:
$ ll -h raspbian-9.dwarfs *.xz
-rw-r--r-- 1 root root 267M Nov 29 23:54 raspbian-9.dwarfs
-rw-r--r-- 1 mhx users 258M Dec 9 22:40 raspbian-9.dwarfs
-rw-r--r-- 1 mhx users 297M Aug 20 12:47 2020-08-20-raspios-buster-armhf-lite.img.xz
However, if you actually build a tarball and compress that (instead of
@ -768,15 +761,20 @@ I first tried `wimcapture` on the perl dataset:
sys 1m2.743s
$ ll perl-install.*
-rw-r--r-- 1 mhx users 582654491 Nov 29 23:52 perl-install.dwarfs
-rw-r--r-- 1 mhx users 510428994 Dec 9 19:52 perl-install.dwarfs
-rw-r--r-- 1 mhx users 1016971956 Dec 6 00:12 perl-install.wim
-rw-r--r-- 1 mhx users 4748902400 Nov 25 00:37 perl-install.squashfs
So wimlib is definitely much better than squashfs, in terms of both
compression ratio and speed. DwarFS is still about 30% faster to create
the file system and the DwarFS file system is more than 40% smaller.
compression ratio and speed. DwarFS is still about 20% faster to create
the file system and the DwarFS file system is almost 50% smaller.
When switching to LZMA and metadata compression, the DwarFS file system
is more than 50% smaller (wimlib uses LZMS compression by default).
is almost 60% smaller (wimlib uses LZMS compression by default). And
when ultimately using the same block size as wimlib (64 MiB) for DwarFS,
the file system image shrinks down to 1/3 the size of the *wim* file:
-rw-r--r-- 1 mhx users 335611507 Dec 9 23:11 perl-install-64M.dwarfs
-rw-r--r-- 1 mhx users 1016971956 Dec 6 00:12 perl-install.wim
What's a bit surprising is that mounting a *wim* file takes quite a bit
of time:
@ -790,7 +788,7 @@ of time:
Mounting the DwarFS image takes almost no time in comparison:
$ time ./dwarfs perl-install.dwarfs mnt
$ time dwarfs perl-install.dwarfs mnt
00:36:42.626580 dwarfs (0.2.3)
real 0m0.010s
@ -802,7 +800,7 @@ initializes the file system in the background. However, even when
running it in the foreground, initializing the file system takes only
a few milliseconds:
$ ./dwarfs perl-install.dwarfs mnt -f
$ dwarfs perl-install.dwarfs mnt -f
00:35:44.975437 dwarfs (0.2.3)
00:35:44.987450 file system initialized [5.064ms]
@ -863,7 +861,7 @@ This *still* takes surprisingly long to mount:
However, it's really usable as a file system, even though it's about
4-5 times slower than the DwarFS image:
$ hyperfine -c 'umount /tmp/perl/install' -p 'umount /tmp/perl/install; ./dwarfs perl-install.dwarfs /tmp/perl/install -o cachesize=1g -o workers=4; sleep 1' -n dwarfs "ls -1 /tmp/perl/install/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '\$0 -v >/dev/null'" -p 'umount /tmp/perl/install; wimmount perl-install-nonsolid.wim /tmp/perl/install; sleep 1' -n wimlib "ls -1 /tmp/perl/install/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '\$0 -v >/dev/null'"
$ hyperfine -c 'umount /tmp/perl/install' -p 'umount /tmp/perl/install; dwarfs perl-install.dwarfs /tmp/perl/install -o cachesize=1g -o workers=4; sleep 1' -n dwarfs "ls -1 /tmp/perl/install/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '\$0 -v >/dev/null'" -p 'umount /tmp/perl/install; wimmount perl-install-nonsolid.wim /tmp/perl/install; sleep 1' -n wimlib "ls -1 /tmp/perl/install/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '\$0 -v >/dev/null'"
Benchmark #1: dwarfs
Time (mean ± σ): 2.295 s ± 0.362 s [User: 1.823 s, System: 3.173 s]
Range (min … max): 1.813 s … 2.606 s 10 runs
@ -948,45 +946,46 @@ a lot less redundancy:
And repeating the same task with `mkdwarfs`:
$ time mkdwarfs -i install-small -o perl-install-small.dwarfs
14:52:09.009618 scanning install-small
14:52:09.195087 waiting for background scanners...
14:52:09.612164 assigning directory and link inodes...
14:52:09.618281 finding duplicate files...
14:52:09.718756 saved 267.8 MiB / 611.8 MiB in 22842/26401 duplicate files
14:52:09.718837 waiting for inode scanners...
14:52:09.926978 assigning device inodes...
14:52:09.927745 assigning pipe/socket inodes...
14:52:09.928211 building metadata...
14:52:09.928293 building blocks...
14:52:09.928302 saving names and links...
14:52:09.928382 ordering 3559 inodes by similarity...
14:52:09.930836 3559 inodes ordered [2.401ms]
14:52:09.930891 assigning file inodes...
14:52:09.933716 updating name and link indices...
14:52:27.051383 waiting for block compression to finish...
14:52:27.072944 saving chunks...
14:52:27.074108 saving directories...
14:52:27.154133 waiting for compression to finish...
14:52:40.508238 compressed 611.8 MiB to 25.76 MiB (ratio=0.0420963)
14:52:40.525452 filesystem created without errors [31.52s]
⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯
23:16:57.167629 scanning install-small
23:16:57.391819 waiting for background scanners...
23:16:57.796123 assigning directory and link inodes...
23:16:57.801648 finding duplicate files...
23:16:57.920016 saved 267.8 MiB / 611.8 MiB in 22842/26401 duplicate files
23:16:57.920098 waiting for inode scanners...
23:16:58.715802 assigning device inodes...
23:16:58.716760 assigning pipe/socket inodes...
23:16:58.717360 building metadata...
23:16:58.717455 building blocks...
23:16:58.717495 saving names and links...
23:16:58.717663 ordering 3559 inodes using nilsimsa similarity...
23:16:58.718019 nilsimsa: depth=25000, limit=255
23:16:58.724538 updating name and link indices...
23:16:58.726197 pre-sorted index (3360 name, 2127 path lookups) [8.088ms]
23:16:58.870936 3559 inodes ordered [153.2ms]
23:16:58.871021 waiting for segmenting/blockifying to finish...
23:17:17.143687 saving chunks...
23:17:17.144423 saving directories...
23:17:17.198977 waiting for compression to finish...
23:17:29.321495 compressed 611.8 MiB to 24.18 MiB (ratio=0.039523)
23:17:29.337482 filesystem created without errors [32.17s]
⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯
waiting for block compression to finish
scanned/found: 3334/3334 dirs, 0/0 links, 26401/26401 files
original size: 611.8 MiB, dedupe: 267.8 MiB (22842 files), segment: 142.8 MiB
filesystem: 201.2 MiB in 13 blocks (9847 chunks, 3559/3559 inodes)
compressed filesystem: 13 blocks/25.76 MiB written
██████████████████████████████████████████████████████████████████████▏100% |
3334 dirs, 0/0 soft/hard links, 26401/26401 files, 0 other
original size: 611.8 MiB, dedupe: 267.8 MiB (22842 files), segment: 130 MiB
filesystem: 214.1 MiB in 14 blocks (10229 chunks, 3559/3559 inodes)
compressed filesystem: 14 blocks/24.18 MiB written
████████████████████████████████████████████████████████████████████▏100% \
real 0m31.553s
user 3m21.854s
sys 0m3.726s
real 0m32.230s
user 3m37.366s
sys 0m3.950s
So `mkdwarfs` is about 50 times faster than `mkcromfs` and uses 80 times
less CPU resources. At the same time, the DwarFS file system is 25% smaller:
less CPU resources. At the same time, the DwarFS file system is 30% smaller:
$ ls -l perl-install-small.*fs
-rw-r--r-- 1 mhx users 35328512 Dec 8 14:25 perl-install-small.cromfs
-rw-r--r-- 1 mhx users 27006735 Dec 8 14:52 perl-install-small.dwarfs
-rw-r--r-- 1 mhx users 25355862 Dec 9 23:17 perl-install-small.dwarfs
I noticed that the `blockifying` step that took ages for the full dataset
with `mkcromfs` ran substantially faster (in terms of MiB/second) on the
@ -997,46 +996,47 @@ In order to be completely fair, I also ran `mkdwarfs` with `-l 9` to enable
LZMA compression (which is what `mkcromfs` uses by default):
$ time mkdwarfs -i install-small -o perl-install-small-l9.dwarfs -l 9
15:05:59.344501 scanning install-small
15:05:59.529269 waiting for background scanners...
15:05:59.933753 assigning directory and link inodes...
15:05:59.938668 finding duplicate files...
15:06:00.026974 saved 267.8 MiB / 611.8 MiB in 22842/26401 duplicate files
15:06:00.027054 waiting for inode scanners...
15:06:00.240184 assigning device inodes...
15:06:00.241129 assigning pipe/socket inodes...
15:06:00.241723 building metadata...
15:06:00.241803 building blocks...
15:06:00.241840 saving names and links...
15:06:00.241992 ordering 3559 inodes by similarity...
15:06:00.246133 3559 inodes ordered [4.057ms]
15:06:00.246219 assigning file inodes...
15:06:00.248957 updating name and link indices...
15:06:19.132473 waiting for block compression to finish...
15:06:19.133229 saving chunks...
15:06:19.134430 saving directories...
15:06:19.192477 waiting for compression to finish...
15:06:33.125893 compressed 611.8 MiB to 21.06 MiB (ratio=0.0344202)
15:06:33.136930 filesystem created without errors [33.79s]
⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯
23:20:50.363882 scanning install-small
23:20:50.584318 waiting for background scanners...
23:20:50.970406 assigning directory and link inodes...
23:20:50.976176 finding duplicate files...
23:20:51.091204 saved 267.8 MiB / 611.8 MiB in 22842/26401 duplicate files
23:20:51.091322 waiting for inode scanners...
23:20:51.877998 assigning device inodes...
23:20:51.878976 assigning pipe/socket inodes...
23:20:51.879486 building metadata...
23:20:51.879548 building blocks...
23:20:51.879587 saving names and links...
23:20:51.879786 ordering 3559 inodes using nilsimsa similarity...
23:20:51.880238 nilsimsa: depth=25000, limit=255
23:20:51.887597 updating name and link indices...
23:20:51.889207 pre-sorted index (3360 name, 2127 path lookups) [8.836ms]
23:20:52.007817 3559 inodes ordered [127.9ms]
23:20:52.007903 waiting for segmenting/blockifying to finish...
23:21:12.038054 saving chunks...
23:21:12.039631 saving directories...
23:21:12.134903 waiting for compression to finish...
23:21:26.166560 compressed 611.8 MiB to 19.62 MiB (ratio=0.0320689)
23:21:26.181937 filesystem created without errors [35.82s]
⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯
waiting for block compression to finish
scanned/found: 3334/3334 dirs, 0/0 links, 26401/26401 files
original size: 611.8 MiB, dedupe: 267.8 MiB (22842 files), segment: 142.8 MiB
filesystem: 201.2 MiB in 13 blocks (9847 chunks, 3559/3559 inodes)
compressed filesystem: 13 blocks/21.06 MiB written
██████████████████████████████████████████████████████████████████████▏100% \
3334 dirs, 0/0 soft/hard links, 26401/26401 files, 0 other
original size: 611.8 MiB, dedupe: 267.8 MiB (22842 files), segment: 130 MiB
filesystem: 214.1 MiB in 14 blocks (10229 chunks, 3559/3559 inodes)
compressed filesystem: 14 blocks/19.62 MiB written
████████████████████████████████████████████████████████████████████▏100% /
real 0m33.834s
user 3m56.922s
sys 0m4.328s
real 0m35.876s
user 4m15.218s
sys 0m4.881s
$ ls -l perl-install-small*.*fs
-rw-r--r-- 1 mhx users 22082143 Dec 8 15:06 perl-install-small-l9.dwarfs
-rw-r--r-- 1 mhx users 20573703 Dec 9 23:21 perl-install-small-l9.dwarfs
-rw-r--r-- 1 mhx users 35328512 Dec 8 14:25 perl-install-small.cromfs
-rw-r--r-- 1 mhx users 26928161 Dec 8 15:05 perl-install-small.dwarfs
-rw-r--r-- 1 mhx users 25355862 Dec 9 23:17 perl-install-small.dwarfs
It only takes 2 seconds longer to build the DwarFS file system with LZMA
compression, but reduces the size even further to make it almost 40%
It only takes 3 seconds longer to build the DwarFS file system with LZMA
compression, but reduces the size even further to make it more than 40%
smaller than the Cromfs file system.
I would have added some benchmarks with the Cromfs FUSE driver, but sadly
@ -1097,39 +1097,40 @@ Admittedly, this isn't a fair comparison. EROFS has a fixed block size
of 4 KiB, and it uses LZ4 compression. If we tweak DwarFS to the same
parameters, we get:
$ time mkdwarfs -i install-small/ -o perl-install-small-lz4.dwarfs -C lz4hc:level=9 -S 12
15:06:48.432260 scanning install-small/
15:06:48.646910 waiting for background scanners...
15:06:49.041670 assigning directory and link inodes...
15:06:49.047244 finding duplicate files...
15:06:49.155198 saved 267.8 MiB / 611.8 MiB in 22842/26401 duplicate files
15:06:49.155279 waiting for inode scanners...
15:06:49.363318 assigning device inodes...
15:06:49.364154 assigning pipe/socket inodes...
15:06:49.364580 building metadata...
15:06:49.364649 building blocks...
15:06:49.364679 saving names and links...
15:06:49.364773 ordering 3559 inodes by similarity...
15:06:49.367529 3559 inodes ordered [2.678ms]
15:06:49.367601 assigning file inodes...
15:06:49.370936 updating name and link indices...
15:07:00.850769 waiting for block compression to finish...
15:07:00.850953 saving chunks...
15:07:00.852170 saving directories...
15:07:00.906353 waiting for compression to finish...
15:07:00.907786 compressed 611.8 MiB to 140.4 MiB (ratio=0.229396)
15:07:00.917665 filesystem created without errors [12.49s]
$ time mkdwarfs -i install-small -o perl-install-small-lz4.dwarfs -C lz4hc:level=9 -S 12
23:24:16.028616 scanning install-small
23:24:16.250673 waiting for background scanners...
23:24:16.644500 assigning directory and link inodes...
23:24:16.650032 finding duplicate files...
23:24:16.771272 saved 267.8 MiB / 611.8 MiB in 22842/26401 duplicate files
23:24:16.771353 waiting for inode scanners...
23:24:17.542446 assigning device inodes...
23:24:17.543254 assigning pipe/socket inodes...
23:24:17.543557 building metadata...
23:24:17.543600 building blocks...
23:24:17.543618 saving names and links...
23:24:17.543737 ordering 3559 inodes using nilsimsa similarity...
23:24:17.544001 nilsimsa: depth=25000, limit=255
23:24:17.548094 pre-sorted index (3360 name, 2127 path lookups) [4.041ms]
23:24:17.553405 updating name and link indices...
23:24:17.668847 3559 inodes ordered [125ms]
23:24:17.668936 waiting for segmenting/blockifying to finish...
23:24:27.310633 saving chunks...
23:24:27.314192 saving directories...
23:24:27.367761 waiting for compression to finish...
23:24:27.368812 compressed 611.8 MiB to 140.3 MiB (ratio=0.229247)
23:24:27.382266 filesystem created without errors [11.35s]
⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯
waiting for block compression to finish
scanned/found: 3334/3334 dirs, 0/0 links, 26401/26401(0) files
original size: 611.8 MiB, dedupe: 267.8 MiB (22842 files), segment: 1.466 MiB
filesystem: 342.6 MiB in 87697 blocks (91884 chunks, 3559/3559 inodes)
compressed filesystem: 87697 blocks/140.4 MiB written
██████████████████████████████████████████████████████████████████████▏100% -
3334 dirs, 0/0 soft/hard links, 26401/26401 files, 0 other
original size: 611.8 MiB, dedupe: 267.8 MiB (22842 files), segment: 1.541 MiB
filesystem: 342.5 MiB in 87678 blocks (91908 chunks, 3559/3559 inodes)
compressed filesystem: 87678 blocks/140.3 MiB written
██████████████████████████████████████████████████████████████████████▏100% \
real 0m12.690s
user 0m33.772s
sys 0m4.031s
real 0m11.383s
user 0m40.879s
sys 0m2.497s
It finishes in less than half the time and produces an output image
that's half the size of the EROFS image.