mirror of
https://github.com/mhx/dwarfs.git
synced 2025-08-04 02:06:22 -04:00
Markdown cleanup
This commit is contained in:
parent
9ad4dd655f
commit
569966b752
191
README.md
191
README.md
@ -6,22 +6,22 @@ A fast high compression read-only file system
|
||||
|
||||
## Table of contents
|
||||
|
||||
* [Overview](#overview)
|
||||
* [History](#history)
|
||||
* [Building and Installing](#building-and-installing)
|
||||
* [Dependencies](#dependencies)
|
||||
* [Building](#building)
|
||||
* [Installing](#installing)
|
||||
* [Experimental Python Scripting Support](#experimental-python-scripting-support)
|
||||
* [Usage](#usage)
|
||||
* [Comparison](#comparison)
|
||||
* [With SquashFS](#with-squashfs)
|
||||
* [With SquashFS & xz](#with-squashfs--xz)
|
||||
* [With lrzip](#with-lrzip)
|
||||
* [With zpaq](#with-zpaq)
|
||||
* [With wimlib](#with-wimlib)
|
||||
* [With Cromfs](#with-cromfs)
|
||||
* [With EROFS](#with-erofs)
|
||||
- [Overview](#overview)
|
||||
- [History](#history)
|
||||
- [Building and Installing](#building-and-installing)
|
||||
- [Dependencies](#dependencies)
|
||||
- [Building](#building)
|
||||
- [Installing](#installing)
|
||||
- [Experimental Python Scripting Support](#experimental-python-scripting-support)
|
||||
- [Usage](#usage)
|
||||
- [Comparison](#comparison)
|
||||
- [With SquashFS](#with-squashfs)
|
||||
- [With SquashFS & xz](#with-squashfs--xz)
|
||||
- [With lrzip](#with-lrzip)
|
||||
- [With zpaq](#with-zpaq)
|
||||
- [With wimlib](#with-wimlib)
|
||||
- [With Cromfs](#with-cromfs)
|
||||
- [With EROFS](#with-erofs)
|
||||
|
||||
## Overview
|
||||
|
||||
@ -45,20 +45,20 @@ less CPU resources.
|
||||
|
||||
Distinct features of DwarFS are:
|
||||
|
||||
* Clustering of files by similarity using a similarity hash function.
|
||||
- Clustering of files by similarity using a similarity hash function.
|
||||
This makes it easier to exploit the redundancy across file boundaries.
|
||||
|
||||
* Segmentation analysis across file system blocks in order to reduce
|
||||
- Segmentation analysis across file system blocks in order to reduce
|
||||
the size of the uncompressed file system. This saves memory when
|
||||
using the compressed file system and thus potentially allows for
|
||||
higher cache hit rates as more data can be kept in the cache.
|
||||
|
||||
* Highly multi-threaded implementation. Both the file
|
||||
- Highly multi-threaded implementation. Both the file
|
||||
[system creation tool](doc/mkdwarfs.md) as well as the
|
||||
[FUSE driver](doc/dwarfs.md) are able to make good use of the
|
||||
many cores of your system.
|
||||
|
||||
* Optional experimental Python scripting support to provide custom
|
||||
- Optional experimental Python scripting support to provide custom
|
||||
filtering and ordering functionality.
|
||||
|
||||
## History
|
||||
@ -129,6 +129,7 @@ will be automatically resolved if you build with tests.
|
||||
|
||||
A good starting point for apt-based systems is probably:
|
||||
|
||||
```
|
||||
$ apt install \
|
||||
g++ \
|
||||
clang \
|
||||
@ -161,6 +162,7 @@ A good starting point for apt-based systems is probably:
|
||||
libfmt-dev \
|
||||
libfuse3-dev \
|
||||
libgoogle-glog-dev
|
||||
```
|
||||
|
||||
Note that when building with `gcc`, the optimization level will be
|
||||
set to `-O2` instead of the CMake default of `-O3` for release
|
||||
@ -168,30 +170,37 @@ builds. At least with versions up to `gcc-10`, the `-O3` build is
|
||||
[up to 70% slower](https://github.com/mhx/dwarfs/issues/14) than a
|
||||
build with `-O2`.
|
||||
|
||||
|
||||
### Building
|
||||
|
||||
Firstly, either clone the repository...
|
||||
|
||||
```
|
||||
$ git clone --recurse-submodules https://github.com/mhx/dwarfs
|
||||
$ cd dwarfs
|
||||
```
|
||||
|
||||
...or unpack the release archive:
|
||||
|
||||
```
|
||||
$ tar xvf dwarfs-x.y.z.tar.bz2
|
||||
$ cd dwarfs-x.y.z
|
||||
```
|
||||
|
||||
Once all dependencies have been installed, you can build DwarFS
|
||||
using:
|
||||
|
||||
```
|
||||
$ mkdir build
|
||||
$ cd build
|
||||
$ cmake .. -DWITH_TESTS=1
|
||||
$ make -j$(nproc)
|
||||
```
|
||||
|
||||
You can then run tests with:
|
||||
|
||||
```
|
||||
$ make test
|
||||
```
|
||||
|
||||
All binaries use [jemalloc](https://github.com/jemalloc/jemalloc)
|
||||
as a memory allocator by default, as it is typically uses much less
|
||||
@ -203,7 +212,9 @@ To disable the use of `jemalloc`, pass `-DUSE_JEMALLOC=0` on the
|
||||
|
||||
Installing is as easy as:
|
||||
|
||||
```
|
||||
$ sudo make install
|
||||
```
|
||||
|
||||
Though you don't have to install the tools to play with them.
|
||||
|
||||
@ -212,13 +223,17 @@ Though you don't have to install the tools to play with them.
|
||||
You can build `mkdwarfs` with experimental support for Python
|
||||
scripting:
|
||||
|
||||
```
|
||||
$ cmake .. -DWITH_TESTS=1 -DWITH_PYTHON=1
|
||||
```
|
||||
|
||||
This also requires Boost.Python. If you have multiple Python
|
||||
versions installed, you can explicitly specify the version to
|
||||
build against:
|
||||
|
||||
```
|
||||
$ cmake .. -DWITH_TESTS=1 -DWITH_PYTHON=1 -DWITH_PYTHON_VERSION=3.8
|
||||
```
|
||||
|
||||
Note that only Python 3 is supported. You can take a look at
|
||||
[scripts/example.py](scripts/example.py) to get an idea for
|
||||
@ -259,6 +274,7 @@ NVME drive, so most of its contents were likely cached.
|
||||
I'm using the same compression type and compression level for
|
||||
SquashFS that is the default setting for DwarFS:
|
||||
|
||||
```
|
||||
$ time mksquashfs install perl-install.squashfs -comp zstd -Xcompression-level 22
|
||||
Parallel mksquashfs: Using 16 processors
|
||||
Creating 4.0 filesystem on perl-install-zstd.squashfs, block size 131072.
|
||||
@ -292,9 +308,11 @@ SquashFS that is the default setting for DwarFS:
|
||||
real 32m54.713s
|
||||
user 501m46.382s
|
||||
sys 0m58.528s
|
||||
```
|
||||
|
||||
For DwarFS, I'm sticking to the defaults:
|
||||
|
||||
```
|
||||
$ time mkdwarfs -i install -o perl-install.dwarfs
|
||||
I 11:33:33.310931 scanning install
|
||||
I 11:33:39.026712 waiting for background scanners...
|
||||
@ -333,13 +351,16 @@ For DwarFS, I'm sticking to the defaults:
|
||||
real 5m23.030s
|
||||
user 78m7.554s
|
||||
sys 1m47.968s
|
||||
```
|
||||
|
||||
So in this comparison, `mkdwarfs` is **more than 6 times faster** than `mksquashfs`,
|
||||
both in terms of CPU time and wall clock time.
|
||||
|
||||
```
|
||||
$ ll perl-install.*fs
|
||||
-rw-r--r-- 1 mhx users 447230618 Mar 3 20:28 perl-install.dwarfs
|
||||
-rw-r--r-- 1 mhx users 4748902400 Mar 3 20:10 perl-install.squashfs
|
||||
```
|
||||
|
||||
In terms of compression ratio, the **DwarFS file system is more than 10 times
|
||||
smaller than the SquashFS file system**. With DwarFS, the content has been
|
||||
@ -351,21 +372,27 @@ the original space**.
|
||||
|
||||
Here's another comparison using `lzma` compression instead of `zstd`:
|
||||
|
||||
```
|
||||
$ time mksquashfs install perl-install-lzma.squashfs -comp lzma
|
||||
|
||||
real 13m42.825s
|
||||
user 205m40.851s
|
||||
sys 3m29.088s
|
||||
```
|
||||
|
||||
```
|
||||
$ time mkdwarfs -i install -o perl-install-lzma.dwarfs -l9
|
||||
|
||||
real 3m43.937s
|
||||
user 49m45.295s
|
||||
sys 1m44.550s
|
||||
```
|
||||
|
||||
```
|
||||
$ ll perl-install-lzma.*fs
|
||||
-rw-r--r-- 1 mhx users 315482627 Mar 3 21:23 perl-install-lzma.dwarfs
|
||||
-rw-r--r-- 1 mhx users 3838406656 Mar 3 20:50 perl-install-lzma.squashfs
|
||||
```
|
||||
|
||||
It's immediately obvious that the runs are significantly faster and the
|
||||
resulting images are significantly smaller. Still, `mkdwarfs` is about
|
||||
@ -383,21 +410,27 @@ uses a block size of 128KiB, whereas `mkdwarfs` uses 16MiB blocks by default,
|
||||
or even 64MiB blocks with `-l9`. When using identical block sizes for both
|
||||
file systems, the difference, quite expectedly, becomes a lot less dramatic:
|
||||
|
||||
```
|
||||
$ time mksquashfs install perl-install-lzma-1M.squashfs -comp lzma -b 1M
|
||||
|
||||
real 15m43.319s
|
||||
user 139m24.533s
|
||||
sys 0m45.132s
|
||||
```
|
||||
|
||||
```
|
||||
$ time mkdwarfs -i install -o perl-install-lzma-1M.dwarfs -l9 -S20 -B3
|
||||
|
||||
real 4m25.973s
|
||||
user 52m15.100s
|
||||
sys 7m41.889s
|
||||
```
|
||||
|
||||
```
|
||||
$ ll perl-install*.*fs
|
||||
-rw-r--r-- 1 mhx users 935953866 Mar 13 12:12 perl-install-lzma-1M.dwarfs
|
||||
-rw-r--r-- 1 mhx users 3407474688 Mar 3 21:54 perl-install-lzma-1M.squashfs
|
||||
```
|
||||
|
||||
Even this is *still* not entirely fair, as it uses a feature (`-B3`) that allows
|
||||
DwarFS to reference file chunks from up to two previous filesystem blocks.
|
||||
@ -413,6 +446,7 @@ fast experimentation with different algorithms and options without requiring
|
||||
a full rebuild of the file system. For example, recompressing the above file
|
||||
system with the best possible compression (`-l 9`):
|
||||
|
||||
```
|
||||
$ time mkdwarfs --recompress -i perl-install.dwarfs -o perl-lzma-re.dwarfs -l9
|
||||
I 20:28:03.246534 filesystem rewrittenwithout errors [148.3s]
|
||||
⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯
|
||||
@ -423,11 +457,14 @@ system with the best possible compression (`-l 9`):
|
||||
real 2m28.279s
|
||||
user 37m8.825s
|
||||
sys 0m43.256s
|
||||
```
|
||||
|
||||
```
|
||||
$ ll perl-*.dwarfs
|
||||
-rw-r--r-- 1 mhx users 447230618 Mar 3 20:28 perl-install.dwarfs
|
||||
-rw-r--r-- 1 mhx users 390845518 Mar 4 20:28 perl-lzma-re.dwarfs
|
||||
-rw-r--r-- 1 mhx users 315482627 Mar 3 21:23 perl-install-lzma.dwarfs
|
||||
```
|
||||
|
||||
Note that while the recompressed filesystem is smaller than the original image,
|
||||
it is still a lot bigger than the filesystem we previously build with `-l9`.
|
||||
@ -438,6 +475,7 @@ In terms of how fast the file system is when using it, a quick test
|
||||
I've done is to freshly mount the filesystem created above and run
|
||||
each of the 1139 `perl` executables to print their version.
|
||||
|
||||
```
|
||||
$ hyperfine -c "umount mnt" -p "umount mnt; dwarfs perl-install.dwarfs mnt -o cachesize=1g -o workers=4; sleep 1" -P procs 5 20 -D 5 "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P{procs} sh -c '\$0 -v >/dev/null'"
|
||||
Benchmark #1: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P5 sh -c '$0 -v >/dev/null'
|
||||
Time (mean ± σ): 1.810 s ± 0.013 s [User: 1.847 s, System: 0.623 s]
|
||||
@ -454,6 +492,7 @@ each of the 1139 `perl` executables to print their version.
|
||||
Benchmark #4: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '$0 -v >/dev/null'
|
||||
Time (mean ± σ): 1.149 s ± 0.015 s [User: 2.128 s, System: 0.781 s]
|
||||
Range (min … max): 1.136 s … 1.186 s 10 runs
|
||||
```
|
||||
|
||||
These timings are for *initial* runs on a freshly mounted file system,
|
||||
running 5, 10, 15 and 20 processes in parallel. 1.1 seconds means that
|
||||
@ -462,6 +501,7 @@ it takes only about 1 millisecond per Perl binary.
|
||||
Following are timings for *subsequent* runs, both on DwarFS (at `mnt`)
|
||||
and the original XFS (at `install`). DwarFS is around 15% slower here:
|
||||
|
||||
```
|
||||
$ hyperfine -P procs 10 20 -D 10 -w1 "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P{procs} sh -c '\$0 -v >/dev/null'" "ls -1 install/*/*/bin/perl5* | xargs -d $'\n' -n1 -P{procs} sh -c '\$0 -v >/dev/null'"
|
||||
Benchmark #1: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P10 sh -c '$0 -v >/dev/null'
|
||||
Time (mean ± σ): 347.0 ms ± 7.2 ms [User: 1.755 s, System: 0.452 s]
|
||||
@ -484,10 +524,12 @@ and the original XFS (at `install`). DwarFS is around 15% slower here:
|
||||
1.00 ± 0.01 times faster than 'ls -1 install/*/*/bin/perl5* | xargs -d $'\n' -n1 -P10 sh -c '$0 -v >/dev/null''
|
||||
1.13 ± 0.02 times faster than 'ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '$0 -v >/dev/null''
|
||||
1.15 ± 0.03 times faster than 'ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P10 sh -c '$0 -v >/dev/null''
|
||||
```
|
||||
|
||||
Using the lzma-compressed file system, the metrics for *initial* runs look
|
||||
considerably worse (about an order of magnitude):
|
||||
|
||||
```
|
||||
$ hyperfine -c "umount mnt" -p "umount mnt; dwarfs perl-install-lzma.dwarfs mnt -o cachesize=1g -o workers=4; sleep 1" -P procs 5 20 -D 5 "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P{procs} sh -c '\$0 -v >/dev/null'"
|
||||
Benchmark #1: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P5 sh -c '$0 -v >/dev/null'
|
||||
Time (mean ± σ): 10.660 s ± 0.057 s [User: 1.952 s, System: 0.729 s]
|
||||
@ -504,6 +546,7 @@ considerably worse (about an order of magnitude):
|
||||
Benchmark #4: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '$0 -v >/dev/null'
|
||||
Time (mean ± σ): 9.004 s ± 0.298 s [User: 2.134 s, System: 0.736 s]
|
||||
Range (min … max): 8.611 s … 9.555 s 10 runs
|
||||
```
|
||||
|
||||
So you might want to consider using `zstd` instead of `lzma` if you'd
|
||||
like to optimize for file system performance. It's also the default
|
||||
@ -511,6 +554,7 @@ compression used by `mkdwarfs`.
|
||||
|
||||
Now here's a comparison with the SquashFS filesystem:
|
||||
|
||||
```
|
||||
$ hyperfine -c 'sudo umount mnt' -p 'umount mnt; dwarfs perl-install.dwarfs mnt -o cachesize=1g -o workers=4; sleep 1' -n dwarfs-zstd "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '\$0 -v >/dev/null'" -p 'sudo umount mnt; sudo mount -t squashfs perl-install.squashfs mnt; sleep 1' -n squashfs-zstd "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '\$0 -v >/dev/null'"
|
||||
Benchmark #1: dwarfs-zstd
|
||||
Time (mean ± σ): 1.151 s ± 0.015 s [User: 2.147 s, System: 0.769 s]
|
||||
@ -523,17 +567,20 @@ Now here's a comparison with the SquashFS filesystem:
|
||||
Summary
|
||||
'dwarfs-zstd' ran
|
||||
5.85 ± 0.08 times faster than 'squashfs-zstd'
|
||||
```
|
||||
|
||||
So DwarFS is almost six times faster than SquashFS. But what's more,
|
||||
SquashFS also uses significantly more CPU power. However, the numbers
|
||||
shown above for DwarFS obviously don't include the time spent in the
|
||||
`dwarfs` process, so I repeated the test outside of hyperfine:
|
||||
|
||||
```
|
||||
$ time dwarfs perl-install.dwarfs mnt -o cachesize=1g -o workers=4 -f
|
||||
|
||||
real 0m4.569s
|
||||
user 0m2.154s
|
||||
sys 0m1.846s
|
||||
```
|
||||
|
||||
So in total, DwarFS was using 5.7 seconds of CPU time, whereas
|
||||
SquashFS was using 20.2 seconds, almost four times as much. Ignore
|
||||
@ -546,13 +593,13 @@ used, [Tie::Hash::Indexed](https://github.com/mhx/Tie-Hash-Indexed),
|
||||
has an XS component that requires a C compiler to build. So this really
|
||||
accesses a lot of different stuff in the file system:
|
||||
|
||||
* The `perl` executables and its shared libraries
|
||||
- The `perl` executables and its shared libraries
|
||||
|
||||
* The Perl modules used for writing the Makefile
|
||||
- The Perl modules used for writing the Makefile
|
||||
|
||||
* Perl's C header files used for building the module
|
||||
- Perl's C header files used for building the module
|
||||
|
||||
* More Perl modules used for running the tests
|
||||
- More Perl modules used for running the tests
|
||||
|
||||
I wrote a little script to be able to run multiple builds in parallel:
|
||||
|
||||
@ -574,28 +621,36 @@ The following command will run up to 16 builds in parallel on the 8 core
|
||||
Xeon CPU, including debug, optimized and threaded versions of all Perl
|
||||
releases between 5.10.0 and 5.33.3, a total of 624 `perl` installations:
|
||||
|
||||
```
|
||||
$ time ls -1 /tmp/perl/install/*/perl-5.??.?/bin/perl5* | sort -t / -k 8 | xargs -d $'\n' -P 16 -n 1 ./build.sh
|
||||
```
|
||||
|
||||
Tests were done with a cleanly mounted file system to make sure the caches
|
||||
were empty. `ccache` was primed to make sure all compiler runs could be
|
||||
satisfied from the cache. With SquashFS, the timing was:
|
||||
|
||||
```
|
||||
real 0m52.385s
|
||||
user 8m10.333s
|
||||
sys 4m10.056s
|
||||
```
|
||||
|
||||
And with DwarFS:
|
||||
|
||||
```
|
||||
real 0m50.469s
|
||||
user 9m22.597s
|
||||
sys 1m18.469s
|
||||
```
|
||||
|
||||
So, frankly, not much of a difference, with DwarFS being just a bit faster.
|
||||
The `dwarfs` process itself used:
|
||||
|
||||
```
|
||||
real 0m56.686s
|
||||
user 0m18.857s
|
||||
sys 0m21.058s
|
||||
```
|
||||
|
||||
So again, DwarFS used less raw CPU power overall, but in terms of wallclock
|
||||
time, the difference is really marginal.
|
||||
@ -606,6 +661,7 @@ This test uses slightly less pathological input data: the root filesystem of
|
||||
a recent Raspberry Pi OS release. This file system also contains device inodes,
|
||||
so in order to preserve those, we pass `--with-devices` to `mkdwarfs`:
|
||||
|
||||
```
|
||||
$ time sudo mkdwarfs -i raspbian -o raspbian.dwarfs --with-devices
|
||||
I 21:30:29.812562 scanning raspbian
|
||||
I 21:30:29.908984 waiting for background scanners...
|
||||
@ -640,9 +696,11 @@ so in order to preserve those, we pass `--with-devices` to `mkdwarfs`:
|
||||
real 0m46.711s
|
||||
user 10m39.038s
|
||||
sys 0m8.123s
|
||||
```
|
||||
|
||||
Again, SquashFS uses the same compression options:
|
||||
|
||||
```
|
||||
$ time sudo mksquashfs raspbian raspbian.squashfs -comp zstd -Xcompression-level 22
|
||||
Parallel mksquashfs: Using 16 processors
|
||||
Creating 4.0 filesystem on raspbian.squashfs, block size 131072.
|
||||
@ -694,62 +752,77 @@ Again, SquashFS uses the same compression options:
|
||||
real 0m50.124s
|
||||
user 9m41.708s
|
||||
sys 0m1.727s
|
||||
```
|
||||
|
||||
The difference in speed is almost negligible. SquashFS is just a bit
|
||||
slower here. In terms of compression, the difference also isn't huge:
|
||||
|
||||
```
|
||||
$ ls -lh raspbian.* *.xz
|
||||
-rw-r--r-- 1 mhx users 297M Mar 4 21:32 2020-08-20-raspios-buster-armhf-lite.img.xz
|
||||
-rw-r--r-- 1 root root 287M Mar 4 21:31 raspbian.dwarfs
|
||||
-rw-r--r-- 1 root root 364M Mar 4 21:33 raspbian.squashfs
|
||||
```
|
||||
|
||||
Interestingly, `xz` actually can't compress the whole original image
|
||||
better than DwarFS.
|
||||
|
||||
We can even again try to increase the DwarFS compression level:
|
||||
|
||||
```
|
||||
$ time sudo mkdwarfs -i raspbian -o raspbian-9.dwarfs --with-devices -l9
|
||||
|
||||
real 0m54.161s
|
||||
user 8m40.109s
|
||||
sys 0m7.101s
|
||||
```
|
||||
|
||||
Now that actually gets the DwarFS image size well below that of the
|
||||
`xz` archive:
|
||||
|
||||
```
|
||||
$ ls -lh raspbian-9.dwarfs *.xz
|
||||
-rw-r--r-- 1 root root 244M Mar 4 21:36 raspbian-9.dwarfs
|
||||
-rw-r--r-- 1 mhx users 297M Mar 4 21:32 2020-08-20-raspios-buster-armhf-lite.img.xz
|
||||
```
|
||||
|
||||
Even if you actually build a tarball and compress that (instead of
|
||||
compressing the EXT4 file system itself), `xz` isn't quite able to
|
||||
match the DwarFS image size:
|
||||
|
||||
```
|
||||
$ time sudo tar cf - raspbian | xz -9 -vT 0 >raspbian.tar.xz
|
||||
100 % 246.9 MiB / 1,037.2 MiB = 0.238 13 MiB/s 1:18
|
||||
|
||||
real 1m18.226s
|
||||
user 6m35.381s
|
||||
sys 0m2.205s
|
||||
```
|
||||
|
||||
```
|
||||
$ ls -lh raspbian.tar.xz
|
||||
-rw-r--r-- 1 mhx users 247M Mar 4 21:40 raspbian.tar.xz
|
||||
```
|
||||
|
||||
DwarFS also comes with the [dwarfsextract](doc/dwarfsextract.md) tool
|
||||
that allows extraction of a filesystem image without the FUSE driver.
|
||||
So here's a comparison of the extraction speed:
|
||||
|
||||
```
|
||||
$ time sudo tar xf raspbian.tar.xz -C out1
|
||||
|
||||
real 0m12.846s
|
||||
user 0m12.313s
|
||||
sys 0m1.616s
|
||||
```
|
||||
|
||||
```
|
||||
$ time sudo dwarfsextract -i raspbian-9.dwarfs -o out2
|
||||
|
||||
real 0m3.825s
|
||||
user 0m13.234s
|
||||
sys 0m1.382s
|
||||
```
|
||||
|
||||
So `dwarfsextract` is almost 4 times faster thanks to using multiple
|
||||
worker threads for decompression. It's writing about 300 MiB/s in this
|
||||
@ -759,14 +832,18 @@ Another nice feature of `dwarfsextract` is that it allows you to directly
|
||||
output data in an archive format, so you could create a tarball from
|
||||
your image without extracting the files to disk:
|
||||
|
||||
```
|
||||
$ dwarfsextract -i raspbian-9.dwarfs -f ustar | xz -9 -T0 >raspbian2.tar.xz
|
||||
```
|
||||
|
||||
This has the interesting side-effect that the resulting tarball will
|
||||
likely be smaller than the one built straight from the directory:
|
||||
|
||||
```
|
||||
$ ls -lh raspbian*.tar.xz
|
||||
-rw-r--r-- 1 mhx users 247M Mar 4 21:40 raspbian.tar.xz
|
||||
-rw-r--r-- 1 mhx users 240M Mar 4 23:52 raspbian2.tar.xz
|
||||
```
|
||||
|
||||
That's because `dwarfsextract` writes files in inode-order, and by
|
||||
default inodes are ordered by similarity for the best possible
|
||||
@ -784,14 +861,17 @@ When I first read about `lrzip`, I was pretty certain it would easily
|
||||
beat DwarFS. So let's take a look. `lrzip` operates on a single file,
|
||||
so it's necessary to first build a tarball:
|
||||
|
||||
```
|
||||
$ time tar cf perl-install.tar install
|
||||
|
||||
real 2m9.568s
|
||||
user 0m3.757s
|
||||
sys 0m26.623s
|
||||
```
|
||||
|
||||
Now we can run `lrzip`:
|
||||
|
||||
```
|
||||
$ time lrzip -vL9 -o perl-install.tar.lrzip perl-install.tar
|
||||
The following options are in effect for this COMPRESSION.
|
||||
Threading is ENABLED. Number of CPUs detected: 16
|
||||
@ -814,12 +894,15 @@ Now we can run `lrzip`:
|
||||
real 57m32.472s
|
||||
user 81m44.104s
|
||||
sys 4m50.221s
|
||||
```
|
||||
|
||||
That definitely took a while. This is about an order of magnitude
|
||||
slower than `mkdwarfs` and it barely makes use of the 8 cores.
|
||||
|
||||
```
|
||||
$ ll -h perl-install.tar.lrzip
|
||||
-rw-r--r-- 1 mhx users 500M Mar 6 21:16 perl-install.tar.lrzip
|
||||
```
|
||||
|
||||
This is a surprisingly disappointing result. The archive is 65% larger
|
||||
than a DwarFS image at `-l9` that takes less than 4 minutes to build.
|
||||
@ -828,6 +911,7 @@ unpacking the archive first.
|
||||
|
||||
That being said, it *is* better than just using `xz` on the tarball:
|
||||
|
||||
```
|
||||
$ time xz -T0 -v9 -c perl-install.tar >perl-install.tar.xz
|
||||
perl-install.tar (1/1)
|
||||
100 % 4,317.0 MiB / 49.0 GiB = 0.086 24 MiB/s 34:55
|
||||
@ -835,9 +919,12 @@ That being said, it *is* better than just using `xz` on the tarball:
|
||||
real 34m55.450s
|
||||
user 543m50.810s
|
||||
sys 0m26.533s
|
||||
```
|
||||
|
||||
```
|
||||
$ ll perl-install.tar.xz -h
|
||||
-rw-r--r-- 1 mhx users 4.3G Mar 6 22:59 perl-install.tar.xz
|
||||
```
|
||||
|
||||
### With zpaq
|
||||
|
||||
@ -850,10 +937,13 @@ can be used.
|
||||
|
||||
Anyway, how does it fare in terms of speed and compression performance?
|
||||
|
||||
```
|
||||
$ time zpaq a perl-install.zpaq install -m5
|
||||
```
|
||||
|
||||
After a few million lines of output that (I think) cannot be turned off:
|
||||
|
||||
```
|
||||
2258234 +added, 0 -removed.
|
||||
|
||||
0.000000 + (51161.953159 -> 8932.000297 -> 490.227707) = 490.227707 MB
|
||||
@ -862,30 +952,34 @@ After a few million lines of output that (I think) cannot be turned off:
|
||||
real 47m8.104s
|
||||
user 714m44.286s
|
||||
sys 3m6.751s
|
||||
```
|
||||
|
||||
So it's an order of magnitude slower than `mkdwarfs` and uses 14 times
|
||||
as much CPU resources as `mkdwarfs -l9`. The resulting archive it pretty
|
||||
close in size to the default configuration DwarFS image, but it's more
|
||||
than 50% bigger than the image produced by `mkdwarfs -l9`.
|
||||
|
||||
```
|
||||
$ ll perl-install*.*
|
||||
-rw-r--r-- 1 mhx users 490227707 Mar 7 01:38 perl-install.zpaq
|
||||
-rw-r--r-- 1 mhx users 315482627 Mar 3 21:23 perl-install-l9.dwarfs
|
||||
-rw-r--r-- 1 mhx users 447230618 Mar 3 20:28 perl-install.dwarfs
|
||||
```
|
||||
|
||||
What's *really* surprising is how slow it is to extract the `zpaq`
|
||||
archive again:
|
||||
|
||||
```
|
||||
$ time zpaq x perl-install.zpaq
|
||||
2798.097 seconds (all OK)
|
||||
|
||||
real 46m38.117s
|
||||
user 711m18.734s
|
||||
sys 3m47.876s
|
||||
```
|
||||
|
||||
That's 700 times slower than extracting the DwarFS image.
|
||||
|
||||
|
||||
### With wimlib
|
||||
|
||||
[wimlib](https://wimlib.net/) is a really interesting project that is
|
||||
@ -896,6 +990,7 @@ quite a rich set of features, so it's definitely worth taking a look at.
|
||||
|
||||
I first tried `wimcapture` on the perl dataset:
|
||||
|
||||
```
|
||||
$ time wimcapture --unix-data --solid --solid-chunk-size=16M install perl-install.wim
|
||||
Scanning "install"
|
||||
47 GiB scanned (1927501 files, 330733 directories)
|
||||
@ -905,12 +1000,15 @@ I first tried `wimcapture` on the perl dataset:
|
||||
real 15m23.310s
|
||||
user 174m29.274s
|
||||
sys 0m42.921s
|
||||
```
|
||||
|
||||
```
|
||||
$ ll perl-install.*
|
||||
-rw-r--r-- 1 mhx users 447230618 Mar 3 20:28 perl-install.dwarfs
|
||||
-rw-r--r-- 1 mhx users 315482627 Mar 3 21:23 perl-install-l9.dwarfs
|
||||
-rw-r--r-- 1 mhx users 4748902400 Mar 3 20:10 perl-install.squashfs
|
||||
-rw-r--r-- 1 mhx users 1016981520 Mar 6 21:12 perl-install.wim
|
||||
```
|
||||
|
||||
So wimlib is definitely much better than squashfs, in terms of both
|
||||
compression ratio and speed. DwarFS is however about 3 times faster to
|
||||
@ -921,43 +1019,52 @@ When switching to LZMA compression, the DwarFS file system is more than
|
||||
What's a bit surprising is that mounting a *wim* file takes quite a bit
|
||||
of time:
|
||||
|
||||
```
|
||||
$ time wimmount perl-install.wim mnt
|
||||
[WARNING] Mounting a WIM file containing solid-compressed data; file access may be slow.
|
||||
|
||||
real 0m2.038s
|
||||
user 0m1.764s
|
||||
sys 0m0.242s
|
||||
```
|
||||
|
||||
Mounting the DwarFS image takes almost no time in comparison:
|
||||
|
||||
```
|
||||
$ time git/github/dwarfs/build-clang-11/dwarfs perl-install-default.dwarfs mnt
|
||||
I 00:23:39.238182 dwarfs (v0.4.0, fuse version 35)
|
||||
|
||||
real 0m0.003s
|
||||
user 0m0.003s
|
||||
sys 0m0.000s
|
||||
```
|
||||
|
||||
That's just because it immediately forks into background by default and
|
||||
initializes the file system in the background. However, even when
|
||||
running it in the foreground, initializing the file system takes only
|
||||
about 60 milliseconds:
|
||||
|
||||
```
|
||||
$ dwarfs perl-install.dwarfs mnt -f
|
||||
I 00:25:03.186005 dwarfs (v0.4.0, fuse version 35)
|
||||
I 00:25:03.248061 file system initialized [60.95ms]
|
||||
```
|
||||
|
||||
If you actually build the DwarFS file system with uncompressed metadata,
|
||||
mounting is basically instantaneous:
|
||||
|
||||
```
|
||||
$ dwarfs perl-install-meta.dwarfs mnt -f
|
||||
I 00:27:52.667026 dwarfs (v0.4.0, fuse version 35)
|
||||
I 00:27:52.671066 file system initialized [2.879ms]
|
||||
```
|
||||
|
||||
I've tried running the benchmark where all 1139 `perl` executables
|
||||
print their version with the wimlib image, but after about 10 minutes,
|
||||
it still hadn't finished the first run (with the DwarFS image, one run
|
||||
took slightly more than 2 seconds). I then tried the following instead:
|
||||
|
||||
```
|
||||
$ ls -1 /tmp/perl/install/*/*/bin/perl5* | xargs -d $'\n' -n1 -P1 sh -c 'time $0 -v >/dev/null' 2>&1 | grep ^real
|
||||
real 0m0.802s
|
||||
real 0m0.652s
|
||||
@ -972,6 +1079,7 @@ took slightly more than 2 seconds). I then tried the following instead:
|
||||
real 0m1.809s
|
||||
real 0m1.790s
|
||||
real 0m2.115s
|
||||
```
|
||||
|
||||
Judging from that, it would have probably taken about half an hour
|
||||
for a single run, which makes at least the `--solid` wim image pretty
|
||||
@ -982,6 +1090,7 @@ that DwarFS actually organizes data internally. However, judging by the
|
||||
warning when mounting a solid image, it's probably not ideal when using
|
||||
the image as a mounted file system. So I tried again without `--solid`:
|
||||
|
||||
```
|
||||
$ time wimcapture --unix-data install perl-install-nonsolid.wim
|
||||
Scanning "install"
|
||||
47 GiB scanned (1927501 files, 330733 directories)
|
||||
@ -991,25 +1100,31 @@ the image as a mounted file system. So I tried again without `--solid`:
|
||||
real 8m39.034s
|
||||
user 64m58.575s
|
||||
sys 0m32.003s
|
||||
```
|
||||
|
||||
This is still more than 3 minutes slower than `mkdwarfs`. However, it
|
||||
yields an image that's almost 10 times the size of the DwarFS image
|
||||
and comparable in size to the SquashFS image:
|
||||
|
||||
```
|
||||
$ ll perl-install-nonsolid.wim -h
|
||||
-rw-r--r-- 1 mhx users 4.6G Mar 6 23:24 perl-install-nonsolid.wim
|
||||
```
|
||||
|
||||
This *still* takes surprisingly long to mount:
|
||||
|
||||
```
|
||||
$ time wimmount perl-install-nonsolid.wim mnt
|
||||
|
||||
real 0m1.603s
|
||||
user 0m1.327s
|
||||
sys 0m0.275s
|
||||
```
|
||||
|
||||
However, it's really usable as a file system, even though it's about
|
||||
4-5 times slower than the DwarFS image:
|
||||
|
||||
```
|
||||
$ hyperfine -c 'umount mnt' -p 'umount mnt; dwarfs perl-install.dwarfs mnt -o cachesize=1g -o workers=4; sleep 1' -n dwarfs "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '\$0 -v >/dev/null'" -p 'umount mnt; wimmount perl-install-nonsolid.wim mnt; sleep 1' -n wimlib "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '\$0 -v >/dev/null'"
|
||||
Benchmark #1: dwarfs
|
||||
Time (mean ± σ): 1.149 s ± 0.019 s [User: 2.147 s, System: 0.739 s]
|
||||
@ -1022,7 +1137,7 @@ However, it's really usable as a file system, even though it's about
|
||||
Summary
|
||||
'dwarfs' ran
|
||||
6.56 ± 0.12 times faster than 'wimlib'
|
||||
|
||||
```
|
||||
|
||||
### With Cromfs
|
||||
|
||||
@ -1035,6 +1150,7 @@ Here's a run on the Perl dataset, with the block size set to 16 MiB to
|
||||
match the default of DwarFS, and with additional options suggested to
|
||||
speed up compression:
|
||||
|
||||
```
|
||||
$ time mkcromfs -f 16777216 -qq -e -r100000 install perl-install.cromfs
|
||||
Writing perl-install.cromfs...
|
||||
mkcromfs: Automatically enabling --24bitblocknums because it seems possible for this filesystem.
|
||||
@ -1050,6 +1166,7 @@ speed up compression:
|
||||
real 29m9.634s
|
||||
user 201m37.816s
|
||||
sys 2m15.005s
|
||||
```
|
||||
|
||||
So it processed 21 MiB out of 48 GiB in half an hour, using almost
|
||||
twice as much CPU resources as DwarFS for the *whole* file system.
|
||||
@ -1062,6 +1179,7 @@ I then tried once more with a smaller version of the Perl dataset.
|
||||
This only has 20 versions (instead of 1139) of Perl, and obviously
|
||||
a lot less redundancy:
|
||||
|
||||
```
|
||||
$ time mkcromfs -f 16777216 -qq -e -r100000 install-small perl-install.cromfs
|
||||
Writing perl-install.cromfs...
|
||||
mkcromfs: Automatically enabling --16bitblocknums because it seems possible for this filesystem.
|
||||
@ -1092,9 +1210,11 @@ a lot less redundancy:
|
||||
real 27m38.833s
|
||||
user 277m36.208s
|
||||
sys 11m36.945s
|
||||
```
|
||||
|
||||
And repeating the same task with `mkdwarfs`:
|
||||
|
||||
```
|
||||
$ time mkdwarfs -i install-small -o perl-install-small.dwarfs
|
||||
21:13:38.131724 scanning install-small
|
||||
21:13:38.320139 waiting for background scanners...
|
||||
@ -1129,13 +1249,16 @@ And repeating the same task with `mkdwarfs`:
|
||||
real 0m33.007s
|
||||
user 3m43.324s
|
||||
sys 0m4.015s
|
||||
```
|
||||
|
||||
So `mkdwarfs` is about 50 times faster than `mkcromfs` and uses 75 times
|
||||
less CPU resources. At the same time, the DwarFS file system is 30% smaller:
|
||||
|
||||
```
|
||||
$ ls -l perl-install-small.*fs
|
||||
-rw-r--r-- 1 mhx users 35328512 Dec 8 14:25 perl-install-small.cromfs
|
||||
-rw-r--r-- 1 mhx users 25175016 Dec 10 21:14 perl-install-small.dwarfs
|
||||
```
|
||||
|
||||
I noticed that the `blockifying` step that took ages for the full dataset
|
||||
with `mkcromfs` ran substantially faster (in terms of MiB/second) on the
|
||||
@ -1145,6 +1268,7 @@ behaviour that's slowing down `mkcromfs`.
|
||||
In order to be completely fair, I also ran `mkdwarfs` with `-l 9` to enable
|
||||
LZMA compression (which is what `mkcromfs` uses by default):
|
||||
|
||||
```
|
||||
$ time mkdwarfs -i install-small -o perl-install-small-l9.dwarfs -l 9
|
||||
21:16:21.874975 scanning install-small
|
||||
21:16:22.092201 waiting for background scanners...
|
||||
@ -1179,11 +1303,14 @@ LZMA compression (which is what `mkcromfs` uses by default):
|
||||
real 0m48.683s
|
||||
user 2m24.905s
|
||||
sys 0m3.292s
|
||||
```
|
||||
|
||||
```
|
||||
$ ls -l perl-install-small*.*fs
|
||||
-rw-r--r-- 1 mhx users 18282075 Dec 10 21:17 perl-install-small-l9.dwarfs
|
||||
-rw-r--r-- 1 mhx users 35328512 Dec 8 14:25 perl-install-small.cromfs
|
||||
-rw-r--r-- 1 mhx users 25175016 Dec 10 21:14 perl-install-small.dwarfs
|
||||
```
|
||||
|
||||
It takes about 15 seconds longer to build the DwarFS file system with LZMA
|
||||
compression (this is still 35 times faster than Cromfs), but reduces the
|
||||
@ -1203,6 +1330,7 @@ supports LZ4 compression.
|
||||
|
||||
I was feeling lucky and decided to run it on the full Perl dataset:
|
||||
|
||||
```
|
||||
$ time mkfs.erofs perl-install.erofs install -zlz4hc,9 -d2
|
||||
mkfs.erofs 1.2
|
||||
c_version: [ 1.2]
|
||||
@ -1213,17 +1341,21 @@ I was feeling lucky and decided to run it on the full Perl dataset:
|
||||
real 912m42.601s
|
||||
user 903m2.777s
|
||||
sys 1m52.812s
|
||||
```
|
||||
|
||||
As you can tell, after more than 15 hours I just gave up. In those
|
||||
15 hours, `mkfs.erofs` had produced a 13 GiB output file:
|
||||
|
||||
```
|
||||
$ ll -h perl-install.erofs
|
||||
-rw-r--r-- 1 mhx users 13G Dec 9 14:42 perl-install.erofs
|
||||
```
|
||||
|
||||
I don't think this would have been very useful to compare with DwarFS.
|
||||
|
||||
Just as for Cromfs, I re-ran with the smaller Perl dataset:
|
||||
|
||||
```
|
||||
$ time mkfs.erofs perl-install-small.erofs install-small -zlz4hc,9 -d2
|
||||
mkfs.erofs 1.2
|
||||
c_version: [ 1.2]
|
||||
@ -1233,20 +1365,24 @@ Just as for Cromfs, I re-ran with the smaller Perl dataset:
|
||||
real 0m27.844s
|
||||
user 0m20.570s
|
||||
sys 0m1.848s
|
||||
```
|
||||
|
||||
That was surprisingly quick, which makes me think that, again, there
|
||||
might be some accidentally quadratic complexity hiding in `mkfs.erofs`.
|
||||
The output file it produced is an order of magnitude larger than the
|
||||
DwarFS image:
|
||||
|
||||
```
|
||||
$ ls -l perl-install-small.*fs
|
||||
-rw-r--r-- 1 mhx users 26928161 Dec 8 15:05 perl-install-small.dwarfs
|
||||
-rw-r--r-- 1 mhx users 296488960 Dec 9 14:45 perl-install-small.erofs
|
||||
```
|
||||
|
||||
Admittedly, this isn't a fair comparison. EROFS has a fixed block size
|
||||
of 4 KiB, and it uses LZ4 compression. If we tweak DwarFS to the same
|
||||
parameters, we get:
|
||||
|
||||
```
|
||||
$ time mkdwarfs -i install-small -o perl-install-small-lz4.dwarfs -C lz4hc:level=9 -S 12
|
||||
21:21:18.136796 scanning install-small
|
||||
21:21:18.376998 waiting for background scanners...
|
||||
@ -1281,6 +1417,7 @@ parameters, we get:
|
||||
real 0m9.075s
|
||||
user 0m37.718s
|
||||
sys 0m2.427s
|
||||
```
|
||||
|
||||
It finishes in less than half the time and produces an output image
|
||||
that's half the size of the EROFS image.
|
||||
|
@ -1,11 +1,9 @@
|
||||
dwarfs-format(5) -- DwarFS File System Format v2.3
|
||||
==================================================
|
||||
# dwarfs-format(5) -- DwarFS File System Format v2.3
|
||||
|
||||
## DESCRIPTION
|
||||
|
||||
This document describes the DwarFS file system format, version 2.3.
|
||||
|
||||
|
||||
## FILE STRUCTURE
|
||||
|
||||
A DwarFS file system image is just a sequence of blocks. Each block has the
|
||||
@ -65,26 +63,24 @@ A couple of notes:
|
||||
larger than the one it supports. However, a new program will still
|
||||
read all file systems with a smaller minor version number.
|
||||
|
||||
|
||||
### Section Types
|
||||
|
||||
There are currently 3 different section types.
|
||||
|
||||
* `BLOCK` (0):
|
||||
- `BLOCK` (0):
|
||||
A block of data. This is where all file data is stored. There can be
|
||||
an arbitrary number of blocks of this type.
|
||||
|
||||
* `METADATA_V2_SCHEMA` (7):
|
||||
- `METADATA_V2_SCHEMA` (7):
|
||||
The schema used to layout the `METADATA_V2` block contents. This is
|
||||
stored in "compact" thrift encoding.
|
||||
|
||||
* `METADATA_V2` (8):
|
||||
- `METADATA_V2` (8):
|
||||
This section contains the bulk of the metadata. It's essentially just
|
||||
a collection of bit-packed arrays and structures. The exact layout of
|
||||
each list and structure depends on the actual data and is stored
|
||||
separately in `METADATA_V2_SCHEMA`.
|
||||
|
||||
|
||||
## METADATA FORMAT
|
||||
|
||||
Here is a high-level overview of how all the bits and pieces relate
|
||||
@ -169,17 +165,12 @@ list. The index into this list is the `inode_num` from `dir_entries`,
|
||||
but you can perform direct lookups based on the inode number as well.
|
||||
The `inodes` list is strictly in the following order:
|
||||
|
||||
* directory inodes (`S_IFDIR`)
|
||||
|
||||
* symlink inodes (`S_IFLNK`)
|
||||
|
||||
* regular *unique* file inodes (`S_IREG`)
|
||||
|
||||
* regular *shared* file inodes (`S_IREG`)
|
||||
|
||||
* character/block device inodes (`S_IFCHR`, `S_IFBLK`)
|
||||
|
||||
* socket/pipe inodes (`S_IFSOCK`, `S_IFIFO`)
|
||||
- directory inodes (`S_IFDIR`)
|
||||
- symlink inodes (`S_IFLNK`)
|
||||
- regular *unique* file inodes (`S_IREG`)
|
||||
- regular *shared* file inodes (`S_IREG`)
|
||||
- character/block device inodes (`S_IFCHR`, `S_IFBLK`)
|
||||
- socket/pipe inodes (`S_IFSOCK`, `S_IFIFO`)
|
||||
|
||||
The offsets can thus be found by using a binary search with a
|
||||
predicate on the inode more. The shared file offset can be found
|
||||
|
@ -1,5 +1,4 @@
|
||||
dwarfs(1) -- mount highly compressed read-only file system
|
||||
==========================================================
|
||||
# dwarfs(1) -- mount highly compressed read-only file system
|
||||
|
||||
## SYNOPSIS
|
||||
|
||||
@ -14,14 +13,16 @@ but it has some distinct features.
|
||||
Other than that, it's pretty straightforward to use. Once you've created a
|
||||
file system image using mkdwarfs(1), you can mount it with:
|
||||
|
||||
```
|
||||
dwarfs image.dwarfs /path/to/mountpoint
|
||||
```
|
||||
|
||||
## OPTIONS
|
||||
|
||||
In addition to the regular FUSE options, `dwarfs` supports the following
|
||||
options:
|
||||
|
||||
* `-o cachesize=`*value*:
|
||||
- `-o cachesize=`*value*:
|
||||
Size of the block cache, in bytes. You can append suffixes
|
||||
(`k`, `m`, `g`) to specify the size in KiB, MiB and GiB,
|
||||
respectively. Note that this is not the upper memory limit
|
||||
@ -31,12 +32,12 @@ options:
|
||||
with it, which can use a significant amount of additional
|
||||
memory. For more details, see mkdwarfs(1).
|
||||
|
||||
* `-o workers=`*value*:
|
||||
- `-o workers=`*value*:
|
||||
Number of worker threads to use for decompressing blocks.
|
||||
If you have a lot of CPUs, increasing this number can help
|
||||
speed up access to files in the filesystem.
|
||||
|
||||
* `-o decratio=`*value*:
|
||||
- `-o decratio=`*value*:
|
||||
The ratio over which a block is fully decompressed. Blocks
|
||||
are only decompressed partially, so each block has to carry
|
||||
the decompressor state with it until it is fully decompressed.
|
||||
@ -49,18 +50,18 @@ options:
|
||||
we keep the partially decompressed block, but if we've
|
||||
decompressed more then 80%, we'll fully decompress it.
|
||||
|
||||
* `-o offset=`*value*|`auto`:
|
||||
- `-o offset=`*value*|`auto`:
|
||||
Specify the byte offset at which the filesystem is located in
|
||||
the image, or use `auto` to detect the offset automatically.
|
||||
This is only useful for images that have some header located
|
||||
before the actual filesystem data.
|
||||
|
||||
* `-o mlock=none`|`try`|`must`:
|
||||
- `-o mlock=none`|`try`|`must`:
|
||||
Set this to `try` or `must` instead of the default `none` to
|
||||
try or require `mlock()`ing of the file system metadata into
|
||||
memory.
|
||||
|
||||
* `-o enable_nlink`:
|
||||
- `-o enable_nlink`:
|
||||
Set this option if you want correct hardlink counts for regular
|
||||
files. If this is not specified, the hardlink count will be 1.
|
||||
Enabling this will slow down the initialization of the fuse
|
||||
@ -70,7 +71,7 @@ options:
|
||||
will also consume more memory to hold the hardlink count table.
|
||||
This will be 4 bytes for every regular file inode.
|
||||
|
||||
* `-o readonly`:
|
||||
- `-o readonly`:
|
||||
Show all file system entries as read-only. By default, DwarFS
|
||||
will preserve the original writeability, which is obviously a
|
||||
lie as it's a read-only file system. However, this is needed
|
||||
@ -80,7 +81,7 @@ options:
|
||||
overlays and want the file system to reflect its read-only
|
||||
state, you can set this option.
|
||||
|
||||
* `-o (no_)cache_image`:
|
||||
- `-o (no_)cache_image`:
|
||||
By default, `dwarfs` tries to ensure that the compressed file
|
||||
system image will not be cached by the kernel (i.e. the default
|
||||
is `-o no_cache_image`). This will reduce the memory consumption
|
||||
@ -91,7 +92,7 @@ options:
|
||||
`-o cache_image` to keep the compressed image data in the kernel
|
||||
cache.
|
||||
|
||||
* `-o (no_)cache_files`:
|
||||
- `-o (no_)cache_files`:
|
||||
By default, files in the mounted file system will be cached by
|
||||
the kernel (i.e. the default is `-o cache_files`). This will
|
||||
significantly improve performance when accessing the same files
|
||||
@ -103,14 +104,14 @@ options:
|
||||
though it's likely that the kernel will already do the right thing
|
||||
even when the cache is enabled.
|
||||
|
||||
* `-o debuglevel=`*name*:
|
||||
- `-o debuglevel=`*name*:
|
||||
Use this for different levels of verbosity along with either
|
||||
the `-f` or `-d` FUSE options. This can give you some insight
|
||||
over what the file system driver is doing internally, but it's
|
||||
mainly meant for debugging and the `debug` and `trace` levels
|
||||
in particular will slow down the driver.
|
||||
|
||||
* `-o tidy_strategy=`*name*:
|
||||
- `-o tidy_strategy=`*name*:
|
||||
Use one of the following strategies to tidy the block cache:
|
||||
|
||||
- `none`:
|
||||
@ -128,14 +129,14 @@ options:
|
||||
cache is traversed and all blocks that have been fully or
|
||||
partially swapped out by the kernel will be removed.
|
||||
|
||||
* `-o tidy_interval=`*time*:
|
||||
- `-o tidy_interval=`*time*:
|
||||
Used only if `tidy_strategy` is not `none`. This is the interval
|
||||
at which the cache tidying thread wakes up to look for blocks
|
||||
that can be removed from the cache. This must be an integer value.
|
||||
Suffixes `ms`, `s`, `m`, `h` are supported. If no suffix is given,
|
||||
the value will be assumed to be in seconds.
|
||||
|
||||
* `-o tidy_max_age=`*time*:
|
||||
- `-o tidy_max_age=`*time*:
|
||||
Used only if `tidy_strategy` is `time`. A block will be removed
|
||||
from the cache if it hasn't been used for this time span. This must
|
||||
be an integer value. Suffixes `ms`, `s`, `m`, `h` are supported.
|
||||
@ -145,7 +146,7 @@ There's two particular FUSE options that you'll likely need at some
|
||||
point, e.g. when trying to set up an `overlayfs` mount on top of
|
||||
a DwarFS image:
|
||||
|
||||
* `-o allow_root` and `-o allow_other`:
|
||||
- `-o allow_root` and `-o allow_other`:
|
||||
These will ensure that the mounted file system can be read by
|
||||
either `root` or any other user in addition to the user that
|
||||
started the fuse driver. So if you're running `dwarfs` as a
|
||||
@ -193,27 +194,33 @@ set of Perl versions back.
|
||||
|
||||
Here's what you need to do:
|
||||
|
||||
* Create a set of directories. In my case, these are all located
|
||||
- Create a set of directories. In my case, these are all located
|
||||
in `/tmp/perl` as this was the orginal install location.
|
||||
|
||||
```
|
||||
cd /tmp/perl
|
||||
mkdir install-ro
|
||||
mkdir install-rw
|
||||
mkdir install-work
|
||||
mkdir install
|
||||
```
|
||||
|
||||
* Mount the DwarFS image. `-o allow_root` is needed to make sure
|
||||
- Mount the DwarFS image. `-o allow_root` is needed to make sure
|
||||
`overlayfs` has access to the mounted file system. In order
|
||||
to use `-o allow_root`, you may have to uncomment or add
|
||||
`user_allow_other` in `/etc/fuse.conf`.
|
||||
|
||||
```
|
||||
dwarfs perl-install.dwarfs install-ro -o allow_root
|
||||
```
|
||||
|
||||
* Now set up `overlayfs`.
|
||||
- Now set up `overlayfs`.
|
||||
|
||||
```
|
||||
sudo mount -t overlay overlay -o lowerdir=install-ro,upperdir=install-rw,workdir=install-work install
|
||||
```
|
||||
|
||||
* That's it. You should now be able to access a writeable version
|
||||
- That's it. You should now be able to access a writeable version
|
||||
of your DwarFS image in `install`.
|
||||
|
||||
You can go even further than that. Say you have different sets of
|
||||
@ -223,7 +230,9 @@ the read-write directory after unmounting the `overlayfs`, and
|
||||
selectively add this by passing a colon-separated list to the
|
||||
`lowerdir` option when setting up the `overlayfs` mount:
|
||||
|
||||
```
|
||||
sudo mount -t overlay overlay -o lowerdir=install-ro:install-modules install
|
||||
```
|
||||
|
||||
If you want *this* merged overlay to be writable, just add in the
|
||||
`upperdir` and `workdir` options from before again.
|
||||
|
@ -1,5 +1,4 @@
|
||||
dwarfsck(1) -- check DwarFS image
|
||||
=================================
|
||||
# dwarfsck(1) -- check DwarFS image
|
||||
|
||||
## SYNOPSIS
|
||||
|
||||
@ -15,42 +14,42 @@ with a non-zero exit code.
|
||||
|
||||
## OPTIONS
|
||||
|
||||
* `-i`, `--input=`*file*:
|
||||
- `-i`, `--input=`*file*:
|
||||
Path to the filesystem image.
|
||||
|
||||
* `-d`, `--detail=`*value*:
|
||||
- `-d`, `--detail=`*value*:
|
||||
Level of filesystem information detail. The default is 2. Higher values
|
||||
mean more output. Values larger than 6 will currently not provide any
|
||||
further detail.
|
||||
|
||||
* `-O`, `--image-offset=`*value*|`auto`:
|
||||
- `-O`, `--image-offset=`*value*|`auto`:
|
||||
Specify the byte offset at which the filesystem is located in the image.
|
||||
Use `auto` to detect the offset automatically. This is also the default.
|
||||
This is only useful for images that have some header located before the
|
||||
actual filesystem data.
|
||||
|
||||
* `-H`, `--print-header`:
|
||||
- `-H`, `--print-header`:
|
||||
Print the header located before the filesystem image to stdout. If no
|
||||
header is present, the program will exit with a non-zero exit code.
|
||||
|
||||
* `-n`, `--num-workers=`*value*:
|
||||
- `-n`, `--num-workers=`*value*:
|
||||
Number of worker threads used for integrity checking.
|
||||
|
||||
* `--check-integrity`:
|
||||
- `--check-integrity`:
|
||||
In addition to performing a fast checksum check, also perform a (much
|
||||
slower) verification of the embedded SHA-512/256 hashes.
|
||||
|
||||
* `--json`:
|
||||
- `--json`:
|
||||
Print a simple JSON representation of the filesystem metadata. Please
|
||||
note that the format is *not* stable.
|
||||
|
||||
* `--export-metadata=`*file*:
|
||||
- `--export-metadata=`*file*:
|
||||
Export all filesystem meteadata in JSON format.
|
||||
|
||||
* `--log-level=`*name*:
|
||||
- `--log-level=`*name*:
|
||||
Specifiy a logging level.
|
||||
|
||||
* `--help`:
|
||||
- `--help`:
|
||||
Show program help, including option defaults.
|
||||
|
||||
## AUTHOR
|
||||
|
@ -1,9 +1,8 @@
|
||||
dwarfsextract(1) -- extract DwarFS image
|
||||
========================================
|
||||
# dwarfsextract(1) -- extract DwarFS image
|
||||
|
||||
## SYNOPSIS
|
||||
|
||||
`dwarfsextract` `-i` *image* [`-o` *dir*] [*options*...]<br>
|
||||
`dwarfsextract` `-i` *image* [`-o` *dir*] [*options*...]
|
||||
`dwarfsextract` `-i` *image* -f *format* [`-o` *file*] [*options*...]
|
||||
|
||||
## DESCRIPTION
|
||||
@ -35,32 +34,32 @@ to disk:
|
||||
|
||||
## OPTIONS
|
||||
|
||||
* `-i`, `--input=`*file*:
|
||||
- `-i`, `--input=`*file*:
|
||||
Path to the source filesystem.
|
||||
|
||||
* `-o`, `--output=`*directory*|*file*:
|
||||
- `-o`, `--output=`*directory*|*file*:
|
||||
If no format is specified, this is the directory to which the contents
|
||||
of the filesystem should be extracted. If a format is specified, this
|
||||
is the name of the output archive. This option can be omitted, in which
|
||||
case the default is to extract the files to the current directory, or
|
||||
to write the archive data to stdout.
|
||||
|
||||
* `-O`, `--image-offset=`*value*|`auto`:
|
||||
- `-O`, `--image-offset=`*value*|`auto`:
|
||||
Specify the byte offset at which the filesystem is located in the image.
|
||||
Use `auto` to detect the offset automatically. This is also the default.
|
||||
This is only useful for images that have some header located before the
|
||||
actual filesystem data.
|
||||
|
||||
* `-f`, `--format=`*format*:
|
||||
- `-f`, `--format=`*format*:
|
||||
The archive format to produce. If this is left empty or unspecified,
|
||||
files will be extracted to the output directory (or the current directory
|
||||
if no output directory is specified). For a full list of supported formats,
|
||||
see libarchive-formats(5).
|
||||
|
||||
* `-n`, `--num-workers=`*value*:
|
||||
- `-n`, `--num-workers=`*value*:
|
||||
Number of worker threads used for extracting the filesystem.
|
||||
|
||||
* `-s`, `--cache-size=`*value*:
|
||||
- `-s`, `--cache-size=`*value*:
|
||||
Size of the block cache, in bytes. You can append suffixes (`k`, `m`, `g`)
|
||||
to specify the size in KiB, MiB and GiB, respectively. Note that this is
|
||||
not the upper memory limit of the process, as there may be blocks in
|
||||
@ -68,10 +67,10 @@ to disk:
|
||||
fully decompressed yet will carry decompressor state along with it, which
|
||||
can use a significant amount of additional memory.
|
||||
|
||||
* `--log-level=`*name*:
|
||||
- `--log-level=`*name*:
|
||||
Specifiy a logging level.
|
||||
|
||||
* `--help`:
|
||||
- `--help`:
|
||||
Show program help, including option defaults.
|
||||
|
||||
## AUTHOR
|
||||
|
@ -1,9 +1,8 @@
|
||||
mkdwarfs(1) -- create highly compressed read-only file systems
|
||||
==============================================================
|
||||
# mkdwarfs(1) -- create highly compressed read-only file systems
|
||||
|
||||
## SYNOPSIS
|
||||
|
||||
`mkdwarfs` `-i` *path* `-o` *file* [*options*...]<br>
|
||||
`mkdwarfs` `-i` *path* `-o` *file* [*options*...]
|
||||
`mkdwarfs` `-i` *file* `-o` *file* `--recompress` [*options*...]
|
||||
|
||||
## DESCRIPTION
|
||||
@ -26,17 +25,17 @@ After that, you can mount it with dwarfs(1):
|
||||
|
||||
There two mandatory options for specifying the input and output:
|
||||
|
||||
* `-i`, `--input=`*path*|*file*:
|
||||
- `-i`, `--input=`*path*|*file*:
|
||||
Path to the root directory containing the files from which you want to
|
||||
build a filesystem. If the `--recompress` option is given, this argument
|
||||
is the source filesystem.
|
||||
|
||||
* `-o`, `--output=`*file*:
|
||||
- `-o`, `--output=`*file*:
|
||||
File name of the output filesystem.
|
||||
|
||||
Most other options are concerned with compression tuning:
|
||||
|
||||
* `-l`, `--compress-level=`*value*:
|
||||
- `-l`, `--compress-level=`*value*:
|
||||
Compression level to use for the filesystem. **If you are unsure, please
|
||||
stick to the default level of 7.** This is intended to provide some
|
||||
sensible defaults and will depend on which compression libraries were
|
||||
@ -53,7 +52,7 @@ Most other options are concerned with compression tuning:
|
||||
`--window-step` and `--order`. See the output of `mkdwarfs --help` for
|
||||
a table listing the exact defaults used for each compression level.
|
||||
|
||||
* `-S`, `--block-size-bits=`*value*:
|
||||
- `-S`, `--block-size-bits=`*value*:
|
||||
The block size used for the compressed filesystem. The actual block size
|
||||
is two to the power of this value. Larger block sizes will offer better
|
||||
overall compression ratios, but will be slower and consume more memory
|
||||
@ -61,7 +60,7 @@ Most other options are concerned with compression tuning:
|
||||
least partially decompressed into memory. Values between 20 and 26, i.e.
|
||||
between 1MiB and 64MiB, usually work quite well.
|
||||
|
||||
* `-N`, `--num-workers=`*value*:
|
||||
- `-N`, `--num-workers=`*value*:
|
||||
Number of worker threads used for building the filesystem. This defaults
|
||||
to the number of processors available on your system. Use this option if
|
||||
you want to limit the resources used by `mkdwarfs`.
|
||||
@ -75,7 +74,7 @@ Most other options are concerned with compression tuning:
|
||||
individual filesystem blocks in the background. Ordering, segmenting
|
||||
and block building are, again, single-threaded and run independently.
|
||||
|
||||
* `-B`, `--max-lookback-blocks=`*value*:
|
||||
- `-B`, `--max-lookback-blocks=`*value*:
|
||||
Specify how many of the most recent blocks to scan for duplicate segments.
|
||||
By default, only the current block will be scanned. The larger this number,
|
||||
the more duplicate segments will likely be found, which may further improve
|
||||
@ -84,7 +83,7 @@ Most other options are concerned with compression tuning:
|
||||
files can now potentially span multiple filesystem blocks. Passing `-B0`
|
||||
will completely disable duplicate segment search.
|
||||
|
||||
* `-W`, `--window-size=`*value*:
|
||||
- `-W`, `--window-size=`*value*:
|
||||
Window size of cyclic hash used for segmenting. This is again an exponent
|
||||
to a base of two. Cyclic hashes are used by `mkdwarfs` for finding
|
||||
identical segments across multiple files. This is done on top of duplicate
|
||||
@ -101,7 +100,7 @@ Most other options are concerned with compression tuning:
|
||||
size will grow. Passing `-W0` will completely disable duplicate segment
|
||||
search.
|
||||
|
||||
* `-w`, `--window-step=`*value*:
|
||||
- `-w`, `--window-step=`*value*:
|
||||
This option specifies how often cyclic hash values are stored for lookup.
|
||||
It is specified relative to the window size, as a base-2 exponent that
|
||||
divides the window size. To give a concrete example, if `--window-size=16`
|
||||
@ -114,7 +113,7 @@ Most other options are concerned with compression tuning:
|
||||
If you use a larger value for this option, the increments become *smaller*,
|
||||
and `mkdwarfs` will be slightly slower and use more memory.
|
||||
|
||||
* `--bloom-filter-size`=*value*:
|
||||
- `--bloom-filter-size`=*value*:
|
||||
The segmenting algorithm uses a bloom filter to determine quickly if
|
||||
there is *no* match at a given position. This will filter out more than
|
||||
90% of bad matches quickly with the default bloom filter size. The default
|
||||
@ -123,7 +122,7 @@ Most other options are concerned with compression tuning:
|
||||
be able to see some improvement. If you're tight on memory, then decreasing
|
||||
this will potentially save a few MiBs.
|
||||
|
||||
* `-L`, `--memory-limit=`*value*:
|
||||
- `-L`, `--memory-limit=`*value*:
|
||||
Approximately how much memory you want `mkdwarfs` to use during filesystem
|
||||
creation. Note that currently this will only affect the block manager
|
||||
component, i.e. the number of filesystem blocks that are in flight but
|
||||
@ -134,24 +133,24 @@ Most other options are concerned with compression tuning:
|
||||
algorithms, so if you're short on memory it might be worth tweaking the
|
||||
compression options.
|
||||
|
||||
* `-C`, `--compression=`*algorithm*[`:`*algopt*[`=`*value*][`,`...]]:
|
||||
- `-C`, `--compression=`*algorithm*[`:`*algopt*[`=`*value*][`,`...]]:
|
||||
The compression algorithm and configuration used for file system data.
|
||||
The value for this option is a colon-separated list. The first item is
|
||||
the compression algorithm, the remaining item are its options. Options
|
||||
can be either boolean or have a value. For details on which algori`thms
|
||||
can be either boolean or have a value. For details on which algorithms
|
||||
and options are available, see the output of `mkdwarfs --help`. `zstd`
|
||||
will give you the best compression while still keeping decompression
|
||||
*very* fast. `lzma` will compress even better, but decompression will
|
||||
be around ten times slower.
|
||||
|
||||
* `--schema-compression=`*algorithm*[`:`*algopt*[`=`*value*][`,`...]]:
|
||||
- `--schema-compression=`*algorithm*[`:`*algopt*[`=`*value*][`,`...]]:
|
||||
The compression algorithm and configuration used for the metadata schema.
|
||||
Takes the same arguments as `--compression` above. The schema is *very*
|
||||
small, in the hundreds of bytes, so this is only relevant for extremely
|
||||
small file systems. The default (`zstd`) has shown to give considerably
|
||||
better results than any other algorithms.
|
||||
|
||||
* `--metadata-compression=`*algorithm*[`:`*algopt*[`=`*value*][`,`...]]:
|
||||
- `--metadata-compression=`*algorithm*[`:`*algopt*[`=`*value*][`,`...]]:
|
||||
The compression algorithm and configuration used for the metadata.
|
||||
Takes the same arguments as `--compression` above. The metadata has been
|
||||
optimized for very little redundancy and leaving it uncompressed, the
|
||||
@ -161,7 +160,7 @@ Most other options are concerned with compression tuning:
|
||||
care about mount time, you can safely choose `lzma` compression here, as
|
||||
the data will only have to be decompressed once when mounting the image.
|
||||
|
||||
* `--recompress`[`=all`|`=block`|`=metadata`|`=none`]:
|
||||
- `--recompress`[`=all`|`=block`|`=metadata`|`=none`]:
|
||||
Take an existing DwarFS file system and recompress it using different
|
||||
compression algorithms. If no argument or `all` is given, all sections
|
||||
in the file system image will be recompressed. Note that *only* the
|
||||
@ -177,7 +176,7 @@ Most other options are concerned with compression tuning:
|
||||
metadata to uncompressed metadata without having to rebuild or recompress
|
||||
all the other data.
|
||||
|
||||
* `-P`, `--pack-metadata=auto`|`none`|[`all`|`chunk_table`|`directories`|`shared_files`|`names`|`names_index`|`symlinks`|`symlinks_index`|`force`|`plain`[`,`...]]:
|
||||
- `-P`, `--pack-metadata=auto`|`none`|[`all`|`chunk_table`|`directories`|`shared_files`|`names`|`names_index`|`symlinks`|`symlinks_index`|`force`|`plain`[`,`...]]:
|
||||
Which metadata information to store in packed format. This is primarily
|
||||
useful when storing metadata uncompressed, as it allows for smaller
|
||||
metadata block size without having to turn on compression. Keep in mind,
|
||||
@ -189,34 +188,34 @@ Most other options are concerned with compression tuning:
|
||||
systems that contain hundreds of thousands of files.
|
||||
See [Metadata Packing](#metadata-packing) for more details.
|
||||
|
||||
* `--set-owner=`*uid*:
|
||||
- `--set-owner=`*uid*:
|
||||
Set the owner for all entities in the file system. This can reduce the
|
||||
size of the file system. If the input only has a single owner already,
|
||||
setting this won't make any difference.
|
||||
|
||||
* `--set-group=`*gid*:
|
||||
- `--set-group=`*gid*:
|
||||
Set the group for all entities in the file system. This can reduce the
|
||||
size of the file system. If the input only has a single group already,
|
||||
setting this won't make any difference.
|
||||
|
||||
* `--set-time=`*time*|`now`:
|
||||
- `--set-time=`*time*|`now`:
|
||||
Set the time stamps for all entities to this value. This can significantly
|
||||
reduce the size of the file system. You can pass either a unix time stamp
|
||||
or `now`.
|
||||
|
||||
* `--keep-all-times`:
|
||||
- `--keep-all-times`:
|
||||
As of release 0.3.0, by default, `mkdwarfs` will only save the contents of
|
||||
the `mtime` field in order to save metadata space. If you want to save
|
||||
`atime` and `ctime` as well, use this option.
|
||||
|
||||
* `--time-resolution=`*sec*|`sec`|`min`|`hour`|`day`:
|
||||
- `--time-resolution=`*sec*|`sec`|`min`|`hour`|`day`:
|
||||
Specify the resolution with which time stamps are stored. By default,
|
||||
time stamps are stored with second resolution. You can specify "odd"
|
||||
resolutions as well, e.g. something like 15 second resolution is
|
||||
entirely possible. Moving from second to minute resolution, for example,
|
||||
will save roughly 6 bits per file system entry in the metadata block.
|
||||
|
||||
* `--order=none`|`path`|`similarity`|`nilsimsa`[`:`*limit*[`:`*depth*[`:`*mindepth*]]]|`script`:
|
||||
- `--order=none`|`path`|`similarity`|`nilsimsa`[`:`*limit*[`:`*depth*[`:`*mindepth*]]]|`script`:
|
||||
The order in which inodes will be written to the file system. Choosing `none`,
|
||||
the inodes will be stored in the order in which they are discovered. With
|
||||
`path`, they will be sorted asciibetically by path name of the first file
|
||||
@ -243,35 +242,35 @@ Most other options are concerned with compression tuning:
|
||||
Last but not least, if scripting support is built into `mkdwarfs`, you can
|
||||
choose `script` to let the script determine the order.
|
||||
|
||||
* `--remove-empty-dirs`:
|
||||
- `--remove-empty-dirs`:
|
||||
Removes all empty directories from the output file system, recursively.
|
||||
This is particularly useful when using scripts that filter out a lot of
|
||||
file system entries.
|
||||
|
||||
* `--with-devices`:
|
||||
- `--with-devices`:
|
||||
Include character and block devices in the output file system. These are
|
||||
not included by default, and due to security measures in FUSE, they will
|
||||
never work in the mounted file system. However, they can still be copied
|
||||
out of the mounted file system, for example using `rsync`.
|
||||
|
||||
* `--with-specials`:
|
||||
- `--with-specials`:
|
||||
Include named fifos and sockets in the output file system. These are not
|
||||
included by default.
|
||||
|
||||
* `--header=`*file*:
|
||||
- `--header=`*file*:
|
||||
Read header from file and place it before the output filesystem image.
|
||||
Can be used with `--recompress` to add or replace a header.
|
||||
|
||||
* `--remove-header`:
|
||||
- `--remove-header`:
|
||||
Remove header from a filesystem image. Only useful with `--recompress`.
|
||||
|
||||
* `--log-level=`*name*:
|
||||
- `--log-level=`*name*:
|
||||
Specifiy a logging level.
|
||||
|
||||
* `--no-progress`:
|
||||
- `--no-progress`:
|
||||
Don't show progress output while building filesystem.
|
||||
|
||||
* `--progress=none`|`simple`|`ascii`|`unicode`:
|
||||
- `--progress=none`|`simple`|`ascii`|`unicode`:
|
||||
Choosing `none` is equivalent to specifying `--no-progress`. `simple`
|
||||
will print a single line of progress information whenever the progress
|
||||
has significantly changed, but at most once every 2 seconds. This is
|
||||
@ -281,14 +280,14 @@ Most other options are concerned with compression tuning:
|
||||
you can switch to `ascii`, which is like `unicode`, but looks less
|
||||
fancy.
|
||||
|
||||
* `--help`:
|
||||
- `--help`:
|
||||
Show program help, including defaults, compression level detail and
|
||||
supported compression algorithms.
|
||||
|
||||
If experimental Python support was compiled into `mkdwarfs`, you can use the
|
||||
following option to enable customizations via the scripting interface:
|
||||
|
||||
* `--script=`*file*[`:`*class*[`(`arguments`...)`]]:
|
||||
- `--script=`*file*[`:`*class*[`(`arguments`...)`]]:
|
||||
Specify the Python script to load. The class name is optional if there's
|
||||
a class named `mkdwarfs` in the script. It is also possible to pass
|
||||
arguments to the constuctor.
|
||||
@ -342,28 +341,28 @@ However, there are several options to choose from that allow you to
|
||||
further reduce metadata size without having to compress the metadata.
|
||||
These options are controlled by the `--pack-metadata` option.
|
||||
|
||||
* `auto`:
|
||||
- `auto`:
|
||||
This is the default. It will enable both `names` and `symlinks`.
|
||||
|
||||
* `none`:
|
||||
- `none`:
|
||||
Don't enable any packing. However, string tables (i.e. names and
|
||||
symlinks) will still be stored in "compact" rather than "plain"
|
||||
format. In order to force storage in plain format, use `plain`.
|
||||
|
||||
* `all`:
|
||||
- `all`:
|
||||
Enable all packing options. This does *not* force packing of
|
||||
string tables (i.e. names and symlinks) if the packing would
|
||||
actually increase the size, which can happen if the string tables
|
||||
are actually small. In order to force string table packing, use
|
||||
`all,force`.
|
||||
|
||||
* `chunk_table`:
|
||||
- `chunk_table`:
|
||||
Delta-compress chunk tables. This can reduce the size of the
|
||||
chunk tables for large file systems and help compression, however,
|
||||
it will likely require a lot of memory when unpacking the tables
|
||||
again. Only use this if you know what you're doing.
|
||||
|
||||
* `directories`:
|
||||
- `directories`:
|
||||
Pack directories table by storing first entry pointers delta-
|
||||
compressed and completely removing parent directory pointers.
|
||||
The parent directory pointers can be rebuilt by tree traversal
|
||||
@ -372,12 +371,12 @@ These options are controlled by the `--pack-metadata` option.
|
||||
will likely require a lot of memory when unpacking the tables
|
||||
again. Only use this if you know what you're doing.
|
||||
|
||||
* `shared_files`:
|
||||
- `shared_files`:
|
||||
Pack shared files table. This is only useful if the filesystem
|
||||
contains lots of non-hardlinked duplicates. It gets more efficient
|
||||
the more copies of a file are in the filesystem.
|
||||
|
||||
* `names`,`symlinks`:
|
||||
- `names`,`symlinks`:
|
||||
Compress the names and symlink targets using the
|
||||
[fsst](https://github.com/cwida/fsst) compression scheme. This
|
||||
compresses each individual entry separately using a small,
|
||||
@ -392,17 +391,17 @@ These options are controlled by the `--pack-metadata` option.
|
||||
than the uncompressed strings. If this is the case, the strings
|
||||
will be stored uncompressed, unless `force` is also specified.
|
||||
|
||||
* `names_index`,`symlinks_index`:
|
||||
- `names_index`,`symlinks_index`:
|
||||
Delta-compress the names and symlink targets indices. The same
|
||||
caveats apply as for `chunk_table`.
|
||||
|
||||
* `force`:
|
||||
- `force`:
|
||||
Forces the compression of the `names` and `symlinks` tables,
|
||||
even if that would make them use more memory than the
|
||||
uncompressed tables. This is really only useful for testing
|
||||
and development.
|
||||
|
||||
* `plain`:
|
||||
- `plain`:
|
||||
Store string tables in "plain" format. The plain format uses
|
||||
Frozen thrift arrays and was used in earlier metadata versions.
|
||||
It is useful for debugging, but wastes up to one byte per string.
|
||||
@ -430,7 +429,6 @@ further compress the block. So if you're really desperately trying
|
||||
to reduce the image size, enabling `all` packing would be an option
|
||||
at the cost of using a lot more memory when using the filesystem.
|
||||
|
||||
|
||||
## INTERNAL OPERATION
|
||||
|
||||
Internally, `mkdwarfs` runs in two completely separate phases. The first
|
||||
|
Loading…
x
Reference in New Issue
Block a user