Markdown cleanup

This commit is contained in:
Marcus Holland-Moritz 2021-10-27 00:57:28 +02:00
parent 9ad4dd655f
commit 569966b752
6 changed files with 1318 additions and 1185 deletions

191
README.md
View File

@ -6,22 +6,22 @@ A fast high compression read-only file system
## Table of contents ## Table of contents
* [Overview](#overview) - [Overview](#overview)
* [History](#history) - [History](#history)
* [Building and Installing](#building-and-installing) - [Building and Installing](#building-and-installing)
* [Dependencies](#dependencies) - [Dependencies](#dependencies)
* [Building](#building) - [Building](#building)
* [Installing](#installing) - [Installing](#installing)
* [Experimental Python Scripting Support](#experimental-python-scripting-support) - [Experimental Python Scripting Support](#experimental-python-scripting-support)
* [Usage](#usage) - [Usage](#usage)
* [Comparison](#comparison) - [Comparison](#comparison)
* [With SquashFS](#with-squashfs) - [With SquashFS](#with-squashfs)
* [With SquashFS & xz](#with-squashfs--xz) - [With SquashFS & xz](#with-squashfs--xz)
* [With lrzip](#with-lrzip) - [With lrzip](#with-lrzip)
* [With zpaq](#with-zpaq) - [With zpaq](#with-zpaq)
* [With wimlib](#with-wimlib) - [With wimlib](#with-wimlib)
* [With Cromfs](#with-cromfs) - [With Cromfs](#with-cromfs)
* [With EROFS](#with-erofs) - [With EROFS](#with-erofs)
## Overview ## Overview
@ -45,20 +45,20 @@ less CPU resources.
Distinct features of DwarFS are: Distinct features of DwarFS are:
* Clustering of files by similarity using a similarity hash function. - Clustering of files by similarity using a similarity hash function.
This makes it easier to exploit the redundancy across file boundaries. This makes it easier to exploit the redundancy across file boundaries.
* Segmentation analysis across file system blocks in order to reduce - Segmentation analysis across file system blocks in order to reduce
the size of the uncompressed file system. This saves memory when the size of the uncompressed file system. This saves memory when
using the compressed file system and thus potentially allows for using the compressed file system and thus potentially allows for
higher cache hit rates as more data can be kept in the cache. higher cache hit rates as more data can be kept in the cache.
* Highly multi-threaded implementation. Both the file - Highly multi-threaded implementation. Both the file
[system creation tool](doc/mkdwarfs.md) as well as the [system creation tool](doc/mkdwarfs.md) as well as the
[FUSE driver](doc/dwarfs.md) are able to make good use of the [FUSE driver](doc/dwarfs.md) are able to make good use of the
many cores of your system. many cores of your system.
* Optional experimental Python scripting support to provide custom - Optional experimental Python scripting support to provide custom
filtering and ordering functionality. filtering and ordering functionality.
## History ## History
@ -129,6 +129,7 @@ will be automatically resolved if you build with tests.
A good starting point for apt-based systems is probably: A good starting point for apt-based systems is probably:
```
$ apt install \ $ apt install \
g++ \ g++ \
clang \ clang \
@ -161,6 +162,7 @@ A good starting point for apt-based systems is probably:
libfmt-dev \ libfmt-dev \
libfuse3-dev \ libfuse3-dev \
libgoogle-glog-dev libgoogle-glog-dev
```
Note that when building with `gcc`, the optimization level will be Note that when building with `gcc`, the optimization level will be
set to `-O2` instead of the CMake default of `-O3` for release set to `-O2` instead of the CMake default of `-O3` for release
@ -168,30 +170,37 @@ builds. At least with versions up to `gcc-10`, the `-O3` build is
[up to 70% slower](https://github.com/mhx/dwarfs/issues/14) than a [up to 70% slower](https://github.com/mhx/dwarfs/issues/14) than a
build with `-O2`. build with `-O2`.
### Building ### Building
Firstly, either clone the repository... Firstly, either clone the repository...
```
$ git clone --recurse-submodules https://github.com/mhx/dwarfs $ git clone --recurse-submodules https://github.com/mhx/dwarfs
$ cd dwarfs $ cd dwarfs
```
...or unpack the release archive: ...or unpack the release archive:
```
$ tar xvf dwarfs-x.y.z.tar.bz2 $ tar xvf dwarfs-x.y.z.tar.bz2
$ cd dwarfs-x.y.z $ cd dwarfs-x.y.z
```
Once all dependencies have been installed, you can build DwarFS Once all dependencies have been installed, you can build DwarFS
using: using:
```
$ mkdir build $ mkdir build
$ cd build $ cd build
$ cmake .. -DWITH_TESTS=1 $ cmake .. -DWITH_TESTS=1
$ make -j$(nproc) $ make -j$(nproc)
```
You can then run tests with: You can then run tests with:
```
$ make test $ make test
```
All binaries use [jemalloc](https://github.com/jemalloc/jemalloc) All binaries use [jemalloc](https://github.com/jemalloc/jemalloc)
as a memory allocator by default, as it is typically uses much less as a memory allocator by default, as it is typically uses much less
@ -203,7 +212,9 @@ To disable the use of `jemalloc`, pass `-DUSE_JEMALLOC=0` on the
Installing is as easy as: Installing is as easy as:
```
$ sudo make install $ sudo make install
```
Though you don't have to install the tools to play with them. Though you don't have to install the tools to play with them.
@ -212,13 +223,17 @@ Though you don't have to install the tools to play with them.
You can build `mkdwarfs` with experimental support for Python You can build `mkdwarfs` with experimental support for Python
scripting: scripting:
```
$ cmake .. -DWITH_TESTS=1 -DWITH_PYTHON=1 $ cmake .. -DWITH_TESTS=1 -DWITH_PYTHON=1
```
This also requires Boost.Python. If you have multiple Python This also requires Boost.Python. If you have multiple Python
versions installed, you can explicitly specify the version to versions installed, you can explicitly specify the version to
build against: build against:
```
$ cmake .. -DWITH_TESTS=1 -DWITH_PYTHON=1 -DWITH_PYTHON_VERSION=3.8 $ cmake .. -DWITH_TESTS=1 -DWITH_PYTHON=1 -DWITH_PYTHON_VERSION=3.8
```
Note that only Python 3 is supported. You can take a look at Note that only Python 3 is supported. You can take a look at
[scripts/example.py](scripts/example.py) to get an idea for [scripts/example.py](scripts/example.py) to get an idea for
@ -259,6 +274,7 @@ NVME drive, so most of its contents were likely cached.
I'm using the same compression type and compression level for I'm using the same compression type and compression level for
SquashFS that is the default setting for DwarFS: SquashFS that is the default setting for DwarFS:
```
$ time mksquashfs install perl-install.squashfs -comp zstd -Xcompression-level 22 $ time mksquashfs install perl-install.squashfs -comp zstd -Xcompression-level 22
Parallel mksquashfs: Using 16 processors Parallel mksquashfs: Using 16 processors
Creating 4.0 filesystem on perl-install-zstd.squashfs, block size 131072. Creating 4.0 filesystem on perl-install-zstd.squashfs, block size 131072.
@ -292,9 +308,11 @@ SquashFS that is the default setting for DwarFS:
real 32m54.713s real 32m54.713s
user 501m46.382s user 501m46.382s
sys 0m58.528s sys 0m58.528s
```
For DwarFS, I'm sticking to the defaults: For DwarFS, I'm sticking to the defaults:
```
$ time mkdwarfs -i install -o perl-install.dwarfs $ time mkdwarfs -i install -o perl-install.dwarfs
I 11:33:33.310931 scanning install I 11:33:33.310931 scanning install
I 11:33:39.026712 waiting for background scanners... I 11:33:39.026712 waiting for background scanners...
@ -333,13 +351,16 @@ For DwarFS, I'm sticking to the defaults:
real 5m23.030s real 5m23.030s
user 78m7.554s user 78m7.554s
sys 1m47.968s sys 1m47.968s
```
So in this comparison, `mkdwarfs` is **more than 6 times faster** than `mksquashfs`, So in this comparison, `mkdwarfs` is **more than 6 times faster** than `mksquashfs`,
both in terms of CPU time and wall clock time. both in terms of CPU time and wall clock time.
```
$ ll perl-install.*fs $ ll perl-install.*fs
-rw-r--r-- 1 mhx users 447230618 Mar 3 20:28 perl-install.dwarfs -rw-r--r-- 1 mhx users 447230618 Mar 3 20:28 perl-install.dwarfs
-rw-r--r-- 1 mhx users 4748902400 Mar 3 20:10 perl-install.squashfs -rw-r--r-- 1 mhx users 4748902400 Mar 3 20:10 perl-install.squashfs
```
In terms of compression ratio, the **DwarFS file system is more than 10 times In terms of compression ratio, the **DwarFS file system is more than 10 times
smaller than the SquashFS file system**. With DwarFS, the content has been smaller than the SquashFS file system**. With DwarFS, the content has been
@ -351,21 +372,27 @@ the original space**.
Here's another comparison using `lzma` compression instead of `zstd`: Here's another comparison using `lzma` compression instead of `zstd`:
```
$ time mksquashfs install perl-install-lzma.squashfs -comp lzma $ time mksquashfs install perl-install-lzma.squashfs -comp lzma
real 13m42.825s real 13m42.825s
user 205m40.851s user 205m40.851s
sys 3m29.088s sys 3m29.088s
```
```
$ time mkdwarfs -i install -o perl-install-lzma.dwarfs -l9 $ time mkdwarfs -i install -o perl-install-lzma.dwarfs -l9
real 3m43.937s real 3m43.937s
user 49m45.295s user 49m45.295s
sys 1m44.550s sys 1m44.550s
```
```
$ ll perl-install-lzma.*fs $ ll perl-install-lzma.*fs
-rw-r--r-- 1 mhx users 315482627 Mar 3 21:23 perl-install-lzma.dwarfs -rw-r--r-- 1 mhx users 315482627 Mar 3 21:23 perl-install-lzma.dwarfs
-rw-r--r-- 1 mhx users 3838406656 Mar 3 20:50 perl-install-lzma.squashfs -rw-r--r-- 1 mhx users 3838406656 Mar 3 20:50 perl-install-lzma.squashfs
```
It's immediately obvious that the runs are significantly faster and the It's immediately obvious that the runs are significantly faster and the
resulting images are significantly smaller. Still, `mkdwarfs` is about resulting images are significantly smaller. Still, `mkdwarfs` is about
@ -383,21 +410,27 @@ uses a block size of 128KiB, whereas `mkdwarfs` uses 16MiB blocks by default,
or even 64MiB blocks with `-l9`. When using identical block sizes for both or even 64MiB blocks with `-l9`. When using identical block sizes for both
file systems, the difference, quite expectedly, becomes a lot less dramatic: file systems, the difference, quite expectedly, becomes a lot less dramatic:
```
$ time mksquashfs install perl-install-lzma-1M.squashfs -comp lzma -b 1M $ time mksquashfs install perl-install-lzma-1M.squashfs -comp lzma -b 1M
real 15m43.319s real 15m43.319s
user 139m24.533s user 139m24.533s
sys 0m45.132s sys 0m45.132s
```
```
$ time mkdwarfs -i install -o perl-install-lzma-1M.dwarfs -l9 -S20 -B3 $ time mkdwarfs -i install -o perl-install-lzma-1M.dwarfs -l9 -S20 -B3
real 4m25.973s real 4m25.973s
user 52m15.100s user 52m15.100s
sys 7m41.889s sys 7m41.889s
```
```
$ ll perl-install*.*fs $ ll perl-install*.*fs
-rw-r--r-- 1 mhx users 935953866 Mar 13 12:12 perl-install-lzma-1M.dwarfs -rw-r--r-- 1 mhx users 935953866 Mar 13 12:12 perl-install-lzma-1M.dwarfs
-rw-r--r-- 1 mhx users 3407474688 Mar 3 21:54 perl-install-lzma-1M.squashfs -rw-r--r-- 1 mhx users 3407474688 Mar 3 21:54 perl-install-lzma-1M.squashfs
```
Even this is *still* not entirely fair, as it uses a feature (`-B3`) that allows Even this is *still* not entirely fair, as it uses a feature (`-B3`) that allows
DwarFS to reference file chunks from up to two previous filesystem blocks. DwarFS to reference file chunks from up to two previous filesystem blocks.
@ -413,6 +446,7 @@ fast experimentation with different algorithms and options without requiring
a full rebuild of the file system. For example, recompressing the above file a full rebuild of the file system. For example, recompressing the above file
system with the best possible compression (`-l 9`): system with the best possible compression (`-l 9`):
```
$ time mkdwarfs --recompress -i perl-install.dwarfs -o perl-lzma-re.dwarfs -l9 $ time mkdwarfs --recompress -i perl-install.dwarfs -o perl-lzma-re.dwarfs -l9
I 20:28:03.246534 filesystem rewrittenwithout errors [148.3s] I 20:28:03.246534 filesystem rewrittenwithout errors [148.3s]
⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯
@ -423,11 +457,14 @@ system with the best possible compression (`-l 9`):
real 2m28.279s real 2m28.279s
user 37m8.825s user 37m8.825s
sys 0m43.256s sys 0m43.256s
```
```
$ ll perl-*.dwarfs $ ll perl-*.dwarfs
-rw-r--r-- 1 mhx users 447230618 Mar 3 20:28 perl-install.dwarfs -rw-r--r-- 1 mhx users 447230618 Mar 3 20:28 perl-install.dwarfs
-rw-r--r-- 1 mhx users 390845518 Mar 4 20:28 perl-lzma-re.dwarfs -rw-r--r-- 1 mhx users 390845518 Mar 4 20:28 perl-lzma-re.dwarfs
-rw-r--r-- 1 mhx users 315482627 Mar 3 21:23 perl-install-lzma.dwarfs -rw-r--r-- 1 mhx users 315482627 Mar 3 21:23 perl-install-lzma.dwarfs
```
Note that while the recompressed filesystem is smaller than the original image, Note that while the recompressed filesystem is smaller than the original image,
it is still a lot bigger than the filesystem we previously build with `-l9`. it is still a lot bigger than the filesystem we previously build with `-l9`.
@ -438,6 +475,7 @@ In terms of how fast the file system is when using it, a quick test
I've done is to freshly mount the filesystem created above and run I've done is to freshly mount the filesystem created above and run
each of the 1139 `perl` executables to print their version. each of the 1139 `perl` executables to print their version.
```
$ hyperfine -c "umount mnt" -p "umount mnt; dwarfs perl-install.dwarfs mnt -o cachesize=1g -o workers=4; sleep 1" -P procs 5 20 -D 5 "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P{procs} sh -c '\$0 -v >/dev/null'" $ hyperfine -c "umount mnt" -p "umount mnt; dwarfs perl-install.dwarfs mnt -o cachesize=1g -o workers=4; sleep 1" -P procs 5 20 -D 5 "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P{procs} sh -c '\$0 -v >/dev/null'"
Benchmark #1: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P5 sh -c '$0 -v >/dev/null' Benchmark #1: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P5 sh -c '$0 -v >/dev/null'
Time (mean ± σ): 1.810 s ± 0.013 s [User: 1.847 s, System: 0.623 s] Time (mean ± σ): 1.810 s ± 0.013 s [User: 1.847 s, System: 0.623 s]
@ -454,6 +492,7 @@ each of the 1139 `perl` executables to print their version.
Benchmark #4: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '$0 -v >/dev/null' Benchmark #4: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '$0 -v >/dev/null'
Time (mean ± σ): 1.149 s ± 0.015 s [User: 2.128 s, System: 0.781 s] Time (mean ± σ): 1.149 s ± 0.015 s [User: 2.128 s, System: 0.781 s]
Range (min … max): 1.136 s … 1.186 s 10 runs Range (min … max): 1.136 s … 1.186 s 10 runs
```
These timings are for *initial* runs on a freshly mounted file system, These timings are for *initial* runs on a freshly mounted file system,
running 5, 10, 15 and 20 processes in parallel. 1.1 seconds means that running 5, 10, 15 and 20 processes in parallel. 1.1 seconds means that
@ -462,6 +501,7 @@ it takes only about 1 millisecond per Perl binary.
Following are timings for *subsequent* runs, both on DwarFS (at `mnt`) Following are timings for *subsequent* runs, both on DwarFS (at `mnt`)
and the original XFS (at `install`). DwarFS is around 15% slower here: and the original XFS (at `install`). DwarFS is around 15% slower here:
```
$ hyperfine -P procs 10 20 -D 10 -w1 "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P{procs} sh -c '\$0 -v >/dev/null'" "ls -1 install/*/*/bin/perl5* | xargs -d $'\n' -n1 -P{procs} sh -c '\$0 -v >/dev/null'" $ hyperfine -P procs 10 20 -D 10 -w1 "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P{procs} sh -c '\$0 -v >/dev/null'" "ls -1 install/*/*/bin/perl5* | xargs -d $'\n' -n1 -P{procs} sh -c '\$0 -v >/dev/null'"
Benchmark #1: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P10 sh -c '$0 -v >/dev/null' Benchmark #1: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P10 sh -c '$0 -v >/dev/null'
Time (mean ± σ): 347.0 ms ± 7.2 ms [User: 1.755 s, System: 0.452 s] Time (mean ± σ): 347.0 ms ± 7.2 ms [User: 1.755 s, System: 0.452 s]
@ -484,10 +524,12 @@ and the original XFS (at `install`). DwarFS is around 15% slower here:
1.00 ± 0.01 times faster than 'ls -1 install/*/*/bin/perl5* | xargs -d $'\n' -n1 -P10 sh -c '$0 -v >/dev/null'' 1.00 ± 0.01 times faster than 'ls -1 install/*/*/bin/perl5* | xargs -d $'\n' -n1 -P10 sh -c '$0 -v >/dev/null''
1.13 ± 0.02 times faster than 'ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '$0 -v >/dev/null'' 1.13 ± 0.02 times faster than 'ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '$0 -v >/dev/null''
1.15 ± 0.03 times faster than 'ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P10 sh -c '$0 -v >/dev/null'' 1.15 ± 0.03 times faster than 'ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P10 sh -c '$0 -v >/dev/null''
```
Using the lzma-compressed file system, the metrics for *initial* runs look Using the lzma-compressed file system, the metrics for *initial* runs look
considerably worse (about an order of magnitude): considerably worse (about an order of magnitude):
```
$ hyperfine -c "umount mnt" -p "umount mnt; dwarfs perl-install-lzma.dwarfs mnt -o cachesize=1g -o workers=4; sleep 1" -P procs 5 20 -D 5 "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P{procs} sh -c '\$0 -v >/dev/null'" $ hyperfine -c "umount mnt" -p "umount mnt; dwarfs perl-install-lzma.dwarfs mnt -o cachesize=1g -o workers=4; sleep 1" -P procs 5 20 -D 5 "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P{procs} sh -c '\$0 -v >/dev/null'"
Benchmark #1: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P5 sh -c '$0 -v >/dev/null' Benchmark #1: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P5 sh -c '$0 -v >/dev/null'
Time (mean ± σ): 10.660 s ± 0.057 s [User: 1.952 s, System: 0.729 s] Time (mean ± σ): 10.660 s ± 0.057 s [User: 1.952 s, System: 0.729 s]
@ -504,6 +546,7 @@ considerably worse (about an order of magnitude):
Benchmark #4: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '$0 -v >/dev/null' Benchmark #4: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '$0 -v >/dev/null'
Time (mean ± σ): 9.004 s ± 0.298 s [User: 2.134 s, System: 0.736 s] Time (mean ± σ): 9.004 s ± 0.298 s [User: 2.134 s, System: 0.736 s]
Range (min … max): 8.611 s … 9.555 s 10 runs Range (min … max): 8.611 s … 9.555 s 10 runs
```
So you might want to consider using `zstd` instead of `lzma` if you'd So you might want to consider using `zstd` instead of `lzma` if you'd
like to optimize for file system performance. It's also the default like to optimize for file system performance. It's also the default
@ -511,6 +554,7 @@ compression used by `mkdwarfs`.
Now here's a comparison with the SquashFS filesystem: Now here's a comparison with the SquashFS filesystem:
```
$ hyperfine -c 'sudo umount mnt' -p 'umount mnt; dwarfs perl-install.dwarfs mnt -o cachesize=1g -o workers=4; sleep 1' -n dwarfs-zstd "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '\$0 -v >/dev/null'" -p 'sudo umount mnt; sudo mount -t squashfs perl-install.squashfs mnt; sleep 1' -n squashfs-zstd "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '\$0 -v >/dev/null'" $ hyperfine -c 'sudo umount mnt' -p 'umount mnt; dwarfs perl-install.dwarfs mnt -o cachesize=1g -o workers=4; sleep 1' -n dwarfs-zstd "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '\$0 -v >/dev/null'" -p 'sudo umount mnt; sudo mount -t squashfs perl-install.squashfs mnt; sleep 1' -n squashfs-zstd "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '\$0 -v >/dev/null'"
Benchmark #1: dwarfs-zstd Benchmark #1: dwarfs-zstd
Time (mean ± σ): 1.151 s ± 0.015 s [User: 2.147 s, System: 0.769 s] Time (mean ± σ): 1.151 s ± 0.015 s [User: 2.147 s, System: 0.769 s]
@ -523,17 +567,20 @@ Now here's a comparison with the SquashFS filesystem:
Summary Summary
'dwarfs-zstd' ran 'dwarfs-zstd' ran
5.85 ± 0.08 times faster than 'squashfs-zstd' 5.85 ± 0.08 times faster than 'squashfs-zstd'
```
So DwarFS is almost six times faster than SquashFS. But what's more, So DwarFS is almost six times faster than SquashFS. But what's more,
SquashFS also uses significantly more CPU power. However, the numbers SquashFS also uses significantly more CPU power. However, the numbers
shown above for DwarFS obviously don't include the time spent in the shown above for DwarFS obviously don't include the time spent in the
`dwarfs` process, so I repeated the test outside of hyperfine: `dwarfs` process, so I repeated the test outside of hyperfine:
```
$ time dwarfs perl-install.dwarfs mnt -o cachesize=1g -o workers=4 -f $ time dwarfs perl-install.dwarfs mnt -o cachesize=1g -o workers=4 -f
real 0m4.569s real 0m4.569s
user 0m2.154s user 0m2.154s
sys 0m1.846s sys 0m1.846s
```
So in total, DwarFS was using 5.7 seconds of CPU time, whereas So in total, DwarFS was using 5.7 seconds of CPU time, whereas
SquashFS was using 20.2 seconds, almost four times as much. Ignore SquashFS was using 20.2 seconds, almost four times as much. Ignore
@ -546,13 +593,13 @@ used, [Tie::Hash::Indexed](https://github.com/mhx/Tie-Hash-Indexed),
has an XS component that requires a C compiler to build. So this really has an XS component that requires a C compiler to build. So this really
accesses a lot of different stuff in the file system: accesses a lot of different stuff in the file system:
* The `perl` executables and its shared libraries - The `perl` executables and its shared libraries
* The Perl modules used for writing the Makefile - The Perl modules used for writing the Makefile
* Perl's C header files used for building the module - Perl's C header files used for building the module
* More Perl modules used for running the tests - More Perl modules used for running the tests
I wrote a little script to be able to run multiple builds in parallel: I wrote a little script to be able to run multiple builds in parallel:
@ -574,28 +621,36 @@ The following command will run up to 16 builds in parallel on the 8 core
Xeon CPU, including debug, optimized and threaded versions of all Perl Xeon CPU, including debug, optimized and threaded versions of all Perl
releases between 5.10.0 and 5.33.3, a total of 624 `perl` installations: releases between 5.10.0 and 5.33.3, a total of 624 `perl` installations:
```
$ time ls -1 /tmp/perl/install/*/perl-5.??.?/bin/perl5* | sort -t / -k 8 | xargs -d $'\n' -P 16 -n 1 ./build.sh $ time ls -1 /tmp/perl/install/*/perl-5.??.?/bin/perl5* | sort -t / -k 8 | xargs -d $'\n' -P 16 -n 1 ./build.sh
```
Tests were done with a cleanly mounted file system to make sure the caches Tests were done with a cleanly mounted file system to make sure the caches
were empty. `ccache` was primed to make sure all compiler runs could be were empty. `ccache` was primed to make sure all compiler runs could be
satisfied from the cache. With SquashFS, the timing was: satisfied from the cache. With SquashFS, the timing was:
```
real 0m52.385s real 0m52.385s
user 8m10.333s user 8m10.333s
sys 4m10.056s sys 4m10.056s
```
And with DwarFS: And with DwarFS:
```
real 0m50.469s real 0m50.469s
user 9m22.597s user 9m22.597s
sys 1m18.469s sys 1m18.469s
```
So, frankly, not much of a difference, with DwarFS being just a bit faster. So, frankly, not much of a difference, with DwarFS being just a bit faster.
The `dwarfs` process itself used: The `dwarfs` process itself used:
```
real 0m56.686s real 0m56.686s
user 0m18.857s user 0m18.857s
sys 0m21.058s sys 0m21.058s
```
So again, DwarFS used less raw CPU power overall, but in terms of wallclock So again, DwarFS used less raw CPU power overall, but in terms of wallclock
time, the difference is really marginal. time, the difference is really marginal.
@ -606,6 +661,7 @@ This test uses slightly less pathological input data: the root filesystem of
a recent Raspberry Pi OS release. This file system also contains device inodes, a recent Raspberry Pi OS release. This file system also contains device inodes,
so in order to preserve those, we pass `--with-devices` to `mkdwarfs`: so in order to preserve those, we pass `--with-devices` to `mkdwarfs`:
```
$ time sudo mkdwarfs -i raspbian -o raspbian.dwarfs --with-devices $ time sudo mkdwarfs -i raspbian -o raspbian.dwarfs --with-devices
I 21:30:29.812562 scanning raspbian I 21:30:29.812562 scanning raspbian
I 21:30:29.908984 waiting for background scanners... I 21:30:29.908984 waiting for background scanners...
@ -640,9 +696,11 @@ so in order to preserve those, we pass `--with-devices` to `mkdwarfs`:
real 0m46.711s real 0m46.711s
user 10m39.038s user 10m39.038s
sys 0m8.123s sys 0m8.123s
```
Again, SquashFS uses the same compression options: Again, SquashFS uses the same compression options:
```
$ time sudo mksquashfs raspbian raspbian.squashfs -comp zstd -Xcompression-level 22 $ time sudo mksquashfs raspbian raspbian.squashfs -comp zstd -Xcompression-level 22
Parallel mksquashfs: Using 16 processors Parallel mksquashfs: Using 16 processors
Creating 4.0 filesystem on raspbian.squashfs, block size 131072. Creating 4.0 filesystem on raspbian.squashfs, block size 131072.
@ -694,62 +752,77 @@ Again, SquashFS uses the same compression options:
real 0m50.124s real 0m50.124s
user 9m41.708s user 9m41.708s
sys 0m1.727s sys 0m1.727s
```
The difference in speed is almost negligible. SquashFS is just a bit The difference in speed is almost negligible. SquashFS is just a bit
slower here. In terms of compression, the difference also isn't huge: slower here. In terms of compression, the difference also isn't huge:
```
$ ls -lh raspbian.* *.xz $ ls -lh raspbian.* *.xz
-rw-r--r-- 1 mhx users 297M Mar 4 21:32 2020-08-20-raspios-buster-armhf-lite.img.xz -rw-r--r-- 1 mhx users 297M Mar 4 21:32 2020-08-20-raspios-buster-armhf-lite.img.xz
-rw-r--r-- 1 root root 287M Mar 4 21:31 raspbian.dwarfs -rw-r--r-- 1 root root 287M Mar 4 21:31 raspbian.dwarfs
-rw-r--r-- 1 root root 364M Mar 4 21:33 raspbian.squashfs -rw-r--r-- 1 root root 364M Mar 4 21:33 raspbian.squashfs
```
Interestingly, `xz` actually can't compress the whole original image Interestingly, `xz` actually can't compress the whole original image
better than DwarFS. better than DwarFS.
We can even again try to increase the DwarFS compression level: We can even again try to increase the DwarFS compression level:
```
$ time sudo mkdwarfs -i raspbian -o raspbian-9.dwarfs --with-devices -l9 $ time sudo mkdwarfs -i raspbian -o raspbian-9.dwarfs --with-devices -l9
real 0m54.161s real 0m54.161s
user 8m40.109s user 8m40.109s
sys 0m7.101s sys 0m7.101s
```
Now that actually gets the DwarFS image size well below that of the Now that actually gets the DwarFS image size well below that of the
`xz` archive: `xz` archive:
```
$ ls -lh raspbian-9.dwarfs *.xz $ ls -lh raspbian-9.dwarfs *.xz
-rw-r--r-- 1 root root 244M Mar 4 21:36 raspbian-9.dwarfs -rw-r--r-- 1 root root 244M Mar 4 21:36 raspbian-9.dwarfs
-rw-r--r-- 1 mhx users 297M Mar 4 21:32 2020-08-20-raspios-buster-armhf-lite.img.xz -rw-r--r-- 1 mhx users 297M Mar 4 21:32 2020-08-20-raspios-buster-armhf-lite.img.xz
```
Even if you actually build a tarball and compress that (instead of Even if you actually build a tarball and compress that (instead of
compressing the EXT4 file system itself), `xz` isn't quite able to compressing the EXT4 file system itself), `xz` isn't quite able to
match the DwarFS image size: match the DwarFS image size:
```
$ time sudo tar cf - raspbian | xz -9 -vT 0 >raspbian.tar.xz $ time sudo tar cf - raspbian | xz -9 -vT 0 >raspbian.tar.xz
100 % 246.9 MiB / 1,037.2 MiB = 0.238 13 MiB/s 1:18 100 % 246.9 MiB / 1,037.2 MiB = 0.238 13 MiB/s 1:18
real 1m18.226s real 1m18.226s
user 6m35.381s user 6m35.381s
sys 0m2.205s sys 0m2.205s
```
```
$ ls -lh raspbian.tar.xz $ ls -lh raspbian.tar.xz
-rw-r--r-- 1 mhx users 247M Mar 4 21:40 raspbian.tar.xz -rw-r--r-- 1 mhx users 247M Mar 4 21:40 raspbian.tar.xz
```
DwarFS also comes with the [dwarfsextract](doc/dwarfsextract.md) tool DwarFS also comes with the [dwarfsextract](doc/dwarfsextract.md) tool
that allows extraction of a filesystem image without the FUSE driver. that allows extraction of a filesystem image without the FUSE driver.
So here's a comparison of the extraction speed: So here's a comparison of the extraction speed:
```
$ time sudo tar xf raspbian.tar.xz -C out1 $ time sudo tar xf raspbian.tar.xz -C out1
real 0m12.846s real 0m12.846s
user 0m12.313s user 0m12.313s
sys 0m1.616s sys 0m1.616s
```
```
$ time sudo dwarfsextract -i raspbian-9.dwarfs -o out2 $ time sudo dwarfsextract -i raspbian-9.dwarfs -o out2
real 0m3.825s real 0m3.825s
user 0m13.234s user 0m13.234s
sys 0m1.382s sys 0m1.382s
```
So `dwarfsextract` is almost 4 times faster thanks to using multiple So `dwarfsextract` is almost 4 times faster thanks to using multiple
worker threads for decompression. It's writing about 300 MiB/s in this worker threads for decompression. It's writing about 300 MiB/s in this
@ -759,14 +832,18 @@ Another nice feature of `dwarfsextract` is that it allows you to directly
output data in an archive format, so you could create a tarball from output data in an archive format, so you could create a tarball from
your image without extracting the files to disk: your image without extracting the files to disk:
```
$ dwarfsextract -i raspbian-9.dwarfs -f ustar | xz -9 -T0 >raspbian2.tar.xz $ dwarfsextract -i raspbian-9.dwarfs -f ustar | xz -9 -T0 >raspbian2.tar.xz
```
This has the interesting side-effect that the resulting tarball will This has the interesting side-effect that the resulting tarball will
likely be smaller than the one built straight from the directory: likely be smaller than the one built straight from the directory:
```
$ ls -lh raspbian*.tar.xz $ ls -lh raspbian*.tar.xz
-rw-r--r-- 1 mhx users 247M Mar 4 21:40 raspbian.tar.xz -rw-r--r-- 1 mhx users 247M Mar 4 21:40 raspbian.tar.xz
-rw-r--r-- 1 mhx users 240M Mar 4 23:52 raspbian2.tar.xz -rw-r--r-- 1 mhx users 240M Mar 4 23:52 raspbian2.tar.xz
```
That's because `dwarfsextract` writes files in inode-order, and by That's because `dwarfsextract` writes files in inode-order, and by
default inodes are ordered by similarity for the best possible default inodes are ordered by similarity for the best possible
@ -784,14 +861,17 @@ When I first read about `lrzip`, I was pretty certain it would easily
beat DwarFS. So let's take a look. `lrzip` operates on a single file, beat DwarFS. So let's take a look. `lrzip` operates on a single file,
so it's necessary to first build a tarball: so it's necessary to first build a tarball:
```
$ time tar cf perl-install.tar install $ time tar cf perl-install.tar install
real 2m9.568s real 2m9.568s
user 0m3.757s user 0m3.757s
sys 0m26.623s sys 0m26.623s
```
Now we can run `lrzip`: Now we can run `lrzip`:
```
$ time lrzip -vL9 -o perl-install.tar.lrzip perl-install.tar $ time lrzip -vL9 -o perl-install.tar.lrzip perl-install.tar
The following options are in effect for this COMPRESSION. The following options are in effect for this COMPRESSION.
Threading is ENABLED. Number of CPUs detected: 16 Threading is ENABLED. Number of CPUs detected: 16
@ -814,12 +894,15 @@ Now we can run `lrzip`:
real 57m32.472s real 57m32.472s
user 81m44.104s user 81m44.104s
sys 4m50.221s sys 4m50.221s
```
That definitely took a while. This is about an order of magnitude That definitely took a while. This is about an order of magnitude
slower than `mkdwarfs` and it barely makes use of the 8 cores. slower than `mkdwarfs` and it barely makes use of the 8 cores.
```
$ ll -h perl-install.tar.lrzip $ ll -h perl-install.tar.lrzip
-rw-r--r-- 1 mhx users 500M Mar 6 21:16 perl-install.tar.lrzip -rw-r--r-- 1 mhx users 500M Mar 6 21:16 perl-install.tar.lrzip
```
This is a surprisingly disappointing result. The archive is 65% larger This is a surprisingly disappointing result. The archive is 65% larger
than a DwarFS image at `-l9` that takes less than 4 minutes to build. than a DwarFS image at `-l9` that takes less than 4 minutes to build.
@ -828,6 +911,7 @@ unpacking the archive first.
That being said, it *is* better than just using `xz` on the tarball: That being said, it *is* better than just using `xz` on the tarball:
```
$ time xz -T0 -v9 -c perl-install.tar >perl-install.tar.xz $ time xz -T0 -v9 -c perl-install.tar >perl-install.tar.xz
perl-install.tar (1/1) perl-install.tar (1/1)
100 % 4,317.0 MiB / 49.0 GiB = 0.086 24 MiB/s 34:55 100 % 4,317.0 MiB / 49.0 GiB = 0.086 24 MiB/s 34:55
@ -835,9 +919,12 @@ That being said, it *is* better than just using `xz` on the tarball:
real 34m55.450s real 34m55.450s
user 543m50.810s user 543m50.810s
sys 0m26.533s sys 0m26.533s
```
```
$ ll perl-install.tar.xz -h $ ll perl-install.tar.xz -h
-rw-r--r-- 1 mhx users 4.3G Mar 6 22:59 perl-install.tar.xz -rw-r--r-- 1 mhx users 4.3G Mar 6 22:59 perl-install.tar.xz
```
### With zpaq ### With zpaq
@ -850,10 +937,13 @@ can be used.
Anyway, how does it fare in terms of speed and compression performance? Anyway, how does it fare in terms of speed and compression performance?
```
$ time zpaq a perl-install.zpaq install -m5 $ time zpaq a perl-install.zpaq install -m5
```
After a few million lines of output that (I think) cannot be turned off: After a few million lines of output that (I think) cannot be turned off:
```
2258234 +added, 0 -removed. 2258234 +added, 0 -removed.
0.000000 + (51161.953159 -> 8932.000297 -> 490.227707) = 490.227707 MB 0.000000 + (51161.953159 -> 8932.000297 -> 490.227707) = 490.227707 MB
@ -862,30 +952,34 @@ After a few million lines of output that (I think) cannot be turned off:
real 47m8.104s real 47m8.104s
user 714m44.286s user 714m44.286s
sys 3m6.751s sys 3m6.751s
```
So it's an order of magnitude slower than `mkdwarfs` and uses 14 times So it's an order of magnitude slower than `mkdwarfs` and uses 14 times
as much CPU resources as `mkdwarfs -l9`. The resulting archive it pretty as much CPU resources as `mkdwarfs -l9`. The resulting archive it pretty
close in size to the default configuration DwarFS image, but it's more close in size to the default configuration DwarFS image, but it's more
than 50% bigger than the image produced by `mkdwarfs -l9`. than 50% bigger than the image produced by `mkdwarfs -l9`.
```
$ ll perl-install*.* $ ll perl-install*.*
-rw-r--r-- 1 mhx users 490227707 Mar 7 01:38 perl-install.zpaq -rw-r--r-- 1 mhx users 490227707 Mar 7 01:38 perl-install.zpaq
-rw-r--r-- 1 mhx users 315482627 Mar 3 21:23 perl-install-l9.dwarfs -rw-r--r-- 1 mhx users 315482627 Mar 3 21:23 perl-install-l9.dwarfs
-rw-r--r-- 1 mhx users 447230618 Mar 3 20:28 perl-install.dwarfs -rw-r--r-- 1 mhx users 447230618 Mar 3 20:28 perl-install.dwarfs
```
What's *really* surprising is how slow it is to extract the `zpaq` What's *really* surprising is how slow it is to extract the `zpaq`
archive again: archive again:
```
$ time zpaq x perl-install.zpaq $ time zpaq x perl-install.zpaq
2798.097 seconds (all OK) 2798.097 seconds (all OK)
real 46m38.117s real 46m38.117s
user 711m18.734s user 711m18.734s
sys 3m47.876s sys 3m47.876s
```
That's 700 times slower than extracting the DwarFS image. That's 700 times slower than extracting the DwarFS image.
### With wimlib ### With wimlib
[wimlib](https://wimlib.net/) is a really interesting project that is [wimlib](https://wimlib.net/) is a really interesting project that is
@ -896,6 +990,7 @@ quite a rich set of features, so it's definitely worth taking a look at.
I first tried `wimcapture` on the perl dataset: I first tried `wimcapture` on the perl dataset:
```
$ time wimcapture --unix-data --solid --solid-chunk-size=16M install perl-install.wim $ time wimcapture --unix-data --solid --solid-chunk-size=16M install perl-install.wim
Scanning "install" Scanning "install"
47 GiB scanned (1927501 files, 330733 directories) 47 GiB scanned (1927501 files, 330733 directories)
@ -905,12 +1000,15 @@ I first tried `wimcapture` on the perl dataset:
real 15m23.310s real 15m23.310s
user 174m29.274s user 174m29.274s
sys 0m42.921s sys 0m42.921s
```
```
$ ll perl-install.* $ ll perl-install.*
-rw-r--r-- 1 mhx users 447230618 Mar 3 20:28 perl-install.dwarfs -rw-r--r-- 1 mhx users 447230618 Mar 3 20:28 perl-install.dwarfs
-rw-r--r-- 1 mhx users 315482627 Mar 3 21:23 perl-install-l9.dwarfs -rw-r--r-- 1 mhx users 315482627 Mar 3 21:23 perl-install-l9.dwarfs
-rw-r--r-- 1 mhx users 4748902400 Mar 3 20:10 perl-install.squashfs -rw-r--r-- 1 mhx users 4748902400 Mar 3 20:10 perl-install.squashfs
-rw-r--r-- 1 mhx users 1016981520 Mar 6 21:12 perl-install.wim -rw-r--r-- 1 mhx users 1016981520 Mar 6 21:12 perl-install.wim
```
So wimlib is definitely much better than squashfs, in terms of both So wimlib is definitely much better than squashfs, in terms of both
compression ratio and speed. DwarFS is however about 3 times faster to compression ratio and speed. DwarFS is however about 3 times faster to
@ -921,43 +1019,52 @@ When switching to LZMA compression, the DwarFS file system is more than
What's a bit surprising is that mounting a *wim* file takes quite a bit What's a bit surprising is that mounting a *wim* file takes quite a bit
of time: of time:
```
$ time wimmount perl-install.wim mnt $ time wimmount perl-install.wim mnt
[WARNING] Mounting a WIM file containing solid-compressed data; file access may be slow. [WARNING] Mounting a WIM file containing solid-compressed data; file access may be slow.
real 0m2.038s real 0m2.038s
user 0m1.764s user 0m1.764s
sys 0m0.242s sys 0m0.242s
```
Mounting the DwarFS image takes almost no time in comparison: Mounting the DwarFS image takes almost no time in comparison:
```
$ time git/github/dwarfs/build-clang-11/dwarfs perl-install-default.dwarfs mnt $ time git/github/dwarfs/build-clang-11/dwarfs perl-install-default.dwarfs mnt
I 00:23:39.238182 dwarfs (v0.4.0, fuse version 35) I 00:23:39.238182 dwarfs (v0.4.0, fuse version 35)
real 0m0.003s real 0m0.003s
user 0m0.003s user 0m0.003s
sys 0m0.000s sys 0m0.000s
```
That's just because it immediately forks into background by default and That's just because it immediately forks into background by default and
initializes the file system in the background. However, even when initializes the file system in the background. However, even when
running it in the foreground, initializing the file system takes only running it in the foreground, initializing the file system takes only
about 60 milliseconds: about 60 milliseconds:
```
$ dwarfs perl-install.dwarfs mnt -f $ dwarfs perl-install.dwarfs mnt -f
I 00:25:03.186005 dwarfs (v0.4.0, fuse version 35) I 00:25:03.186005 dwarfs (v0.4.0, fuse version 35)
I 00:25:03.248061 file system initialized [60.95ms] I 00:25:03.248061 file system initialized [60.95ms]
```
If you actually build the DwarFS file system with uncompressed metadata, If you actually build the DwarFS file system with uncompressed metadata,
mounting is basically instantaneous: mounting is basically instantaneous:
```
$ dwarfs perl-install-meta.dwarfs mnt -f $ dwarfs perl-install-meta.dwarfs mnt -f
I 00:27:52.667026 dwarfs (v0.4.0, fuse version 35) I 00:27:52.667026 dwarfs (v0.4.0, fuse version 35)
I 00:27:52.671066 file system initialized [2.879ms] I 00:27:52.671066 file system initialized [2.879ms]
```
I've tried running the benchmark where all 1139 `perl` executables I've tried running the benchmark where all 1139 `perl` executables
print their version with the wimlib image, but after about 10 minutes, print their version with the wimlib image, but after about 10 minutes,
it still hadn't finished the first run (with the DwarFS image, one run it still hadn't finished the first run (with the DwarFS image, one run
took slightly more than 2 seconds). I then tried the following instead: took slightly more than 2 seconds). I then tried the following instead:
```
$ ls -1 /tmp/perl/install/*/*/bin/perl5* | xargs -d $'\n' -n1 -P1 sh -c 'time $0 -v >/dev/null' 2>&1 | grep ^real $ ls -1 /tmp/perl/install/*/*/bin/perl5* | xargs -d $'\n' -n1 -P1 sh -c 'time $0 -v >/dev/null' 2>&1 | grep ^real
real 0m0.802s real 0m0.802s
real 0m0.652s real 0m0.652s
@ -972,6 +1079,7 @@ took slightly more than 2 seconds). I then tried the following instead:
real 0m1.809s real 0m1.809s
real 0m1.790s real 0m1.790s
real 0m2.115s real 0m2.115s
```
Judging from that, it would have probably taken about half an hour Judging from that, it would have probably taken about half an hour
for a single run, which makes at least the `--solid` wim image pretty for a single run, which makes at least the `--solid` wim image pretty
@ -982,6 +1090,7 @@ that DwarFS actually organizes data internally. However, judging by the
warning when mounting a solid image, it's probably not ideal when using warning when mounting a solid image, it's probably not ideal when using
the image as a mounted file system. So I tried again without `--solid`: the image as a mounted file system. So I tried again without `--solid`:
```
$ time wimcapture --unix-data install perl-install-nonsolid.wim $ time wimcapture --unix-data install perl-install-nonsolid.wim
Scanning "install" Scanning "install"
47 GiB scanned (1927501 files, 330733 directories) 47 GiB scanned (1927501 files, 330733 directories)
@ -991,25 +1100,31 @@ the image as a mounted file system. So I tried again without `--solid`:
real 8m39.034s real 8m39.034s
user 64m58.575s user 64m58.575s
sys 0m32.003s sys 0m32.003s
```
This is still more than 3 minutes slower than `mkdwarfs`. However, it This is still more than 3 minutes slower than `mkdwarfs`. However, it
yields an image that's almost 10 times the size of the DwarFS image yields an image that's almost 10 times the size of the DwarFS image
and comparable in size to the SquashFS image: and comparable in size to the SquashFS image:
```
$ ll perl-install-nonsolid.wim -h $ ll perl-install-nonsolid.wim -h
-rw-r--r-- 1 mhx users 4.6G Mar 6 23:24 perl-install-nonsolid.wim -rw-r--r-- 1 mhx users 4.6G Mar 6 23:24 perl-install-nonsolid.wim
```
This *still* takes surprisingly long to mount: This *still* takes surprisingly long to mount:
```
$ time wimmount perl-install-nonsolid.wim mnt $ time wimmount perl-install-nonsolid.wim mnt
real 0m1.603s real 0m1.603s
user 0m1.327s user 0m1.327s
sys 0m0.275s sys 0m0.275s
```
However, it's really usable as a file system, even though it's about However, it's really usable as a file system, even though it's about
4-5 times slower than the DwarFS image: 4-5 times slower than the DwarFS image:
```
$ hyperfine -c 'umount mnt' -p 'umount mnt; dwarfs perl-install.dwarfs mnt -o cachesize=1g -o workers=4; sleep 1' -n dwarfs "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '\$0 -v >/dev/null'" -p 'umount mnt; wimmount perl-install-nonsolid.wim mnt; sleep 1' -n wimlib "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '\$0 -v >/dev/null'" $ hyperfine -c 'umount mnt' -p 'umount mnt; dwarfs perl-install.dwarfs mnt -o cachesize=1g -o workers=4; sleep 1' -n dwarfs "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '\$0 -v >/dev/null'" -p 'umount mnt; wimmount perl-install-nonsolid.wim mnt; sleep 1' -n wimlib "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '\$0 -v >/dev/null'"
Benchmark #1: dwarfs Benchmark #1: dwarfs
Time (mean ± σ): 1.149 s ± 0.019 s [User: 2.147 s, System: 0.739 s] Time (mean ± σ): 1.149 s ± 0.019 s [User: 2.147 s, System: 0.739 s]
@ -1022,7 +1137,7 @@ However, it's really usable as a file system, even though it's about
Summary Summary
'dwarfs' ran 'dwarfs' ran
6.56 ± 0.12 times faster than 'wimlib' 6.56 ± 0.12 times faster than 'wimlib'
```
### With Cromfs ### With Cromfs
@ -1035,6 +1150,7 @@ Here's a run on the Perl dataset, with the block size set to 16 MiB to
match the default of DwarFS, and with additional options suggested to match the default of DwarFS, and with additional options suggested to
speed up compression: speed up compression:
```
$ time mkcromfs -f 16777216 -qq -e -r100000 install perl-install.cromfs $ time mkcromfs -f 16777216 -qq -e -r100000 install perl-install.cromfs
Writing perl-install.cromfs... Writing perl-install.cromfs...
mkcromfs: Automatically enabling --24bitblocknums because it seems possible for this filesystem. mkcromfs: Automatically enabling --24bitblocknums because it seems possible for this filesystem.
@ -1050,6 +1166,7 @@ speed up compression:
real 29m9.634s real 29m9.634s
user 201m37.816s user 201m37.816s
sys 2m15.005s sys 2m15.005s
```
So it processed 21 MiB out of 48 GiB in half an hour, using almost So it processed 21 MiB out of 48 GiB in half an hour, using almost
twice as much CPU resources as DwarFS for the *whole* file system. twice as much CPU resources as DwarFS for the *whole* file system.
@ -1062,6 +1179,7 @@ I then tried once more with a smaller version of the Perl dataset.
This only has 20 versions (instead of 1139) of Perl, and obviously This only has 20 versions (instead of 1139) of Perl, and obviously
a lot less redundancy: a lot less redundancy:
```
$ time mkcromfs -f 16777216 -qq -e -r100000 install-small perl-install.cromfs $ time mkcromfs -f 16777216 -qq -e -r100000 install-small perl-install.cromfs
Writing perl-install.cromfs... Writing perl-install.cromfs...
mkcromfs: Automatically enabling --16bitblocknums because it seems possible for this filesystem. mkcromfs: Automatically enabling --16bitblocknums because it seems possible for this filesystem.
@ -1092,9 +1210,11 @@ a lot less redundancy:
real 27m38.833s real 27m38.833s
user 277m36.208s user 277m36.208s
sys 11m36.945s sys 11m36.945s
```
And repeating the same task with `mkdwarfs`: And repeating the same task with `mkdwarfs`:
```
$ time mkdwarfs -i install-small -o perl-install-small.dwarfs $ time mkdwarfs -i install-small -o perl-install-small.dwarfs
21:13:38.131724 scanning install-small 21:13:38.131724 scanning install-small
21:13:38.320139 waiting for background scanners... 21:13:38.320139 waiting for background scanners...
@ -1129,13 +1249,16 @@ And repeating the same task with `mkdwarfs`:
real 0m33.007s real 0m33.007s
user 3m43.324s user 3m43.324s
sys 0m4.015s sys 0m4.015s
```
So `mkdwarfs` is about 50 times faster than `mkcromfs` and uses 75 times So `mkdwarfs` is about 50 times faster than `mkcromfs` and uses 75 times
less CPU resources. At the same time, the DwarFS file system is 30% smaller: less CPU resources. At the same time, the DwarFS file system is 30% smaller:
```
$ ls -l perl-install-small.*fs $ ls -l perl-install-small.*fs
-rw-r--r-- 1 mhx users 35328512 Dec 8 14:25 perl-install-small.cromfs -rw-r--r-- 1 mhx users 35328512 Dec 8 14:25 perl-install-small.cromfs
-rw-r--r-- 1 mhx users 25175016 Dec 10 21:14 perl-install-small.dwarfs -rw-r--r-- 1 mhx users 25175016 Dec 10 21:14 perl-install-small.dwarfs
```
I noticed that the `blockifying` step that took ages for the full dataset I noticed that the `blockifying` step that took ages for the full dataset
with `mkcromfs` ran substantially faster (in terms of MiB/second) on the with `mkcromfs` ran substantially faster (in terms of MiB/second) on the
@ -1145,6 +1268,7 @@ behaviour that's slowing down `mkcromfs`.
In order to be completely fair, I also ran `mkdwarfs` with `-l 9` to enable In order to be completely fair, I also ran `mkdwarfs` with `-l 9` to enable
LZMA compression (which is what `mkcromfs` uses by default): LZMA compression (which is what `mkcromfs` uses by default):
```
$ time mkdwarfs -i install-small -o perl-install-small-l9.dwarfs -l 9 $ time mkdwarfs -i install-small -o perl-install-small-l9.dwarfs -l 9
21:16:21.874975 scanning install-small 21:16:21.874975 scanning install-small
21:16:22.092201 waiting for background scanners... 21:16:22.092201 waiting for background scanners...
@ -1179,11 +1303,14 @@ LZMA compression (which is what `mkcromfs` uses by default):
real 0m48.683s real 0m48.683s
user 2m24.905s user 2m24.905s
sys 0m3.292s sys 0m3.292s
```
```
$ ls -l perl-install-small*.*fs $ ls -l perl-install-small*.*fs
-rw-r--r-- 1 mhx users 18282075 Dec 10 21:17 perl-install-small-l9.dwarfs -rw-r--r-- 1 mhx users 18282075 Dec 10 21:17 perl-install-small-l9.dwarfs
-rw-r--r-- 1 mhx users 35328512 Dec 8 14:25 perl-install-small.cromfs -rw-r--r-- 1 mhx users 35328512 Dec 8 14:25 perl-install-small.cromfs
-rw-r--r-- 1 mhx users 25175016 Dec 10 21:14 perl-install-small.dwarfs -rw-r--r-- 1 mhx users 25175016 Dec 10 21:14 perl-install-small.dwarfs
```
It takes about 15 seconds longer to build the DwarFS file system with LZMA It takes about 15 seconds longer to build the DwarFS file system with LZMA
compression (this is still 35 times faster than Cromfs), but reduces the compression (this is still 35 times faster than Cromfs), but reduces the
@ -1203,6 +1330,7 @@ supports LZ4 compression.
I was feeling lucky and decided to run it on the full Perl dataset: I was feeling lucky and decided to run it on the full Perl dataset:
```
$ time mkfs.erofs perl-install.erofs install -zlz4hc,9 -d2 $ time mkfs.erofs perl-install.erofs install -zlz4hc,9 -d2
mkfs.erofs 1.2 mkfs.erofs 1.2
c_version: [ 1.2] c_version: [ 1.2]
@ -1213,17 +1341,21 @@ I was feeling lucky and decided to run it on the full Perl dataset:
real 912m42.601s real 912m42.601s
user 903m2.777s user 903m2.777s
sys 1m52.812s sys 1m52.812s
```
As you can tell, after more than 15 hours I just gave up. In those As you can tell, after more than 15 hours I just gave up. In those
15 hours, `mkfs.erofs` had produced a 13 GiB output file: 15 hours, `mkfs.erofs` had produced a 13 GiB output file:
```
$ ll -h perl-install.erofs $ ll -h perl-install.erofs
-rw-r--r-- 1 mhx users 13G Dec 9 14:42 perl-install.erofs -rw-r--r-- 1 mhx users 13G Dec 9 14:42 perl-install.erofs
```
I don't think this would have been very useful to compare with DwarFS. I don't think this would have been very useful to compare with DwarFS.
Just as for Cromfs, I re-ran with the smaller Perl dataset: Just as for Cromfs, I re-ran with the smaller Perl dataset:
```
$ time mkfs.erofs perl-install-small.erofs install-small -zlz4hc,9 -d2 $ time mkfs.erofs perl-install-small.erofs install-small -zlz4hc,9 -d2
mkfs.erofs 1.2 mkfs.erofs 1.2
c_version: [ 1.2] c_version: [ 1.2]
@ -1233,20 +1365,24 @@ Just as for Cromfs, I re-ran with the smaller Perl dataset:
real 0m27.844s real 0m27.844s
user 0m20.570s user 0m20.570s
sys 0m1.848s sys 0m1.848s
```
That was surprisingly quick, which makes me think that, again, there That was surprisingly quick, which makes me think that, again, there
might be some accidentally quadratic complexity hiding in `mkfs.erofs`. might be some accidentally quadratic complexity hiding in `mkfs.erofs`.
The output file it produced is an order of magnitude larger than the The output file it produced is an order of magnitude larger than the
DwarFS image: DwarFS image:
```
$ ls -l perl-install-small.*fs $ ls -l perl-install-small.*fs
-rw-r--r-- 1 mhx users 26928161 Dec 8 15:05 perl-install-small.dwarfs -rw-r--r-- 1 mhx users 26928161 Dec 8 15:05 perl-install-small.dwarfs
-rw-r--r-- 1 mhx users 296488960 Dec 9 14:45 perl-install-small.erofs -rw-r--r-- 1 mhx users 296488960 Dec 9 14:45 perl-install-small.erofs
```
Admittedly, this isn't a fair comparison. EROFS has a fixed block size Admittedly, this isn't a fair comparison. EROFS has a fixed block size
of 4 KiB, and it uses LZ4 compression. If we tweak DwarFS to the same of 4 KiB, and it uses LZ4 compression. If we tweak DwarFS to the same
parameters, we get: parameters, we get:
```
$ time mkdwarfs -i install-small -o perl-install-small-lz4.dwarfs -C lz4hc:level=9 -S 12 $ time mkdwarfs -i install-small -o perl-install-small-lz4.dwarfs -C lz4hc:level=9 -S 12
21:21:18.136796 scanning install-small 21:21:18.136796 scanning install-small
21:21:18.376998 waiting for background scanners... 21:21:18.376998 waiting for background scanners...
@ -1281,6 +1417,7 @@ parameters, we get:
real 0m9.075s real 0m9.075s
user 0m37.718s user 0m37.718s
sys 0m2.427s sys 0m2.427s
```
It finishes in less than half the time and produces an output image It finishes in less than half the time and produces an output image
that's half the size of the EROFS image. that's half the size of the EROFS image.

View File

@ -1,11 +1,9 @@
dwarfs-format(5) -- DwarFS File System Format v2.3 # dwarfs-format(5) -- DwarFS File System Format v2.3
==================================================
## DESCRIPTION ## DESCRIPTION
This document describes the DwarFS file system format, version 2.3. This document describes the DwarFS file system format, version 2.3.
## FILE STRUCTURE ## FILE STRUCTURE
A DwarFS file system image is just a sequence of blocks. Each block has the A DwarFS file system image is just a sequence of blocks. Each block has the
@ -65,26 +63,24 @@ A couple of notes:
larger than the one it supports. However, a new program will still larger than the one it supports. However, a new program will still
read all file systems with a smaller minor version number. read all file systems with a smaller minor version number.
### Section Types ### Section Types
There are currently 3 different section types. There are currently 3 different section types.
* `BLOCK` (0): - `BLOCK` (0):
A block of data. This is where all file data is stored. There can be A block of data. This is where all file data is stored. There can be
an arbitrary number of blocks of this type. an arbitrary number of blocks of this type.
* `METADATA_V2_SCHEMA` (7): - `METADATA_V2_SCHEMA` (7):
The schema used to layout the `METADATA_V2` block contents. This is The schema used to layout the `METADATA_V2` block contents. This is
stored in "compact" thrift encoding. stored in "compact" thrift encoding.
* `METADATA_V2` (8): - `METADATA_V2` (8):
This section contains the bulk of the metadata. It's essentially just This section contains the bulk of the metadata. It's essentially just
a collection of bit-packed arrays and structures. The exact layout of a collection of bit-packed arrays and structures. The exact layout of
each list and structure depends on the actual data and is stored each list and structure depends on the actual data and is stored
separately in `METADATA_V2_SCHEMA`. separately in `METADATA_V2_SCHEMA`.
## METADATA FORMAT ## METADATA FORMAT
Here is a high-level overview of how all the bits and pieces relate Here is a high-level overview of how all the bits and pieces relate
@ -169,17 +165,12 @@ list. The index into this list is the `inode_num` from `dir_entries`,
but you can perform direct lookups based on the inode number as well. but you can perform direct lookups based on the inode number as well.
The `inodes` list is strictly in the following order: The `inodes` list is strictly in the following order:
* directory inodes (`S_IFDIR`) - directory inodes (`S_IFDIR`)
- symlink inodes (`S_IFLNK`)
* symlink inodes (`S_IFLNK`) - regular *unique* file inodes (`S_IREG`)
- regular *shared* file inodes (`S_IREG`)
* regular *unique* file inodes (`S_IREG`) - character/block device inodes (`S_IFCHR`, `S_IFBLK`)
- socket/pipe inodes (`S_IFSOCK`, `S_IFIFO`)
* regular *shared* file inodes (`S_IREG`)
* character/block device inodes (`S_IFCHR`, `S_IFBLK`)
* socket/pipe inodes (`S_IFSOCK`, `S_IFIFO`)
The offsets can thus be found by using a binary search with a The offsets can thus be found by using a binary search with a
predicate on the inode more. The shared file offset can be found predicate on the inode more. The shared file offset can be found

View File

@ -1,5 +1,4 @@
dwarfs(1) -- mount highly compressed read-only file system # dwarfs(1) -- mount highly compressed read-only file system
==========================================================
## SYNOPSIS ## SYNOPSIS
@ -14,14 +13,16 @@ but it has some distinct features.
Other than that, it's pretty straightforward to use. Once you've created a Other than that, it's pretty straightforward to use. Once you've created a
file system image using mkdwarfs(1), you can mount it with: file system image using mkdwarfs(1), you can mount it with:
```
dwarfs image.dwarfs /path/to/mountpoint dwarfs image.dwarfs /path/to/mountpoint
```
## OPTIONS ## OPTIONS
In addition to the regular FUSE options, `dwarfs` supports the following In addition to the regular FUSE options, `dwarfs` supports the following
options: options:
* `-o cachesize=`*value*: - `-o cachesize=`*value*:
Size of the block cache, in bytes. You can append suffixes Size of the block cache, in bytes. You can append suffixes
(`k`, `m`, `g`) to specify the size in KiB, MiB and GiB, (`k`, `m`, `g`) to specify the size in KiB, MiB and GiB,
respectively. Note that this is not the upper memory limit respectively. Note that this is not the upper memory limit
@ -31,12 +32,12 @@ options:
with it, which can use a significant amount of additional with it, which can use a significant amount of additional
memory. For more details, see mkdwarfs(1). memory. For more details, see mkdwarfs(1).
* `-o workers=`*value*: - `-o workers=`*value*:
Number of worker threads to use for decompressing blocks. Number of worker threads to use for decompressing blocks.
If you have a lot of CPUs, increasing this number can help If you have a lot of CPUs, increasing this number can help
speed up access to files in the filesystem. speed up access to files in the filesystem.
* `-o decratio=`*value*: - `-o decratio=`*value*:
The ratio over which a block is fully decompressed. Blocks The ratio over which a block is fully decompressed. Blocks
are only decompressed partially, so each block has to carry are only decompressed partially, so each block has to carry
the decompressor state with it until it is fully decompressed. the decompressor state with it until it is fully decompressed.
@ -49,18 +50,18 @@ options:
we keep the partially decompressed block, but if we've we keep the partially decompressed block, but if we've
decompressed more then 80%, we'll fully decompress it. decompressed more then 80%, we'll fully decompress it.
* `-o offset=`*value*|`auto`: - `-o offset=`*value*|`auto`:
Specify the byte offset at which the filesystem is located in Specify the byte offset at which the filesystem is located in
the image, or use `auto` to detect the offset automatically. the image, or use `auto` to detect the offset automatically.
This is only useful for images that have some header located This is only useful for images that have some header located
before the actual filesystem data. before the actual filesystem data.
* `-o mlock=none`|`try`|`must`: - `-o mlock=none`|`try`|`must`:
Set this to `try` or `must` instead of the default `none` to Set this to `try` or `must` instead of the default `none` to
try or require `mlock()`ing of the file system metadata into try or require `mlock()`ing of the file system metadata into
memory. memory.
* `-o enable_nlink`: - `-o enable_nlink`:
Set this option if you want correct hardlink counts for regular Set this option if you want correct hardlink counts for regular
files. If this is not specified, the hardlink count will be 1. files. If this is not specified, the hardlink count will be 1.
Enabling this will slow down the initialization of the fuse Enabling this will slow down the initialization of the fuse
@ -70,7 +71,7 @@ options:
will also consume more memory to hold the hardlink count table. will also consume more memory to hold the hardlink count table.
This will be 4 bytes for every regular file inode. This will be 4 bytes for every regular file inode.
* `-o readonly`: - `-o readonly`:
Show all file system entries as read-only. By default, DwarFS Show all file system entries as read-only. By default, DwarFS
will preserve the original writeability, which is obviously a will preserve the original writeability, which is obviously a
lie as it's a read-only file system. However, this is needed lie as it's a read-only file system. However, this is needed
@ -80,7 +81,7 @@ options:
overlays and want the file system to reflect its read-only overlays and want the file system to reflect its read-only
state, you can set this option. state, you can set this option.
* `-o (no_)cache_image`: - `-o (no_)cache_image`:
By default, `dwarfs` tries to ensure that the compressed file By default, `dwarfs` tries to ensure that the compressed file
system image will not be cached by the kernel (i.e. the default system image will not be cached by the kernel (i.e. the default
is `-o no_cache_image`). This will reduce the memory consumption is `-o no_cache_image`). This will reduce the memory consumption
@ -91,7 +92,7 @@ options:
`-o cache_image` to keep the compressed image data in the kernel `-o cache_image` to keep the compressed image data in the kernel
cache. cache.
* `-o (no_)cache_files`: - `-o (no_)cache_files`:
By default, files in the mounted file system will be cached by By default, files in the mounted file system will be cached by
the kernel (i.e. the default is `-o cache_files`). This will the kernel (i.e. the default is `-o cache_files`). This will
significantly improve performance when accessing the same files significantly improve performance when accessing the same files
@ -103,14 +104,14 @@ options:
though it's likely that the kernel will already do the right thing though it's likely that the kernel will already do the right thing
even when the cache is enabled. even when the cache is enabled.
* `-o debuglevel=`*name*: - `-o debuglevel=`*name*:
Use this for different levels of verbosity along with either Use this for different levels of verbosity along with either
the `-f` or `-d` FUSE options. This can give you some insight the `-f` or `-d` FUSE options. This can give you some insight
over what the file system driver is doing internally, but it's over what the file system driver is doing internally, but it's
mainly meant for debugging and the `debug` and `trace` levels mainly meant for debugging and the `debug` and `trace` levels
in particular will slow down the driver. in particular will slow down the driver.
* `-o tidy_strategy=`*name*: - `-o tidy_strategy=`*name*:
Use one of the following strategies to tidy the block cache: Use one of the following strategies to tidy the block cache:
- `none`: - `none`:
@ -128,14 +129,14 @@ options:
cache is traversed and all blocks that have been fully or cache is traversed and all blocks that have been fully or
partially swapped out by the kernel will be removed. partially swapped out by the kernel will be removed.
* `-o tidy_interval=`*time*: - `-o tidy_interval=`*time*:
Used only if `tidy_strategy` is not `none`. This is the interval Used only if `tidy_strategy` is not `none`. This is the interval
at which the cache tidying thread wakes up to look for blocks at which the cache tidying thread wakes up to look for blocks
that can be removed from the cache. This must be an integer value. that can be removed from the cache. This must be an integer value.
Suffixes `ms`, `s`, `m`, `h` are supported. If no suffix is given, Suffixes `ms`, `s`, `m`, `h` are supported. If no suffix is given,
the value will be assumed to be in seconds. the value will be assumed to be in seconds.
* `-o tidy_max_age=`*time*: - `-o tidy_max_age=`*time*:
Used only if `tidy_strategy` is `time`. A block will be removed Used only if `tidy_strategy` is `time`. A block will be removed
from the cache if it hasn't been used for this time span. This must from the cache if it hasn't been used for this time span. This must
be an integer value. Suffixes `ms`, `s`, `m`, `h` are supported. be an integer value. Suffixes `ms`, `s`, `m`, `h` are supported.
@ -145,7 +146,7 @@ There's two particular FUSE options that you'll likely need at some
point, e.g. when trying to set up an `overlayfs` mount on top of point, e.g. when trying to set up an `overlayfs` mount on top of
a DwarFS image: a DwarFS image:
* `-o allow_root` and `-o allow_other`: - `-o allow_root` and `-o allow_other`:
These will ensure that the mounted file system can be read by These will ensure that the mounted file system can be read by
either `root` or any other user in addition to the user that either `root` or any other user in addition to the user that
started the fuse driver. So if you're running `dwarfs` as a started the fuse driver. So if you're running `dwarfs` as a
@ -193,27 +194,33 @@ set of Perl versions back.
Here's what you need to do: Here's what you need to do:
* Create a set of directories. In my case, these are all located - Create a set of directories. In my case, these are all located
in `/tmp/perl` as this was the orginal install location. in `/tmp/perl` as this was the orginal install location.
```
cd /tmp/perl cd /tmp/perl
mkdir install-ro mkdir install-ro
mkdir install-rw mkdir install-rw
mkdir install-work mkdir install-work
mkdir install mkdir install
```
* Mount the DwarFS image. `-o allow_root` is needed to make sure - Mount the DwarFS image. `-o allow_root` is needed to make sure
`overlayfs` has access to the mounted file system. In order `overlayfs` has access to the mounted file system. In order
to use `-o allow_root`, you may have to uncomment or add to use `-o allow_root`, you may have to uncomment or add
`user_allow_other` in `/etc/fuse.conf`. `user_allow_other` in `/etc/fuse.conf`.
```
dwarfs perl-install.dwarfs install-ro -o allow_root dwarfs perl-install.dwarfs install-ro -o allow_root
```
* Now set up `overlayfs`. - Now set up `overlayfs`.
```
sudo mount -t overlay overlay -o lowerdir=install-ro,upperdir=install-rw,workdir=install-work install sudo mount -t overlay overlay -o lowerdir=install-ro,upperdir=install-rw,workdir=install-work install
```
* That's it. You should now be able to access a writeable version - That's it. You should now be able to access a writeable version
of your DwarFS image in `install`. of your DwarFS image in `install`.
You can go even further than that. Say you have different sets of You can go even further than that. Say you have different sets of
@ -223,7 +230,9 @@ the read-write directory after unmounting the `overlayfs`, and
selectively add this by passing a colon-separated list to the selectively add this by passing a colon-separated list to the
`lowerdir` option when setting up the `overlayfs` mount: `lowerdir` option when setting up the `overlayfs` mount:
```
sudo mount -t overlay overlay -o lowerdir=install-ro:install-modules install sudo mount -t overlay overlay -o lowerdir=install-ro:install-modules install
```
If you want *this* merged overlay to be writable, just add in the If you want *this* merged overlay to be writable, just add in the
`upperdir` and `workdir` options from before again. `upperdir` and `workdir` options from before again.

View File

@ -1,5 +1,4 @@
dwarfsck(1) -- check DwarFS image # dwarfsck(1) -- check DwarFS image
=================================
## SYNOPSIS ## SYNOPSIS
@ -15,42 +14,42 @@ with a non-zero exit code.
## OPTIONS ## OPTIONS
* `-i`, `--input=`*file*: - `-i`, `--input=`*file*:
Path to the filesystem image. Path to the filesystem image.
* `-d`, `--detail=`*value*: - `-d`, `--detail=`*value*:
Level of filesystem information detail. The default is 2. Higher values Level of filesystem information detail. The default is 2. Higher values
mean more output. Values larger than 6 will currently not provide any mean more output. Values larger than 6 will currently not provide any
further detail. further detail.
* `-O`, `--image-offset=`*value*|`auto`: - `-O`, `--image-offset=`*value*|`auto`:
Specify the byte offset at which the filesystem is located in the image. Specify the byte offset at which the filesystem is located in the image.
Use `auto` to detect the offset automatically. This is also the default. Use `auto` to detect the offset automatically. This is also the default.
This is only useful for images that have some header located before the This is only useful for images that have some header located before the
actual filesystem data. actual filesystem data.
* `-H`, `--print-header`: - `-H`, `--print-header`:
Print the header located before the filesystem image to stdout. If no Print the header located before the filesystem image to stdout. If no
header is present, the program will exit with a non-zero exit code. header is present, the program will exit with a non-zero exit code.
* `-n`, `--num-workers=`*value*: - `-n`, `--num-workers=`*value*:
Number of worker threads used for integrity checking. Number of worker threads used for integrity checking.
* `--check-integrity`: - `--check-integrity`:
In addition to performing a fast checksum check, also perform a (much In addition to performing a fast checksum check, also perform a (much
slower) verification of the embedded SHA-512/256 hashes. slower) verification of the embedded SHA-512/256 hashes.
* `--json`: - `--json`:
Print a simple JSON representation of the filesystem metadata. Please Print a simple JSON representation of the filesystem metadata. Please
note that the format is *not* stable. note that the format is *not* stable.
* `--export-metadata=`*file*: - `--export-metadata=`*file*:
Export all filesystem meteadata in JSON format. Export all filesystem meteadata in JSON format.
* `--log-level=`*name*: - `--log-level=`*name*:
Specifiy a logging level. Specifiy a logging level.
* `--help`: - `--help`:
Show program help, including option defaults. Show program help, including option defaults.
## AUTHOR ## AUTHOR

View File

@ -1,9 +1,8 @@
dwarfsextract(1) -- extract DwarFS image # dwarfsextract(1) -- extract DwarFS image
========================================
## SYNOPSIS ## SYNOPSIS
`dwarfsextract` `-i` *image* [`-o` *dir*] [*options*...]<br> `dwarfsextract` `-i` *image* [`-o` *dir*] [*options*...]
`dwarfsextract` `-i` *image* -f *format* [`-o` *file*] [*options*...] `dwarfsextract` `-i` *image* -f *format* [`-o` *file*] [*options*...]
## DESCRIPTION ## DESCRIPTION
@ -35,32 +34,32 @@ to disk:
## OPTIONS ## OPTIONS
* `-i`, `--input=`*file*: - `-i`, `--input=`*file*:
Path to the source filesystem. Path to the source filesystem.
* `-o`, `--output=`*directory*|*file*: - `-o`, `--output=`*directory*|*file*:
If no format is specified, this is the directory to which the contents If no format is specified, this is the directory to which the contents
of the filesystem should be extracted. If a format is specified, this of the filesystem should be extracted. If a format is specified, this
is the name of the output archive. This option can be omitted, in which is the name of the output archive. This option can be omitted, in which
case the default is to extract the files to the current directory, or case the default is to extract the files to the current directory, or
to write the archive data to stdout. to write the archive data to stdout.
* `-O`, `--image-offset=`*value*|`auto`: - `-O`, `--image-offset=`*value*|`auto`:
Specify the byte offset at which the filesystem is located in the image. Specify the byte offset at which the filesystem is located in the image.
Use `auto` to detect the offset automatically. This is also the default. Use `auto` to detect the offset automatically. This is also the default.
This is only useful for images that have some header located before the This is only useful for images that have some header located before the
actual filesystem data. actual filesystem data.
* `-f`, `--format=`*format*: - `-f`, `--format=`*format*:
The archive format to produce. If this is left empty or unspecified, The archive format to produce. If this is left empty or unspecified,
files will be extracted to the output directory (or the current directory files will be extracted to the output directory (or the current directory
if no output directory is specified). For a full list of supported formats, if no output directory is specified). For a full list of supported formats,
see libarchive-formats(5). see libarchive-formats(5).
* `-n`, `--num-workers=`*value*: - `-n`, `--num-workers=`*value*:
Number of worker threads used for extracting the filesystem. Number of worker threads used for extracting the filesystem.
* `-s`, `--cache-size=`*value*: - `-s`, `--cache-size=`*value*:
Size of the block cache, in bytes. You can append suffixes (`k`, `m`, `g`) Size of the block cache, in bytes. You can append suffixes (`k`, `m`, `g`)
to specify the size in KiB, MiB and GiB, respectively. Note that this is to specify the size in KiB, MiB and GiB, respectively. Note that this is
not the upper memory limit of the process, as there may be blocks in not the upper memory limit of the process, as there may be blocks in
@ -68,10 +67,10 @@ to disk:
fully decompressed yet will carry decompressor state along with it, which fully decompressed yet will carry decompressor state along with it, which
can use a significant amount of additional memory. can use a significant amount of additional memory.
* `--log-level=`*name*: - `--log-level=`*name*:
Specifiy a logging level. Specifiy a logging level.
* `--help`: - `--help`:
Show program help, including option defaults. Show program help, including option defaults.
## AUTHOR ## AUTHOR

View File

@ -1,9 +1,8 @@
mkdwarfs(1) -- create highly compressed read-only file systems # mkdwarfs(1) -- create highly compressed read-only file systems
==============================================================
## SYNOPSIS ## SYNOPSIS
`mkdwarfs` `-i` *path* `-o` *file* [*options*...]<br> `mkdwarfs` `-i` *path* `-o` *file* [*options*...]
`mkdwarfs` `-i` *file* `-o` *file* `--recompress` [*options*...] `mkdwarfs` `-i` *file* `-o` *file* `--recompress` [*options*...]
## DESCRIPTION ## DESCRIPTION
@ -26,17 +25,17 @@ After that, you can mount it with dwarfs(1):
There two mandatory options for specifying the input and output: There two mandatory options for specifying the input and output:
* `-i`, `--input=`*path*|*file*: - `-i`, `--input=`*path*|*file*:
Path to the root directory containing the files from which you want to Path to the root directory containing the files from which you want to
build a filesystem. If the `--recompress` option is given, this argument build a filesystem. If the `--recompress` option is given, this argument
is the source filesystem. is the source filesystem.
* `-o`, `--output=`*file*: - `-o`, `--output=`*file*:
File name of the output filesystem. File name of the output filesystem.
Most other options are concerned with compression tuning: Most other options are concerned with compression tuning:
* `-l`, `--compress-level=`*value*: - `-l`, `--compress-level=`*value*:
Compression level to use for the filesystem. **If you are unsure, please Compression level to use for the filesystem. **If you are unsure, please
stick to the default level of 7.** This is intended to provide some stick to the default level of 7.** This is intended to provide some
sensible defaults and will depend on which compression libraries were sensible defaults and will depend on which compression libraries were
@ -53,7 +52,7 @@ Most other options are concerned with compression tuning:
`--window-step` and `--order`. See the output of `mkdwarfs --help` for `--window-step` and `--order`. See the output of `mkdwarfs --help` for
a table listing the exact defaults used for each compression level. a table listing the exact defaults used for each compression level.
* `-S`, `--block-size-bits=`*value*: - `-S`, `--block-size-bits=`*value*:
The block size used for the compressed filesystem. The actual block size The block size used for the compressed filesystem. The actual block size
is two to the power of this value. Larger block sizes will offer better is two to the power of this value. Larger block sizes will offer better
overall compression ratios, but will be slower and consume more memory overall compression ratios, but will be slower and consume more memory
@ -61,7 +60,7 @@ Most other options are concerned with compression tuning:
least partially decompressed into memory. Values between 20 and 26, i.e. least partially decompressed into memory. Values between 20 and 26, i.e.
between 1MiB and 64MiB, usually work quite well. between 1MiB and 64MiB, usually work quite well.
* `-N`, `--num-workers=`*value*: - `-N`, `--num-workers=`*value*:
Number of worker threads used for building the filesystem. This defaults Number of worker threads used for building the filesystem. This defaults
to the number of processors available on your system. Use this option if to the number of processors available on your system. Use this option if
you want to limit the resources used by `mkdwarfs`. you want to limit the resources used by `mkdwarfs`.
@ -75,7 +74,7 @@ Most other options are concerned with compression tuning:
individual filesystem blocks in the background. Ordering, segmenting individual filesystem blocks in the background. Ordering, segmenting
and block building are, again, single-threaded and run independently. and block building are, again, single-threaded and run independently.
* `-B`, `--max-lookback-blocks=`*value*: - `-B`, `--max-lookback-blocks=`*value*:
Specify how many of the most recent blocks to scan for duplicate segments. Specify how many of the most recent blocks to scan for duplicate segments.
By default, only the current block will be scanned. The larger this number, By default, only the current block will be scanned. The larger this number,
the more duplicate segments will likely be found, which may further improve the more duplicate segments will likely be found, which may further improve
@ -84,7 +83,7 @@ Most other options are concerned with compression tuning:
files can now potentially span multiple filesystem blocks. Passing `-B0` files can now potentially span multiple filesystem blocks. Passing `-B0`
will completely disable duplicate segment search. will completely disable duplicate segment search.
* `-W`, `--window-size=`*value*: - `-W`, `--window-size=`*value*:
Window size of cyclic hash used for segmenting. This is again an exponent Window size of cyclic hash used for segmenting. This is again an exponent
to a base of two. Cyclic hashes are used by `mkdwarfs` for finding to a base of two. Cyclic hashes are used by `mkdwarfs` for finding
identical segments across multiple files. This is done on top of duplicate identical segments across multiple files. This is done on top of duplicate
@ -101,7 +100,7 @@ Most other options are concerned with compression tuning:
size will grow. Passing `-W0` will completely disable duplicate segment size will grow. Passing `-W0` will completely disable duplicate segment
search. search.
* `-w`, `--window-step=`*value*: - `-w`, `--window-step=`*value*:
This option specifies how often cyclic hash values are stored for lookup. This option specifies how often cyclic hash values are stored for lookup.
It is specified relative to the window size, as a base-2 exponent that It is specified relative to the window size, as a base-2 exponent that
divides the window size. To give a concrete example, if `--window-size=16` divides the window size. To give a concrete example, if `--window-size=16`
@ -114,7 +113,7 @@ Most other options are concerned with compression tuning:
If you use a larger value for this option, the increments become *smaller*, If you use a larger value for this option, the increments become *smaller*,
and `mkdwarfs` will be slightly slower and use more memory. and `mkdwarfs` will be slightly slower and use more memory.
* `--bloom-filter-size`=*value*: - `--bloom-filter-size`=*value*:
The segmenting algorithm uses a bloom filter to determine quickly if The segmenting algorithm uses a bloom filter to determine quickly if
there is *no* match at a given position. This will filter out more than there is *no* match at a given position. This will filter out more than
90% of bad matches quickly with the default bloom filter size. The default 90% of bad matches quickly with the default bloom filter size. The default
@ -123,7 +122,7 @@ Most other options are concerned with compression tuning:
be able to see some improvement. If you're tight on memory, then decreasing be able to see some improvement. If you're tight on memory, then decreasing
this will potentially save a few MiBs. this will potentially save a few MiBs.
* `-L`, `--memory-limit=`*value*: - `-L`, `--memory-limit=`*value*:
Approximately how much memory you want `mkdwarfs` to use during filesystem Approximately how much memory you want `mkdwarfs` to use during filesystem
creation. Note that currently this will only affect the block manager creation. Note that currently this will only affect the block manager
component, i.e. the number of filesystem blocks that are in flight but component, i.e. the number of filesystem blocks that are in flight but
@ -134,24 +133,24 @@ Most other options are concerned with compression tuning:
algorithms, so if you're short on memory it might be worth tweaking the algorithms, so if you're short on memory it might be worth tweaking the
compression options. compression options.
* `-C`, `--compression=`*algorithm*[`:`*algopt*[`=`*value*][`,`...]]: - `-C`, `--compression=`*algorithm*[`:`*algopt*[`=`*value*][`,`...]]:
The compression algorithm and configuration used for file system data. The compression algorithm and configuration used for file system data.
The value for this option is a colon-separated list. The first item is The value for this option is a colon-separated list. The first item is
the compression algorithm, the remaining item are its options. Options the compression algorithm, the remaining item are its options. Options
can be either boolean or have a value. For details on which algori`thms can be either boolean or have a value. For details on which algorithms
and options are available, see the output of `mkdwarfs --help`. `zstd` and options are available, see the output of `mkdwarfs --help`. `zstd`
will give you the best compression while still keeping decompression will give you the best compression while still keeping decompression
*very* fast. `lzma` will compress even better, but decompression will *very* fast. `lzma` will compress even better, but decompression will
be around ten times slower. be around ten times slower.
* `--schema-compression=`*algorithm*[`:`*algopt*[`=`*value*][`,`...]]: - `--schema-compression=`*algorithm*[`:`*algopt*[`=`*value*][`,`...]]:
The compression algorithm and configuration used for the metadata schema. The compression algorithm and configuration used for the metadata schema.
Takes the same arguments as `--compression` above. The schema is *very* Takes the same arguments as `--compression` above. The schema is *very*
small, in the hundreds of bytes, so this is only relevant for extremely small, in the hundreds of bytes, so this is only relevant for extremely
small file systems. The default (`zstd`) has shown to give considerably small file systems. The default (`zstd`) has shown to give considerably
better results than any other algorithms. better results than any other algorithms.
* `--metadata-compression=`*algorithm*[`:`*algopt*[`=`*value*][`,`...]]: - `--metadata-compression=`*algorithm*[`:`*algopt*[`=`*value*][`,`...]]:
The compression algorithm and configuration used for the metadata. The compression algorithm and configuration used for the metadata.
Takes the same arguments as `--compression` above. The metadata has been Takes the same arguments as `--compression` above. The metadata has been
optimized for very little redundancy and leaving it uncompressed, the optimized for very little redundancy and leaving it uncompressed, the
@ -161,7 +160,7 @@ Most other options are concerned with compression tuning:
care about mount time, you can safely choose `lzma` compression here, as care about mount time, you can safely choose `lzma` compression here, as
the data will only have to be decompressed once when mounting the image. the data will only have to be decompressed once when mounting the image.
* `--recompress`[`=all`|`=block`|`=metadata`|`=none`]: - `--recompress`[`=all`|`=block`|`=metadata`|`=none`]:
Take an existing DwarFS file system and recompress it using different Take an existing DwarFS file system and recompress it using different
compression algorithms. If no argument or `all` is given, all sections compression algorithms. If no argument or `all` is given, all sections
in the file system image will be recompressed. Note that *only* the in the file system image will be recompressed. Note that *only* the
@ -177,7 +176,7 @@ Most other options are concerned with compression tuning:
metadata to uncompressed metadata without having to rebuild or recompress metadata to uncompressed metadata without having to rebuild or recompress
all the other data. all the other data.
* `-P`, `--pack-metadata=auto`|`none`|[`all`|`chunk_table`|`directories`|`shared_files`|`names`|`names_index`|`symlinks`|`symlinks_index`|`force`|`plain`[`,`...]]: - `-P`, `--pack-metadata=auto`|`none`|[`all`|`chunk_table`|`directories`|`shared_files`|`names`|`names_index`|`symlinks`|`symlinks_index`|`force`|`plain`[`,`...]]:
Which metadata information to store in packed format. This is primarily Which metadata information to store in packed format. This is primarily
useful when storing metadata uncompressed, as it allows for smaller useful when storing metadata uncompressed, as it allows for smaller
metadata block size without having to turn on compression. Keep in mind, metadata block size without having to turn on compression. Keep in mind,
@ -189,34 +188,34 @@ Most other options are concerned with compression tuning:
systems that contain hundreds of thousands of files. systems that contain hundreds of thousands of files.
See [Metadata Packing](#metadata-packing) for more details. See [Metadata Packing](#metadata-packing) for more details.
* `--set-owner=`*uid*: - `--set-owner=`*uid*:
Set the owner for all entities in the file system. This can reduce the Set the owner for all entities in the file system. This can reduce the
size of the file system. If the input only has a single owner already, size of the file system. If the input only has a single owner already,
setting this won't make any difference. setting this won't make any difference.
* `--set-group=`*gid*: - `--set-group=`*gid*:
Set the group for all entities in the file system. This can reduce the Set the group for all entities in the file system. This can reduce the
size of the file system. If the input only has a single group already, size of the file system. If the input only has a single group already,
setting this won't make any difference. setting this won't make any difference.
* `--set-time=`*time*|`now`: - `--set-time=`*time*|`now`:
Set the time stamps for all entities to this value. This can significantly Set the time stamps for all entities to this value. This can significantly
reduce the size of the file system. You can pass either a unix time stamp reduce the size of the file system. You can pass either a unix time stamp
or `now`. or `now`.
* `--keep-all-times`: - `--keep-all-times`:
As of release 0.3.0, by default, `mkdwarfs` will only save the contents of As of release 0.3.0, by default, `mkdwarfs` will only save the contents of
the `mtime` field in order to save metadata space. If you want to save the `mtime` field in order to save metadata space. If you want to save
`atime` and `ctime` as well, use this option. `atime` and `ctime` as well, use this option.
* `--time-resolution=`*sec*|`sec`|`min`|`hour`|`day`: - `--time-resolution=`*sec*|`sec`|`min`|`hour`|`day`:
Specify the resolution with which time stamps are stored. By default, Specify the resolution with which time stamps are stored. By default,
time stamps are stored with second resolution. You can specify "odd" time stamps are stored with second resolution. You can specify "odd"
resolutions as well, e.g. something like 15 second resolution is resolutions as well, e.g. something like 15 second resolution is
entirely possible. Moving from second to minute resolution, for example, entirely possible. Moving from second to minute resolution, for example,
will save roughly 6 bits per file system entry in the metadata block. will save roughly 6 bits per file system entry in the metadata block.
* `--order=none`|`path`|`similarity`|`nilsimsa`[`:`*limit*[`:`*depth*[`:`*mindepth*]]]|`script`: - `--order=none`|`path`|`similarity`|`nilsimsa`[`:`*limit*[`:`*depth*[`:`*mindepth*]]]|`script`:
The order in which inodes will be written to the file system. Choosing `none`, The order in which inodes will be written to the file system. Choosing `none`,
the inodes will be stored in the order in which they are discovered. With the inodes will be stored in the order in which they are discovered. With
`path`, they will be sorted asciibetically by path name of the first file `path`, they will be sorted asciibetically by path name of the first file
@ -243,35 +242,35 @@ Most other options are concerned with compression tuning:
Last but not least, if scripting support is built into `mkdwarfs`, you can Last but not least, if scripting support is built into `mkdwarfs`, you can
choose `script` to let the script determine the order. choose `script` to let the script determine the order.
* `--remove-empty-dirs`: - `--remove-empty-dirs`:
Removes all empty directories from the output file system, recursively. Removes all empty directories from the output file system, recursively.
This is particularly useful when using scripts that filter out a lot of This is particularly useful when using scripts that filter out a lot of
file system entries. file system entries.
* `--with-devices`: - `--with-devices`:
Include character and block devices in the output file system. These are Include character and block devices in the output file system. These are
not included by default, and due to security measures in FUSE, they will not included by default, and due to security measures in FUSE, they will
never work in the mounted file system. However, they can still be copied never work in the mounted file system. However, they can still be copied
out of the mounted file system, for example using `rsync`. out of the mounted file system, for example using `rsync`.
* `--with-specials`: - `--with-specials`:
Include named fifos and sockets in the output file system. These are not Include named fifos and sockets in the output file system. These are not
included by default. included by default.
* `--header=`*file*: - `--header=`*file*:
Read header from file and place it before the output filesystem image. Read header from file and place it before the output filesystem image.
Can be used with `--recompress` to add or replace a header. Can be used with `--recompress` to add or replace a header.
* `--remove-header`: - `--remove-header`:
Remove header from a filesystem image. Only useful with `--recompress`. Remove header from a filesystem image. Only useful with `--recompress`.
* `--log-level=`*name*: - `--log-level=`*name*:
Specifiy a logging level. Specifiy a logging level.
* `--no-progress`: - `--no-progress`:
Don't show progress output while building filesystem. Don't show progress output while building filesystem.
* `--progress=none`|`simple`|`ascii`|`unicode`: - `--progress=none`|`simple`|`ascii`|`unicode`:
Choosing `none` is equivalent to specifying `--no-progress`. `simple` Choosing `none` is equivalent to specifying `--no-progress`. `simple`
will print a single line of progress information whenever the progress will print a single line of progress information whenever the progress
has significantly changed, but at most once every 2 seconds. This is has significantly changed, but at most once every 2 seconds. This is
@ -281,14 +280,14 @@ Most other options are concerned with compression tuning:
you can switch to `ascii`, which is like `unicode`, but looks less you can switch to `ascii`, which is like `unicode`, but looks less
fancy. fancy.
* `--help`: - `--help`:
Show program help, including defaults, compression level detail and Show program help, including defaults, compression level detail and
supported compression algorithms. supported compression algorithms.
If experimental Python support was compiled into `mkdwarfs`, you can use the If experimental Python support was compiled into `mkdwarfs`, you can use the
following option to enable customizations via the scripting interface: following option to enable customizations via the scripting interface:
* `--script=`*file*[`:`*class*[`(`arguments`...)`]]: - `--script=`*file*[`:`*class*[`(`arguments`...)`]]:
Specify the Python script to load. The class name is optional if there's Specify the Python script to load. The class name is optional if there's
a class named `mkdwarfs` in the script. It is also possible to pass a class named `mkdwarfs` in the script. It is also possible to pass
arguments to the constuctor. arguments to the constuctor.
@ -342,28 +341,28 @@ However, there are several options to choose from that allow you to
further reduce metadata size without having to compress the metadata. further reduce metadata size without having to compress the metadata.
These options are controlled by the `--pack-metadata` option. These options are controlled by the `--pack-metadata` option.
* `auto`: - `auto`:
This is the default. It will enable both `names` and `symlinks`. This is the default. It will enable both `names` and `symlinks`.
* `none`: - `none`:
Don't enable any packing. However, string tables (i.e. names and Don't enable any packing. However, string tables (i.e. names and
symlinks) will still be stored in "compact" rather than "plain" symlinks) will still be stored in "compact" rather than "plain"
format. In order to force storage in plain format, use `plain`. format. In order to force storage in plain format, use `plain`.
* `all`: - `all`:
Enable all packing options. This does *not* force packing of Enable all packing options. This does *not* force packing of
string tables (i.e. names and symlinks) if the packing would string tables (i.e. names and symlinks) if the packing would
actually increase the size, which can happen if the string tables actually increase the size, which can happen if the string tables
are actually small. In order to force string table packing, use are actually small. In order to force string table packing, use
`all,force`. `all,force`.
* `chunk_table`: - `chunk_table`:
Delta-compress chunk tables. This can reduce the size of the Delta-compress chunk tables. This can reduce the size of the
chunk tables for large file systems and help compression, however, chunk tables for large file systems and help compression, however,
it will likely require a lot of memory when unpacking the tables it will likely require a lot of memory when unpacking the tables
again. Only use this if you know what you're doing. again. Only use this if you know what you're doing.
* `directories`: - `directories`:
Pack directories table by storing first entry pointers delta- Pack directories table by storing first entry pointers delta-
compressed and completely removing parent directory pointers. compressed and completely removing parent directory pointers.
The parent directory pointers can be rebuilt by tree traversal The parent directory pointers can be rebuilt by tree traversal
@ -372,12 +371,12 @@ These options are controlled by the `--pack-metadata` option.
will likely require a lot of memory when unpacking the tables will likely require a lot of memory when unpacking the tables
again. Only use this if you know what you're doing. again. Only use this if you know what you're doing.
* `shared_files`: - `shared_files`:
Pack shared files table. This is only useful if the filesystem Pack shared files table. This is only useful if the filesystem
contains lots of non-hardlinked duplicates. It gets more efficient contains lots of non-hardlinked duplicates. It gets more efficient
the more copies of a file are in the filesystem. the more copies of a file are in the filesystem.
* `names`,`symlinks`: - `names`,`symlinks`:
Compress the names and symlink targets using the Compress the names and symlink targets using the
[fsst](https://github.com/cwida/fsst) compression scheme. This [fsst](https://github.com/cwida/fsst) compression scheme. This
compresses each individual entry separately using a small, compresses each individual entry separately using a small,
@ -392,17 +391,17 @@ These options are controlled by the `--pack-metadata` option.
than the uncompressed strings. If this is the case, the strings than the uncompressed strings. If this is the case, the strings
will be stored uncompressed, unless `force` is also specified. will be stored uncompressed, unless `force` is also specified.
* `names_index`,`symlinks_index`: - `names_index`,`symlinks_index`:
Delta-compress the names and symlink targets indices. The same Delta-compress the names and symlink targets indices. The same
caveats apply as for `chunk_table`. caveats apply as for `chunk_table`.
* `force`: - `force`:
Forces the compression of the `names` and `symlinks` tables, Forces the compression of the `names` and `symlinks` tables,
even if that would make them use more memory than the even if that would make them use more memory than the
uncompressed tables. This is really only useful for testing uncompressed tables. This is really only useful for testing
and development. and development.
* `plain`: - `plain`:
Store string tables in "plain" format. The plain format uses Store string tables in "plain" format. The plain format uses
Frozen thrift arrays and was used in earlier metadata versions. Frozen thrift arrays and was used in earlier metadata versions.
It is useful for debugging, but wastes up to one byte per string. It is useful for debugging, but wastes up to one byte per string.
@ -430,7 +429,6 @@ further compress the block. So if you're really desperately trying
to reduce the image size, enabling `all` packing would be an option to reduce the image size, enabling `all` packing would be an option
at the cost of using a lot more memory when using the filesystem. at the cost of using a lot more memory when using the filesystem.
## INTERNAL OPERATION ## INTERNAL OPERATION
Internally, `mkdwarfs` runs in two completely separate phases. The first Internally, `mkdwarfs` runs in two completely separate phases. The first