mirror of
https://github.com/mhx/dwarfs.git
synced 2025-08-04 02:06:22 -04:00
Markdown cleanup
This commit is contained in:
parent
9ad4dd655f
commit
569966b752
191
README.md
191
README.md
@ -6,22 +6,22 @@ A fast high compression read-only file system
|
|||||||
|
|
||||||
## Table of contents
|
## Table of contents
|
||||||
|
|
||||||
* [Overview](#overview)
|
- [Overview](#overview)
|
||||||
* [History](#history)
|
- [History](#history)
|
||||||
* [Building and Installing](#building-and-installing)
|
- [Building and Installing](#building-and-installing)
|
||||||
* [Dependencies](#dependencies)
|
- [Dependencies](#dependencies)
|
||||||
* [Building](#building)
|
- [Building](#building)
|
||||||
* [Installing](#installing)
|
- [Installing](#installing)
|
||||||
* [Experimental Python Scripting Support](#experimental-python-scripting-support)
|
- [Experimental Python Scripting Support](#experimental-python-scripting-support)
|
||||||
* [Usage](#usage)
|
- [Usage](#usage)
|
||||||
* [Comparison](#comparison)
|
- [Comparison](#comparison)
|
||||||
* [With SquashFS](#with-squashfs)
|
- [With SquashFS](#with-squashfs)
|
||||||
* [With SquashFS & xz](#with-squashfs--xz)
|
- [With SquashFS & xz](#with-squashfs--xz)
|
||||||
* [With lrzip](#with-lrzip)
|
- [With lrzip](#with-lrzip)
|
||||||
* [With zpaq](#with-zpaq)
|
- [With zpaq](#with-zpaq)
|
||||||
* [With wimlib](#with-wimlib)
|
- [With wimlib](#with-wimlib)
|
||||||
* [With Cromfs](#with-cromfs)
|
- [With Cromfs](#with-cromfs)
|
||||||
* [With EROFS](#with-erofs)
|
- [With EROFS](#with-erofs)
|
||||||
|
|
||||||
## Overview
|
## Overview
|
||||||
|
|
||||||
@ -45,20 +45,20 @@ less CPU resources.
|
|||||||
|
|
||||||
Distinct features of DwarFS are:
|
Distinct features of DwarFS are:
|
||||||
|
|
||||||
* Clustering of files by similarity using a similarity hash function.
|
- Clustering of files by similarity using a similarity hash function.
|
||||||
This makes it easier to exploit the redundancy across file boundaries.
|
This makes it easier to exploit the redundancy across file boundaries.
|
||||||
|
|
||||||
* Segmentation analysis across file system blocks in order to reduce
|
- Segmentation analysis across file system blocks in order to reduce
|
||||||
the size of the uncompressed file system. This saves memory when
|
the size of the uncompressed file system. This saves memory when
|
||||||
using the compressed file system and thus potentially allows for
|
using the compressed file system and thus potentially allows for
|
||||||
higher cache hit rates as more data can be kept in the cache.
|
higher cache hit rates as more data can be kept in the cache.
|
||||||
|
|
||||||
* Highly multi-threaded implementation. Both the file
|
- Highly multi-threaded implementation. Both the file
|
||||||
[system creation tool](doc/mkdwarfs.md) as well as the
|
[system creation tool](doc/mkdwarfs.md) as well as the
|
||||||
[FUSE driver](doc/dwarfs.md) are able to make good use of the
|
[FUSE driver](doc/dwarfs.md) are able to make good use of the
|
||||||
many cores of your system.
|
many cores of your system.
|
||||||
|
|
||||||
* Optional experimental Python scripting support to provide custom
|
- Optional experimental Python scripting support to provide custom
|
||||||
filtering and ordering functionality.
|
filtering and ordering functionality.
|
||||||
|
|
||||||
## History
|
## History
|
||||||
@ -129,6 +129,7 @@ will be automatically resolved if you build with tests.
|
|||||||
|
|
||||||
A good starting point for apt-based systems is probably:
|
A good starting point for apt-based systems is probably:
|
||||||
|
|
||||||
|
```
|
||||||
$ apt install \
|
$ apt install \
|
||||||
g++ \
|
g++ \
|
||||||
clang \
|
clang \
|
||||||
@ -161,6 +162,7 @@ A good starting point for apt-based systems is probably:
|
|||||||
libfmt-dev \
|
libfmt-dev \
|
||||||
libfuse3-dev \
|
libfuse3-dev \
|
||||||
libgoogle-glog-dev
|
libgoogle-glog-dev
|
||||||
|
```
|
||||||
|
|
||||||
Note that when building with `gcc`, the optimization level will be
|
Note that when building with `gcc`, the optimization level will be
|
||||||
set to `-O2` instead of the CMake default of `-O3` for release
|
set to `-O2` instead of the CMake default of `-O3` for release
|
||||||
@ -168,30 +170,37 @@ builds. At least with versions up to `gcc-10`, the `-O3` build is
|
|||||||
[up to 70% slower](https://github.com/mhx/dwarfs/issues/14) than a
|
[up to 70% slower](https://github.com/mhx/dwarfs/issues/14) than a
|
||||||
build with `-O2`.
|
build with `-O2`.
|
||||||
|
|
||||||
|
|
||||||
### Building
|
### Building
|
||||||
|
|
||||||
Firstly, either clone the repository...
|
Firstly, either clone the repository...
|
||||||
|
|
||||||
|
```
|
||||||
$ git clone --recurse-submodules https://github.com/mhx/dwarfs
|
$ git clone --recurse-submodules https://github.com/mhx/dwarfs
|
||||||
$ cd dwarfs
|
$ cd dwarfs
|
||||||
|
```
|
||||||
|
|
||||||
...or unpack the release archive:
|
...or unpack the release archive:
|
||||||
|
|
||||||
|
```
|
||||||
$ tar xvf dwarfs-x.y.z.tar.bz2
|
$ tar xvf dwarfs-x.y.z.tar.bz2
|
||||||
$ cd dwarfs-x.y.z
|
$ cd dwarfs-x.y.z
|
||||||
|
```
|
||||||
|
|
||||||
Once all dependencies have been installed, you can build DwarFS
|
Once all dependencies have been installed, you can build DwarFS
|
||||||
using:
|
using:
|
||||||
|
|
||||||
|
```
|
||||||
$ mkdir build
|
$ mkdir build
|
||||||
$ cd build
|
$ cd build
|
||||||
$ cmake .. -DWITH_TESTS=1
|
$ cmake .. -DWITH_TESTS=1
|
||||||
$ make -j$(nproc)
|
$ make -j$(nproc)
|
||||||
|
```
|
||||||
|
|
||||||
You can then run tests with:
|
You can then run tests with:
|
||||||
|
|
||||||
|
```
|
||||||
$ make test
|
$ make test
|
||||||
|
```
|
||||||
|
|
||||||
All binaries use [jemalloc](https://github.com/jemalloc/jemalloc)
|
All binaries use [jemalloc](https://github.com/jemalloc/jemalloc)
|
||||||
as a memory allocator by default, as it is typically uses much less
|
as a memory allocator by default, as it is typically uses much less
|
||||||
@ -203,7 +212,9 @@ To disable the use of `jemalloc`, pass `-DUSE_JEMALLOC=0` on the
|
|||||||
|
|
||||||
Installing is as easy as:
|
Installing is as easy as:
|
||||||
|
|
||||||
|
```
|
||||||
$ sudo make install
|
$ sudo make install
|
||||||
|
```
|
||||||
|
|
||||||
Though you don't have to install the tools to play with them.
|
Though you don't have to install the tools to play with them.
|
||||||
|
|
||||||
@ -212,13 +223,17 @@ Though you don't have to install the tools to play with them.
|
|||||||
You can build `mkdwarfs` with experimental support for Python
|
You can build `mkdwarfs` with experimental support for Python
|
||||||
scripting:
|
scripting:
|
||||||
|
|
||||||
|
```
|
||||||
$ cmake .. -DWITH_TESTS=1 -DWITH_PYTHON=1
|
$ cmake .. -DWITH_TESTS=1 -DWITH_PYTHON=1
|
||||||
|
```
|
||||||
|
|
||||||
This also requires Boost.Python. If you have multiple Python
|
This also requires Boost.Python. If you have multiple Python
|
||||||
versions installed, you can explicitly specify the version to
|
versions installed, you can explicitly specify the version to
|
||||||
build against:
|
build against:
|
||||||
|
|
||||||
|
```
|
||||||
$ cmake .. -DWITH_TESTS=1 -DWITH_PYTHON=1 -DWITH_PYTHON_VERSION=3.8
|
$ cmake .. -DWITH_TESTS=1 -DWITH_PYTHON=1 -DWITH_PYTHON_VERSION=3.8
|
||||||
|
```
|
||||||
|
|
||||||
Note that only Python 3 is supported. You can take a look at
|
Note that only Python 3 is supported. You can take a look at
|
||||||
[scripts/example.py](scripts/example.py) to get an idea for
|
[scripts/example.py](scripts/example.py) to get an idea for
|
||||||
@ -259,6 +274,7 @@ NVME drive, so most of its contents were likely cached.
|
|||||||
I'm using the same compression type and compression level for
|
I'm using the same compression type and compression level for
|
||||||
SquashFS that is the default setting for DwarFS:
|
SquashFS that is the default setting for DwarFS:
|
||||||
|
|
||||||
|
```
|
||||||
$ time mksquashfs install perl-install.squashfs -comp zstd -Xcompression-level 22
|
$ time mksquashfs install perl-install.squashfs -comp zstd -Xcompression-level 22
|
||||||
Parallel mksquashfs: Using 16 processors
|
Parallel mksquashfs: Using 16 processors
|
||||||
Creating 4.0 filesystem on perl-install-zstd.squashfs, block size 131072.
|
Creating 4.0 filesystem on perl-install-zstd.squashfs, block size 131072.
|
||||||
@ -292,9 +308,11 @@ SquashFS that is the default setting for DwarFS:
|
|||||||
real 32m54.713s
|
real 32m54.713s
|
||||||
user 501m46.382s
|
user 501m46.382s
|
||||||
sys 0m58.528s
|
sys 0m58.528s
|
||||||
|
```
|
||||||
|
|
||||||
For DwarFS, I'm sticking to the defaults:
|
For DwarFS, I'm sticking to the defaults:
|
||||||
|
|
||||||
|
```
|
||||||
$ time mkdwarfs -i install -o perl-install.dwarfs
|
$ time mkdwarfs -i install -o perl-install.dwarfs
|
||||||
I 11:33:33.310931 scanning install
|
I 11:33:33.310931 scanning install
|
||||||
I 11:33:39.026712 waiting for background scanners...
|
I 11:33:39.026712 waiting for background scanners...
|
||||||
@ -333,13 +351,16 @@ For DwarFS, I'm sticking to the defaults:
|
|||||||
real 5m23.030s
|
real 5m23.030s
|
||||||
user 78m7.554s
|
user 78m7.554s
|
||||||
sys 1m47.968s
|
sys 1m47.968s
|
||||||
|
```
|
||||||
|
|
||||||
So in this comparison, `mkdwarfs` is **more than 6 times faster** than `mksquashfs`,
|
So in this comparison, `mkdwarfs` is **more than 6 times faster** than `mksquashfs`,
|
||||||
both in terms of CPU time and wall clock time.
|
both in terms of CPU time and wall clock time.
|
||||||
|
|
||||||
|
```
|
||||||
$ ll perl-install.*fs
|
$ ll perl-install.*fs
|
||||||
-rw-r--r-- 1 mhx users 447230618 Mar 3 20:28 perl-install.dwarfs
|
-rw-r--r-- 1 mhx users 447230618 Mar 3 20:28 perl-install.dwarfs
|
||||||
-rw-r--r-- 1 mhx users 4748902400 Mar 3 20:10 perl-install.squashfs
|
-rw-r--r-- 1 mhx users 4748902400 Mar 3 20:10 perl-install.squashfs
|
||||||
|
```
|
||||||
|
|
||||||
In terms of compression ratio, the **DwarFS file system is more than 10 times
|
In terms of compression ratio, the **DwarFS file system is more than 10 times
|
||||||
smaller than the SquashFS file system**. With DwarFS, the content has been
|
smaller than the SquashFS file system**. With DwarFS, the content has been
|
||||||
@ -351,21 +372,27 @@ the original space**.
|
|||||||
|
|
||||||
Here's another comparison using `lzma` compression instead of `zstd`:
|
Here's another comparison using `lzma` compression instead of `zstd`:
|
||||||
|
|
||||||
|
```
|
||||||
$ time mksquashfs install perl-install-lzma.squashfs -comp lzma
|
$ time mksquashfs install perl-install-lzma.squashfs -comp lzma
|
||||||
|
|
||||||
real 13m42.825s
|
real 13m42.825s
|
||||||
user 205m40.851s
|
user 205m40.851s
|
||||||
sys 3m29.088s
|
sys 3m29.088s
|
||||||
|
```
|
||||||
|
|
||||||
|
```
|
||||||
$ time mkdwarfs -i install -o perl-install-lzma.dwarfs -l9
|
$ time mkdwarfs -i install -o perl-install-lzma.dwarfs -l9
|
||||||
|
|
||||||
real 3m43.937s
|
real 3m43.937s
|
||||||
user 49m45.295s
|
user 49m45.295s
|
||||||
sys 1m44.550s
|
sys 1m44.550s
|
||||||
|
```
|
||||||
|
|
||||||
|
```
|
||||||
$ ll perl-install-lzma.*fs
|
$ ll perl-install-lzma.*fs
|
||||||
-rw-r--r-- 1 mhx users 315482627 Mar 3 21:23 perl-install-lzma.dwarfs
|
-rw-r--r-- 1 mhx users 315482627 Mar 3 21:23 perl-install-lzma.dwarfs
|
||||||
-rw-r--r-- 1 mhx users 3838406656 Mar 3 20:50 perl-install-lzma.squashfs
|
-rw-r--r-- 1 mhx users 3838406656 Mar 3 20:50 perl-install-lzma.squashfs
|
||||||
|
```
|
||||||
|
|
||||||
It's immediately obvious that the runs are significantly faster and the
|
It's immediately obvious that the runs are significantly faster and the
|
||||||
resulting images are significantly smaller. Still, `mkdwarfs` is about
|
resulting images are significantly smaller. Still, `mkdwarfs` is about
|
||||||
@ -383,21 +410,27 @@ uses a block size of 128KiB, whereas `mkdwarfs` uses 16MiB blocks by default,
|
|||||||
or even 64MiB blocks with `-l9`. When using identical block sizes for both
|
or even 64MiB blocks with `-l9`. When using identical block sizes for both
|
||||||
file systems, the difference, quite expectedly, becomes a lot less dramatic:
|
file systems, the difference, quite expectedly, becomes a lot less dramatic:
|
||||||
|
|
||||||
|
```
|
||||||
$ time mksquashfs install perl-install-lzma-1M.squashfs -comp lzma -b 1M
|
$ time mksquashfs install perl-install-lzma-1M.squashfs -comp lzma -b 1M
|
||||||
|
|
||||||
real 15m43.319s
|
real 15m43.319s
|
||||||
user 139m24.533s
|
user 139m24.533s
|
||||||
sys 0m45.132s
|
sys 0m45.132s
|
||||||
|
```
|
||||||
|
|
||||||
|
```
|
||||||
$ time mkdwarfs -i install -o perl-install-lzma-1M.dwarfs -l9 -S20 -B3
|
$ time mkdwarfs -i install -o perl-install-lzma-1M.dwarfs -l9 -S20 -B3
|
||||||
|
|
||||||
real 4m25.973s
|
real 4m25.973s
|
||||||
user 52m15.100s
|
user 52m15.100s
|
||||||
sys 7m41.889s
|
sys 7m41.889s
|
||||||
|
```
|
||||||
|
|
||||||
|
```
|
||||||
$ ll perl-install*.*fs
|
$ ll perl-install*.*fs
|
||||||
-rw-r--r-- 1 mhx users 935953866 Mar 13 12:12 perl-install-lzma-1M.dwarfs
|
-rw-r--r-- 1 mhx users 935953866 Mar 13 12:12 perl-install-lzma-1M.dwarfs
|
||||||
-rw-r--r-- 1 mhx users 3407474688 Mar 3 21:54 perl-install-lzma-1M.squashfs
|
-rw-r--r-- 1 mhx users 3407474688 Mar 3 21:54 perl-install-lzma-1M.squashfs
|
||||||
|
```
|
||||||
|
|
||||||
Even this is *still* not entirely fair, as it uses a feature (`-B3`) that allows
|
Even this is *still* not entirely fair, as it uses a feature (`-B3`) that allows
|
||||||
DwarFS to reference file chunks from up to two previous filesystem blocks.
|
DwarFS to reference file chunks from up to two previous filesystem blocks.
|
||||||
@ -413,6 +446,7 @@ fast experimentation with different algorithms and options without requiring
|
|||||||
a full rebuild of the file system. For example, recompressing the above file
|
a full rebuild of the file system. For example, recompressing the above file
|
||||||
system with the best possible compression (`-l 9`):
|
system with the best possible compression (`-l 9`):
|
||||||
|
|
||||||
|
```
|
||||||
$ time mkdwarfs --recompress -i perl-install.dwarfs -o perl-lzma-re.dwarfs -l9
|
$ time mkdwarfs --recompress -i perl-install.dwarfs -o perl-lzma-re.dwarfs -l9
|
||||||
I 20:28:03.246534 filesystem rewrittenwithout errors [148.3s]
|
I 20:28:03.246534 filesystem rewrittenwithout errors [148.3s]
|
||||||
⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯
|
⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯
|
||||||
@ -423,11 +457,14 @@ system with the best possible compression (`-l 9`):
|
|||||||
real 2m28.279s
|
real 2m28.279s
|
||||||
user 37m8.825s
|
user 37m8.825s
|
||||||
sys 0m43.256s
|
sys 0m43.256s
|
||||||
|
```
|
||||||
|
|
||||||
|
```
|
||||||
$ ll perl-*.dwarfs
|
$ ll perl-*.dwarfs
|
||||||
-rw-r--r-- 1 mhx users 447230618 Mar 3 20:28 perl-install.dwarfs
|
-rw-r--r-- 1 mhx users 447230618 Mar 3 20:28 perl-install.dwarfs
|
||||||
-rw-r--r-- 1 mhx users 390845518 Mar 4 20:28 perl-lzma-re.dwarfs
|
-rw-r--r-- 1 mhx users 390845518 Mar 4 20:28 perl-lzma-re.dwarfs
|
||||||
-rw-r--r-- 1 mhx users 315482627 Mar 3 21:23 perl-install-lzma.dwarfs
|
-rw-r--r-- 1 mhx users 315482627 Mar 3 21:23 perl-install-lzma.dwarfs
|
||||||
|
```
|
||||||
|
|
||||||
Note that while the recompressed filesystem is smaller than the original image,
|
Note that while the recompressed filesystem is smaller than the original image,
|
||||||
it is still a lot bigger than the filesystem we previously build with `-l9`.
|
it is still a lot bigger than the filesystem we previously build with `-l9`.
|
||||||
@ -438,6 +475,7 @@ In terms of how fast the file system is when using it, a quick test
|
|||||||
I've done is to freshly mount the filesystem created above and run
|
I've done is to freshly mount the filesystem created above and run
|
||||||
each of the 1139 `perl` executables to print their version.
|
each of the 1139 `perl` executables to print their version.
|
||||||
|
|
||||||
|
```
|
||||||
$ hyperfine -c "umount mnt" -p "umount mnt; dwarfs perl-install.dwarfs mnt -o cachesize=1g -o workers=4; sleep 1" -P procs 5 20 -D 5 "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P{procs} sh -c '\$0 -v >/dev/null'"
|
$ hyperfine -c "umount mnt" -p "umount mnt; dwarfs perl-install.dwarfs mnt -o cachesize=1g -o workers=4; sleep 1" -P procs 5 20 -D 5 "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P{procs} sh -c '\$0 -v >/dev/null'"
|
||||||
Benchmark #1: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P5 sh -c '$0 -v >/dev/null'
|
Benchmark #1: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P5 sh -c '$0 -v >/dev/null'
|
||||||
Time (mean ± σ): 1.810 s ± 0.013 s [User: 1.847 s, System: 0.623 s]
|
Time (mean ± σ): 1.810 s ± 0.013 s [User: 1.847 s, System: 0.623 s]
|
||||||
@ -454,6 +492,7 @@ each of the 1139 `perl` executables to print their version.
|
|||||||
Benchmark #4: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '$0 -v >/dev/null'
|
Benchmark #4: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '$0 -v >/dev/null'
|
||||||
Time (mean ± σ): 1.149 s ± 0.015 s [User: 2.128 s, System: 0.781 s]
|
Time (mean ± σ): 1.149 s ± 0.015 s [User: 2.128 s, System: 0.781 s]
|
||||||
Range (min … max): 1.136 s … 1.186 s 10 runs
|
Range (min … max): 1.136 s … 1.186 s 10 runs
|
||||||
|
```
|
||||||
|
|
||||||
These timings are for *initial* runs on a freshly mounted file system,
|
These timings are for *initial* runs on a freshly mounted file system,
|
||||||
running 5, 10, 15 and 20 processes in parallel. 1.1 seconds means that
|
running 5, 10, 15 and 20 processes in parallel. 1.1 seconds means that
|
||||||
@ -462,6 +501,7 @@ it takes only about 1 millisecond per Perl binary.
|
|||||||
Following are timings for *subsequent* runs, both on DwarFS (at `mnt`)
|
Following are timings for *subsequent* runs, both on DwarFS (at `mnt`)
|
||||||
and the original XFS (at `install`). DwarFS is around 15% slower here:
|
and the original XFS (at `install`). DwarFS is around 15% slower here:
|
||||||
|
|
||||||
|
```
|
||||||
$ hyperfine -P procs 10 20 -D 10 -w1 "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P{procs} sh -c '\$0 -v >/dev/null'" "ls -1 install/*/*/bin/perl5* | xargs -d $'\n' -n1 -P{procs} sh -c '\$0 -v >/dev/null'"
|
$ hyperfine -P procs 10 20 -D 10 -w1 "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P{procs} sh -c '\$0 -v >/dev/null'" "ls -1 install/*/*/bin/perl5* | xargs -d $'\n' -n1 -P{procs} sh -c '\$0 -v >/dev/null'"
|
||||||
Benchmark #1: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P10 sh -c '$0 -v >/dev/null'
|
Benchmark #1: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P10 sh -c '$0 -v >/dev/null'
|
||||||
Time (mean ± σ): 347.0 ms ± 7.2 ms [User: 1.755 s, System: 0.452 s]
|
Time (mean ± σ): 347.0 ms ± 7.2 ms [User: 1.755 s, System: 0.452 s]
|
||||||
@ -484,10 +524,12 @@ and the original XFS (at `install`). DwarFS is around 15% slower here:
|
|||||||
1.00 ± 0.01 times faster than 'ls -1 install/*/*/bin/perl5* | xargs -d $'\n' -n1 -P10 sh -c '$0 -v >/dev/null''
|
1.00 ± 0.01 times faster than 'ls -1 install/*/*/bin/perl5* | xargs -d $'\n' -n1 -P10 sh -c '$0 -v >/dev/null''
|
||||||
1.13 ± 0.02 times faster than 'ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '$0 -v >/dev/null''
|
1.13 ± 0.02 times faster than 'ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '$0 -v >/dev/null''
|
||||||
1.15 ± 0.03 times faster than 'ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P10 sh -c '$0 -v >/dev/null''
|
1.15 ± 0.03 times faster than 'ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P10 sh -c '$0 -v >/dev/null''
|
||||||
|
```
|
||||||
|
|
||||||
Using the lzma-compressed file system, the metrics for *initial* runs look
|
Using the lzma-compressed file system, the metrics for *initial* runs look
|
||||||
considerably worse (about an order of magnitude):
|
considerably worse (about an order of magnitude):
|
||||||
|
|
||||||
|
```
|
||||||
$ hyperfine -c "umount mnt" -p "umount mnt; dwarfs perl-install-lzma.dwarfs mnt -o cachesize=1g -o workers=4; sleep 1" -P procs 5 20 -D 5 "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P{procs} sh -c '\$0 -v >/dev/null'"
|
$ hyperfine -c "umount mnt" -p "umount mnt; dwarfs perl-install-lzma.dwarfs mnt -o cachesize=1g -o workers=4; sleep 1" -P procs 5 20 -D 5 "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P{procs} sh -c '\$0 -v >/dev/null'"
|
||||||
Benchmark #1: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P5 sh -c '$0 -v >/dev/null'
|
Benchmark #1: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P5 sh -c '$0 -v >/dev/null'
|
||||||
Time (mean ± σ): 10.660 s ± 0.057 s [User: 1.952 s, System: 0.729 s]
|
Time (mean ± σ): 10.660 s ± 0.057 s [User: 1.952 s, System: 0.729 s]
|
||||||
@ -504,6 +546,7 @@ considerably worse (about an order of magnitude):
|
|||||||
Benchmark #4: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '$0 -v >/dev/null'
|
Benchmark #4: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '$0 -v >/dev/null'
|
||||||
Time (mean ± σ): 9.004 s ± 0.298 s [User: 2.134 s, System: 0.736 s]
|
Time (mean ± σ): 9.004 s ± 0.298 s [User: 2.134 s, System: 0.736 s]
|
||||||
Range (min … max): 8.611 s … 9.555 s 10 runs
|
Range (min … max): 8.611 s … 9.555 s 10 runs
|
||||||
|
```
|
||||||
|
|
||||||
So you might want to consider using `zstd` instead of `lzma` if you'd
|
So you might want to consider using `zstd` instead of `lzma` if you'd
|
||||||
like to optimize for file system performance. It's also the default
|
like to optimize for file system performance. It's also the default
|
||||||
@ -511,6 +554,7 @@ compression used by `mkdwarfs`.
|
|||||||
|
|
||||||
Now here's a comparison with the SquashFS filesystem:
|
Now here's a comparison with the SquashFS filesystem:
|
||||||
|
|
||||||
|
```
|
||||||
$ hyperfine -c 'sudo umount mnt' -p 'umount mnt; dwarfs perl-install.dwarfs mnt -o cachesize=1g -o workers=4; sleep 1' -n dwarfs-zstd "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '\$0 -v >/dev/null'" -p 'sudo umount mnt; sudo mount -t squashfs perl-install.squashfs mnt; sleep 1' -n squashfs-zstd "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '\$0 -v >/dev/null'"
|
$ hyperfine -c 'sudo umount mnt' -p 'umount mnt; dwarfs perl-install.dwarfs mnt -o cachesize=1g -o workers=4; sleep 1' -n dwarfs-zstd "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '\$0 -v >/dev/null'" -p 'sudo umount mnt; sudo mount -t squashfs perl-install.squashfs mnt; sleep 1' -n squashfs-zstd "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '\$0 -v >/dev/null'"
|
||||||
Benchmark #1: dwarfs-zstd
|
Benchmark #1: dwarfs-zstd
|
||||||
Time (mean ± σ): 1.151 s ± 0.015 s [User: 2.147 s, System: 0.769 s]
|
Time (mean ± σ): 1.151 s ± 0.015 s [User: 2.147 s, System: 0.769 s]
|
||||||
@ -523,17 +567,20 @@ Now here's a comparison with the SquashFS filesystem:
|
|||||||
Summary
|
Summary
|
||||||
'dwarfs-zstd' ran
|
'dwarfs-zstd' ran
|
||||||
5.85 ± 0.08 times faster than 'squashfs-zstd'
|
5.85 ± 0.08 times faster than 'squashfs-zstd'
|
||||||
|
```
|
||||||
|
|
||||||
So DwarFS is almost six times faster than SquashFS. But what's more,
|
So DwarFS is almost six times faster than SquashFS. But what's more,
|
||||||
SquashFS also uses significantly more CPU power. However, the numbers
|
SquashFS also uses significantly more CPU power. However, the numbers
|
||||||
shown above for DwarFS obviously don't include the time spent in the
|
shown above for DwarFS obviously don't include the time spent in the
|
||||||
`dwarfs` process, so I repeated the test outside of hyperfine:
|
`dwarfs` process, so I repeated the test outside of hyperfine:
|
||||||
|
|
||||||
|
```
|
||||||
$ time dwarfs perl-install.dwarfs mnt -o cachesize=1g -o workers=4 -f
|
$ time dwarfs perl-install.dwarfs mnt -o cachesize=1g -o workers=4 -f
|
||||||
|
|
||||||
real 0m4.569s
|
real 0m4.569s
|
||||||
user 0m2.154s
|
user 0m2.154s
|
||||||
sys 0m1.846s
|
sys 0m1.846s
|
||||||
|
```
|
||||||
|
|
||||||
So in total, DwarFS was using 5.7 seconds of CPU time, whereas
|
So in total, DwarFS was using 5.7 seconds of CPU time, whereas
|
||||||
SquashFS was using 20.2 seconds, almost four times as much. Ignore
|
SquashFS was using 20.2 seconds, almost four times as much. Ignore
|
||||||
@ -546,13 +593,13 @@ used, [Tie::Hash::Indexed](https://github.com/mhx/Tie-Hash-Indexed),
|
|||||||
has an XS component that requires a C compiler to build. So this really
|
has an XS component that requires a C compiler to build. So this really
|
||||||
accesses a lot of different stuff in the file system:
|
accesses a lot of different stuff in the file system:
|
||||||
|
|
||||||
* The `perl` executables and its shared libraries
|
- The `perl` executables and its shared libraries
|
||||||
|
|
||||||
* The Perl modules used for writing the Makefile
|
- The Perl modules used for writing the Makefile
|
||||||
|
|
||||||
* Perl's C header files used for building the module
|
- Perl's C header files used for building the module
|
||||||
|
|
||||||
* More Perl modules used for running the tests
|
- More Perl modules used for running the tests
|
||||||
|
|
||||||
I wrote a little script to be able to run multiple builds in parallel:
|
I wrote a little script to be able to run multiple builds in parallel:
|
||||||
|
|
||||||
@ -574,28 +621,36 @@ The following command will run up to 16 builds in parallel on the 8 core
|
|||||||
Xeon CPU, including debug, optimized and threaded versions of all Perl
|
Xeon CPU, including debug, optimized and threaded versions of all Perl
|
||||||
releases between 5.10.0 and 5.33.3, a total of 624 `perl` installations:
|
releases between 5.10.0 and 5.33.3, a total of 624 `perl` installations:
|
||||||
|
|
||||||
|
```
|
||||||
$ time ls -1 /tmp/perl/install/*/perl-5.??.?/bin/perl5* | sort -t / -k 8 | xargs -d $'\n' -P 16 -n 1 ./build.sh
|
$ time ls -1 /tmp/perl/install/*/perl-5.??.?/bin/perl5* | sort -t / -k 8 | xargs -d $'\n' -P 16 -n 1 ./build.sh
|
||||||
|
```
|
||||||
|
|
||||||
Tests were done with a cleanly mounted file system to make sure the caches
|
Tests were done with a cleanly mounted file system to make sure the caches
|
||||||
were empty. `ccache` was primed to make sure all compiler runs could be
|
were empty. `ccache` was primed to make sure all compiler runs could be
|
||||||
satisfied from the cache. With SquashFS, the timing was:
|
satisfied from the cache. With SquashFS, the timing was:
|
||||||
|
|
||||||
|
```
|
||||||
real 0m52.385s
|
real 0m52.385s
|
||||||
user 8m10.333s
|
user 8m10.333s
|
||||||
sys 4m10.056s
|
sys 4m10.056s
|
||||||
|
```
|
||||||
|
|
||||||
And with DwarFS:
|
And with DwarFS:
|
||||||
|
|
||||||
|
```
|
||||||
real 0m50.469s
|
real 0m50.469s
|
||||||
user 9m22.597s
|
user 9m22.597s
|
||||||
sys 1m18.469s
|
sys 1m18.469s
|
||||||
|
```
|
||||||
|
|
||||||
So, frankly, not much of a difference, with DwarFS being just a bit faster.
|
So, frankly, not much of a difference, with DwarFS being just a bit faster.
|
||||||
The `dwarfs` process itself used:
|
The `dwarfs` process itself used:
|
||||||
|
|
||||||
|
```
|
||||||
real 0m56.686s
|
real 0m56.686s
|
||||||
user 0m18.857s
|
user 0m18.857s
|
||||||
sys 0m21.058s
|
sys 0m21.058s
|
||||||
|
```
|
||||||
|
|
||||||
So again, DwarFS used less raw CPU power overall, but in terms of wallclock
|
So again, DwarFS used less raw CPU power overall, but in terms of wallclock
|
||||||
time, the difference is really marginal.
|
time, the difference is really marginal.
|
||||||
@ -606,6 +661,7 @@ This test uses slightly less pathological input data: the root filesystem of
|
|||||||
a recent Raspberry Pi OS release. This file system also contains device inodes,
|
a recent Raspberry Pi OS release. This file system also contains device inodes,
|
||||||
so in order to preserve those, we pass `--with-devices` to `mkdwarfs`:
|
so in order to preserve those, we pass `--with-devices` to `mkdwarfs`:
|
||||||
|
|
||||||
|
```
|
||||||
$ time sudo mkdwarfs -i raspbian -o raspbian.dwarfs --with-devices
|
$ time sudo mkdwarfs -i raspbian -o raspbian.dwarfs --with-devices
|
||||||
I 21:30:29.812562 scanning raspbian
|
I 21:30:29.812562 scanning raspbian
|
||||||
I 21:30:29.908984 waiting for background scanners...
|
I 21:30:29.908984 waiting for background scanners...
|
||||||
@ -640,9 +696,11 @@ so in order to preserve those, we pass `--with-devices` to `mkdwarfs`:
|
|||||||
real 0m46.711s
|
real 0m46.711s
|
||||||
user 10m39.038s
|
user 10m39.038s
|
||||||
sys 0m8.123s
|
sys 0m8.123s
|
||||||
|
```
|
||||||
|
|
||||||
Again, SquashFS uses the same compression options:
|
Again, SquashFS uses the same compression options:
|
||||||
|
|
||||||
|
```
|
||||||
$ time sudo mksquashfs raspbian raspbian.squashfs -comp zstd -Xcompression-level 22
|
$ time sudo mksquashfs raspbian raspbian.squashfs -comp zstd -Xcompression-level 22
|
||||||
Parallel mksquashfs: Using 16 processors
|
Parallel mksquashfs: Using 16 processors
|
||||||
Creating 4.0 filesystem on raspbian.squashfs, block size 131072.
|
Creating 4.0 filesystem on raspbian.squashfs, block size 131072.
|
||||||
@ -694,62 +752,77 @@ Again, SquashFS uses the same compression options:
|
|||||||
real 0m50.124s
|
real 0m50.124s
|
||||||
user 9m41.708s
|
user 9m41.708s
|
||||||
sys 0m1.727s
|
sys 0m1.727s
|
||||||
|
```
|
||||||
|
|
||||||
The difference in speed is almost negligible. SquashFS is just a bit
|
The difference in speed is almost negligible. SquashFS is just a bit
|
||||||
slower here. In terms of compression, the difference also isn't huge:
|
slower here. In terms of compression, the difference also isn't huge:
|
||||||
|
|
||||||
|
```
|
||||||
$ ls -lh raspbian.* *.xz
|
$ ls -lh raspbian.* *.xz
|
||||||
-rw-r--r-- 1 mhx users 297M Mar 4 21:32 2020-08-20-raspios-buster-armhf-lite.img.xz
|
-rw-r--r-- 1 mhx users 297M Mar 4 21:32 2020-08-20-raspios-buster-armhf-lite.img.xz
|
||||||
-rw-r--r-- 1 root root 287M Mar 4 21:31 raspbian.dwarfs
|
-rw-r--r-- 1 root root 287M Mar 4 21:31 raspbian.dwarfs
|
||||||
-rw-r--r-- 1 root root 364M Mar 4 21:33 raspbian.squashfs
|
-rw-r--r-- 1 root root 364M Mar 4 21:33 raspbian.squashfs
|
||||||
|
```
|
||||||
|
|
||||||
Interestingly, `xz` actually can't compress the whole original image
|
Interestingly, `xz` actually can't compress the whole original image
|
||||||
better than DwarFS.
|
better than DwarFS.
|
||||||
|
|
||||||
We can even again try to increase the DwarFS compression level:
|
We can even again try to increase the DwarFS compression level:
|
||||||
|
|
||||||
|
```
|
||||||
$ time sudo mkdwarfs -i raspbian -o raspbian-9.dwarfs --with-devices -l9
|
$ time sudo mkdwarfs -i raspbian -o raspbian-9.dwarfs --with-devices -l9
|
||||||
|
|
||||||
real 0m54.161s
|
real 0m54.161s
|
||||||
user 8m40.109s
|
user 8m40.109s
|
||||||
sys 0m7.101s
|
sys 0m7.101s
|
||||||
|
```
|
||||||
|
|
||||||
Now that actually gets the DwarFS image size well below that of the
|
Now that actually gets the DwarFS image size well below that of the
|
||||||
`xz` archive:
|
`xz` archive:
|
||||||
|
|
||||||
|
```
|
||||||
$ ls -lh raspbian-9.dwarfs *.xz
|
$ ls -lh raspbian-9.dwarfs *.xz
|
||||||
-rw-r--r-- 1 root root 244M Mar 4 21:36 raspbian-9.dwarfs
|
-rw-r--r-- 1 root root 244M Mar 4 21:36 raspbian-9.dwarfs
|
||||||
-rw-r--r-- 1 mhx users 297M Mar 4 21:32 2020-08-20-raspios-buster-armhf-lite.img.xz
|
-rw-r--r-- 1 mhx users 297M Mar 4 21:32 2020-08-20-raspios-buster-armhf-lite.img.xz
|
||||||
|
```
|
||||||
|
|
||||||
Even if you actually build a tarball and compress that (instead of
|
Even if you actually build a tarball and compress that (instead of
|
||||||
compressing the EXT4 file system itself), `xz` isn't quite able to
|
compressing the EXT4 file system itself), `xz` isn't quite able to
|
||||||
match the DwarFS image size:
|
match the DwarFS image size:
|
||||||
|
|
||||||
|
```
|
||||||
$ time sudo tar cf - raspbian | xz -9 -vT 0 >raspbian.tar.xz
|
$ time sudo tar cf - raspbian | xz -9 -vT 0 >raspbian.tar.xz
|
||||||
100 % 246.9 MiB / 1,037.2 MiB = 0.238 13 MiB/s 1:18
|
100 % 246.9 MiB / 1,037.2 MiB = 0.238 13 MiB/s 1:18
|
||||||
|
|
||||||
real 1m18.226s
|
real 1m18.226s
|
||||||
user 6m35.381s
|
user 6m35.381s
|
||||||
sys 0m2.205s
|
sys 0m2.205s
|
||||||
|
```
|
||||||
|
|
||||||
|
```
|
||||||
$ ls -lh raspbian.tar.xz
|
$ ls -lh raspbian.tar.xz
|
||||||
-rw-r--r-- 1 mhx users 247M Mar 4 21:40 raspbian.tar.xz
|
-rw-r--r-- 1 mhx users 247M Mar 4 21:40 raspbian.tar.xz
|
||||||
|
```
|
||||||
|
|
||||||
DwarFS also comes with the [dwarfsextract](doc/dwarfsextract.md) tool
|
DwarFS also comes with the [dwarfsextract](doc/dwarfsextract.md) tool
|
||||||
that allows extraction of a filesystem image without the FUSE driver.
|
that allows extraction of a filesystem image without the FUSE driver.
|
||||||
So here's a comparison of the extraction speed:
|
So here's a comparison of the extraction speed:
|
||||||
|
|
||||||
|
```
|
||||||
$ time sudo tar xf raspbian.tar.xz -C out1
|
$ time sudo tar xf raspbian.tar.xz -C out1
|
||||||
|
|
||||||
real 0m12.846s
|
real 0m12.846s
|
||||||
user 0m12.313s
|
user 0m12.313s
|
||||||
sys 0m1.616s
|
sys 0m1.616s
|
||||||
|
```
|
||||||
|
|
||||||
|
```
|
||||||
$ time sudo dwarfsextract -i raspbian-9.dwarfs -o out2
|
$ time sudo dwarfsextract -i raspbian-9.dwarfs -o out2
|
||||||
|
|
||||||
real 0m3.825s
|
real 0m3.825s
|
||||||
user 0m13.234s
|
user 0m13.234s
|
||||||
sys 0m1.382s
|
sys 0m1.382s
|
||||||
|
```
|
||||||
|
|
||||||
So `dwarfsextract` is almost 4 times faster thanks to using multiple
|
So `dwarfsextract` is almost 4 times faster thanks to using multiple
|
||||||
worker threads for decompression. It's writing about 300 MiB/s in this
|
worker threads for decompression. It's writing about 300 MiB/s in this
|
||||||
@ -759,14 +832,18 @@ Another nice feature of `dwarfsextract` is that it allows you to directly
|
|||||||
output data in an archive format, so you could create a tarball from
|
output data in an archive format, so you could create a tarball from
|
||||||
your image without extracting the files to disk:
|
your image without extracting the files to disk:
|
||||||
|
|
||||||
|
```
|
||||||
$ dwarfsextract -i raspbian-9.dwarfs -f ustar | xz -9 -T0 >raspbian2.tar.xz
|
$ dwarfsextract -i raspbian-9.dwarfs -f ustar | xz -9 -T0 >raspbian2.tar.xz
|
||||||
|
```
|
||||||
|
|
||||||
This has the interesting side-effect that the resulting tarball will
|
This has the interesting side-effect that the resulting tarball will
|
||||||
likely be smaller than the one built straight from the directory:
|
likely be smaller than the one built straight from the directory:
|
||||||
|
|
||||||
|
```
|
||||||
$ ls -lh raspbian*.tar.xz
|
$ ls -lh raspbian*.tar.xz
|
||||||
-rw-r--r-- 1 mhx users 247M Mar 4 21:40 raspbian.tar.xz
|
-rw-r--r-- 1 mhx users 247M Mar 4 21:40 raspbian.tar.xz
|
||||||
-rw-r--r-- 1 mhx users 240M Mar 4 23:52 raspbian2.tar.xz
|
-rw-r--r-- 1 mhx users 240M Mar 4 23:52 raspbian2.tar.xz
|
||||||
|
```
|
||||||
|
|
||||||
That's because `dwarfsextract` writes files in inode-order, and by
|
That's because `dwarfsextract` writes files in inode-order, and by
|
||||||
default inodes are ordered by similarity for the best possible
|
default inodes are ordered by similarity for the best possible
|
||||||
@ -784,14 +861,17 @@ When I first read about `lrzip`, I was pretty certain it would easily
|
|||||||
beat DwarFS. So let's take a look. `lrzip` operates on a single file,
|
beat DwarFS. So let's take a look. `lrzip` operates on a single file,
|
||||||
so it's necessary to first build a tarball:
|
so it's necessary to first build a tarball:
|
||||||
|
|
||||||
|
```
|
||||||
$ time tar cf perl-install.tar install
|
$ time tar cf perl-install.tar install
|
||||||
|
|
||||||
real 2m9.568s
|
real 2m9.568s
|
||||||
user 0m3.757s
|
user 0m3.757s
|
||||||
sys 0m26.623s
|
sys 0m26.623s
|
||||||
|
```
|
||||||
|
|
||||||
Now we can run `lrzip`:
|
Now we can run `lrzip`:
|
||||||
|
|
||||||
|
```
|
||||||
$ time lrzip -vL9 -o perl-install.tar.lrzip perl-install.tar
|
$ time lrzip -vL9 -o perl-install.tar.lrzip perl-install.tar
|
||||||
The following options are in effect for this COMPRESSION.
|
The following options are in effect for this COMPRESSION.
|
||||||
Threading is ENABLED. Number of CPUs detected: 16
|
Threading is ENABLED. Number of CPUs detected: 16
|
||||||
@ -814,12 +894,15 @@ Now we can run `lrzip`:
|
|||||||
real 57m32.472s
|
real 57m32.472s
|
||||||
user 81m44.104s
|
user 81m44.104s
|
||||||
sys 4m50.221s
|
sys 4m50.221s
|
||||||
|
```
|
||||||
|
|
||||||
That definitely took a while. This is about an order of magnitude
|
That definitely took a while. This is about an order of magnitude
|
||||||
slower than `mkdwarfs` and it barely makes use of the 8 cores.
|
slower than `mkdwarfs` and it barely makes use of the 8 cores.
|
||||||
|
|
||||||
|
```
|
||||||
$ ll -h perl-install.tar.lrzip
|
$ ll -h perl-install.tar.lrzip
|
||||||
-rw-r--r-- 1 mhx users 500M Mar 6 21:16 perl-install.tar.lrzip
|
-rw-r--r-- 1 mhx users 500M Mar 6 21:16 perl-install.tar.lrzip
|
||||||
|
```
|
||||||
|
|
||||||
This is a surprisingly disappointing result. The archive is 65% larger
|
This is a surprisingly disappointing result. The archive is 65% larger
|
||||||
than a DwarFS image at `-l9` that takes less than 4 minutes to build.
|
than a DwarFS image at `-l9` that takes less than 4 minutes to build.
|
||||||
@ -828,6 +911,7 @@ unpacking the archive first.
|
|||||||
|
|
||||||
That being said, it *is* better than just using `xz` on the tarball:
|
That being said, it *is* better than just using `xz` on the tarball:
|
||||||
|
|
||||||
|
```
|
||||||
$ time xz -T0 -v9 -c perl-install.tar >perl-install.tar.xz
|
$ time xz -T0 -v9 -c perl-install.tar >perl-install.tar.xz
|
||||||
perl-install.tar (1/1)
|
perl-install.tar (1/1)
|
||||||
100 % 4,317.0 MiB / 49.0 GiB = 0.086 24 MiB/s 34:55
|
100 % 4,317.0 MiB / 49.0 GiB = 0.086 24 MiB/s 34:55
|
||||||
@ -835,9 +919,12 @@ That being said, it *is* better than just using `xz` on the tarball:
|
|||||||
real 34m55.450s
|
real 34m55.450s
|
||||||
user 543m50.810s
|
user 543m50.810s
|
||||||
sys 0m26.533s
|
sys 0m26.533s
|
||||||
|
```
|
||||||
|
|
||||||
|
```
|
||||||
$ ll perl-install.tar.xz -h
|
$ ll perl-install.tar.xz -h
|
||||||
-rw-r--r-- 1 mhx users 4.3G Mar 6 22:59 perl-install.tar.xz
|
-rw-r--r-- 1 mhx users 4.3G Mar 6 22:59 perl-install.tar.xz
|
||||||
|
```
|
||||||
|
|
||||||
### With zpaq
|
### With zpaq
|
||||||
|
|
||||||
@ -850,10 +937,13 @@ can be used.
|
|||||||
|
|
||||||
Anyway, how does it fare in terms of speed and compression performance?
|
Anyway, how does it fare in terms of speed and compression performance?
|
||||||
|
|
||||||
|
```
|
||||||
$ time zpaq a perl-install.zpaq install -m5
|
$ time zpaq a perl-install.zpaq install -m5
|
||||||
|
```
|
||||||
|
|
||||||
After a few million lines of output that (I think) cannot be turned off:
|
After a few million lines of output that (I think) cannot be turned off:
|
||||||
|
|
||||||
|
```
|
||||||
2258234 +added, 0 -removed.
|
2258234 +added, 0 -removed.
|
||||||
|
|
||||||
0.000000 + (51161.953159 -> 8932.000297 -> 490.227707) = 490.227707 MB
|
0.000000 + (51161.953159 -> 8932.000297 -> 490.227707) = 490.227707 MB
|
||||||
@ -862,30 +952,34 @@ After a few million lines of output that (I think) cannot be turned off:
|
|||||||
real 47m8.104s
|
real 47m8.104s
|
||||||
user 714m44.286s
|
user 714m44.286s
|
||||||
sys 3m6.751s
|
sys 3m6.751s
|
||||||
|
```
|
||||||
|
|
||||||
So it's an order of magnitude slower than `mkdwarfs` and uses 14 times
|
So it's an order of magnitude slower than `mkdwarfs` and uses 14 times
|
||||||
as much CPU resources as `mkdwarfs -l9`. The resulting archive it pretty
|
as much CPU resources as `mkdwarfs -l9`. The resulting archive it pretty
|
||||||
close in size to the default configuration DwarFS image, but it's more
|
close in size to the default configuration DwarFS image, but it's more
|
||||||
than 50% bigger than the image produced by `mkdwarfs -l9`.
|
than 50% bigger than the image produced by `mkdwarfs -l9`.
|
||||||
|
|
||||||
|
```
|
||||||
$ ll perl-install*.*
|
$ ll perl-install*.*
|
||||||
-rw-r--r-- 1 mhx users 490227707 Mar 7 01:38 perl-install.zpaq
|
-rw-r--r-- 1 mhx users 490227707 Mar 7 01:38 perl-install.zpaq
|
||||||
-rw-r--r-- 1 mhx users 315482627 Mar 3 21:23 perl-install-l9.dwarfs
|
-rw-r--r-- 1 mhx users 315482627 Mar 3 21:23 perl-install-l9.dwarfs
|
||||||
-rw-r--r-- 1 mhx users 447230618 Mar 3 20:28 perl-install.dwarfs
|
-rw-r--r-- 1 mhx users 447230618 Mar 3 20:28 perl-install.dwarfs
|
||||||
|
```
|
||||||
|
|
||||||
What's *really* surprising is how slow it is to extract the `zpaq`
|
What's *really* surprising is how slow it is to extract the `zpaq`
|
||||||
archive again:
|
archive again:
|
||||||
|
|
||||||
|
```
|
||||||
$ time zpaq x perl-install.zpaq
|
$ time zpaq x perl-install.zpaq
|
||||||
2798.097 seconds (all OK)
|
2798.097 seconds (all OK)
|
||||||
|
|
||||||
real 46m38.117s
|
real 46m38.117s
|
||||||
user 711m18.734s
|
user 711m18.734s
|
||||||
sys 3m47.876s
|
sys 3m47.876s
|
||||||
|
```
|
||||||
|
|
||||||
That's 700 times slower than extracting the DwarFS image.
|
That's 700 times slower than extracting the DwarFS image.
|
||||||
|
|
||||||
|
|
||||||
### With wimlib
|
### With wimlib
|
||||||
|
|
||||||
[wimlib](https://wimlib.net/) is a really interesting project that is
|
[wimlib](https://wimlib.net/) is a really interesting project that is
|
||||||
@ -896,6 +990,7 @@ quite a rich set of features, so it's definitely worth taking a look at.
|
|||||||
|
|
||||||
I first tried `wimcapture` on the perl dataset:
|
I first tried `wimcapture` on the perl dataset:
|
||||||
|
|
||||||
|
```
|
||||||
$ time wimcapture --unix-data --solid --solid-chunk-size=16M install perl-install.wim
|
$ time wimcapture --unix-data --solid --solid-chunk-size=16M install perl-install.wim
|
||||||
Scanning "install"
|
Scanning "install"
|
||||||
47 GiB scanned (1927501 files, 330733 directories)
|
47 GiB scanned (1927501 files, 330733 directories)
|
||||||
@ -905,12 +1000,15 @@ I first tried `wimcapture` on the perl dataset:
|
|||||||
real 15m23.310s
|
real 15m23.310s
|
||||||
user 174m29.274s
|
user 174m29.274s
|
||||||
sys 0m42.921s
|
sys 0m42.921s
|
||||||
|
```
|
||||||
|
|
||||||
|
```
|
||||||
$ ll perl-install.*
|
$ ll perl-install.*
|
||||||
-rw-r--r-- 1 mhx users 447230618 Mar 3 20:28 perl-install.dwarfs
|
-rw-r--r-- 1 mhx users 447230618 Mar 3 20:28 perl-install.dwarfs
|
||||||
-rw-r--r-- 1 mhx users 315482627 Mar 3 21:23 perl-install-l9.dwarfs
|
-rw-r--r-- 1 mhx users 315482627 Mar 3 21:23 perl-install-l9.dwarfs
|
||||||
-rw-r--r-- 1 mhx users 4748902400 Mar 3 20:10 perl-install.squashfs
|
-rw-r--r-- 1 mhx users 4748902400 Mar 3 20:10 perl-install.squashfs
|
||||||
-rw-r--r-- 1 mhx users 1016981520 Mar 6 21:12 perl-install.wim
|
-rw-r--r-- 1 mhx users 1016981520 Mar 6 21:12 perl-install.wim
|
||||||
|
```
|
||||||
|
|
||||||
So wimlib is definitely much better than squashfs, in terms of both
|
So wimlib is definitely much better than squashfs, in terms of both
|
||||||
compression ratio and speed. DwarFS is however about 3 times faster to
|
compression ratio and speed. DwarFS is however about 3 times faster to
|
||||||
@ -921,43 +1019,52 @@ When switching to LZMA compression, the DwarFS file system is more than
|
|||||||
What's a bit surprising is that mounting a *wim* file takes quite a bit
|
What's a bit surprising is that mounting a *wim* file takes quite a bit
|
||||||
of time:
|
of time:
|
||||||
|
|
||||||
|
```
|
||||||
$ time wimmount perl-install.wim mnt
|
$ time wimmount perl-install.wim mnt
|
||||||
[WARNING] Mounting a WIM file containing solid-compressed data; file access may be slow.
|
[WARNING] Mounting a WIM file containing solid-compressed data; file access may be slow.
|
||||||
|
|
||||||
real 0m2.038s
|
real 0m2.038s
|
||||||
user 0m1.764s
|
user 0m1.764s
|
||||||
sys 0m0.242s
|
sys 0m0.242s
|
||||||
|
```
|
||||||
|
|
||||||
Mounting the DwarFS image takes almost no time in comparison:
|
Mounting the DwarFS image takes almost no time in comparison:
|
||||||
|
|
||||||
|
```
|
||||||
$ time git/github/dwarfs/build-clang-11/dwarfs perl-install-default.dwarfs mnt
|
$ time git/github/dwarfs/build-clang-11/dwarfs perl-install-default.dwarfs mnt
|
||||||
I 00:23:39.238182 dwarfs (v0.4.0, fuse version 35)
|
I 00:23:39.238182 dwarfs (v0.4.0, fuse version 35)
|
||||||
|
|
||||||
real 0m0.003s
|
real 0m0.003s
|
||||||
user 0m0.003s
|
user 0m0.003s
|
||||||
sys 0m0.000s
|
sys 0m0.000s
|
||||||
|
```
|
||||||
|
|
||||||
That's just because it immediately forks into background by default and
|
That's just because it immediately forks into background by default and
|
||||||
initializes the file system in the background. However, even when
|
initializes the file system in the background. However, even when
|
||||||
running it in the foreground, initializing the file system takes only
|
running it in the foreground, initializing the file system takes only
|
||||||
about 60 milliseconds:
|
about 60 milliseconds:
|
||||||
|
|
||||||
|
```
|
||||||
$ dwarfs perl-install.dwarfs mnt -f
|
$ dwarfs perl-install.dwarfs mnt -f
|
||||||
I 00:25:03.186005 dwarfs (v0.4.0, fuse version 35)
|
I 00:25:03.186005 dwarfs (v0.4.0, fuse version 35)
|
||||||
I 00:25:03.248061 file system initialized [60.95ms]
|
I 00:25:03.248061 file system initialized [60.95ms]
|
||||||
|
```
|
||||||
|
|
||||||
If you actually build the DwarFS file system with uncompressed metadata,
|
If you actually build the DwarFS file system with uncompressed metadata,
|
||||||
mounting is basically instantaneous:
|
mounting is basically instantaneous:
|
||||||
|
|
||||||
|
```
|
||||||
$ dwarfs perl-install-meta.dwarfs mnt -f
|
$ dwarfs perl-install-meta.dwarfs mnt -f
|
||||||
I 00:27:52.667026 dwarfs (v0.4.0, fuse version 35)
|
I 00:27:52.667026 dwarfs (v0.4.0, fuse version 35)
|
||||||
I 00:27:52.671066 file system initialized [2.879ms]
|
I 00:27:52.671066 file system initialized [2.879ms]
|
||||||
|
```
|
||||||
|
|
||||||
I've tried running the benchmark where all 1139 `perl` executables
|
I've tried running the benchmark where all 1139 `perl` executables
|
||||||
print their version with the wimlib image, but after about 10 minutes,
|
print their version with the wimlib image, but after about 10 minutes,
|
||||||
it still hadn't finished the first run (with the DwarFS image, one run
|
it still hadn't finished the first run (with the DwarFS image, one run
|
||||||
took slightly more than 2 seconds). I then tried the following instead:
|
took slightly more than 2 seconds). I then tried the following instead:
|
||||||
|
|
||||||
|
```
|
||||||
$ ls -1 /tmp/perl/install/*/*/bin/perl5* | xargs -d $'\n' -n1 -P1 sh -c 'time $0 -v >/dev/null' 2>&1 | grep ^real
|
$ ls -1 /tmp/perl/install/*/*/bin/perl5* | xargs -d $'\n' -n1 -P1 sh -c 'time $0 -v >/dev/null' 2>&1 | grep ^real
|
||||||
real 0m0.802s
|
real 0m0.802s
|
||||||
real 0m0.652s
|
real 0m0.652s
|
||||||
@ -972,6 +1079,7 @@ took slightly more than 2 seconds). I then tried the following instead:
|
|||||||
real 0m1.809s
|
real 0m1.809s
|
||||||
real 0m1.790s
|
real 0m1.790s
|
||||||
real 0m2.115s
|
real 0m2.115s
|
||||||
|
```
|
||||||
|
|
||||||
Judging from that, it would have probably taken about half an hour
|
Judging from that, it would have probably taken about half an hour
|
||||||
for a single run, which makes at least the `--solid` wim image pretty
|
for a single run, which makes at least the `--solid` wim image pretty
|
||||||
@ -982,6 +1090,7 @@ that DwarFS actually organizes data internally. However, judging by the
|
|||||||
warning when mounting a solid image, it's probably not ideal when using
|
warning when mounting a solid image, it's probably not ideal when using
|
||||||
the image as a mounted file system. So I tried again without `--solid`:
|
the image as a mounted file system. So I tried again without `--solid`:
|
||||||
|
|
||||||
|
```
|
||||||
$ time wimcapture --unix-data install perl-install-nonsolid.wim
|
$ time wimcapture --unix-data install perl-install-nonsolid.wim
|
||||||
Scanning "install"
|
Scanning "install"
|
||||||
47 GiB scanned (1927501 files, 330733 directories)
|
47 GiB scanned (1927501 files, 330733 directories)
|
||||||
@ -991,25 +1100,31 @@ the image as a mounted file system. So I tried again without `--solid`:
|
|||||||
real 8m39.034s
|
real 8m39.034s
|
||||||
user 64m58.575s
|
user 64m58.575s
|
||||||
sys 0m32.003s
|
sys 0m32.003s
|
||||||
|
```
|
||||||
|
|
||||||
This is still more than 3 minutes slower than `mkdwarfs`. However, it
|
This is still more than 3 minutes slower than `mkdwarfs`. However, it
|
||||||
yields an image that's almost 10 times the size of the DwarFS image
|
yields an image that's almost 10 times the size of the DwarFS image
|
||||||
and comparable in size to the SquashFS image:
|
and comparable in size to the SquashFS image:
|
||||||
|
|
||||||
|
```
|
||||||
$ ll perl-install-nonsolid.wim -h
|
$ ll perl-install-nonsolid.wim -h
|
||||||
-rw-r--r-- 1 mhx users 4.6G Mar 6 23:24 perl-install-nonsolid.wim
|
-rw-r--r-- 1 mhx users 4.6G Mar 6 23:24 perl-install-nonsolid.wim
|
||||||
|
```
|
||||||
|
|
||||||
This *still* takes surprisingly long to mount:
|
This *still* takes surprisingly long to mount:
|
||||||
|
|
||||||
|
```
|
||||||
$ time wimmount perl-install-nonsolid.wim mnt
|
$ time wimmount perl-install-nonsolid.wim mnt
|
||||||
|
|
||||||
real 0m1.603s
|
real 0m1.603s
|
||||||
user 0m1.327s
|
user 0m1.327s
|
||||||
sys 0m0.275s
|
sys 0m0.275s
|
||||||
|
```
|
||||||
|
|
||||||
However, it's really usable as a file system, even though it's about
|
However, it's really usable as a file system, even though it's about
|
||||||
4-5 times slower than the DwarFS image:
|
4-5 times slower than the DwarFS image:
|
||||||
|
|
||||||
|
```
|
||||||
$ hyperfine -c 'umount mnt' -p 'umount mnt; dwarfs perl-install.dwarfs mnt -o cachesize=1g -o workers=4; sleep 1' -n dwarfs "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '\$0 -v >/dev/null'" -p 'umount mnt; wimmount perl-install-nonsolid.wim mnt; sleep 1' -n wimlib "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '\$0 -v >/dev/null'"
|
$ hyperfine -c 'umount mnt' -p 'umount mnt; dwarfs perl-install.dwarfs mnt -o cachesize=1g -o workers=4; sleep 1' -n dwarfs "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '\$0 -v >/dev/null'" -p 'umount mnt; wimmount perl-install-nonsolid.wim mnt; sleep 1' -n wimlib "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '\$0 -v >/dev/null'"
|
||||||
Benchmark #1: dwarfs
|
Benchmark #1: dwarfs
|
||||||
Time (mean ± σ): 1.149 s ± 0.019 s [User: 2.147 s, System: 0.739 s]
|
Time (mean ± σ): 1.149 s ± 0.019 s [User: 2.147 s, System: 0.739 s]
|
||||||
@ -1022,7 +1137,7 @@ However, it's really usable as a file system, even though it's about
|
|||||||
Summary
|
Summary
|
||||||
'dwarfs' ran
|
'dwarfs' ran
|
||||||
6.56 ± 0.12 times faster than 'wimlib'
|
6.56 ± 0.12 times faster than 'wimlib'
|
||||||
|
```
|
||||||
|
|
||||||
### With Cromfs
|
### With Cromfs
|
||||||
|
|
||||||
@ -1035,6 +1150,7 @@ Here's a run on the Perl dataset, with the block size set to 16 MiB to
|
|||||||
match the default of DwarFS, and with additional options suggested to
|
match the default of DwarFS, and with additional options suggested to
|
||||||
speed up compression:
|
speed up compression:
|
||||||
|
|
||||||
|
```
|
||||||
$ time mkcromfs -f 16777216 -qq -e -r100000 install perl-install.cromfs
|
$ time mkcromfs -f 16777216 -qq -e -r100000 install perl-install.cromfs
|
||||||
Writing perl-install.cromfs...
|
Writing perl-install.cromfs...
|
||||||
mkcromfs: Automatically enabling --24bitblocknums because it seems possible for this filesystem.
|
mkcromfs: Automatically enabling --24bitblocknums because it seems possible for this filesystem.
|
||||||
@ -1050,6 +1166,7 @@ speed up compression:
|
|||||||
real 29m9.634s
|
real 29m9.634s
|
||||||
user 201m37.816s
|
user 201m37.816s
|
||||||
sys 2m15.005s
|
sys 2m15.005s
|
||||||
|
```
|
||||||
|
|
||||||
So it processed 21 MiB out of 48 GiB in half an hour, using almost
|
So it processed 21 MiB out of 48 GiB in half an hour, using almost
|
||||||
twice as much CPU resources as DwarFS for the *whole* file system.
|
twice as much CPU resources as DwarFS for the *whole* file system.
|
||||||
@ -1062,6 +1179,7 @@ I then tried once more with a smaller version of the Perl dataset.
|
|||||||
This only has 20 versions (instead of 1139) of Perl, and obviously
|
This only has 20 versions (instead of 1139) of Perl, and obviously
|
||||||
a lot less redundancy:
|
a lot less redundancy:
|
||||||
|
|
||||||
|
```
|
||||||
$ time mkcromfs -f 16777216 -qq -e -r100000 install-small perl-install.cromfs
|
$ time mkcromfs -f 16777216 -qq -e -r100000 install-small perl-install.cromfs
|
||||||
Writing perl-install.cromfs...
|
Writing perl-install.cromfs...
|
||||||
mkcromfs: Automatically enabling --16bitblocknums because it seems possible for this filesystem.
|
mkcromfs: Automatically enabling --16bitblocknums because it seems possible for this filesystem.
|
||||||
@ -1092,9 +1210,11 @@ a lot less redundancy:
|
|||||||
real 27m38.833s
|
real 27m38.833s
|
||||||
user 277m36.208s
|
user 277m36.208s
|
||||||
sys 11m36.945s
|
sys 11m36.945s
|
||||||
|
```
|
||||||
|
|
||||||
And repeating the same task with `mkdwarfs`:
|
And repeating the same task with `mkdwarfs`:
|
||||||
|
|
||||||
|
```
|
||||||
$ time mkdwarfs -i install-small -o perl-install-small.dwarfs
|
$ time mkdwarfs -i install-small -o perl-install-small.dwarfs
|
||||||
21:13:38.131724 scanning install-small
|
21:13:38.131724 scanning install-small
|
||||||
21:13:38.320139 waiting for background scanners...
|
21:13:38.320139 waiting for background scanners...
|
||||||
@ -1129,13 +1249,16 @@ And repeating the same task with `mkdwarfs`:
|
|||||||
real 0m33.007s
|
real 0m33.007s
|
||||||
user 3m43.324s
|
user 3m43.324s
|
||||||
sys 0m4.015s
|
sys 0m4.015s
|
||||||
|
```
|
||||||
|
|
||||||
So `mkdwarfs` is about 50 times faster than `mkcromfs` and uses 75 times
|
So `mkdwarfs` is about 50 times faster than `mkcromfs` and uses 75 times
|
||||||
less CPU resources. At the same time, the DwarFS file system is 30% smaller:
|
less CPU resources. At the same time, the DwarFS file system is 30% smaller:
|
||||||
|
|
||||||
|
```
|
||||||
$ ls -l perl-install-small.*fs
|
$ ls -l perl-install-small.*fs
|
||||||
-rw-r--r-- 1 mhx users 35328512 Dec 8 14:25 perl-install-small.cromfs
|
-rw-r--r-- 1 mhx users 35328512 Dec 8 14:25 perl-install-small.cromfs
|
||||||
-rw-r--r-- 1 mhx users 25175016 Dec 10 21:14 perl-install-small.dwarfs
|
-rw-r--r-- 1 mhx users 25175016 Dec 10 21:14 perl-install-small.dwarfs
|
||||||
|
```
|
||||||
|
|
||||||
I noticed that the `blockifying` step that took ages for the full dataset
|
I noticed that the `blockifying` step that took ages for the full dataset
|
||||||
with `mkcromfs` ran substantially faster (in terms of MiB/second) on the
|
with `mkcromfs` ran substantially faster (in terms of MiB/second) on the
|
||||||
@ -1145,6 +1268,7 @@ behaviour that's slowing down `mkcromfs`.
|
|||||||
In order to be completely fair, I also ran `mkdwarfs` with `-l 9` to enable
|
In order to be completely fair, I also ran `mkdwarfs` with `-l 9` to enable
|
||||||
LZMA compression (which is what `mkcromfs` uses by default):
|
LZMA compression (which is what `mkcromfs` uses by default):
|
||||||
|
|
||||||
|
```
|
||||||
$ time mkdwarfs -i install-small -o perl-install-small-l9.dwarfs -l 9
|
$ time mkdwarfs -i install-small -o perl-install-small-l9.dwarfs -l 9
|
||||||
21:16:21.874975 scanning install-small
|
21:16:21.874975 scanning install-small
|
||||||
21:16:22.092201 waiting for background scanners...
|
21:16:22.092201 waiting for background scanners...
|
||||||
@ -1179,11 +1303,14 @@ LZMA compression (which is what `mkcromfs` uses by default):
|
|||||||
real 0m48.683s
|
real 0m48.683s
|
||||||
user 2m24.905s
|
user 2m24.905s
|
||||||
sys 0m3.292s
|
sys 0m3.292s
|
||||||
|
```
|
||||||
|
|
||||||
|
```
|
||||||
$ ls -l perl-install-small*.*fs
|
$ ls -l perl-install-small*.*fs
|
||||||
-rw-r--r-- 1 mhx users 18282075 Dec 10 21:17 perl-install-small-l9.dwarfs
|
-rw-r--r-- 1 mhx users 18282075 Dec 10 21:17 perl-install-small-l9.dwarfs
|
||||||
-rw-r--r-- 1 mhx users 35328512 Dec 8 14:25 perl-install-small.cromfs
|
-rw-r--r-- 1 mhx users 35328512 Dec 8 14:25 perl-install-small.cromfs
|
||||||
-rw-r--r-- 1 mhx users 25175016 Dec 10 21:14 perl-install-small.dwarfs
|
-rw-r--r-- 1 mhx users 25175016 Dec 10 21:14 perl-install-small.dwarfs
|
||||||
|
```
|
||||||
|
|
||||||
It takes about 15 seconds longer to build the DwarFS file system with LZMA
|
It takes about 15 seconds longer to build the DwarFS file system with LZMA
|
||||||
compression (this is still 35 times faster than Cromfs), but reduces the
|
compression (this is still 35 times faster than Cromfs), but reduces the
|
||||||
@ -1203,6 +1330,7 @@ supports LZ4 compression.
|
|||||||
|
|
||||||
I was feeling lucky and decided to run it on the full Perl dataset:
|
I was feeling lucky and decided to run it on the full Perl dataset:
|
||||||
|
|
||||||
|
```
|
||||||
$ time mkfs.erofs perl-install.erofs install -zlz4hc,9 -d2
|
$ time mkfs.erofs perl-install.erofs install -zlz4hc,9 -d2
|
||||||
mkfs.erofs 1.2
|
mkfs.erofs 1.2
|
||||||
c_version: [ 1.2]
|
c_version: [ 1.2]
|
||||||
@ -1213,17 +1341,21 @@ I was feeling lucky and decided to run it on the full Perl dataset:
|
|||||||
real 912m42.601s
|
real 912m42.601s
|
||||||
user 903m2.777s
|
user 903m2.777s
|
||||||
sys 1m52.812s
|
sys 1m52.812s
|
||||||
|
```
|
||||||
|
|
||||||
As you can tell, after more than 15 hours I just gave up. In those
|
As you can tell, after more than 15 hours I just gave up. In those
|
||||||
15 hours, `mkfs.erofs` had produced a 13 GiB output file:
|
15 hours, `mkfs.erofs` had produced a 13 GiB output file:
|
||||||
|
|
||||||
|
```
|
||||||
$ ll -h perl-install.erofs
|
$ ll -h perl-install.erofs
|
||||||
-rw-r--r-- 1 mhx users 13G Dec 9 14:42 perl-install.erofs
|
-rw-r--r-- 1 mhx users 13G Dec 9 14:42 perl-install.erofs
|
||||||
|
```
|
||||||
|
|
||||||
I don't think this would have been very useful to compare with DwarFS.
|
I don't think this would have been very useful to compare with DwarFS.
|
||||||
|
|
||||||
Just as for Cromfs, I re-ran with the smaller Perl dataset:
|
Just as for Cromfs, I re-ran with the smaller Perl dataset:
|
||||||
|
|
||||||
|
```
|
||||||
$ time mkfs.erofs perl-install-small.erofs install-small -zlz4hc,9 -d2
|
$ time mkfs.erofs perl-install-small.erofs install-small -zlz4hc,9 -d2
|
||||||
mkfs.erofs 1.2
|
mkfs.erofs 1.2
|
||||||
c_version: [ 1.2]
|
c_version: [ 1.2]
|
||||||
@ -1233,20 +1365,24 @@ Just as for Cromfs, I re-ran with the smaller Perl dataset:
|
|||||||
real 0m27.844s
|
real 0m27.844s
|
||||||
user 0m20.570s
|
user 0m20.570s
|
||||||
sys 0m1.848s
|
sys 0m1.848s
|
||||||
|
```
|
||||||
|
|
||||||
That was surprisingly quick, which makes me think that, again, there
|
That was surprisingly quick, which makes me think that, again, there
|
||||||
might be some accidentally quadratic complexity hiding in `mkfs.erofs`.
|
might be some accidentally quadratic complexity hiding in `mkfs.erofs`.
|
||||||
The output file it produced is an order of magnitude larger than the
|
The output file it produced is an order of magnitude larger than the
|
||||||
DwarFS image:
|
DwarFS image:
|
||||||
|
|
||||||
|
```
|
||||||
$ ls -l perl-install-small.*fs
|
$ ls -l perl-install-small.*fs
|
||||||
-rw-r--r-- 1 mhx users 26928161 Dec 8 15:05 perl-install-small.dwarfs
|
-rw-r--r-- 1 mhx users 26928161 Dec 8 15:05 perl-install-small.dwarfs
|
||||||
-rw-r--r-- 1 mhx users 296488960 Dec 9 14:45 perl-install-small.erofs
|
-rw-r--r-- 1 mhx users 296488960 Dec 9 14:45 perl-install-small.erofs
|
||||||
|
```
|
||||||
|
|
||||||
Admittedly, this isn't a fair comparison. EROFS has a fixed block size
|
Admittedly, this isn't a fair comparison. EROFS has a fixed block size
|
||||||
of 4 KiB, and it uses LZ4 compression. If we tweak DwarFS to the same
|
of 4 KiB, and it uses LZ4 compression. If we tweak DwarFS to the same
|
||||||
parameters, we get:
|
parameters, we get:
|
||||||
|
|
||||||
|
```
|
||||||
$ time mkdwarfs -i install-small -o perl-install-small-lz4.dwarfs -C lz4hc:level=9 -S 12
|
$ time mkdwarfs -i install-small -o perl-install-small-lz4.dwarfs -C lz4hc:level=9 -S 12
|
||||||
21:21:18.136796 scanning install-small
|
21:21:18.136796 scanning install-small
|
||||||
21:21:18.376998 waiting for background scanners...
|
21:21:18.376998 waiting for background scanners...
|
||||||
@ -1281,6 +1417,7 @@ parameters, we get:
|
|||||||
real 0m9.075s
|
real 0m9.075s
|
||||||
user 0m37.718s
|
user 0m37.718s
|
||||||
sys 0m2.427s
|
sys 0m2.427s
|
||||||
|
```
|
||||||
|
|
||||||
It finishes in less than half the time and produces an output image
|
It finishes in less than half the time and produces an output image
|
||||||
that's half the size of the EROFS image.
|
that's half the size of the EROFS image.
|
||||||
|
@ -1,11 +1,9 @@
|
|||||||
dwarfs-format(5) -- DwarFS File System Format v2.3
|
# dwarfs-format(5) -- DwarFS File System Format v2.3
|
||||||
==================================================
|
|
||||||
|
|
||||||
## DESCRIPTION
|
## DESCRIPTION
|
||||||
|
|
||||||
This document describes the DwarFS file system format, version 2.3.
|
This document describes the DwarFS file system format, version 2.3.
|
||||||
|
|
||||||
|
|
||||||
## FILE STRUCTURE
|
## FILE STRUCTURE
|
||||||
|
|
||||||
A DwarFS file system image is just a sequence of blocks. Each block has the
|
A DwarFS file system image is just a sequence of blocks. Each block has the
|
||||||
@ -65,26 +63,24 @@ A couple of notes:
|
|||||||
larger than the one it supports. However, a new program will still
|
larger than the one it supports. However, a new program will still
|
||||||
read all file systems with a smaller minor version number.
|
read all file systems with a smaller minor version number.
|
||||||
|
|
||||||
|
|
||||||
### Section Types
|
### Section Types
|
||||||
|
|
||||||
There are currently 3 different section types.
|
There are currently 3 different section types.
|
||||||
|
|
||||||
* `BLOCK` (0):
|
- `BLOCK` (0):
|
||||||
A block of data. This is where all file data is stored. There can be
|
A block of data. This is where all file data is stored. There can be
|
||||||
an arbitrary number of blocks of this type.
|
an arbitrary number of blocks of this type.
|
||||||
|
|
||||||
* `METADATA_V2_SCHEMA` (7):
|
- `METADATA_V2_SCHEMA` (7):
|
||||||
The schema used to layout the `METADATA_V2` block contents. This is
|
The schema used to layout the `METADATA_V2` block contents. This is
|
||||||
stored in "compact" thrift encoding.
|
stored in "compact" thrift encoding.
|
||||||
|
|
||||||
* `METADATA_V2` (8):
|
- `METADATA_V2` (8):
|
||||||
This section contains the bulk of the metadata. It's essentially just
|
This section contains the bulk of the metadata. It's essentially just
|
||||||
a collection of bit-packed arrays and structures. The exact layout of
|
a collection of bit-packed arrays and structures. The exact layout of
|
||||||
each list and structure depends on the actual data and is stored
|
each list and structure depends on the actual data and is stored
|
||||||
separately in `METADATA_V2_SCHEMA`.
|
separately in `METADATA_V2_SCHEMA`.
|
||||||
|
|
||||||
|
|
||||||
## METADATA FORMAT
|
## METADATA FORMAT
|
||||||
|
|
||||||
Here is a high-level overview of how all the bits and pieces relate
|
Here is a high-level overview of how all the bits and pieces relate
|
||||||
@ -169,17 +165,12 @@ list. The index into this list is the `inode_num` from `dir_entries`,
|
|||||||
but you can perform direct lookups based on the inode number as well.
|
but you can perform direct lookups based on the inode number as well.
|
||||||
The `inodes` list is strictly in the following order:
|
The `inodes` list is strictly in the following order:
|
||||||
|
|
||||||
* directory inodes (`S_IFDIR`)
|
- directory inodes (`S_IFDIR`)
|
||||||
|
- symlink inodes (`S_IFLNK`)
|
||||||
* symlink inodes (`S_IFLNK`)
|
- regular *unique* file inodes (`S_IREG`)
|
||||||
|
- regular *shared* file inodes (`S_IREG`)
|
||||||
* regular *unique* file inodes (`S_IREG`)
|
- character/block device inodes (`S_IFCHR`, `S_IFBLK`)
|
||||||
|
- socket/pipe inodes (`S_IFSOCK`, `S_IFIFO`)
|
||||||
* regular *shared* file inodes (`S_IREG`)
|
|
||||||
|
|
||||||
* character/block device inodes (`S_IFCHR`, `S_IFBLK`)
|
|
||||||
|
|
||||||
* socket/pipe inodes (`S_IFSOCK`, `S_IFIFO`)
|
|
||||||
|
|
||||||
The offsets can thus be found by using a binary search with a
|
The offsets can thus be found by using a binary search with a
|
||||||
predicate on the inode more. The shared file offset can be found
|
predicate on the inode more. The shared file offset can be found
|
||||||
|
@ -1,5 +1,4 @@
|
|||||||
dwarfs(1) -- mount highly compressed read-only file system
|
# dwarfs(1) -- mount highly compressed read-only file system
|
||||||
==========================================================
|
|
||||||
|
|
||||||
## SYNOPSIS
|
## SYNOPSIS
|
||||||
|
|
||||||
@ -14,14 +13,16 @@ but it has some distinct features.
|
|||||||
Other than that, it's pretty straightforward to use. Once you've created a
|
Other than that, it's pretty straightforward to use. Once you've created a
|
||||||
file system image using mkdwarfs(1), you can mount it with:
|
file system image using mkdwarfs(1), you can mount it with:
|
||||||
|
|
||||||
|
```
|
||||||
dwarfs image.dwarfs /path/to/mountpoint
|
dwarfs image.dwarfs /path/to/mountpoint
|
||||||
|
```
|
||||||
|
|
||||||
## OPTIONS
|
## OPTIONS
|
||||||
|
|
||||||
In addition to the regular FUSE options, `dwarfs` supports the following
|
In addition to the regular FUSE options, `dwarfs` supports the following
|
||||||
options:
|
options:
|
||||||
|
|
||||||
* `-o cachesize=`*value*:
|
- `-o cachesize=`*value*:
|
||||||
Size of the block cache, in bytes. You can append suffixes
|
Size of the block cache, in bytes. You can append suffixes
|
||||||
(`k`, `m`, `g`) to specify the size in KiB, MiB and GiB,
|
(`k`, `m`, `g`) to specify the size in KiB, MiB and GiB,
|
||||||
respectively. Note that this is not the upper memory limit
|
respectively. Note that this is not the upper memory limit
|
||||||
@ -31,12 +32,12 @@ options:
|
|||||||
with it, which can use a significant amount of additional
|
with it, which can use a significant amount of additional
|
||||||
memory. For more details, see mkdwarfs(1).
|
memory. For more details, see mkdwarfs(1).
|
||||||
|
|
||||||
* `-o workers=`*value*:
|
- `-o workers=`*value*:
|
||||||
Number of worker threads to use for decompressing blocks.
|
Number of worker threads to use for decompressing blocks.
|
||||||
If you have a lot of CPUs, increasing this number can help
|
If you have a lot of CPUs, increasing this number can help
|
||||||
speed up access to files in the filesystem.
|
speed up access to files in the filesystem.
|
||||||
|
|
||||||
* `-o decratio=`*value*:
|
- `-o decratio=`*value*:
|
||||||
The ratio over which a block is fully decompressed. Blocks
|
The ratio over which a block is fully decompressed. Blocks
|
||||||
are only decompressed partially, so each block has to carry
|
are only decompressed partially, so each block has to carry
|
||||||
the decompressor state with it until it is fully decompressed.
|
the decompressor state with it until it is fully decompressed.
|
||||||
@ -49,18 +50,18 @@ options:
|
|||||||
we keep the partially decompressed block, but if we've
|
we keep the partially decompressed block, but if we've
|
||||||
decompressed more then 80%, we'll fully decompress it.
|
decompressed more then 80%, we'll fully decompress it.
|
||||||
|
|
||||||
* `-o offset=`*value*|`auto`:
|
- `-o offset=`*value*|`auto`:
|
||||||
Specify the byte offset at which the filesystem is located in
|
Specify the byte offset at which the filesystem is located in
|
||||||
the image, or use `auto` to detect the offset automatically.
|
the image, or use `auto` to detect the offset automatically.
|
||||||
This is only useful for images that have some header located
|
This is only useful for images that have some header located
|
||||||
before the actual filesystem data.
|
before the actual filesystem data.
|
||||||
|
|
||||||
* `-o mlock=none`|`try`|`must`:
|
- `-o mlock=none`|`try`|`must`:
|
||||||
Set this to `try` or `must` instead of the default `none` to
|
Set this to `try` or `must` instead of the default `none` to
|
||||||
try or require `mlock()`ing of the file system metadata into
|
try or require `mlock()`ing of the file system metadata into
|
||||||
memory.
|
memory.
|
||||||
|
|
||||||
* `-o enable_nlink`:
|
- `-o enable_nlink`:
|
||||||
Set this option if you want correct hardlink counts for regular
|
Set this option if you want correct hardlink counts for regular
|
||||||
files. If this is not specified, the hardlink count will be 1.
|
files. If this is not specified, the hardlink count will be 1.
|
||||||
Enabling this will slow down the initialization of the fuse
|
Enabling this will slow down the initialization of the fuse
|
||||||
@ -70,7 +71,7 @@ options:
|
|||||||
will also consume more memory to hold the hardlink count table.
|
will also consume more memory to hold the hardlink count table.
|
||||||
This will be 4 bytes for every regular file inode.
|
This will be 4 bytes for every regular file inode.
|
||||||
|
|
||||||
* `-o readonly`:
|
- `-o readonly`:
|
||||||
Show all file system entries as read-only. By default, DwarFS
|
Show all file system entries as read-only. By default, DwarFS
|
||||||
will preserve the original writeability, which is obviously a
|
will preserve the original writeability, which is obviously a
|
||||||
lie as it's a read-only file system. However, this is needed
|
lie as it's a read-only file system. However, this is needed
|
||||||
@ -80,7 +81,7 @@ options:
|
|||||||
overlays and want the file system to reflect its read-only
|
overlays and want the file system to reflect its read-only
|
||||||
state, you can set this option.
|
state, you can set this option.
|
||||||
|
|
||||||
* `-o (no_)cache_image`:
|
- `-o (no_)cache_image`:
|
||||||
By default, `dwarfs` tries to ensure that the compressed file
|
By default, `dwarfs` tries to ensure that the compressed file
|
||||||
system image will not be cached by the kernel (i.e. the default
|
system image will not be cached by the kernel (i.e. the default
|
||||||
is `-o no_cache_image`). This will reduce the memory consumption
|
is `-o no_cache_image`). This will reduce the memory consumption
|
||||||
@ -91,7 +92,7 @@ options:
|
|||||||
`-o cache_image` to keep the compressed image data in the kernel
|
`-o cache_image` to keep the compressed image data in the kernel
|
||||||
cache.
|
cache.
|
||||||
|
|
||||||
* `-o (no_)cache_files`:
|
- `-o (no_)cache_files`:
|
||||||
By default, files in the mounted file system will be cached by
|
By default, files in the mounted file system will be cached by
|
||||||
the kernel (i.e. the default is `-o cache_files`). This will
|
the kernel (i.e. the default is `-o cache_files`). This will
|
||||||
significantly improve performance when accessing the same files
|
significantly improve performance when accessing the same files
|
||||||
@ -103,14 +104,14 @@ options:
|
|||||||
though it's likely that the kernel will already do the right thing
|
though it's likely that the kernel will already do the right thing
|
||||||
even when the cache is enabled.
|
even when the cache is enabled.
|
||||||
|
|
||||||
* `-o debuglevel=`*name*:
|
- `-o debuglevel=`*name*:
|
||||||
Use this for different levels of verbosity along with either
|
Use this for different levels of verbosity along with either
|
||||||
the `-f` or `-d` FUSE options. This can give you some insight
|
the `-f` or `-d` FUSE options. This can give you some insight
|
||||||
over what the file system driver is doing internally, but it's
|
over what the file system driver is doing internally, but it's
|
||||||
mainly meant for debugging and the `debug` and `trace` levels
|
mainly meant for debugging and the `debug` and `trace` levels
|
||||||
in particular will slow down the driver.
|
in particular will slow down the driver.
|
||||||
|
|
||||||
* `-o tidy_strategy=`*name*:
|
- `-o tidy_strategy=`*name*:
|
||||||
Use one of the following strategies to tidy the block cache:
|
Use one of the following strategies to tidy the block cache:
|
||||||
|
|
||||||
- `none`:
|
- `none`:
|
||||||
@ -128,14 +129,14 @@ options:
|
|||||||
cache is traversed and all blocks that have been fully or
|
cache is traversed and all blocks that have been fully or
|
||||||
partially swapped out by the kernel will be removed.
|
partially swapped out by the kernel will be removed.
|
||||||
|
|
||||||
* `-o tidy_interval=`*time*:
|
- `-o tidy_interval=`*time*:
|
||||||
Used only if `tidy_strategy` is not `none`. This is the interval
|
Used only if `tidy_strategy` is not `none`. This is the interval
|
||||||
at which the cache tidying thread wakes up to look for blocks
|
at which the cache tidying thread wakes up to look for blocks
|
||||||
that can be removed from the cache. This must be an integer value.
|
that can be removed from the cache. This must be an integer value.
|
||||||
Suffixes `ms`, `s`, `m`, `h` are supported. If no suffix is given,
|
Suffixes `ms`, `s`, `m`, `h` are supported. If no suffix is given,
|
||||||
the value will be assumed to be in seconds.
|
the value will be assumed to be in seconds.
|
||||||
|
|
||||||
* `-o tidy_max_age=`*time*:
|
- `-o tidy_max_age=`*time*:
|
||||||
Used only if `tidy_strategy` is `time`. A block will be removed
|
Used only if `tidy_strategy` is `time`. A block will be removed
|
||||||
from the cache if it hasn't been used for this time span. This must
|
from the cache if it hasn't been used for this time span. This must
|
||||||
be an integer value. Suffixes `ms`, `s`, `m`, `h` are supported.
|
be an integer value. Suffixes `ms`, `s`, `m`, `h` are supported.
|
||||||
@ -145,7 +146,7 @@ There's two particular FUSE options that you'll likely need at some
|
|||||||
point, e.g. when trying to set up an `overlayfs` mount on top of
|
point, e.g. when trying to set up an `overlayfs` mount on top of
|
||||||
a DwarFS image:
|
a DwarFS image:
|
||||||
|
|
||||||
* `-o allow_root` and `-o allow_other`:
|
- `-o allow_root` and `-o allow_other`:
|
||||||
These will ensure that the mounted file system can be read by
|
These will ensure that the mounted file system can be read by
|
||||||
either `root` or any other user in addition to the user that
|
either `root` or any other user in addition to the user that
|
||||||
started the fuse driver. So if you're running `dwarfs` as a
|
started the fuse driver. So if you're running `dwarfs` as a
|
||||||
@ -193,27 +194,33 @@ set of Perl versions back.
|
|||||||
|
|
||||||
Here's what you need to do:
|
Here's what you need to do:
|
||||||
|
|
||||||
* Create a set of directories. In my case, these are all located
|
- Create a set of directories. In my case, these are all located
|
||||||
in `/tmp/perl` as this was the orginal install location.
|
in `/tmp/perl` as this was the orginal install location.
|
||||||
|
|
||||||
|
```
|
||||||
cd /tmp/perl
|
cd /tmp/perl
|
||||||
mkdir install-ro
|
mkdir install-ro
|
||||||
mkdir install-rw
|
mkdir install-rw
|
||||||
mkdir install-work
|
mkdir install-work
|
||||||
mkdir install
|
mkdir install
|
||||||
|
```
|
||||||
|
|
||||||
* Mount the DwarFS image. `-o allow_root` is needed to make sure
|
- Mount the DwarFS image. `-o allow_root` is needed to make sure
|
||||||
`overlayfs` has access to the mounted file system. In order
|
`overlayfs` has access to the mounted file system. In order
|
||||||
to use `-o allow_root`, you may have to uncomment or add
|
to use `-o allow_root`, you may have to uncomment or add
|
||||||
`user_allow_other` in `/etc/fuse.conf`.
|
`user_allow_other` in `/etc/fuse.conf`.
|
||||||
|
|
||||||
|
```
|
||||||
dwarfs perl-install.dwarfs install-ro -o allow_root
|
dwarfs perl-install.dwarfs install-ro -o allow_root
|
||||||
|
```
|
||||||
|
|
||||||
* Now set up `overlayfs`.
|
- Now set up `overlayfs`.
|
||||||
|
|
||||||
|
```
|
||||||
sudo mount -t overlay overlay -o lowerdir=install-ro,upperdir=install-rw,workdir=install-work install
|
sudo mount -t overlay overlay -o lowerdir=install-ro,upperdir=install-rw,workdir=install-work install
|
||||||
|
```
|
||||||
|
|
||||||
* That's it. You should now be able to access a writeable version
|
- That's it. You should now be able to access a writeable version
|
||||||
of your DwarFS image in `install`.
|
of your DwarFS image in `install`.
|
||||||
|
|
||||||
You can go even further than that. Say you have different sets of
|
You can go even further than that. Say you have different sets of
|
||||||
@ -223,7 +230,9 @@ the read-write directory after unmounting the `overlayfs`, and
|
|||||||
selectively add this by passing a colon-separated list to the
|
selectively add this by passing a colon-separated list to the
|
||||||
`lowerdir` option when setting up the `overlayfs` mount:
|
`lowerdir` option when setting up the `overlayfs` mount:
|
||||||
|
|
||||||
|
```
|
||||||
sudo mount -t overlay overlay -o lowerdir=install-ro:install-modules install
|
sudo mount -t overlay overlay -o lowerdir=install-ro:install-modules install
|
||||||
|
```
|
||||||
|
|
||||||
If you want *this* merged overlay to be writable, just add in the
|
If you want *this* merged overlay to be writable, just add in the
|
||||||
`upperdir` and `workdir` options from before again.
|
`upperdir` and `workdir` options from before again.
|
||||||
|
@ -1,5 +1,4 @@
|
|||||||
dwarfsck(1) -- check DwarFS image
|
# dwarfsck(1) -- check DwarFS image
|
||||||
=================================
|
|
||||||
|
|
||||||
## SYNOPSIS
|
## SYNOPSIS
|
||||||
|
|
||||||
@ -15,42 +14,42 @@ with a non-zero exit code.
|
|||||||
|
|
||||||
## OPTIONS
|
## OPTIONS
|
||||||
|
|
||||||
* `-i`, `--input=`*file*:
|
- `-i`, `--input=`*file*:
|
||||||
Path to the filesystem image.
|
Path to the filesystem image.
|
||||||
|
|
||||||
* `-d`, `--detail=`*value*:
|
- `-d`, `--detail=`*value*:
|
||||||
Level of filesystem information detail. The default is 2. Higher values
|
Level of filesystem information detail. The default is 2. Higher values
|
||||||
mean more output. Values larger than 6 will currently not provide any
|
mean more output. Values larger than 6 will currently not provide any
|
||||||
further detail.
|
further detail.
|
||||||
|
|
||||||
* `-O`, `--image-offset=`*value*|`auto`:
|
- `-O`, `--image-offset=`*value*|`auto`:
|
||||||
Specify the byte offset at which the filesystem is located in the image.
|
Specify the byte offset at which the filesystem is located in the image.
|
||||||
Use `auto` to detect the offset automatically. This is also the default.
|
Use `auto` to detect the offset automatically. This is also the default.
|
||||||
This is only useful for images that have some header located before the
|
This is only useful for images that have some header located before the
|
||||||
actual filesystem data.
|
actual filesystem data.
|
||||||
|
|
||||||
* `-H`, `--print-header`:
|
- `-H`, `--print-header`:
|
||||||
Print the header located before the filesystem image to stdout. If no
|
Print the header located before the filesystem image to stdout. If no
|
||||||
header is present, the program will exit with a non-zero exit code.
|
header is present, the program will exit with a non-zero exit code.
|
||||||
|
|
||||||
* `-n`, `--num-workers=`*value*:
|
- `-n`, `--num-workers=`*value*:
|
||||||
Number of worker threads used for integrity checking.
|
Number of worker threads used for integrity checking.
|
||||||
|
|
||||||
* `--check-integrity`:
|
- `--check-integrity`:
|
||||||
In addition to performing a fast checksum check, also perform a (much
|
In addition to performing a fast checksum check, also perform a (much
|
||||||
slower) verification of the embedded SHA-512/256 hashes.
|
slower) verification of the embedded SHA-512/256 hashes.
|
||||||
|
|
||||||
* `--json`:
|
- `--json`:
|
||||||
Print a simple JSON representation of the filesystem metadata. Please
|
Print a simple JSON representation of the filesystem metadata. Please
|
||||||
note that the format is *not* stable.
|
note that the format is *not* stable.
|
||||||
|
|
||||||
* `--export-metadata=`*file*:
|
- `--export-metadata=`*file*:
|
||||||
Export all filesystem meteadata in JSON format.
|
Export all filesystem meteadata in JSON format.
|
||||||
|
|
||||||
* `--log-level=`*name*:
|
- `--log-level=`*name*:
|
||||||
Specifiy a logging level.
|
Specifiy a logging level.
|
||||||
|
|
||||||
* `--help`:
|
- `--help`:
|
||||||
Show program help, including option defaults.
|
Show program help, including option defaults.
|
||||||
|
|
||||||
## AUTHOR
|
## AUTHOR
|
||||||
|
@ -1,9 +1,8 @@
|
|||||||
dwarfsextract(1) -- extract DwarFS image
|
# dwarfsextract(1) -- extract DwarFS image
|
||||||
========================================
|
|
||||||
|
|
||||||
## SYNOPSIS
|
## SYNOPSIS
|
||||||
|
|
||||||
`dwarfsextract` `-i` *image* [`-o` *dir*] [*options*...]<br>
|
`dwarfsextract` `-i` *image* [`-o` *dir*] [*options*...]
|
||||||
`dwarfsextract` `-i` *image* -f *format* [`-o` *file*] [*options*...]
|
`dwarfsextract` `-i` *image* -f *format* [`-o` *file*] [*options*...]
|
||||||
|
|
||||||
## DESCRIPTION
|
## DESCRIPTION
|
||||||
@ -35,32 +34,32 @@ to disk:
|
|||||||
|
|
||||||
## OPTIONS
|
## OPTIONS
|
||||||
|
|
||||||
* `-i`, `--input=`*file*:
|
- `-i`, `--input=`*file*:
|
||||||
Path to the source filesystem.
|
Path to the source filesystem.
|
||||||
|
|
||||||
* `-o`, `--output=`*directory*|*file*:
|
- `-o`, `--output=`*directory*|*file*:
|
||||||
If no format is specified, this is the directory to which the contents
|
If no format is specified, this is the directory to which the contents
|
||||||
of the filesystem should be extracted. If a format is specified, this
|
of the filesystem should be extracted. If a format is specified, this
|
||||||
is the name of the output archive. This option can be omitted, in which
|
is the name of the output archive. This option can be omitted, in which
|
||||||
case the default is to extract the files to the current directory, or
|
case the default is to extract the files to the current directory, or
|
||||||
to write the archive data to stdout.
|
to write the archive data to stdout.
|
||||||
|
|
||||||
* `-O`, `--image-offset=`*value*|`auto`:
|
- `-O`, `--image-offset=`*value*|`auto`:
|
||||||
Specify the byte offset at which the filesystem is located in the image.
|
Specify the byte offset at which the filesystem is located in the image.
|
||||||
Use `auto` to detect the offset automatically. This is also the default.
|
Use `auto` to detect the offset automatically. This is also the default.
|
||||||
This is only useful for images that have some header located before the
|
This is only useful for images that have some header located before the
|
||||||
actual filesystem data.
|
actual filesystem data.
|
||||||
|
|
||||||
* `-f`, `--format=`*format*:
|
- `-f`, `--format=`*format*:
|
||||||
The archive format to produce. If this is left empty or unspecified,
|
The archive format to produce. If this is left empty or unspecified,
|
||||||
files will be extracted to the output directory (or the current directory
|
files will be extracted to the output directory (or the current directory
|
||||||
if no output directory is specified). For a full list of supported formats,
|
if no output directory is specified). For a full list of supported formats,
|
||||||
see libarchive-formats(5).
|
see libarchive-formats(5).
|
||||||
|
|
||||||
* `-n`, `--num-workers=`*value*:
|
- `-n`, `--num-workers=`*value*:
|
||||||
Number of worker threads used for extracting the filesystem.
|
Number of worker threads used for extracting the filesystem.
|
||||||
|
|
||||||
* `-s`, `--cache-size=`*value*:
|
- `-s`, `--cache-size=`*value*:
|
||||||
Size of the block cache, in bytes. You can append suffixes (`k`, `m`, `g`)
|
Size of the block cache, in bytes. You can append suffixes (`k`, `m`, `g`)
|
||||||
to specify the size in KiB, MiB and GiB, respectively. Note that this is
|
to specify the size in KiB, MiB and GiB, respectively. Note that this is
|
||||||
not the upper memory limit of the process, as there may be blocks in
|
not the upper memory limit of the process, as there may be blocks in
|
||||||
@ -68,10 +67,10 @@ to disk:
|
|||||||
fully decompressed yet will carry decompressor state along with it, which
|
fully decompressed yet will carry decompressor state along with it, which
|
||||||
can use a significant amount of additional memory.
|
can use a significant amount of additional memory.
|
||||||
|
|
||||||
* `--log-level=`*name*:
|
- `--log-level=`*name*:
|
||||||
Specifiy a logging level.
|
Specifiy a logging level.
|
||||||
|
|
||||||
* `--help`:
|
- `--help`:
|
||||||
Show program help, including option defaults.
|
Show program help, including option defaults.
|
||||||
|
|
||||||
## AUTHOR
|
## AUTHOR
|
||||||
|
@ -1,9 +1,8 @@
|
|||||||
mkdwarfs(1) -- create highly compressed read-only file systems
|
# mkdwarfs(1) -- create highly compressed read-only file systems
|
||||||
==============================================================
|
|
||||||
|
|
||||||
## SYNOPSIS
|
## SYNOPSIS
|
||||||
|
|
||||||
`mkdwarfs` `-i` *path* `-o` *file* [*options*...]<br>
|
`mkdwarfs` `-i` *path* `-o` *file* [*options*...]
|
||||||
`mkdwarfs` `-i` *file* `-o` *file* `--recompress` [*options*...]
|
`mkdwarfs` `-i` *file* `-o` *file* `--recompress` [*options*...]
|
||||||
|
|
||||||
## DESCRIPTION
|
## DESCRIPTION
|
||||||
@ -26,17 +25,17 @@ After that, you can mount it with dwarfs(1):
|
|||||||
|
|
||||||
There two mandatory options for specifying the input and output:
|
There two mandatory options for specifying the input and output:
|
||||||
|
|
||||||
* `-i`, `--input=`*path*|*file*:
|
- `-i`, `--input=`*path*|*file*:
|
||||||
Path to the root directory containing the files from which you want to
|
Path to the root directory containing the files from which you want to
|
||||||
build a filesystem. If the `--recompress` option is given, this argument
|
build a filesystem. If the `--recompress` option is given, this argument
|
||||||
is the source filesystem.
|
is the source filesystem.
|
||||||
|
|
||||||
* `-o`, `--output=`*file*:
|
- `-o`, `--output=`*file*:
|
||||||
File name of the output filesystem.
|
File name of the output filesystem.
|
||||||
|
|
||||||
Most other options are concerned with compression tuning:
|
Most other options are concerned with compression tuning:
|
||||||
|
|
||||||
* `-l`, `--compress-level=`*value*:
|
- `-l`, `--compress-level=`*value*:
|
||||||
Compression level to use for the filesystem. **If you are unsure, please
|
Compression level to use for the filesystem. **If you are unsure, please
|
||||||
stick to the default level of 7.** This is intended to provide some
|
stick to the default level of 7.** This is intended to provide some
|
||||||
sensible defaults and will depend on which compression libraries were
|
sensible defaults and will depend on which compression libraries were
|
||||||
@ -53,7 +52,7 @@ Most other options are concerned with compression tuning:
|
|||||||
`--window-step` and `--order`. See the output of `mkdwarfs --help` for
|
`--window-step` and `--order`. See the output of `mkdwarfs --help` for
|
||||||
a table listing the exact defaults used for each compression level.
|
a table listing the exact defaults used for each compression level.
|
||||||
|
|
||||||
* `-S`, `--block-size-bits=`*value*:
|
- `-S`, `--block-size-bits=`*value*:
|
||||||
The block size used for the compressed filesystem. The actual block size
|
The block size used for the compressed filesystem. The actual block size
|
||||||
is two to the power of this value. Larger block sizes will offer better
|
is two to the power of this value. Larger block sizes will offer better
|
||||||
overall compression ratios, but will be slower and consume more memory
|
overall compression ratios, but will be slower and consume more memory
|
||||||
@ -61,7 +60,7 @@ Most other options are concerned with compression tuning:
|
|||||||
least partially decompressed into memory. Values between 20 and 26, i.e.
|
least partially decompressed into memory. Values between 20 and 26, i.e.
|
||||||
between 1MiB and 64MiB, usually work quite well.
|
between 1MiB and 64MiB, usually work quite well.
|
||||||
|
|
||||||
* `-N`, `--num-workers=`*value*:
|
- `-N`, `--num-workers=`*value*:
|
||||||
Number of worker threads used for building the filesystem. This defaults
|
Number of worker threads used for building the filesystem. This defaults
|
||||||
to the number of processors available on your system. Use this option if
|
to the number of processors available on your system. Use this option if
|
||||||
you want to limit the resources used by `mkdwarfs`.
|
you want to limit the resources used by `mkdwarfs`.
|
||||||
@ -75,7 +74,7 @@ Most other options are concerned with compression tuning:
|
|||||||
individual filesystem blocks in the background. Ordering, segmenting
|
individual filesystem blocks in the background. Ordering, segmenting
|
||||||
and block building are, again, single-threaded and run independently.
|
and block building are, again, single-threaded and run independently.
|
||||||
|
|
||||||
* `-B`, `--max-lookback-blocks=`*value*:
|
- `-B`, `--max-lookback-blocks=`*value*:
|
||||||
Specify how many of the most recent blocks to scan for duplicate segments.
|
Specify how many of the most recent blocks to scan for duplicate segments.
|
||||||
By default, only the current block will be scanned. The larger this number,
|
By default, only the current block will be scanned. The larger this number,
|
||||||
the more duplicate segments will likely be found, which may further improve
|
the more duplicate segments will likely be found, which may further improve
|
||||||
@ -84,7 +83,7 @@ Most other options are concerned with compression tuning:
|
|||||||
files can now potentially span multiple filesystem blocks. Passing `-B0`
|
files can now potentially span multiple filesystem blocks. Passing `-B0`
|
||||||
will completely disable duplicate segment search.
|
will completely disable duplicate segment search.
|
||||||
|
|
||||||
* `-W`, `--window-size=`*value*:
|
- `-W`, `--window-size=`*value*:
|
||||||
Window size of cyclic hash used for segmenting. This is again an exponent
|
Window size of cyclic hash used for segmenting. This is again an exponent
|
||||||
to a base of two. Cyclic hashes are used by `mkdwarfs` for finding
|
to a base of two. Cyclic hashes are used by `mkdwarfs` for finding
|
||||||
identical segments across multiple files. This is done on top of duplicate
|
identical segments across multiple files. This is done on top of duplicate
|
||||||
@ -101,7 +100,7 @@ Most other options are concerned with compression tuning:
|
|||||||
size will grow. Passing `-W0` will completely disable duplicate segment
|
size will grow. Passing `-W0` will completely disable duplicate segment
|
||||||
search.
|
search.
|
||||||
|
|
||||||
* `-w`, `--window-step=`*value*:
|
- `-w`, `--window-step=`*value*:
|
||||||
This option specifies how often cyclic hash values are stored for lookup.
|
This option specifies how often cyclic hash values are stored for lookup.
|
||||||
It is specified relative to the window size, as a base-2 exponent that
|
It is specified relative to the window size, as a base-2 exponent that
|
||||||
divides the window size. To give a concrete example, if `--window-size=16`
|
divides the window size. To give a concrete example, if `--window-size=16`
|
||||||
@ -114,7 +113,7 @@ Most other options are concerned with compression tuning:
|
|||||||
If you use a larger value for this option, the increments become *smaller*,
|
If you use a larger value for this option, the increments become *smaller*,
|
||||||
and `mkdwarfs` will be slightly slower and use more memory.
|
and `mkdwarfs` will be slightly slower and use more memory.
|
||||||
|
|
||||||
* `--bloom-filter-size`=*value*:
|
- `--bloom-filter-size`=*value*:
|
||||||
The segmenting algorithm uses a bloom filter to determine quickly if
|
The segmenting algorithm uses a bloom filter to determine quickly if
|
||||||
there is *no* match at a given position. This will filter out more than
|
there is *no* match at a given position. This will filter out more than
|
||||||
90% of bad matches quickly with the default bloom filter size. The default
|
90% of bad matches quickly with the default bloom filter size. The default
|
||||||
@ -123,7 +122,7 @@ Most other options are concerned with compression tuning:
|
|||||||
be able to see some improvement. If you're tight on memory, then decreasing
|
be able to see some improvement. If you're tight on memory, then decreasing
|
||||||
this will potentially save a few MiBs.
|
this will potentially save a few MiBs.
|
||||||
|
|
||||||
* `-L`, `--memory-limit=`*value*:
|
- `-L`, `--memory-limit=`*value*:
|
||||||
Approximately how much memory you want `mkdwarfs` to use during filesystem
|
Approximately how much memory you want `mkdwarfs` to use during filesystem
|
||||||
creation. Note that currently this will only affect the block manager
|
creation. Note that currently this will only affect the block manager
|
||||||
component, i.e. the number of filesystem blocks that are in flight but
|
component, i.e. the number of filesystem blocks that are in flight but
|
||||||
@ -134,24 +133,24 @@ Most other options are concerned with compression tuning:
|
|||||||
algorithms, so if you're short on memory it might be worth tweaking the
|
algorithms, so if you're short on memory it might be worth tweaking the
|
||||||
compression options.
|
compression options.
|
||||||
|
|
||||||
* `-C`, `--compression=`*algorithm*[`:`*algopt*[`=`*value*][`,`...]]:
|
- `-C`, `--compression=`*algorithm*[`:`*algopt*[`=`*value*][`,`...]]:
|
||||||
The compression algorithm and configuration used for file system data.
|
The compression algorithm and configuration used for file system data.
|
||||||
The value for this option is a colon-separated list. The first item is
|
The value for this option is a colon-separated list. The first item is
|
||||||
the compression algorithm, the remaining item are its options. Options
|
the compression algorithm, the remaining item are its options. Options
|
||||||
can be either boolean or have a value. For details on which algori`thms
|
can be either boolean or have a value. For details on which algorithms
|
||||||
and options are available, see the output of `mkdwarfs --help`. `zstd`
|
and options are available, see the output of `mkdwarfs --help`. `zstd`
|
||||||
will give you the best compression while still keeping decompression
|
will give you the best compression while still keeping decompression
|
||||||
*very* fast. `lzma` will compress even better, but decompression will
|
*very* fast. `lzma` will compress even better, but decompression will
|
||||||
be around ten times slower.
|
be around ten times slower.
|
||||||
|
|
||||||
* `--schema-compression=`*algorithm*[`:`*algopt*[`=`*value*][`,`...]]:
|
- `--schema-compression=`*algorithm*[`:`*algopt*[`=`*value*][`,`...]]:
|
||||||
The compression algorithm and configuration used for the metadata schema.
|
The compression algorithm and configuration used for the metadata schema.
|
||||||
Takes the same arguments as `--compression` above. The schema is *very*
|
Takes the same arguments as `--compression` above. The schema is *very*
|
||||||
small, in the hundreds of bytes, so this is only relevant for extremely
|
small, in the hundreds of bytes, so this is only relevant for extremely
|
||||||
small file systems. The default (`zstd`) has shown to give considerably
|
small file systems. The default (`zstd`) has shown to give considerably
|
||||||
better results than any other algorithms.
|
better results than any other algorithms.
|
||||||
|
|
||||||
* `--metadata-compression=`*algorithm*[`:`*algopt*[`=`*value*][`,`...]]:
|
- `--metadata-compression=`*algorithm*[`:`*algopt*[`=`*value*][`,`...]]:
|
||||||
The compression algorithm and configuration used for the metadata.
|
The compression algorithm and configuration used for the metadata.
|
||||||
Takes the same arguments as `--compression` above. The metadata has been
|
Takes the same arguments as `--compression` above. The metadata has been
|
||||||
optimized for very little redundancy and leaving it uncompressed, the
|
optimized for very little redundancy and leaving it uncompressed, the
|
||||||
@ -161,7 +160,7 @@ Most other options are concerned with compression tuning:
|
|||||||
care about mount time, you can safely choose `lzma` compression here, as
|
care about mount time, you can safely choose `lzma` compression here, as
|
||||||
the data will only have to be decompressed once when mounting the image.
|
the data will only have to be decompressed once when mounting the image.
|
||||||
|
|
||||||
* `--recompress`[`=all`|`=block`|`=metadata`|`=none`]:
|
- `--recompress`[`=all`|`=block`|`=metadata`|`=none`]:
|
||||||
Take an existing DwarFS file system and recompress it using different
|
Take an existing DwarFS file system and recompress it using different
|
||||||
compression algorithms. If no argument or `all` is given, all sections
|
compression algorithms. If no argument or `all` is given, all sections
|
||||||
in the file system image will be recompressed. Note that *only* the
|
in the file system image will be recompressed. Note that *only* the
|
||||||
@ -177,7 +176,7 @@ Most other options are concerned with compression tuning:
|
|||||||
metadata to uncompressed metadata without having to rebuild or recompress
|
metadata to uncompressed metadata without having to rebuild or recompress
|
||||||
all the other data.
|
all the other data.
|
||||||
|
|
||||||
* `-P`, `--pack-metadata=auto`|`none`|[`all`|`chunk_table`|`directories`|`shared_files`|`names`|`names_index`|`symlinks`|`symlinks_index`|`force`|`plain`[`,`...]]:
|
- `-P`, `--pack-metadata=auto`|`none`|[`all`|`chunk_table`|`directories`|`shared_files`|`names`|`names_index`|`symlinks`|`symlinks_index`|`force`|`plain`[`,`...]]:
|
||||||
Which metadata information to store in packed format. This is primarily
|
Which metadata information to store in packed format. This is primarily
|
||||||
useful when storing metadata uncompressed, as it allows for smaller
|
useful when storing metadata uncompressed, as it allows for smaller
|
||||||
metadata block size without having to turn on compression. Keep in mind,
|
metadata block size without having to turn on compression. Keep in mind,
|
||||||
@ -189,34 +188,34 @@ Most other options are concerned with compression tuning:
|
|||||||
systems that contain hundreds of thousands of files.
|
systems that contain hundreds of thousands of files.
|
||||||
See [Metadata Packing](#metadata-packing) for more details.
|
See [Metadata Packing](#metadata-packing) for more details.
|
||||||
|
|
||||||
* `--set-owner=`*uid*:
|
- `--set-owner=`*uid*:
|
||||||
Set the owner for all entities in the file system. This can reduce the
|
Set the owner for all entities in the file system. This can reduce the
|
||||||
size of the file system. If the input only has a single owner already,
|
size of the file system. If the input only has a single owner already,
|
||||||
setting this won't make any difference.
|
setting this won't make any difference.
|
||||||
|
|
||||||
* `--set-group=`*gid*:
|
- `--set-group=`*gid*:
|
||||||
Set the group for all entities in the file system. This can reduce the
|
Set the group for all entities in the file system. This can reduce the
|
||||||
size of the file system. If the input only has a single group already,
|
size of the file system. If the input only has a single group already,
|
||||||
setting this won't make any difference.
|
setting this won't make any difference.
|
||||||
|
|
||||||
* `--set-time=`*time*|`now`:
|
- `--set-time=`*time*|`now`:
|
||||||
Set the time stamps for all entities to this value. This can significantly
|
Set the time stamps for all entities to this value. This can significantly
|
||||||
reduce the size of the file system. You can pass either a unix time stamp
|
reduce the size of the file system. You can pass either a unix time stamp
|
||||||
or `now`.
|
or `now`.
|
||||||
|
|
||||||
* `--keep-all-times`:
|
- `--keep-all-times`:
|
||||||
As of release 0.3.0, by default, `mkdwarfs` will only save the contents of
|
As of release 0.3.0, by default, `mkdwarfs` will only save the contents of
|
||||||
the `mtime` field in order to save metadata space. If you want to save
|
the `mtime` field in order to save metadata space. If you want to save
|
||||||
`atime` and `ctime` as well, use this option.
|
`atime` and `ctime` as well, use this option.
|
||||||
|
|
||||||
* `--time-resolution=`*sec*|`sec`|`min`|`hour`|`day`:
|
- `--time-resolution=`*sec*|`sec`|`min`|`hour`|`day`:
|
||||||
Specify the resolution with which time stamps are stored. By default,
|
Specify the resolution with which time stamps are stored. By default,
|
||||||
time stamps are stored with second resolution. You can specify "odd"
|
time stamps are stored with second resolution. You can specify "odd"
|
||||||
resolutions as well, e.g. something like 15 second resolution is
|
resolutions as well, e.g. something like 15 second resolution is
|
||||||
entirely possible. Moving from second to minute resolution, for example,
|
entirely possible. Moving from second to minute resolution, for example,
|
||||||
will save roughly 6 bits per file system entry in the metadata block.
|
will save roughly 6 bits per file system entry in the metadata block.
|
||||||
|
|
||||||
* `--order=none`|`path`|`similarity`|`nilsimsa`[`:`*limit*[`:`*depth*[`:`*mindepth*]]]|`script`:
|
- `--order=none`|`path`|`similarity`|`nilsimsa`[`:`*limit*[`:`*depth*[`:`*mindepth*]]]|`script`:
|
||||||
The order in which inodes will be written to the file system. Choosing `none`,
|
The order in which inodes will be written to the file system. Choosing `none`,
|
||||||
the inodes will be stored in the order in which they are discovered. With
|
the inodes will be stored in the order in which they are discovered. With
|
||||||
`path`, they will be sorted asciibetically by path name of the first file
|
`path`, they will be sorted asciibetically by path name of the first file
|
||||||
@ -243,35 +242,35 @@ Most other options are concerned with compression tuning:
|
|||||||
Last but not least, if scripting support is built into `mkdwarfs`, you can
|
Last but not least, if scripting support is built into `mkdwarfs`, you can
|
||||||
choose `script` to let the script determine the order.
|
choose `script` to let the script determine the order.
|
||||||
|
|
||||||
* `--remove-empty-dirs`:
|
- `--remove-empty-dirs`:
|
||||||
Removes all empty directories from the output file system, recursively.
|
Removes all empty directories from the output file system, recursively.
|
||||||
This is particularly useful when using scripts that filter out a lot of
|
This is particularly useful when using scripts that filter out a lot of
|
||||||
file system entries.
|
file system entries.
|
||||||
|
|
||||||
* `--with-devices`:
|
- `--with-devices`:
|
||||||
Include character and block devices in the output file system. These are
|
Include character and block devices in the output file system. These are
|
||||||
not included by default, and due to security measures in FUSE, they will
|
not included by default, and due to security measures in FUSE, they will
|
||||||
never work in the mounted file system. However, they can still be copied
|
never work in the mounted file system. However, they can still be copied
|
||||||
out of the mounted file system, for example using `rsync`.
|
out of the mounted file system, for example using `rsync`.
|
||||||
|
|
||||||
* `--with-specials`:
|
- `--with-specials`:
|
||||||
Include named fifos and sockets in the output file system. These are not
|
Include named fifos and sockets in the output file system. These are not
|
||||||
included by default.
|
included by default.
|
||||||
|
|
||||||
* `--header=`*file*:
|
- `--header=`*file*:
|
||||||
Read header from file and place it before the output filesystem image.
|
Read header from file and place it before the output filesystem image.
|
||||||
Can be used with `--recompress` to add or replace a header.
|
Can be used with `--recompress` to add or replace a header.
|
||||||
|
|
||||||
* `--remove-header`:
|
- `--remove-header`:
|
||||||
Remove header from a filesystem image. Only useful with `--recompress`.
|
Remove header from a filesystem image. Only useful with `--recompress`.
|
||||||
|
|
||||||
* `--log-level=`*name*:
|
- `--log-level=`*name*:
|
||||||
Specifiy a logging level.
|
Specifiy a logging level.
|
||||||
|
|
||||||
* `--no-progress`:
|
- `--no-progress`:
|
||||||
Don't show progress output while building filesystem.
|
Don't show progress output while building filesystem.
|
||||||
|
|
||||||
* `--progress=none`|`simple`|`ascii`|`unicode`:
|
- `--progress=none`|`simple`|`ascii`|`unicode`:
|
||||||
Choosing `none` is equivalent to specifying `--no-progress`. `simple`
|
Choosing `none` is equivalent to specifying `--no-progress`. `simple`
|
||||||
will print a single line of progress information whenever the progress
|
will print a single line of progress information whenever the progress
|
||||||
has significantly changed, but at most once every 2 seconds. This is
|
has significantly changed, but at most once every 2 seconds. This is
|
||||||
@ -281,14 +280,14 @@ Most other options are concerned with compression tuning:
|
|||||||
you can switch to `ascii`, which is like `unicode`, but looks less
|
you can switch to `ascii`, which is like `unicode`, but looks less
|
||||||
fancy.
|
fancy.
|
||||||
|
|
||||||
* `--help`:
|
- `--help`:
|
||||||
Show program help, including defaults, compression level detail and
|
Show program help, including defaults, compression level detail and
|
||||||
supported compression algorithms.
|
supported compression algorithms.
|
||||||
|
|
||||||
If experimental Python support was compiled into `mkdwarfs`, you can use the
|
If experimental Python support was compiled into `mkdwarfs`, you can use the
|
||||||
following option to enable customizations via the scripting interface:
|
following option to enable customizations via the scripting interface:
|
||||||
|
|
||||||
* `--script=`*file*[`:`*class*[`(`arguments`...)`]]:
|
- `--script=`*file*[`:`*class*[`(`arguments`...)`]]:
|
||||||
Specify the Python script to load. The class name is optional if there's
|
Specify the Python script to load. The class name is optional if there's
|
||||||
a class named `mkdwarfs` in the script. It is also possible to pass
|
a class named `mkdwarfs` in the script. It is also possible to pass
|
||||||
arguments to the constuctor.
|
arguments to the constuctor.
|
||||||
@ -342,28 +341,28 @@ However, there are several options to choose from that allow you to
|
|||||||
further reduce metadata size without having to compress the metadata.
|
further reduce metadata size without having to compress the metadata.
|
||||||
These options are controlled by the `--pack-metadata` option.
|
These options are controlled by the `--pack-metadata` option.
|
||||||
|
|
||||||
* `auto`:
|
- `auto`:
|
||||||
This is the default. It will enable both `names` and `symlinks`.
|
This is the default. It will enable both `names` and `symlinks`.
|
||||||
|
|
||||||
* `none`:
|
- `none`:
|
||||||
Don't enable any packing. However, string tables (i.e. names and
|
Don't enable any packing. However, string tables (i.e. names and
|
||||||
symlinks) will still be stored in "compact" rather than "plain"
|
symlinks) will still be stored in "compact" rather than "plain"
|
||||||
format. In order to force storage in plain format, use `plain`.
|
format. In order to force storage in plain format, use `plain`.
|
||||||
|
|
||||||
* `all`:
|
- `all`:
|
||||||
Enable all packing options. This does *not* force packing of
|
Enable all packing options. This does *not* force packing of
|
||||||
string tables (i.e. names and symlinks) if the packing would
|
string tables (i.e. names and symlinks) if the packing would
|
||||||
actually increase the size, which can happen if the string tables
|
actually increase the size, which can happen if the string tables
|
||||||
are actually small. In order to force string table packing, use
|
are actually small. In order to force string table packing, use
|
||||||
`all,force`.
|
`all,force`.
|
||||||
|
|
||||||
* `chunk_table`:
|
- `chunk_table`:
|
||||||
Delta-compress chunk tables. This can reduce the size of the
|
Delta-compress chunk tables. This can reduce the size of the
|
||||||
chunk tables for large file systems and help compression, however,
|
chunk tables for large file systems and help compression, however,
|
||||||
it will likely require a lot of memory when unpacking the tables
|
it will likely require a lot of memory when unpacking the tables
|
||||||
again. Only use this if you know what you're doing.
|
again. Only use this if you know what you're doing.
|
||||||
|
|
||||||
* `directories`:
|
- `directories`:
|
||||||
Pack directories table by storing first entry pointers delta-
|
Pack directories table by storing first entry pointers delta-
|
||||||
compressed and completely removing parent directory pointers.
|
compressed and completely removing parent directory pointers.
|
||||||
The parent directory pointers can be rebuilt by tree traversal
|
The parent directory pointers can be rebuilt by tree traversal
|
||||||
@ -372,12 +371,12 @@ These options are controlled by the `--pack-metadata` option.
|
|||||||
will likely require a lot of memory when unpacking the tables
|
will likely require a lot of memory when unpacking the tables
|
||||||
again. Only use this if you know what you're doing.
|
again. Only use this if you know what you're doing.
|
||||||
|
|
||||||
* `shared_files`:
|
- `shared_files`:
|
||||||
Pack shared files table. This is only useful if the filesystem
|
Pack shared files table. This is only useful if the filesystem
|
||||||
contains lots of non-hardlinked duplicates. It gets more efficient
|
contains lots of non-hardlinked duplicates. It gets more efficient
|
||||||
the more copies of a file are in the filesystem.
|
the more copies of a file are in the filesystem.
|
||||||
|
|
||||||
* `names`,`symlinks`:
|
- `names`,`symlinks`:
|
||||||
Compress the names and symlink targets using the
|
Compress the names and symlink targets using the
|
||||||
[fsst](https://github.com/cwida/fsst) compression scheme. This
|
[fsst](https://github.com/cwida/fsst) compression scheme. This
|
||||||
compresses each individual entry separately using a small,
|
compresses each individual entry separately using a small,
|
||||||
@ -392,17 +391,17 @@ These options are controlled by the `--pack-metadata` option.
|
|||||||
than the uncompressed strings. If this is the case, the strings
|
than the uncompressed strings. If this is the case, the strings
|
||||||
will be stored uncompressed, unless `force` is also specified.
|
will be stored uncompressed, unless `force` is also specified.
|
||||||
|
|
||||||
* `names_index`,`symlinks_index`:
|
- `names_index`,`symlinks_index`:
|
||||||
Delta-compress the names and symlink targets indices. The same
|
Delta-compress the names and symlink targets indices. The same
|
||||||
caveats apply as for `chunk_table`.
|
caveats apply as for `chunk_table`.
|
||||||
|
|
||||||
* `force`:
|
- `force`:
|
||||||
Forces the compression of the `names` and `symlinks` tables,
|
Forces the compression of the `names` and `symlinks` tables,
|
||||||
even if that would make them use more memory than the
|
even if that would make them use more memory than the
|
||||||
uncompressed tables. This is really only useful for testing
|
uncompressed tables. This is really only useful for testing
|
||||||
and development.
|
and development.
|
||||||
|
|
||||||
* `plain`:
|
- `plain`:
|
||||||
Store string tables in "plain" format. The plain format uses
|
Store string tables in "plain" format. The plain format uses
|
||||||
Frozen thrift arrays and was used in earlier metadata versions.
|
Frozen thrift arrays and was used in earlier metadata versions.
|
||||||
It is useful for debugging, but wastes up to one byte per string.
|
It is useful for debugging, but wastes up to one byte per string.
|
||||||
@ -430,7 +429,6 @@ further compress the block. So if you're really desperately trying
|
|||||||
to reduce the image size, enabling `all` packing would be an option
|
to reduce the image size, enabling `all` packing would be an option
|
||||||
at the cost of using a lot more memory when using the filesystem.
|
at the cost of using a lot more memory when using the filesystem.
|
||||||
|
|
||||||
|
|
||||||
## INTERNAL OPERATION
|
## INTERNAL OPERATION
|
||||||
|
|
||||||
Internally, `mkdwarfs` runs in two completely separate phases. The first
|
Internally, `mkdwarfs` runs in two completely separate phases. The first
|
||||||
|
Loading…
x
Reference in New Issue
Block a user