mirror of
https://github.com/mhx/dwarfs.git
synced 2025-09-14 14:59:52 -04:00
docs: README overhaul
This commit is contained in:
parent
d2a1c00f04
commit
dbf84a290a
169
README.md
169
README.md
@ -10,7 +10,7 @@
|
|||||||
|
|
||||||
The **D**eduplicating **W**arp-speed **A**dvanced **R**ead-only **F**ile **S**ystem.
|
The **D**eduplicating **W**arp-speed **A**dvanced **R**ead-only **F**ile **S**ystem.
|
||||||
|
|
||||||
A fast high compression read-only file system for Linux and Windows.
|
A fast high-compression read-only file system for Linux and Windows.
|
||||||
|
|
||||||
## Table of contents
|
## Table of contents
|
||||||
|
|
||||||
@ -59,7 +59,7 @@ A fast high compression read-only file system for Linux and Windows.
|
|||||||

|

|
||||||
|
|
||||||
DwarFS is a read-only file system with a focus on achieving **very
|
DwarFS is a read-only file system with a focus on achieving **very
|
||||||
high compression ratios** in particular for very redundant data.
|
high compression ratios**, particularly for highly redundant data.
|
||||||
|
|
||||||
This probably doesn't sound very exciting, because if it's redundant,
|
This probably doesn't sound very exciting, because if it's redundant,
|
||||||
it *should* compress well. However, I found that other read-only,
|
it *should* compress well. However, I found that other read-only,
|
||||||
@ -67,10 +67,10 @@ compressed file systems don't do a very good job at making use of
|
|||||||
this redundancy. See [here](#comparison) for a comparison with other
|
this redundancy. See [here](#comparison) for a comparison with other
|
||||||
compressed file systems.
|
compressed file systems.
|
||||||
|
|
||||||
DwarFS also **doesn't compromise on speed** and for my use cases I've
|
DwarFS also **doesn't compromise on speed**; in my use cases, it
|
||||||
found it to be on par with or perform better than SquashFS. For my
|
performs on par with, or better than, SquashFS. For my primary use
|
||||||
primary use case, **DwarFS compression is an order of magnitude better
|
case, **DwarFS compression is an order of magnitude better than
|
||||||
than SquashFS compression**, it's **6 times faster to build the file
|
SquashFS compression**, it's **6 times faster to build the file
|
||||||
system**, it's typically faster to access files on DwarFS and it uses
|
system**, it's typically faster to access files on DwarFS and it uses
|
||||||
less CPU resources.
|
less CPU resources.
|
||||||
|
|
||||||
@ -83,7 +83,7 @@ So there's redundancy in both the video and audio data, but as the streams
|
|||||||
are interleaved and identical blocks are typically very far apart, it's
|
are interleaved and identical blocks are typically very far apart, it's
|
||||||
challenging to make use of that redundancy for compression. SquashFS
|
challenging to make use of that redundancy for compression. SquashFS
|
||||||
essentially fails to compress the source data at all, whereas DwarFS is
|
essentially fails to compress the source data at all, whereas DwarFS is
|
||||||
able to reduce the size by almost a factor of 3, which is close to the
|
able to reduce the size to nearly one-third, which is close to the
|
||||||
theoretical maximum:
|
theoretical maximum:
|
||||||
|
|
||||||
```
|
```
|
||||||
@ -143,7 +143,7 @@ around for when I happened to need them.
|
|||||||
|
|
||||||
Up until then, I had been using [Cromfs](https://bisqwit.iki.fi/source/cromfs.html)
|
Up until then, I had been using [Cromfs](https://bisqwit.iki.fi/source/cromfs.html)
|
||||||
for squeezing them into a manageable size. However, I was getting more
|
for squeezing them into a manageable size. However, I was getting more
|
||||||
and more annoyed by the time it took to build the filesystem image
|
and more annoyed by the time it took to build the file system image
|
||||||
and, to make things worse, more often than not it was crashing after
|
and, to make things worse, more often than not it was crashing after
|
||||||
about an hour or so.
|
about an hour or so.
|
||||||
|
|
||||||
@ -177,21 +177,25 @@ some rudimentary docs as well.
|
|||||||
### Note to Package Maintainers
|
### Note to Package Maintainers
|
||||||
|
|
||||||
DwarFS should usually build fine with minimal changes out of the box.
|
DwarFS should usually build fine with minimal changes out of the box.
|
||||||
If it doesn't, please file a issue. I've set up
|
If it doesn't, please file an issue. I've set up
|
||||||
[CI jobs](https://github.com/mhx/dwarfs/actions/workflows/build.yml)
|
[CI jobs](actions/workflows/build.yml)
|
||||||
using Docker images for Ubuntu ([22.04](https://github.com/mhx/dwarfs/blob/main/.docker/Dockerfile.ubuntu-2204)
|
using Docker images for Ubuntu ([22.04](.docker/Dockerfile.ubuntu-2204)
|
||||||
and [24.04](https://github.com/mhx/dwarfs/blob/main/.docker/Dockerfile.ubuntu)),
|
and [24.04](.docker/Dockerfile.ubuntu)),
|
||||||
[Fedora Rawhide](https://github.com/mhx/dwarfs/blob/main/.docker/Dockerfile.fedora)
|
[Fedora Rawhide](.docker/Dockerfile.fedora),
|
||||||
and [Arch](https://github.com/mhx/dwarfs/blob/main/.docker/Dockerfile.arch)
|
[Arch Linux](.docker/Dockerfile.arch), and
|
||||||
|
[Debian](.docker/Dockerfile.debian),
|
||||||
|
as well as a setup script for [FreeBSD](.github/scripts/freebsd_setup_base.sh),
|
||||||
that can help with determining an up-to-date set of dependencies.
|
that can help with determining an up-to-date set of dependencies.
|
||||||
Note that building from the release tarball requires less dependencies
|
Note that building from the release tarball requires less dependencies
|
||||||
than building from the git repository, notably the `ronn` tool as well
|
than building from the git repository, notably the `ronn` tool as well
|
||||||
as Python and the `mistletoe` Python module are not required when
|
as Python and the `mistletoe` Python module are not required when
|
||||||
building from the release tarball.
|
building from the release tarball. Also, the release tarball build
|
||||||
|
doesn't require to build the thrift compiler, which makes the build
|
||||||
|
a lot faster.
|
||||||
|
|
||||||
There are some things to be aware of:
|
There are some things to be aware of:
|
||||||
|
|
||||||
- There's a tendency to try and unbundle the [folly](https://github.com/facebook/folly/)
|
- There's a tendency to try to unbundle the [folly](https://github.com/facebook/folly/)
|
||||||
and [fbthrift](https://github.com/facebook/fbthrift) libraries that
|
and [fbthrift](https://github.com/facebook/fbthrift) libraries that
|
||||||
are included as submodules and are built along with DwarFS.
|
are included as submodules and are built along with DwarFS.
|
||||||
While I agree with the sentiment, it's unfortunately a bad idea.
|
While I agree with the sentiment, it's unfortunately a bad idea.
|
||||||
@ -209,13 +213,13 @@ There are some things to be aware of:
|
|||||||
fbthrift headers are required to build against DwarFS' libraries.
|
fbthrift headers are required to build against DwarFS' libraries.
|
||||||
|
|
||||||
- Similar issues can arise when using a system-installed version
|
- Similar issues can arise when using a system-installed version
|
||||||
of GoogleTest. GoogleTest itself recommends that it is being
|
of GoogleTest. GoogleTest recommends downloading it as part of
|
||||||
downloaded as part of the build. However, you can use the system
|
the build. However, you can use the system-installed version by
|
||||||
installed version by passing `-DPREFER_SYSTEM_GTEST=ON` to the
|
passing `-DPREFER_SYSTEM_GTEST=ON` to the `cmake` call. Use at
|
||||||
`cmake` call. Use at your own risk.
|
your own risk.
|
||||||
|
|
||||||
- For other bundled libraries (namely `fmt`, `parallel-hashmap`,
|
- For other bundled libraries (namely `fmt`, `parallel-hashmap`,
|
||||||
`range-v3`), the system installed version is used as long as it
|
`range-v3`), the system-installed version is used as long as it
|
||||||
meets the minimum required version. Otherwise, the preferred
|
meets the minimum required version. Otherwise, the preferred
|
||||||
version is fetched during the build.
|
version is fetched during the build.
|
||||||
|
|
||||||
@ -233,18 +237,33 @@ In addition to the binary tarballs, there's a **universal binary**
|
|||||||
available for each architecture. These universal binaries contain
|
available for each architecture. These universal binaries contain
|
||||||
*all* tools (`mkdwarfs`, `dwarfsck`, `dwarfsextract` and the `dwarfs`
|
*all* tools (`mkdwarfs`, `dwarfsck`, `dwarfsextract` and the `dwarfs`
|
||||||
FUSE driver) in a single executable. These executables are compressed
|
FUSE driver) in a single executable. These executables are compressed
|
||||||
using [upx](https://github.com/upx/upx), so they are much smaller than
|
using [upx](https://github.com/upx/upx) where possible, and using a
|
||||||
the individual tools combined. However, it also means the binaries need
|
custom self-extractor on all other platforms. This means they are much
|
||||||
to be decompressed each time they are run, which can have a significant
|
smaller than the individual tools combined. However, it also means the
|
||||||
overhead. If that is an issue, you can either stick to the "classic"
|
binaries need to be decompressed each time they are run, which can add
|
||||||
individual binaries or you can decompress the universal binary, e.g.:
|
significant overhead. If that is an issue, you can either stick to the
|
||||||
|
"classic" individual binaries or you can decompress the universal binary.
|
||||||
|
For upx compressed binaries, you can use:
|
||||||
|
|
||||||
```
|
```
|
||||||
upx -d dwarfs-universal-0.7.0-Linux-aarch64
|
$ upx -d dwarfs-universal-0.7.0-Linux-aarch64
|
||||||
```
|
```
|
||||||
|
|
||||||
The universal binaries can be run through symbolic links named after
|
For the binaries that use the custom self-extractor, you can use:
|
||||||
the proper tool. e.g.:
|
|
||||||
|
```
|
||||||
|
$ ./dwarfs-universal-riscv64 --extract-wrapped-binary dwarfs-universal
|
||||||
|
```
|
||||||
|
|
||||||
|
Note that both self-extractors need at least Linux kernel 3.17 to work
|
||||||
|
properly. If you want to use the FUSE driver, you'll need to install
|
||||||
|
the fuse3 tools for your distribution. If you want to run the binaries
|
||||||
|
on an older kernel, you can unpack the universal binary (unpacking does
|
||||||
|
*not* require kernel 3.17). If you're stuck with fuse2, you must use the
|
||||||
|
individual `dwarfs2` driver instead of the universal binary.
|
||||||
|
|
||||||
|
You can run the universal binaries via symbolic links named after
|
||||||
|
the tool. For example:
|
||||||
|
|
||||||
```
|
```
|
||||||
$ ln -s dwarfs-universal-0.7.0-Linux-aarch64 mkdwarfs
|
$ ln -s dwarfs-universal-0.7.0-Linux-aarch64 mkdwarfs
|
||||||
@ -289,10 +308,13 @@ space-efficient, memory-mappable and well defined format. It's also
|
|||||||
included as a submodule, and we only build the compiler and a very
|
included as a submodule, and we only build the compiler and a very
|
||||||
reduced library that contains just enough for DwarFS to work.
|
reduced library that contains just enough for DwarFS to work.
|
||||||
|
|
||||||
Other than that, DwarFS really only depends on FUSE3 and on a set
|
Beyond that, DwarFS depends on FUSE3 and a set of compression
|
||||||
of compression libraries that Folly already depends on (namely
|
libraries (namely [lz4](https://github.com/lz4/lz4),
|
||||||
[lz4](https://github.com/lz4/lz4), [zstd](https://github.com/facebook/zstd)
|
[zstd](https://github.com/facebook/zstd),
|
||||||
and [liblzma](https://github.com/kobolabs/liblzma)).
|
[brotli](https://github.com/google/brotli),
|
||||||
|
[xz](https://github.com/tukaani-project/xz), and
|
||||||
|
[flac](https://github.com/xiph/flac)). Except for `zstd`, these
|
||||||
|
are all optional.
|
||||||
|
|
||||||
The dependency on [googletest](https://github.com/google/googletest)
|
The dependency on [googletest](https://github.com/google/googletest)
|
||||||
will be automatically resolved if you build with tests.
|
will be automatically resolved if you build with tests.
|
||||||
@ -392,7 +414,7 @@ $ ctest -j
|
|||||||
```
|
```
|
||||||
|
|
||||||
All binaries use [jemalloc](https://github.com/jemalloc/jemalloc)
|
All binaries use [jemalloc](https://github.com/jemalloc/jemalloc)
|
||||||
as a memory allocator by default, as it is typically uses much less
|
as a memory allocator by default, as it typically uses much less
|
||||||
system memory compared to the `glibc` or `tcmalloc` allocators.
|
system memory compared to the `glibc` or `tcmalloc` allocators.
|
||||||
To disable the use of `jemalloc`, pass `-DUSE_JEMALLOC=0` on the
|
To disable the use of `jemalloc`, pass `-DUSE_JEMALLOC=0` on the
|
||||||
`cmake` command line.
|
`cmake` command line.
|
||||||
@ -484,12 +506,11 @@ pages using the `--man` option to each binary, e.g.:
|
|||||||
$ mkdwarfs --man
|
$ mkdwarfs --man
|
||||||
```
|
```
|
||||||
|
|
||||||
The [dwarfs](doc/dwarfs.md) manual page also shows an example for setting
|
The [dwarfs](doc/dwarfs.md) manual page also shows an example for setting up DwarFS
|
||||||
up DwarFS with [overlayfs](https://www.kernel.org/doc/Documentation/filesystems/overlayfs.txt)
|
with [overlayfs](https://www.kernel.org/doc/html/latest/filesystems/overlayfs.html)
|
||||||
in order to create a writable file system mount on top a read-only
|
in order to create a writable file system mount on top of a read-only DwarFS image.
|
||||||
DwarFS image.
|
|
||||||
|
|
||||||
A description of the DwarFS filesystem format can be found in
|
A description of the DwarFS file system format can be found in
|
||||||
[dwarfs-format](doc/dwarfs-format.md).
|
[dwarfs-format](doc/dwarfs-format.md).
|
||||||
|
|
||||||
A high-level overview of the internal operation of `mkdwarfs` is shown
|
A high-level overview of the internal operation of `mkdwarfs` is shown
|
||||||
@ -511,7 +532,7 @@ There are five individual libraries:
|
|||||||
- `dwarfs_reader` contains all code required to read data from a
|
- `dwarfs_reader` contains all code required to read data from a
|
||||||
DwarFS image. The interfaces are defined in [`dwarfs/reader/`](include/dwarfs/reader).
|
DwarFS image. The interfaces are defined in [`dwarfs/reader/`](include/dwarfs/reader).
|
||||||
|
|
||||||
- `dwarfs_extractor` contains the ccode required to extract a DwarFS
|
- `dwarfs_extractor` contains the code required to extract a DwarFS
|
||||||
image using [`libarchive`](https://libarchive.org/). The interfaces
|
image using [`libarchive`](https://libarchive.org/). The interfaces
|
||||||
are defined in [`dwarfs/utility/filesystem_extractor.h`](include/dwarfs/utility/filesystem_extractor.h).
|
are defined in [`dwarfs/utility/filesystem_extractor.h`](include/dwarfs/utility/filesystem_extractor.h).
|
||||||
|
|
||||||
@ -536,7 +557,7 @@ decades, my experience with Windows development is rather limited and
|
|||||||
I'd expect there to definitely be bugs and rough edges in the Windows
|
I'd expect there to definitely be bugs and rough edges in the Windows
|
||||||
code.
|
code.
|
||||||
|
|
||||||
The Windows version of the DwarFS filesystem driver relies on the awesome
|
The Windows version of the DwarFS file system driver relies on the awesome
|
||||||
[WinFsp](https://github.com/winfsp/winfsp) project and its `winfsp-x64.dll`
|
[WinFsp](https://github.com/winfsp/winfsp) project and its `winfsp-x64.dll`
|
||||||
must be discoverable by the `dwarfs.exe` driver.
|
must be discoverable by the `dwarfs.exe` driver.
|
||||||
|
|
||||||
@ -549,9 +570,9 @@ There are a few things worth pointing out, though:
|
|||||||
|
|
||||||
- DwarFS supports both hardlinks and symlinks on Windows, just as it
|
- DwarFS supports both hardlinks and symlinks on Windows, just as it
|
||||||
does on Linux. However, creating hardlinks and symlinks seems to
|
does on Linux. However, creating hardlinks and symlinks seems to
|
||||||
require admin privileges on Windows, so if you want to e.g. extract
|
require admin privileges on Windows, so if, for example, you want to
|
||||||
a DwarFS image that contains links of some sort, you might run into
|
extract a DwarFS image that contains links of some sort, you might
|
||||||
errors if you don't have the right privileges.
|
run into errors if you don't have the right privileges.
|
||||||
|
|
||||||
- Due to a [problem](https://github.com/winfsp/winfsp/issues/454) in
|
- Due to a [problem](https://github.com/winfsp/winfsp/issues/454) in
|
||||||
WinFsp, symlinks cannot currently point outside of the mounted file
|
WinFsp, symlinks cannot currently point outside of the mounted file
|
||||||
@ -593,7 +614,7 @@ You'll need to install:
|
|||||||
if it's not, you'll need to set `WINFSP_PATH` when running CMake via
|
if it's not, you'll need to set `WINFSP_PATH` when running CMake via
|
||||||
`cmake/win.bat`.
|
`cmake/win.bat`.
|
||||||
|
|
||||||
Now you need to clone `vcpkg` and `dwarfs`:
|
Clone `vcpkg` and `dwarfs`:
|
||||||
|
|
||||||
```
|
```
|
||||||
> cd %HOMEPATH%
|
> cd %HOMEPATH%
|
||||||
@ -638,9 +659,9 @@ $ brew install dwarfs
|
|||||||
$ brew test dwarfs
|
$ brew test dwarfs
|
||||||
```
|
```
|
||||||
|
|
||||||
The macOS version of the DwarFS filesystem driver relies on the awesome
|
The macOS version of the DwarFS file system driver relies on the awesome
|
||||||
[macFUSE](https://osxfuse.github.io/) project and is available from
|
[macFUSE](https://macfuse.io) project and is available via gromgit's
|
||||||
gromgit's [homebrew-fuse tap](https://github.com/gromgit/homebrew-fuse):
|
[homebrew-fuse tap](https://github.com/gromgit/homebrew-fuse):
|
||||||
|
|
||||||
```
|
```
|
||||||
$ brew tap gromgit/homebrew-fuse
|
$ brew tap gromgit/homebrew-fuse
|
||||||
@ -652,7 +673,7 @@ $ brew install dwarfs-fuse-mac
|
|||||||
### Astrophotography
|
### Astrophotography
|
||||||
|
|
||||||
Astrophotography can generate huge amounts of raw image data. During a
|
Astrophotography can generate huge amounts of raw image data. During a
|
||||||
single night, it's not unlikely to end up with a few dozens of gigabytes
|
single night, it's not unlikely to end up with a few dozen gigabytes
|
||||||
of data. With most dedicated astrophotography cameras, this data ends up
|
of data. With most dedicated astrophotography cameras, this data ends up
|
||||||
in the form of FITS images. These are usually uncompressed, don't compress
|
in the form of FITS images. These are usually uncompressed, don't compress
|
||||||
very well with standard compression algorithms, and while there are certain
|
very well with standard compression algorithms, and while there are certain
|
||||||
@ -861,7 +882,7 @@ The source directory contained **1139 different Perl installations**
|
|||||||
from 284 distinct releases, a total of 47.65 GiB of data in 1,927,501
|
from 284 distinct releases, a total of 47.65 GiB of data in 1,927,501
|
||||||
files and 330,733 directories. The source directory was freshly
|
files and 330,733 directories. The source directory was freshly
|
||||||
unpacked from a tar archive to an XFS partition on a 970 EVO Plus 2TB
|
unpacked from a tar archive to an XFS partition on a 970 EVO Plus 2TB
|
||||||
NVME drive, so most of its contents were likely cached.
|
NVMe drive, so most of its contents were likely cached.
|
||||||
|
|
||||||
I'm using the same compression type and compression level for
|
I'm using the same compression type and compression level for
|
||||||
SquashFS that is the default setting for DwarFS:
|
SquashFS that is the default setting for DwarFS:
|
||||||
@ -993,7 +1014,7 @@ the SquashFS image. The DwarFS image is only 0.6% of the original file size.
|
|||||||
|
|
||||||
So, why not use `lzma` instead of `zstd` by default? The reason is that `lzma`
|
So, why not use `lzma` instead of `zstd` by default? The reason is that `lzma`
|
||||||
is about an order of magnitude slower to decompress than `zstd`. If you're
|
is about an order of magnitude slower to decompress than `zstd`. If you're
|
||||||
only accessing data on your compressed filesystem occasionally, this might
|
only accessing data on your compressed file system occasionally, this might
|
||||||
not be a big deal, but if you use it extensively, `zstd` will result in
|
not be a big deal, but if you use it extensively, `zstd` will result in
|
||||||
better performance.
|
better performance.
|
||||||
|
|
||||||
@ -1025,7 +1046,7 @@ $ ll perl-install*.*fs
|
|||||||
```
|
```
|
||||||
|
|
||||||
Even this is *still* not entirely fair, as it uses a feature (`-B3`) that allows
|
Even this is *still* not entirely fair, as it uses a feature (`-B3`) that allows
|
||||||
DwarFS to reference file chunks from up to two previous filesystem blocks.
|
DwarFS to reference file chunks from up to two previous file system blocks.
|
||||||
|
|
||||||
But the point is that this is really where SquashFS tops out, as it doesn't
|
But the point is that this is really where SquashFS tops out, as it doesn't
|
||||||
support larger block sizes or back-referencing. And as you'll see below, the
|
support larger block sizes or back-referencing. And as you'll see below, the
|
||||||
@ -1040,7 +1061,7 @@ system with the best possible compression (`-l 9`):
|
|||||||
|
|
||||||
```
|
```
|
||||||
$ time mkdwarfs --recompress -i perl-install.dwarfs -o perl-lzma-re.dwarfs -l9
|
$ time mkdwarfs --recompress -i perl-install.dwarfs -o perl-lzma-re.dwarfs -l9
|
||||||
I 20:28:03.246534 filesystem rewrittenwithout errors [148.3s]
|
I 20:28:03.246534 filesystem rewritten without errors [148.3s]
|
||||||
⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯
|
⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯
|
||||||
filesystem: 4.261 GiB in 273 blocks (0 chunks, 0 inodes)
|
filesystem: 4.261 GiB in 273 blocks (0 chunks, 0 inodes)
|
||||||
compressed filesystem: 273/273 blocks/372.7 MiB written
|
compressed filesystem: 273/273 blocks/372.7 MiB written
|
||||||
@ -1058,13 +1079,13 @@ $ ll perl-*.dwarfs
|
|||||||
-rw-r--r-- 1 mhx users 315482627 Mar 3 21:23 perl-install-lzma.dwarfs
|
-rw-r--r-- 1 mhx users 315482627 Mar 3 21:23 perl-install-lzma.dwarfs
|
||||||
```
|
```
|
||||||
|
|
||||||
Note that while the recompressed filesystem is smaller than the original image,
|
Note that while the recompressed file system is smaller than the original image,
|
||||||
it is still a lot bigger than the filesystem we previously build with `-l9`.
|
it is still a lot bigger than the file system we previously build with `-l9`.
|
||||||
The reason is that the recompressed image still uses the same block size, and
|
The reason is that the recompressed image still uses the same block size, and
|
||||||
the block size cannot be changed by recompressing.
|
the block size cannot be changed by recompressing.
|
||||||
|
|
||||||
In terms of how fast the file system is when using it, a quick test
|
In terms of how fast the file system is when using it, a quick test
|
||||||
I've done is to freshly mount the filesystem created above and run
|
I've done is to freshly mount the file system created above and run
|
||||||
each of the 1139 `perl` executables to print their version.
|
each of the 1139 `perl` executables to print their version.
|
||||||
|
|
||||||
```
|
```
|
||||||
@ -1144,7 +1165,7 @@ So you might want to consider using `zstd` instead of `lzma` if you'd
|
|||||||
like to optimize for file system performance. It's also the default
|
like to optimize for file system performance. It's also the default
|
||||||
compression used by `mkdwarfs`.
|
compression used by `mkdwarfs`.
|
||||||
|
|
||||||
Now here's a comparison with the SquashFS filesystem:
|
Now here's a comparison with the SquashFS file system:
|
||||||
|
|
||||||
```
|
```
|
||||||
$ hyperfine -c 'sudo umount mnt' -p 'umount mnt; dwarfs perl-install.dwarfs mnt -o cachesize=1g -o workers=4; sleep 1' -n dwarfs-zstd "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '\$0 -v >/dev/null'" -p 'sudo umount mnt; sudo mount -t squashfs perl-install.squashfs mnt; sleep 1' -n squashfs-zstd "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '\$0 -v >/dev/null'"
|
$ hyperfine -c 'sudo umount mnt' -p 'umount mnt; dwarfs perl-install.dwarfs mnt -o cachesize=1g -o workers=4; sleep 1' -n dwarfs-zstd "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '\$0 -v >/dev/null'" -p 'sudo umount mnt; sudo mount -t squashfs perl-install.squashfs mnt; sleep 1' -n squashfs-zstd "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '\$0 -v >/dev/null'"
|
||||||
@ -1249,7 +1270,7 @@ time, the difference is really marginal.
|
|||||||
|
|
||||||
### With SquashFS & xz
|
### With SquashFS & xz
|
||||||
|
|
||||||
This test uses slightly less pathological input data: the root filesystem of
|
This test uses slightly less pathological input data: the root file system of
|
||||||
a recent Raspberry Pi OS release. This file system also contains device inodes,
|
a recent Raspberry Pi OS release. This file system also contains device inodes,
|
||||||
so in order to preserve those, we pass `--with-devices` to `mkdwarfs`:
|
so in order to preserve those, we pass `--with-devices` to `mkdwarfs`:
|
||||||
|
|
||||||
@ -1397,7 +1418,7 @@ $ ls -lh raspbian.tar.xz
|
|||||||
```
|
```
|
||||||
|
|
||||||
DwarFS also comes with the [dwarfsextract](doc/dwarfsextract.md) tool
|
DwarFS also comes with the [dwarfsextract](doc/dwarfsextract.md) tool
|
||||||
that allows extraction of a filesystem image without the FUSE driver.
|
that allows extraction of a file system image without the FUSE driver.
|
||||||
So here's a comparison of the extraction speed:
|
So here's a comparison of the extraction speed:
|
||||||
|
|
||||||
```
|
```
|
||||||
@ -1959,7 +1980,7 @@ $ ls -l perl-install-small.*fs
|
|||||||
I noticed that the `blockifying` step that took ages for the full dataset
|
I noticed that the `blockifying` step that took ages for the full dataset
|
||||||
with `mkcromfs` ran substantially faster (in terms of MiB/second) on the
|
with `mkcromfs` ran substantially faster (in terms of MiB/second) on the
|
||||||
smaller dataset, which makes me wonder if there's some quadratic complexity
|
smaller dataset, which makes me wonder if there's some quadratic complexity
|
||||||
behaviour that's slowing down `mkcromfs`.
|
behavior that's slowing down `mkcromfs`.
|
||||||
|
|
||||||
In order to be completely fair, I also ran `mkdwarfs` with `-l 9` to enable
|
In order to be completely fair, I also ran `mkdwarfs` with `-l 9` to enable
|
||||||
LZMA compression (which is what `mkcromfs` uses by default):
|
LZMA compression (which is what `mkcromfs` uses by default):
|
||||||
@ -2017,8 +2038,8 @@ it crashed right upon trying to list the directory after mounting.
|
|||||||
|
|
||||||
### With EROFS
|
### With EROFS
|
||||||
|
|
||||||
[EROFS](https://github.com/erofs/erofs-utils) is a read-only compressed
|
[EROFS](https://github.com/erofs/erofs-utils) is another read-only
|
||||||
file system that has been added to the Linux kernel recently.
|
compressed file system included in the Linux kernel.
|
||||||
Its goals are different from those of DwarFS, though. It is designed to
|
Its goals are different from those of DwarFS, though. It is designed to
|
||||||
be lightweight (which DwarFS is definitely not) and to run on constrained
|
be lightweight (which DwarFS is definitely not) and to run on constrained
|
||||||
hardware like embedded devices or smartphones. It is not designed to provide
|
hardware like embedded devices or smartphones. It is not designed to provide
|
||||||
@ -2073,7 +2094,7 @@ faster than `mkfs.erofs`.
|
|||||||
Actually using the file system images, here's how DwarFS performs:
|
Actually using the file system images, here's how DwarFS performs:
|
||||||
|
|
||||||
```
|
```
|
||||||
$ dwarfs perl-install-1M.dwarfs mnt -oworkers=8
|
$ dwarfs perl-install-1M.dwarfs mnt -o workers=8
|
||||||
$ find mnt -type f -print0 | xargs -0 -P16 -n64 cat | dd of=/dev/null bs=1M status=progress
|
$ find mnt -type f -print0 | xargs -0 -P16 -n64 cat | dd of=/dev/null bs=1M status=progress
|
||||||
50392172594 bytes (50 GB, 47 GiB) copied, 19 s, 2.7 GB/s
|
50392172594 bytes (50 GB, 47 GiB) copied, 19 s, 2.7 GB/s
|
||||||
0+1662649 records in
|
0+1662649 records in
|
||||||
@ -2181,7 +2202,7 @@ DwarFS can get close to the throughput of EROFS by using `zstd` instead
|
|||||||
of `lzma` compression:
|
of `lzma` compression:
|
||||||
|
|
||||||
```
|
```
|
||||||
$ dwarfs perl-install-1M-zstd.dwarfs mnt -oworkers=8
|
$ dwarfs perl-install-1M-zstd.dwarfs mnt -o workers=8
|
||||||
find mnt -type f -print0 | xargs -0 -P16 -n64 cat | dd of=/dev/null bs=1M status=progress
|
find mnt -type f -print0 | xargs -0 -P16 -n64 cat | dd of=/dev/null bs=1M status=progress
|
||||||
49224202357 bytes (49 GB, 46 GiB) copied, 16 s, 3.1 GB/s
|
49224202357 bytes (49 GB, 46 GiB) copied, 16 s, 3.1 GB/s
|
||||||
0+1529018 records in
|
0+1529018 records in
|
||||||
@ -2251,7 +2272,7 @@ sys 0m0.610s
|
|||||||
```
|
```
|
||||||
|
|
||||||
Turns out that `tar --zstd` is easily winning the compression speed
|
Turns out that `tar --zstd` is easily winning the compression speed
|
||||||
test. Looking at the file sizes did actually blow my mind just a bit:
|
test. Looking at the file sizes did genuinely surprise me:
|
||||||
|
|
||||||
```
|
```
|
||||||
$ ll zerotest.* --sort=size
|
$ ll zerotest.* --sort=size
|
||||||
@ -2429,7 +2450,7 @@ To enable the performance monitor, you pass a list of components for which
|
|||||||
you want to collect latency metrics, e.g.:
|
you want to collect latency metrics, e.g.:
|
||||||
|
|
||||||
```
|
```
|
||||||
$ dwarfs test.dwarfs mnt -f -operfmon=fuse
|
$ dwarfs test.dwarfs mnt -f -o perfmon=fuse
|
||||||
```
|
```
|
||||||
|
|
||||||
When the driver exits, you will see output like this:
|
When the driver exits, you will see output like this:
|
||||||
@ -2526,14 +2547,18 @@ typically want to run on your "performance" cores.
|
|||||||
|
|
||||||
### Specifying file system offset and size
|
### Specifying file system offset and size
|
||||||
|
|
||||||
You can specify the byte offset at which the filesystem is located in the file using the `-o offset=N` option.
|
You can specify the byte offset at which the file system is located in the
|
||||||
This can be useful when mounting images where there is some preceding data before the filesystem or when mounting merged/concatenated images.
|
file using the `-o offset=N` option. This can be useful when mounting images
|
||||||
When combined with the `-o imagesize=N` option you can mount merged filesystems, i.e. multiple filesystems stored in a single file.
|
where there is some preceding data before the file system or when mounting
|
||||||
|
merged/concatenated images. When combined with the `-o imagesize=N` option
|
||||||
|
you can mount merged file systems, i.e. multiple file systems stored in a
|
||||||
|
single file.
|
||||||
|
|
||||||
Here is an example, you have two filesystems concatenated into a single file and you want to mount both of them, you can achieve this by running
|
Here is an example, you have two file systems concatenated into a single
|
||||||
|
file and you want to mount both of them, you can achieve this by running:
|
||||||
```sh
|
```sh
|
||||||
dwarfs merged.dwarfs /mnt/fs1 -oimagesize=9231
|
dwarfs merged.dwarfs /mnt/fs1 -o imagesize=9231
|
||||||
dwarfs merged.dwarfs /mnt/fs2 -ooffset=9231,imagesize=7999
|
dwarfs merged.dwarfs /mnt/fs2 -o offset=9231,imagesize=7999
|
||||||
```
|
```
|
||||||
|
|
||||||
## Stargazers over Time
|
## Stargazers over Time
|
||||||
|
Loading…
x
Reference in New Issue
Block a user