Markdown cleanup

This commit is contained in:
Marcus Holland-Moritz 2021-10-27 00:57:28 +02:00
parent 9ad4dd655f
commit 569966b752
6 changed files with 1318 additions and 1185 deletions

1427
README.md

File diff suppressed because it is too large Load Diff

View File

@ -1,11 +1,9 @@
dwarfs-format(5) -- DwarFS File System Format v2.3 # dwarfs-format(5) -- DwarFS File System Format v2.3
==================================================
## DESCRIPTION ## DESCRIPTION
This document describes the DwarFS file system format, version 2.3. This document describes the DwarFS file system format, version 2.3.
## FILE STRUCTURE ## FILE STRUCTURE
A DwarFS file system image is just a sequence of blocks. Each block has the A DwarFS file system image is just a sequence of blocks. Each block has the
@ -65,25 +63,23 @@ A couple of notes:
larger than the one it supports. However, a new program will still larger than the one it supports. However, a new program will still
read all file systems with a smaller minor version number. read all file systems with a smaller minor version number.
### Section Types ### Section Types
There are currently 3 different section types. There are currently 3 different section types.
* `BLOCK` (0): - `BLOCK` (0):
A block of data. This is where all file data is stored. There can be A block of data. This is where all file data is stored. There can be
an arbitrary number of blocks of this type. an arbitrary number of blocks of this type.
* `METADATA_V2_SCHEMA` (7): - `METADATA_V2_SCHEMA` (7):
The schema used to layout the `METADATA_V2` block contents. This is The schema used to layout the `METADATA_V2` block contents. This is
stored in "compact" thrift encoding. stored in "compact" thrift encoding.
* `METADATA_V2` (8):
This section contains the bulk of the metadata. It's essentially just
a collection of bit-packed arrays and structures. The exact layout of
each list and structure depends on the actual data and is stored
separately in `METADATA_V2_SCHEMA`.
- `METADATA_V2` (8):
This section contains the bulk of the metadata. It's essentially just
a collection of bit-packed arrays and structures. The exact layout of
each list and structure depends on the actual data and is stored
separately in `METADATA_V2_SCHEMA`.
## METADATA FORMAT ## METADATA FORMAT
@ -169,17 +165,12 @@ list. The index into this list is the `inode_num` from `dir_entries`,
but you can perform direct lookups based on the inode number as well. but you can perform direct lookups based on the inode number as well.
The `inodes` list is strictly in the following order: The `inodes` list is strictly in the following order:
* directory inodes (`S_IFDIR`) - directory inodes (`S_IFDIR`)
- symlink inodes (`S_IFLNK`)
* symlink inodes (`S_IFLNK`) - regular *unique* file inodes (`S_IREG`)
- regular *shared* file inodes (`S_IREG`)
* regular *unique* file inodes (`S_IREG`) - character/block device inodes (`S_IFCHR`, `S_IFBLK`)
- socket/pipe inodes (`S_IFSOCK`, `S_IFIFO`)
* regular *shared* file inodes (`S_IREG`)
* character/block device inodes (`S_IFCHR`, `S_IFBLK`)
* socket/pipe inodes (`S_IFSOCK`, `S_IFIFO`)
The offsets can thus be found by using a binary search with a The offsets can thus be found by using a binary search with a
predicate on the inode more. The shared file offset can be found predicate on the inode more. The shared file offset can be found

View File

@ -1,5 +1,4 @@
dwarfs(1) -- mount highly compressed read-only file system # dwarfs(1) -- mount highly compressed read-only file system
==========================================================
## SYNOPSIS ## SYNOPSIS
@ -14,103 +13,105 @@ but it has some distinct features.
Other than that, it's pretty straightforward to use. Once you've created a Other than that, it's pretty straightforward to use. Once you've created a
file system image using mkdwarfs(1), you can mount it with: file system image using mkdwarfs(1), you can mount it with:
dwarfs image.dwarfs /path/to/mountpoint ```
dwarfs image.dwarfs /path/to/mountpoint
```
## OPTIONS ## OPTIONS
In addition to the regular FUSE options, `dwarfs` supports the following In addition to the regular FUSE options, `dwarfs` supports the following
options: options:
* `-o cachesize=`*value*: - `-o cachesize=`*value*:
Size of the block cache, in bytes. You can append suffixes Size of the block cache, in bytes. You can append suffixes
(`k`, `m`, `g`) to specify the size in KiB, MiB and GiB, (`k`, `m`, `g`) to specify the size in KiB, MiB and GiB,
respectively. Note that this is not the upper memory limit respectively. Note that this is not the upper memory limit
of the process, as there may be blocks in flight that are of the process, as there may be blocks in flight that are
not stored in the cache. Also, each block that hasn't been not stored in the cache. Also, each block that hasn't been
fully decompressed yet will carry decompressor state along fully decompressed yet will carry decompressor state along
with it, which can use a significant amount of additional with it, which can use a significant amount of additional
memory. For more details, see mkdwarfs(1). memory. For more details, see mkdwarfs(1).
* `-o workers=`*value*: - `-o workers=`*value*:
Number of worker threads to use for decompressing blocks. Number of worker threads to use for decompressing blocks.
If you have a lot of CPUs, increasing this number can help If you have a lot of CPUs, increasing this number can help
speed up access to files in the filesystem. speed up access to files in the filesystem.
* `-o decratio=`*value*: - `-o decratio=`*value*:
The ratio over which a block is fully decompressed. Blocks The ratio over which a block is fully decompressed. Blocks
are only decompressed partially, so each block has to carry are only decompressed partially, so each block has to carry
the decompressor state with it until it is fully decompressed. the decompressor state with it until it is fully decompressed.
However, if a certain fraction of the block has already been However, if a certain fraction of the block has already been
decompressed, it may be beneficial to just decompress the rest decompressed, it may be beneficial to just decompress the rest
and free the decompressor state. This value determines the and free the decompressor state. This value determines the
ratio at which we fully decompress the block rather than ratio at which we fully decompress the block rather than
keeping a partially decompressed block. A value of 0.8 means keeping a partially decompressed block. A value of 0.8 means
that as long as we've decompressed less than 80% of the block, that as long as we've decompressed less than 80% of the block,
we keep the partially decompressed block, but if we've we keep the partially decompressed block, but if we've
decompressed more then 80%, we'll fully decompress it. decompressed more then 80%, we'll fully decompress it.
* `-o offset=`*value*|`auto`: - `-o offset=`*value*|`auto`:
Specify the byte offset at which the filesystem is located in Specify the byte offset at which the filesystem is located in
the image, or use `auto` to detect the offset automatically. the image, or use `auto` to detect the offset automatically.
This is only useful for images that have some header located This is only useful for images that have some header located
before the actual filesystem data. before the actual filesystem data.
* `-o mlock=none`|`try`|`must`: - `-o mlock=none`|`try`|`must`:
Set this to `try` or `must` instead of the default `none` to Set this to `try` or `must` instead of the default `none` to
try or require `mlock()`ing of the file system metadata into try or require `mlock()`ing of the file system metadata into
memory. memory.
* `-o enable_nlink`: - `-o enable_nlink`:
Set this option if you want correct hardlink counts for regular Set this option if you want correct hardlink counts for regular
files. If this is not specified, the hardlink count will be 1. files. If this is not specified, the hardlink count will be 1.
Enabling this will slow down the initialization of the fuse Enabling this will slow down the initialization of the fuse
driver as the hardlink counts will be determined by a full driver as the hardlink counts will be determined by a full
file system scan (it only takes about a millisecond to scan file system scan (it only takes about a millisecond to scan
through 100,000 files, so this isn't dramatic). The fuse driver through 100,000 files, so this isn't dramatic). The fuse driver
will also consume more memory to hold the hardlink count table. will also consume more memory to hold the hardlink count table.
This will be 4 bytes for every regular file inode. This will be 4 bytes for every regular file inode.
* `-o readonly`: - `-o readonly`:
Show all file system entries as read-only. By default, DwarFS Show all file system entries as read-only. By default, DwarFS
will preserve the original writeability, which is obviously a will preserve the original writeability, which is obviously a
lie as it's a read-only file system. However, this is needed lie as it's a read-only file system. However, this is needed
for overlays to work correctly, as otherwise directories are for overlays to work correctly, as otherwise directories are
seen as read-only by the overlay and it'll be impossible to seen as read-only by the overlay and it'll be impossible to
create new files even in a writeable overlay. If you don't use create new files even in a writeable overlay. If you don't use
overlays and want the file system to reflect its read-only overlays and want the file system to reflect its read-only
state, you can set this option. state, you can set this option.
* `-o (no_)cache_image`: - `-o (no_)cache_image`:
By default, `dwarfs` tries to ensure that the compressed file By default, `dwarfs` tries to ensure that the compressed file
system image will not be cached by the kernel (i.e. the default system image will not be cached by the kernel (i.e. the default
is `-o no_cache_image`). This will reduce the memory consumption is `-o no_cache_image`). This will reduce the memory consumption
of the FUSE driver to slightly more than the `cachesize`, plus of the FUSE driver to slightly more than the `cachesize`, plus
the size of the metadata block. This usually isn't a problem, the size of the metadata block. This usually isn't a problem,
especially when the image is stored on an SSD, but if you want especially when the image is stored on an SSD, but if you want
to maximize performance it can be beneficial to use to maximize performance it can be beneficial to use
`-o cache_image` to keep the compressed image data in the kernel `-o cache_image` to keep the compressed image data in the kernel
cache. cache.
* `-o (no_)cache_files`: - `-o (no_)cache_files`:
By default, files in the mounted file system will be cached by By default, files in the mounted file system will be cached by
the kernel (i.e. the default is `-o cache_files`). This will the kernel (i.e. the default is `-o cache_files`). This will
significantly improve performance when accessing the same files significantly improve performance when accessing the same files
over and over again, especially if the data from these files has over and over again, especially if the data from these files has
been (partially) evicted from the block cache. By setting the been (partially) evicted from the block cache. By setting the
`-o no_cache_files` option, you can force the fuse driver to not `-o no_cache_files` option, you can force the fuse driver to not
use the kernel cache for file data. If you're short on memory and use the kernel cache for file data. If you're short on memory and
only infrequently accessing files, this can be worth trying, even only infrequently accessing files, this can be worth trying, even
though it's likely that the kernel will already do the right thing though it's likely that the kernel will already do the right thing
even when the cache is enabled. even when the cache is enabled.
* `-o debuglevel=`*name*: - `-o debuglevel=`*name*:
Use this for different levels of verbosity along with either Use this for different levels of verbosity along with either
the `-f` or `-d` FUSE options. This can give you some insight the `-f` or `-d` FUSE options. This can give you some insight
over what the file system driver is doing internally, but it's over what the file system driver is doing internally, but it's
mainly meant for debugging and the `debug` and `trace` levels mainly meant for debugging and the `debug` and `trace` levels
in particular will slow down the driver. in particular will slow down the driver.
* `-o tidy_strategy=`*name*: - `-o tidy_strategy=`*name*:
Use one of the following strategies to tidy the block cache: Use one of the following strategies to tidy the block cache:
- `none`: - `none`:
@ -128,14 +129,14 @@ options:
cache is traversed and all blocks that have been fully or cache is traversed and all blocks that have been fully or
partially swapped out by the kernel will be removed. partially swapped out by the kernel will be removed.
* `-o tidy_interval=`*time*: - `-o tidy_interval=`*time*:
Used only if `tidy_strategy` is not `none`. This is the interval Used only if `tidy_strategy` is not `none`. This is the interval
at which the cache tidying thread wakes up to look for blocks at which the cache tidying thread wakes up to look for blocks
that can be removed from the cache. This must be an integer value. that can be removed from the cache. This must be an integer value.
Suffixes `ms`, `s`, `m`, `h` are supported. If no suffix is given, Suffixes `ms`, `s`, `m`, `h` are supported. If no suffix is given,
the value will be assumed to be in seconds. the value will be assumed to be in seconds.
* `-o tidy_max_age=`*time*: - `-o tidy_max_age=`*time*:
Used only if `tidy_strategy` is `time`. A block will be removed Used only if `tidy_strategy` is `time`. A block will be removed
from the cache if it hasn't been used for this time span. This must from the cache if it hasn't been used for this time span. This must
be an integer value. Suffixes `ms`, `s`, `m`, `h` are supported. be an integer value. Suffixes `ms`, `s`, `m`, `h` are supported.
@ -145,14 +146,14 @@ There's two particular FUSE options that you'll likely need at some
point, e.g. when trying to set up an `overlayfs` mount on top of point, e.g. when trying to set up an `overlayfs` mount on top of
a DwarFS image: a DwarFS image:
* `-o allow_root` and `-o allow_other`: - `-o allow_root` and `-o allow_other`:
These will ensure that the mounted file system can be read by These will ensure that the mounted file system can be read by
either `root` or any other user in addition to the user that either `root` or any other user in addition to the user that
started the fuse driver. So if you're running `dwarfs` as a started the fuse driver. So if you're running `dwarfs` as a
non-privileged user, you want to `-o allow_root` in case `root` non-privileged user, you want to `-o allow_root` in case `root`
needs access, for example when you're trying to use `overlayfs` needs access, for example when you're trying to use `overlayfs`
along with `dwarfs`. If you're running `dwarfs` as `root`, you along with `dwarfs`. If you're running `dwarfs` as `root`, you
need `allow_other`. need `allow_other`.
## TIPS & TRICKS ## TIPS & TRICKS
@ -193,28 +194,34 @@ set of Perl versions back.
Here's what you need to do: Here's what you need to do:
* Create a set of directories. In my case, these are all located - Create a set of directories. In my case, these are all located
in `/tmp/perl` as this was the orginal install location. in `/tmp/perl` as this was the orginal install location.
cd /tmp/perl ```
mkdir install-ro cd /tmp/perl
mkdir install-rw mkdir install-ro
mkdir install-work mkdir install-rw
mkdir install mkdir install-work
mkdir install
```
* Mount the DwarFS image. `-o allow_root` is needed to make sure - Mount the DwarFS image. `-o allow_root` is needed to make sure
`overlayfs` has access to the mounted file system. In order `overlayfs` has access to the mounted file system. In order
to use `-o allow_root`, you may have to uncomment or add to use `-o allow_root`, you may have to uncomment or add
`user_allow_other` in `/etc/fuse.conf`. `user_allow_other` in `/etc/fuse.conf`.
dwarfs perl-install.dwarfs install-ro -o allow_root ```
dwarfs perl-install.dwarfs install-ro -o allow_root
```
* Now set up `overlayfs`. - Now set up `overlayfs`.
sudo mount -t overlay overlay -o lowerdir=install-ro,upperdir=install-rw,workdir=install-work install ```
sudo mount -t overlay overlay -o lowerdir=install-ro,upperdir=install-rw,workdir=install-work install
```
* That's it. You should now be able to access a writeable version - That's it. You should now be able to access a writeable version
of your DwarFS image in `install`. of your DwarFS image in `install`.
You can go even further than that. Say you have different sets of You can go even further than that. Say you have different sets of
modules that you regularly want to layer on top of the base DwarFS modules that you regularly want to layer on top of the base DwarFS
@ -223,7 +230,9 @@ the read-write directory after unmounting the `overlayfs`, and
selectively add this by passing a colon-separated list to the selectively add this by passing a colon-separated list to the
`lowerdir` option when setting up the `overlayfs` mount: `lowerdir` option when setting up the `overlayfs` mount:
sudo mount -t overlay overlay -o lowerdir=install-ro:install-modules install ```
sudo mount -t overlay overlay -o lowerdir=install-ro:install-modules install
```
If you want *this* merged overlay to be writable, just add in the If you want *this* merged overlay to be writable, just add in the
`upperdir` and `workdir` options from before again. `upperdir` and `workdir` options from before again.

View File

@ -1,5 +1,4 @@
dwarfsck(1) -- check DwarFS image # dwarfsck(1) -- check DwarFS image
=================================
## SYNOPSIS ## SYNOPSIS
@ -15,43 +14,43 @@ with a non-zero exit code.
## OPTIONS ## OPTIONS
* `-i`, `--input=`*file*: - `-i`, `--input=`*file*:
Path to the filesystem image. Path to the filesystem image.
* `-d`, `--detail=`*value*: - `-d`, `--detail=`*value*:
Level of filesystem information detail. The default is 2. Higher values Level of filesystem information detail. The default is 2. Higher values
mean more output. Values larger than 6 will currently not provide any mean more output. Values larger than 6 will currently not provide any
further detail. further detail.
* `-O`, `--image-offset=`*value*|`auto`: - `-O`, `--image-offset=`*value*|`auto`:
Specify the byte offset at which the filesystem is located in the image. Specify the byte offset at which the filesystem is located in the image.
Use `auto` to detect the offset automatically. This is also the default. Use `auto` to detect the offset automatically. This is also the default.
This is only useful for images that have some header located before the This is only useful for images that have some header located before the
actual filesystem data. actual filesystem data.
* `-H`, `--print-header`: - `-H`, `--print-header`:
Print the header located before the filesystem image to stdout. If no Print the header located before the filesystem image to stdout. If no
header is present, the program will exit with a non-zero exit code. header is present, the program will exit with a non-zero exit code.
* `-n`, `--num-workers=`*value*: - `-n`, `--num-workers=`*value*:
Number of worker threads used for integrity checking. Number of worker threads used for integrity checking.
* `--check-integrity`: - `--check-integrity`:
In addition to performing a fast checksum check, also perform a (much In addition to performing a fast checksum check, also perform a (much
slower) verification of the embedded SHA-512/256 hashes. slower) verification of the embedded SHA-512/256 hashes.
* `--json`: - `--json`:
Print a simple JSON representation of the filesystem metadata. Please Print a simple JSON representation of the filesystem metadata. Please
note that the format is *not* stable. note that the format is *not* stable.
* `--export-metadata=`*file*: - `--export-metadata=`*file*:
Export all filesystem meteadata in JSON format. Export all filesystem meteadata in JSON format.
* `--log-level=`*name*: - `--log-level=`*name*:
Specifiy a logging level. Specifiy a logging level.
* `--help`: - `--help`:
Show program help, including option defaults. Show program help, including option defaults.
## AUTHOR ## AUTHOR

View File

@ -1,9 +1,8 @@
dwarfsextract(1) -- extract DwarFS image # dwarfsextract(1) -- extract DwarFS image
========================================
## SYNOPSIS ## SYNOPSIS
`dwarfsextract` `-i` *image* [`-o` *dir*] [*options*...]<br> `dwarfsextract` `-i` *image* [`-o` *dir*] [*options*...]
`dwarfsextract` `-i` *image* -f *format* [`-o` *file*] [*options*...] `dwarfsextract` `-i` *image* -f *format* [`-o` *file*] [*options*...]
## DESCRIPTION ## DESCRIPTION
@ -35,44 +34,44 @@ to disk:
## OPTIONS ## OPTIONS
* `-i`, `--input=`*file*: - `-i`, `--input=`*file*:
Path to the source filesystem. Path to the source filesystem.
* `-o`, `--output=`*directory*|*file*: - `-o`, `--output=`*directory*|*file*:
If no format is specified, this is the directory to which the contents If no format is specified, this is the directory to which the contents
of the filesystem should be extracted. If a format is specified, this of the filesystem should be extracted. If a format is specified, this
is the name of the output archive. This option can be omitted, in which is the name of the output archive. This option can be omitted, in which
case the default is to extract the files to the current directory, or case the default is to extract the files to the current directory, or
to write the archive data to stdout. to write the archive data to stdout.
* `-O`, `--image-offset=`*value*|`auto`: - `-O`, `--image-offset=`*value*|`auto`:
Specify the byte offset at which the filesystem is located in the image. Specify the byte offset at which the filesystem is located in the image.
Use `auto` to detect the offset automatically. This is also the default. Use `auto` to detect the offset automatically. This is also the default.
This is only useful for images that have some header located before the This is only useful for images that have some header located before the
actual filesystem data. actual filesystem data.
* `-f`, `--format=`*format*: - `-f`, `--format=`*format*:
The archive format to produce. If this is left empty or unspecified, The archive format to produce. If this is left empty or unspecified,
files will be extracted to the output directory (or the current directory files will be extracted to the output directory (or the current directory
if no output directory is specified). For a full list of supported formats, if no output directory is specified). For a full list of supported formats,
see libarchive-formats(5). see libarchive-formats(5).
* `-n`, `--num-workers=`*value*: - `-n`, `--num-workers=`*value*:
Number of worker threads used for extracting the filesystem. Number of worker threads used for extracting the filesystem.
* `-s`, `--cache-size=`*value*: - `-s`, `--cache-size=`*value*:
Size of the block cache, in bytes. You can append suffixes (`k`, `m`, `g`) Size of the block cache, in bytes. You can append suffixes (`k`, `m`, `g`)
to specify the size in KiB, MiB and GiB, respectively. Note that this is to specify the size in KiB, MiB and GiB, respectively. Note that this is
not the upper memory limit of the process, as there may be blocks in not the upper memory limit of the process, as there may be blocks in
flight that are not stored in the cache. Also, each block that hasn't been flight that are not stored in the cache. Also, each block that hasn't been
fully decompressed yet will carry decompressor state along with it, which fully decompressed yet will carry decompressor state along with it, which
can use a significant amount of additional memory. can use a significant amount of additional memory.
* `--log-level=`*name*: - `--log-level=`*name*:
Specifiy a logging level. Specifiy a logging level.
* `--help`: - `--help`:
Show program help, including option defaults. Show program help, including option defaults.
## AUTHOR ## AUTHOR

View File

@ -1,9 +1,8 @@
mkdwarfs(1) -- create highly compressed read-only file systems # mkdwarfs(1) -- create highly compressed read-only file systems
==============================================================
## SYNOPSIS ## SYNOPSIS
`mkdwarfs` `-i` *path* `-o` *file* [*options*...]<br> `mkdwarfs` `-i` *path* `-o` *file* [*options*...]
`mkdwarfs` `-i` *file* `-o` *file* `--recompress` [*options*...] `mkdwarfs` `-i` *file* `-o` *file* `--recompress` [*options*...]
## DESCRIPTION ## DESCRIPTION
@ -26,272 +25,272 @@ After that, you can mount it with dwarfs(1):
There two mandatory options for specifying the input and output: There two mandatory options for specifying the input and output:
* `-i`, `--input=`*path*|*file*: - `-i`, `--input=`*path*|*file*:
Path to the root directory containing the files from which you want to Path to the root directory containing the files from which you want to
build a filesystem. If the `--recompress` option is given, this argument build a filesystem. If the `--recompress` option is given, this argument
is the source filesystem. is the source filesystem.
* `-o`, `--output=`*file*: - `-o`, `--output=`*file*:
File name of the output filesystem. File name of the output filesystem.
Most other options are concerned with compression tuning: Most other options are concerned with compression tuning:
* `-l`, `--compress-level=`*value*: - `-l`, `--compress-level=`*value*:
Compression level to use for the filesystem. **If you are unsure, please Compression level to use for the filesystem. **If you are unsure, please
stick to the default level of 7.** This is intended to provide some stick to the default level of 7.** This is intended to provide some
sensible defaults and will depend on which compression libraries were sensible defaults and will depend on which compression libraries were
available at build time. **The default level has been chosen to provide available at build time. **The default level has been chosen to provide
you with the best possible compression while still keeping the file you with the best possible compression while still keeping the file
system very fast to access.** Levels 8 and 9 will switch to LZMA system very fast to access.** Levels 8 and 9 will switch to LZMA
compression (when available), which will likely reduce the file system compression (when available), which will likely reduce the file system
image size, but will make it about an order of magnitude slower to image size, but will make it about an order of magnitude slower to
access, so reserve these levels for cases where you only need to access access, so reserve these levels for cases where you only need to access
the data infrequently. This `-l` option is meant to be the "easy" the data infrequently. This `-l` option is meant to be the "easy"
interface to configure `mkdwarfs`, and it will actually pick defaults interface to configure `mkdwarfs`, and it will actually pick defaults
for seven distinct options: `--block-size-bits`, `--compression`, for seven distinct options: `--block-size-bits`, `--compression`,
`--schema-compression`, `--metadata-compression`, `--window-size`, `--schema-compression`, `--metadata-compression`, `--window-size`,
`--window-step` and `--order`. See the output of `mkdwarfs --help` for `--window-step` and `--order`. See the output of `mkdwarfs --help` for
a table listing the exact defaults used for each compression level. a table listing the exact defaults used for each compression level.
* `-S`, `--block-size-bits=`*value*: - `-S`, `--block-size-bits=`*value*:
The block size used for the compressed filesystem. The actual block size The block size used for the compressed filesystem. The actual block size
is two to the power of this value. Larger block sizes will offer better is two to the power of this value. Larger block sizes will offer better
overall compression ratios, but will be slower and consume more memory overall compression ratios, but will be slower and consume more memory
when actually using the filesystem, as blocks will have to be fully or at when actually using the filesystem, as blocks will have to be fully or at
least partially decompressed into memory. Values between 20 and 26, i.e. least partially decompressed into memory. Values between 20 and 26, i.e.
between 1MiB and 64MiB, usually work quite well. between 1MiB and 64MiB, usually work quite well.
* `-N`, `--num-workers=`*value*: - `-N`, `--num-workers=`*value*:
Number of worker threads used for building the filesystem. This defaults Number of worker threads used for building the filesystem. This defaults
to the number of processors available on your system. Use this option if to the number of processors available on your system. Use this option if
you want to limit the resources used by `mkdwarfs`. you want to limit the resources used by `mkdwarfs`.
This option affects both the scanning phase and the compression phase. This option affects both the scanning phase and the compression phase.
In the scanning phase, the worker threads are used to scan files in the In the scanning phase, the worker threads are used to scan files in the
background as they are discovered. File scanning includes checksumming background as they are discovered. File scanning includes checksumming
for de-duplication as well as (optionally) checksumming for similarity for de-duplication as well as (optionally) checksumming for similarity
computation, depending on the `--order` option. File discovery itself computation, depending on the `--order` option. File discovery itself
is single-threaded and runs independently from the scanning threads. is single-threaded and runs independently from the scanning threads.
In the compression phase, the worker threads are used to compress the In the compression phase, the worker threads are used to compress the
individual filesystem blocks in the background. Ordering, segmenting individual filesystem blocks in the background. Ordering, segmenting
and block building are, again, single-threaded and run independently. and block building are, again, single-threaded and run independently.
* `-B`, `--max-lookback-blocks=`*value*: - `-B`, `--max-lookback-blocks=`*value*:
Specify how many of the most recent blocks to scan for duplicate segments. Specify how many of the most recent blocks to scan for duplicate segments.
By default, only the current block will be scanned. The larger this number, By default, only the current block will be scanned. The larger this number,
the more duplicate segments will likely be found, which may further improve the more duplicate segments will likely be found, which may further improve
compression. Impact on compression speed is minimal, but this could cause compression. Impact on compression speed is minimal, but this could cause
resulting filesystem to be slightly less efficient to use, as single small resulting filesystem to be slightly less efficient to use, as single small
files can now potentially span multiple filesystem blocks. Passing `-B0` files can now potentially span multiple filesystem blocks. Passing `-B0`
will completely disable duplicate segment search. will completely disable duplicate segment search.
* `-W`, `--window-size=`*value*: - `-W`, `--window-size=`*value*:
Window size of cyclic hash used for segmenting. This is again an exponent Window size of cyclic hash used for segmenting. This is again an exponent
to a base of two. Cyclic hashes are used by `mkdwarfs` for finding to a base of two. Cyclic hashes are used by `mkdwarfs` for finding
identical segments across multiple files. This is done on top of duplicate identical segments across multiple files. This is done on top of duplicate
file detection. If a reasonable amount of duplicate segments is found, file detection. If a reasonable amount of duplicate segments is found,
this means less blocks will be used in the filesystem and potentially this means less blocks will be used in the filesystem and potentially
less memory will be used when accessing the filesystem. It doesn't less memory will be used when accessing the filesystem. It doesn't
necessarily mean that the filesystem will be much smaller, as this removes necessarily mean that the filesystem will be much smaller, as this removes
redundany that cannot be exploited by the block compression any longer. redundany that cannot be exploited by the block compression any longer.
But it shouldn't make the resulting filesystem any bigger. This option But it shouldn't make the resulting filesystem any bigger. This option
is used along with `--window-step` to determine how extensive this is used along with `--window-step` to determine how extensive this
segment search will be. The smaller the window sizes, the more segments segment search will be. The smaller the window sizes, the more segments
will obviously be found. However, this also means files will become more will obviously be found. However, this also means files will become more
fragmented and thus the filesystem can be slower to use and metadata fragmented and thus the filesystem can be slower to use and metadata
size will grow. Passing `-W0` will completely disable duplicate segment size will grow. Passing `-W0` will completely disable duplicate segment
search. search.
* `-w`, `--window-step=`*value*: - `-w`, `--window-step=`*value*:
This option specifies how often cyclic hash values are stored for lookup. This option specifies how often cyclic hash values are stored for lookup.
It is specified relative to the window size, as a base-2 exponent that It is specified relative to the window size, as a base-2 exponent that
divides the window size. To give a concrete example, if `--window-size=16` divides the window size. To give a concrete example, if `--window-size=16`
and `--window-step=1`, then a cyclic hash across 65536 bytes will be stored and `--window-step=1`, then a cyclic hash across 65536 bytes will be stored
at every 32768 bytes of input data. If `--window-step=2`, then a hash value at every 32768 bytes of input data. If `--window-step=2`, then a hash value
will be stored at every 16384 bytes. This means that not every possible will be stored at every 16384 bytes. This means that not every possible
65536-byte duplicate segment will be detected, but it is guaranteed that 65536-byte duplicate segment will be detected, but it is guaranteed that
all duplicate segments of (`window_size` + `window_step`) bytes or more all duplicate segments of (`window_size` + `window_step`) bytes or more
will be detected (unless they span across block boundaries, of course). will be detected (unless they span across block boundaries, of course).
If you use a larger value for this option, the increments become *smaller*, If you use a larger value for this option, the increments become *smaller*,
and `mkdwarfs` will be slightly slower and use more memory. and `mkdwarfs` will be slightly slower and use more memory.
* `--bloom-filter-size`=*value*: - `--bloom-filter-size`=*value*:
The segmenting algorithm uses a bloom filter to determine quickly if The segmenting algorithm uses a bloom filter to determine quickly if
there is *no* match at a given position. This will filter out more than there is *no* match at a given position. This will filter out more than
90% of bad matches quickly with the default bloom filter size. The default 90% of bad matches quickly with the default bloom filter size. The default
is pretty much where the sweet spot lies. If you have copious amounts of is pretty much where the sweet spot lies. If you have copious amounts of
RAM and CPU power, feel free to increase this by one or two and you *might* RAM and CPU power, feel free to increase this by one or two and you *might*
be able to see some improvement. If you're tight on memory, then decreasing be able to see some improvement. If you're tight on memory, then decreasing
this will potentially save a few MiBs. this will potentially save a few MiBs.
* `-L`, `--memory-limit=`*value*: - `-L`, `--memory-limit=`*value*:
Approximately how much memory you want `mkdwarfs` to use during filesystem Approximately how much memory you want `mkdwarfs` to use during filesystem
creation. Note that currently this will only affect the block manager creation. Note that currently this will only affect the block manager
component, i.e. the number of filesystem blocks that are in flight but component, i.e. the number of filesystem blocks that are in flight but
haven't been compressed and written to the output file yet. So the memory haven't been compressed and written to the output file yet. So the memory
used by `mkdwarfs` can certainly be larger than this limit, but it's a used by `mkdwarfs` can certainly be larger than this limit, but it's a
good option when building large filesystems with expensive compression good option when building large filesystems with expensive compression
algorithms. Also note that most memory is likely used by the compression algorithms. Also note that most memory is likely used by the compression
algorithms, so if you're short on memory it might be worth tweaking the algorithms, so if you're short on memory it might be worth tweaking the
compression options. compression options.
* `-C`, `--compression=`*algorithm*[`:`*algopt*[`=`*value*][`,`...]]: - `-C`, `--compression=`*algorithm*[`:`*algopt*[`=`*value*][`,`...]]:
The compression algorithm and configuration used for file system data. The compression algorithm and configuration used for file system data.
The value for this option is a colon-separated list. The first item is The value for this option is a colon-separated list. The first item is
the compression algorithm, the remaining item are its options. Options the compression algorithm, the remaining item are its options. Options
can be either boolean or have a value. For details on which algori`thms can be either boolean or have a value. For details on which algorithms
and options are available, see the output of `mkdwarfs --help`. `zstd` and options are available, see the output of `mkdwarfs --help`. `zstd`
will give you the best compression while still keeping decompression will give you the best compression while still keeping decompression
*very* fast. `lzma` will compress even better, but decompression will *very* fast. `lzma` will compress even better, but decompression will
be around ten times slower. be around ten times slower.
* `--schema-compression=`*algorithm*[`:`*algopt*[`=`*value*][`,`...]]: - `--schema-compression=`*algorithm*[`:`*algopt*[`=`*value*][`,`...]]:
The compression algorithm and configuration used for the metadata schema. The compression algorithm and configuration used for the metadata schema.
Takes the same arguments as `--compression` above. The schema is *very* Takes the same arguments as `--compression` above. The schema is *very*
small, in the hundreds of bytes, so this is only relevant for extremely small, in the hundreds of bytes, so this is only relevant for extremely
small file systems. The default (`zstd`) has shown to give considerably small file systems. The default (`zstd`) has shown to give considerably
better results than any other algorithms. better results than any other algorithms.
* `--metadata-compression=`*algorithm*[`:`*algopt*[`=`*value*][`,`...]]: - `--metadata-compression=`*algorithm*[`:`*algopt*[`=`*value*][`,`...]]:
The compression algorithm and configuration used for the metadata. The compression algorithm and configuration used for the metadata.
Takes the same arguments as `--compression` above. The metadata has been Takes the same arguments as `--compression` above. The metadata has been
optimized for very little redundancy and leaving it uncompressed, the optimized for very little redundancy and leaving it uncompressed, the
default for all levels below 7, has the benefit that it can be mapped default for all levels below 7, has the benefit that it can be mapped
to memory and used directly. This improves mount time for large file to memory and used directly. This improves mount time for large file
systems compared to e.g. an lzma compressed metadata block. If you don't systems compared to e.g. an lzma compressed metadata block. If you don't
care about mount time, you can safely choose `lzma` compression here, as care about mount time, you can safely choose `lzma` compression here, as
the data will only have to be decompressed once when mounting the image. the data will only have to be decompressed once when mounting the image.
* `--recompress`[`=all`|`=block`|`=metadata`|`=none`]: - `--recompress`[`=all`|`=block`|`=metadata`|`=none`]:
Take an existing DwarFS file system and recompress it using different Take an existing DwarFS file system and recompress it using different
compression algorithms. If no argument or `all` is given, all sections compression algorithms. If no argument or `all` is given, all sections
in the file system image will be recompressed. Note that *only* the in the file system image will be recompressed. Note that *only* the
compression algorithms, i.e. the `--compression`, `--schema-compression` compression algorithms, i.e. the `--compression`, `--schema-compression`
and `--metadata-compression` options, have an impact on how the new file and `--metadata-compression` options, have an impact on how the new file
system is written. Other options, e.g. `--block-size-bits` or `--order`, system is written. Other options, e.g. `--block-size-bits` or `--order`,
have no impact. If `none` is given as an argument, none of the sections have no impact. If `none` is given as an argument, none of the sections
will be recompressed, but the file system is still rewritten in the will be recompressed, but the file system is still rewritten in the
latest file system format. This is an easy way of upgrading an old file latest file system format. This is an easy way of upgrading an old file
system image to a new format. If `block` or `metadata` is given, only system image to a new format. If `block` or `metadata` is given, only
the block sections (i.e. the actual file data) or the metadata sections the block sections (i.e. the actual file data) or the metadata sections
are recompressed. This can be useful if you want to switch from compressed are recompressed. This can be useful if you want to switch from compressed
metadata to uncompressed metadata without having to rebuild or recompress metadata to uncompressed metadata without having to rebuild or recompress
all the other data. all the other data.
* `-P`, `--pack-metadata=auto`|`none`|[`all`|`chunk_table`|`directories`|`shared_files`|`names`|`names_index`|`symlinks`|`symlinks_index`|`force`|`plain`[`,`...]]: - `-P`, `--pack-metadata=auto`|`none`|[`all`|`chunk_table`|`directories`|`shared_files`|`names`|`names_index`|`symlinks`|`symlinks_index`|`force`|`plain`[`,`...]]:
Which metadata information to store in packed format. This is primarily Which metadata information to store in packed format. This is primarily
useful when storing metadata uncompressed, as it allows for smaller useful when storing metadata uncompressed, as it allows for smaller
metadata block size without having to turn on compression. Keep in mind, metadata block size without having to turn on compression. Keep in mind,
though, that *most* of the packed data must be unpacked into memory when though, that *most* of the packed data must be unpacked into memory when
reading the file system. If you want a purely memory-mappable metadata reading the file system. If you want a purely memory-mappable metadata
block, leave this at the default (`auto`), which will turn on `names` and block, leave this at the default (`auto`), which will turn on `names` and
`symlinks` packing if these actually help save data. `symlinks` packing if these actually help save data.
Tweaking these options is mostly interesting when dealing with file Tweaking these options is mostly interesting when dealing with file
systems that contain hundreds of thousands of files. systems that contain hundreds of thousands of files.
See [Metadata Packing](#metadata-packing) for more details. See [Metadata Packing](#metadata-packing) for more details.
* `--set-owner=`*uid*: - `--set-owner=`*uid*:
Set the owner for all entities in the file system. This can reduce the Set the owner for all entities in the file system. This can reduce the
size of the file system. If the input only has a single owner already, size of the file system. If the input only has a single owner already,
setting this won't make any difference. setting this won't make any difference.
* `--set-group=`*gid*: - `--set-group=`*gid*:
Set the group for all entities in the file system. This can reduce the Set the group for all entities in the file system. This can reduce the
size of the file system. If the input only has a single group already, size of the file system. If the input only has a single group already,
setting this won't make any difference. setting this won't make any difference.
* `--set-time=`*time*|`now`: - `--set-time=`*time*|`now`:
Set the time stamps for all entities to this value. This can significantly Set the time stamps for all entities to this value. This can significantly
reduce the size of the file system. You can pass either a unix time stamp reduce the size of the file system. You can pass either a unix time stamp
or `now`. or `now`.
* `--keep-all-times`: - `--keep-all-times`:
As of release 0.3.0, by default, `mkdwarfs` will only save the contents of As of release 0.3.0, by default, `mkdwarfs` will only save the contents of
the `mtime` field in order to save metadata space. If you want to save the `mtime` field in order to save metadata space. If you want to save
`atime` and `ctime` as well, use this option. `atime` and `ctime` as well, use this option.
* `--time-resolution=`*sec*|`sec`|`min`|`hour`|`day`: - `--time-resolution=`*sec*|`sec`|`min`|`hour`|`day`:
Specify the resolution with which time stamps are stored. By default, Specify the resolution with which time stamps are stored. By default,
time stamps are stored with second resolution. You can specify "odd" time stamps are stored with second resolution. You can specify "odd"
resolutions as well, e.g. something like 15 second resolution is resolutions as well, e.g. something like 15 second resolution is
entirely possible. Moving from second to minute resolution, for example, entirely possible. Moving from second to minute resolution, for example,
will save roughly 6 bits per file system entry in the metadata block. will save roughly 6 bits per file system entry in the metadata block.
* `--order=none`|`path`|`similarity`|`nilsimsa`[`:`*limit*[`:`*depth*[`:`*mindepth*]]]|`script`: - `--order=none`|`path`|`similarity`|`nilsimsa`[`:`*limit*[`:`*depth*[`:`*mindepth*]]]|`script`:
The order in which inodes will be written to the file system. Choosing `none`, The order in which inodes will be written to the file system. Choosing `none`,
the inodes will be stored in the order in which they are discovered. With the inodes will be stored in the order in which they are discovered. With
`path`, they will be sorted asciibetically by path name of the first file `path`, they will be sorted asciibetically by path name of the first file
representing this inode. With `similarity`, they will be ordered using a representing this inode. With `similarity`, they will be ordered using a
simple, yet fast and efficient, similarity hash function. `nilsimsa` ordering simple, yet fast and efficient, similarity hash function. `nilsimsa` ordering
uses a more sophisticated similarity function that is typically better than uses a more sophisticated similarity function that is typically better than
`similarity`, but is significantly slower to compute. However, computation `similarity`, but is significantly slower to compute. However, computation
can happen in the background while already building the file system. can happen in the background while already building the file system.
`nilsimsa` ordering can be further tweaked by specifying a *limit* and `nilsimsa` ordering can be further tweaked by specifying a *limit* and
*depth*. The *limit* determines how soon an inode is considered similar *depth*. The *limit* determines how soon an inode is considered similar
enough for adding. A *limit* of 255 means "essentially identical", whereas enough for adding. A *limit* of 255 means "essentially identical", whereas
a *limit* of 0 means "not similar at all". The *depth* determines up to a *limit* of 0 means "not similar at all". The *depth* determines up to
how many inodes can be checked at most while searching for a similar one. how many inodes can be checked at most while searching for a similar one.
To avoid `nilsimsa` ordering to become a bottleneck when ordering lots of To avoid `nilsimsa` ordering to become a bottleneck when ordering lots of
small files, the *depth* is adjusted dynamically to keep the input queue small files, the *depth* is adjusted dynamically to keep the input queue
to the segmentation/compression stages adequately filled. You can specify to the segmentation/compression stages adequately filled. You can specify
how much the *depth* can be adjusted by also specifying *mindepth*. how much the *depth* can be adjusted by also specifying *mindepth*.
The default if you omit these values is a *limit* of 255, a *depth* The default if you omit these values is a *limit* of 255, a *depth*
of 20000 and a *mindepth* of 1000. Note that if you want reproducible of 20000 and a *mindepth* of 1000. Note that if you want reproducible
results, you need to set *depth* and *mindepth* to the same value. Also results, you need to set *depth* and *mindepth* to the same value. Also
note that when you're compressing lots (as in hundreds of thousands) of note that when you're compressing lots (as in hundreds of thousands) of
small files, ordering them by `similarity` instead of `nilsimsa` is likely small files, ordering them by `similarity` instead of `nilsimsa` is likely
going to speed things up significantly without impacting compression too much. going to speed things up significantly without impacting compression too much.
Last but not least, if scripting support is built into `mkdwarfs`, you can Last but not least, if scripting support is built into `mkdwarfs`, you can
choose `script` to let the script determine the order. choose `script` to let the script determine the order.
* `--remove-empty-dirs`: - `--remove-empty-dirs`:
Removes all empty directories from the output file system, recursively. Removes all empty directories from the output file system, recursively.
This is particularly useful when using scripts that filter out a lot of This is particularly useful when using scripts that filter out a lot of
file system entries. file system entries.
* `--with-devices`: - `--with-devices`:
Include character and block devices in the output file system. These are Include character and block devices in the output file system. These are
not included by default, and due to security measures in FUSE, they will not included by default, and due to security measures in FUSE, they will
never work in the mounted file system. However, they can still be copied never work in the mounted file system. However, they can still be copied
out of the mounted file system, for example using `rsync`. out of the mounted file system, for example using `rsync`.
* `--with-specials`: - `--with-specials`:
Include named fifos and sockets in the output file system. These are not Include named fifos and sockets in the output file system. These are not
included by default. included by default.
* `--header=`*file*: - `--header=`*file*:
Read header from file and place it before the output filesystem image. Read header from file and place it before the output filesystem image.
Can be used with `--recompress` to add or replace a header. Can be used with `--recompress` to add or replace a header.
* `--remove-header`: - `--remove-header`:
Remove header from a filesystem image. Only useful with `--recompress`. Remove header from a filesystem image. Only useful with `--recompress`.
* `--log-level=`*name*: - `--log-level=`*name*:
Specifiy a logging level. Specifiy a logging level.
* `--no-progress`: - `--no-progress`:
Don't show progress output while building filesystem. Don't show progress output while building filesystem.
* `--progress=none`|`simple`|`ascii`|`unicode`: - `--progress=none`|`simple`|`ascii`|`unicode`:
Choosing `none` is equivalent to specifying `--no-progress`. `simple` Choosing `none` is equivalent to specifying `--no-progress`. `simple`
will print a single line of progress information whenever the progress will print a single line of progress information whenever the progress
has significantly changed, but at most once every 2 seconds. This is has significantly changed, but at most once every 2 seconds. This is
also the default when the output is not a tty. `unicode` is the default also the default when the output is not a tty. `unicode` is the default
behaviour, which shows a nice progress bar and lots of additional behaviour, which shows a nice progress bar and lots of additional
information. If your terminal cannot deal with unicode characters, information. If your terminal cannot deal with unicode characters,
you can switch to `ascii`, which is like `unicode`, but looks less you can switch to `ascii`, which is like `unicode`, but looks less
fancy. fancy.
* `--help`: - `--help`:
Show program help, including defaults, compression level detail and Show program help, including defaults, compression level detail and
supported compression algorithms. supported compression algorithms.
If experimental Python support was compiled into `mkdwarfs`, you can use the If experimental Python support was compiled into `mkdwarfs`, you can use the
following option to enable customizations via the scripting interface: following option to enable customizations via the scripting interface:
* `--script=`*file*[`:`*class*[`(`arguments`...)`]]: - `--script=`*file*[`:`*class*[`(`arguments`...)`]]:
Specify the Python script to load. The class name is optional if there's Specify the Python script to load. The class name is optional if there's
a class named `mkdwarfs` in the script. It is also possible to pass a class named `mkdwarfs` in the script. It is also possible to pass
arguments to the constuctor. arguments to the constuctor.
## TIPS & TRICKS ## TIPS & TRICKS
@ -342,70 +341,70 @@ However, there are several options to choose from that allow you to
further reduce metadata size without having to compress the metadata. further reduce metadata size without having to compress the metadata.
These options are controlled by the `--pack-metadata` option. These options are controlled by the `--pack-metadata` option.
* `auto`: - `auto`:
This is the default. It will enable both `names` and `symlinks`. This is the default. It will enable both `names` and `symlinks`.
* `none`: - `none`:
Don't enable any packing. However, string tables (i.e. names and Don't enable any packing. However, string tables (i.e. names and
symlinks) will still be stored in "compact" rather than "plain" symlinks) will still be stored in "compact" rather than "plain"
format. In order to force storage in plain format, use `plain`. format. In order to force storage in plain format, use `plain`.
* `all`: - `all`:
Enable all packing options. This does *not* force packing of Enable all packing options. This does *not* force packing of
string tables (i.e. names and symlinks) if the packing would string tables (i.e. names and symlinks) if the packing would
actually increase the size, which can happen if the string tables actually increase the size, which can happen if the string tables
are actually small. In order to force string table packing, use are actually small. In order to force string table packing, use
`all,force`. `all,force`.
* `chunk_table`: - `chunk_table`:
Delta-compress chunk tables. This can reduce the size of the Delta-compress chunk tables. This can reduce the size of the
chunk tables for large file systems and help compression, however, chunk tables for large file systems and help compression, however,
it will likely require a lot of memory when unpacking the tables it will likely require a lot of memory when unpacking the tables
again. Only use this if you know what you're doing. again. Only use this if you know what you're doing.
* `directories`: - `directories`:
Pack directories table by storing first entry pointers delta- Pack directories table by storing first entry pointers delta-
compressed and completely removing parent directory pointers. compressed and completely removing parent directory pointers.
The parent directory pointers can be rebuilt by tree traversal The parent directory pointers can be rebuilt by tree traversal
when the filesystem is loaded. If you have a large number of when the filesystem is loaded. If you have a large number of
directories, this can reduce the metadata size, however, it directories, this can reduce the metadata size, however, it
will likely require a lot of memory when unpacking the tables will likely require a lot of memory when unpacking the tables
again. Only use this if you know what you're doing. again. Only use this if you know what you're doing.
* `shared_files`: - `shared_files`:
Pack shared files table. This is only useful if the filesystem Pack shared files table. This is only useful if the filesystem
contains lots of non-hardlinked duplicates. It gets more efficient contains lots of non-hardlinked duplicates. It gets more efficient
the more copies of a file are in the filesystem. the more copies of a file are in the filesystem.
* `names`,`symlinks`: - `names`,`symlinks`:
Compress the names and symlink targets using the Compress the names and symlink targets using the
[fsst](https://github.com/cwida/fsst) compression scheme. This [fsst](https://github.com/cwida/fsst) compression scheme. This
compresses each individual entry separately using a small, compresses each individual entry separately using a small,
custom symbol table, and it's surprisingly efficient. It is custom symbol table, and it's surprisingly efficient. It is
not uncommon for names to make up for 50-70% of the metadata, not uncommon for names to make up for 50-70% of the metadata,
and fsst compression typically reduces the size by a factor and fsst compression typically reduces the size by a factor
of two. The entries can be decompressed individually, so no of two. The entries can be decompressed individually, so no
extra memory is used when accessing the filesystem (except for extra memory is used when accessing the filesystem (except for
the symbol table, which is only a few hundred bytes). This is the symbol table, which is only a few hundred bytes). This is
turned on by default. For small filesystems, it's possible that turned on by default. For small filesystems, it's possible that
the compressed strings plus symbol table are actually larger the compressed strings plus symbol table are actually larger
than the uncompressed strings. If this is the case, the strings than the uncompressed strings. If this is the case, the strings
will be stored uncompressed, unless `force` is also specified. will be stored uncompressed, unless `force` is also specified.
* `names_index`,`symlinks_index`: - `names_index`,`symlinks_index`:
Delta-compress the names and symlink targets indices. The same Delta-compress the names and symlink targets indices. The same
caveats apply as for `chunk_table`. caveats apply as for `chunk_table`.
* `force`: - `force`:
Forces the compression of the `names` and `symlinks` tables, Forces the compression of the `names` and `symlinks` tables,
even if that would make them use more memory than the even if that would make them use more memory than the
uncompressed tables. This is really only useful for testing uncompressed tables. This is really only useful for testing
and development. and development.
* `plain`: - `plain`:
Store string tables in "plain" format. The plain format uses Store string tables in "plain" format. The plain format uses
Frozen thrift arrays and was used in earlier metadata versions. Frozen thrift arrays and was used in earlier metadata versions.
It is useful for debugging, but wastes up to one byte per string. It is useful for debugging, but wastes up to one byte per string.
To give you an idea of the metadata size using different packing options, To give you an idea of the metadata size using different packing options,
here's the size of the metadata block for the Ubuntu 20.04.2.0 Desktop here's the size of the metadata block for the Ubuntu 20.04.2.0 Desktop
@ -430,7 +429,6 @@ further compress the block. So if you're really desperately trying
to reduce the image size, enabling `all` packing would be an option to reduce the image size, enabling `all` packing would be an option
at the cost of using a lot more memory when using the filesystem. at the cost of using a lot more memory when using the filesystem.
## INTERNAL OPERATION ## INTERNAL OPERATION
Internally, `mkdwarfs` runs in two completely separate phases. The first Internally, `mkdwarfs` runs in two completely separate phases. The first