mirror of
https://github.com/mhx/dwarfs.git
synced 2025-09-09 12:28:13 -04:00
Update mkdwarfs man page
This commit is contained in:
parent
fcf872e5b5
commit
4c6580b9c7
106
doc/mkdwarfs.md
106
doc/mkdwarfs.md
@ -75,6 +75,54 @@ Most other options are concerned with compression tuning:
|
||||
individual filesystem blocks in the background. Ordering, segmenting
|
||||
and block building are, again, single-threaded and run independently.
|
||||
|
||||
* `-B`, `--max-lookback-blocks=`*value*:
|
||||
Specify how many of the most recent blocks to scan for duplicate segments.
|
||||
By default, only the current block will be scanned. The larger this number,
|
||||
the more duplicate segments will likely be found, which may further improve
|
||||
compression. Impact on compression speed is minimal, but this could cause
|
||||
resulting filesystem to be slightly less efficient to use, as single small
|
||||
files can now potentially span multiple filesystem blocks. Passing `-B0`
|
||||
will completely disable duplicate segment search.
|
||||
|
||||
* `-W`, `--window-size=`*value*:
|
||||
Window size of cyclic hash used for segmenting. This is again an exponent
|
||||
to a base of two. Cyclic hashes are used by `mkdwarfs` for finding
|
||||
identical segments across multiple files. This is done on top of duplicate
|
||||
file detection. If a reasonable amount of duplicate segments is found,
|
||||
this means less blocks will be used in the filesystem and potentially
|
||||
less memory will be used when accessing the filesystem. It doesn't
|
||||
necessarily mean that the filesystem will be much smaller, as this removes
|
||||
redundany that cannot be exploited by the block compression any longer.
|
||||
But it shouldn't make the resulting filesystem any bigger. This option
|
||||
is used along with `--window-step` to determine how extensive this
|
||||
segment search will be. The smaller the window sizes, the more segments
|
||||
will obviously be found. However, this also means files will become more
|
||||
fragmented and thus the filesystem can be slower to use and metadata
|
||||
size will grow. Passing `-W0` will completely disable duplicate segment
|
||||
search.
|
||||
|
||||
* `-w`, `--window-step=`*value*:
|
||||
This option specifies how often cyclic hash values are stored for lookup.
|
||||
It is specified relative to the window size, as a base-2 exponent that
|
||||
divides the window size. To give a concrete example, if `--window-size=16`
|
||||
and `--window-step=1`, then a cyclic hash across 65536 bytes will be stored
|
||||
at every 32768 bytes of input data. If `--window-step=2`, then a hash value
|
||||
will be stored at every 16384 bytes. This means that not every possible
|
||||
65536-byte duplicate segment will be detected, but it is guaranteed that
|
||||
all duplicate segments of (`window_size` + `window_step`) bytes or more
|
||||
will be detected (unless they span across block boundaries, of course).
|
||||
If you use a larger value for this option, the increments become *smaller*,
|
||||
and `mkdwarfs` will be slightly slower and use more memory.
|
||||
|
||||
* `--bloom-filter-size`=*value*:
|
||||
The segmenting algorithm uses a bloom filter to determine quickly if
|
||||
there is *no* match at a given position. This will filter out more than
|
||||
90% of bad matches quickly with the default bloom filter size. The default
|
||||
is pretty much where the sweet spot lies. If you have copious amounts of
|
||||
RAM and CPU power, feel free to increase this by one or two and you *might*
|
||||
be able to see some improvement. If you're tight on memory, then decreasing
|
||||
this will potentially save a few MiBs.
|
||||
|
||||
* `-L`, `--memory-limit=`*value*:
|
||||
Approximately how much memory you want `mkdwarfs` to use during filesystem
|
||||
creation. Note that currently this will only affect the block manager
|
||||
@ -156,6 +204,11 @@ Most other options are concerned with compression tuning:
|
||||
reduce the size of the file system. You can pass either a unix time stamp
|
||||
or `now`.
|
||||
|
||||
* `--keep-all-times`:
|
||||
As of release 0.3.0, by default, `mkdwarfs` will only save the contents of
|
||||
the `mtime` field in order to save metadata space. If you want to save
|
||||
`atime` and `ctime` as well, use this option.
|
||||
|
||||
* `--time-resolution=`*sec*|`sec`|`min`|`hour`|`day`:
|
||||
Specify the resolution with which time stamps are stored. By default,
|
||||
time stamps are stored with second resolution. You can specify "odd"
|
||||
@ -163,11 +216,6 @@ Most other options are concerned with compression tuning:
|
||||
entirely possible. Moving from second to minute resolution, for example,
|
||||
will save roughly 6 bits per file system entry in the metadata block.
|
||||
|
||||
* `--keep-all-times`:
|
||||
As of release 0.3.0, by default, `mkdwarfs` will only save the contents of
|
||||
the `mtime` field in order to save metadata space. If you want to save
|
||||
`atime` and `ctime` as well, use this option.
|
||||
|
||||
* `--order=none`|`path`|`similarity`|`nilsimsa`[`:`*limit*[`:`*depth*[`:`*mindepth*]]]|`script`:
|
||||
The order in which inodes will be written to the file system. Choosing `none`,
|
||||
the inodes will be stored in the order in which they are discovered. With
|
||||
@ -195,54 +243,6 @@ Most other options are concerned with compression tuning:
|
||||
Last but not least, if scripting support is built into `mkdwarfs`, you can
|
||||
choose `script` to let the script determine the order.
|
||||
|
||||
* `-W`, `--window-size=`*value*:
|
||||
Window size of cyclic hash used for segmenting. This is again an exponent
|
||||
to a base of two. Cyclic hashes are used by `mkdwarfs` for finding
|
||||
identical segments across multiple files. This is done on top of duplicate
|
||||
file detection. If a reasonable amount of duplicate segments is found,
|
||||
this means less blocks will be used in the filesystem and potentially
|
||||
less memory will be used when accessing the filesystem. It doesn't
|
||||
necessarily mean that the filesystem will be much smaller, as this removes
|
||||
redundany that cannot be exploited by the block compression any longer.
|
||||
But it shouldn't make the resulting filesystem any bigger. This option
|
||||
is used along with `--window-step` to determine how extensive this
|
||||
segment search will be. The smaller the window sizes, the more segments
|
||||
will obviously be found. However, this also means files will become more
|
||||
fragmented and thus the filesystem can be slower to use and metadata
|
||||
size will grow. Passing `-W0` will completely disable duplicate segment
|
||||
search.
|
||||
|
||||
* `--window-step=`*value*:
|
||||
This option specifies how often cyclic hash values are stored for lookup.
|
||||
It is specified relative to the window size, as a base-2 exponent that
|
||||
divides the window size. To give a concrete example, if `--window-size=16`
|
||||
and `--window-step=1`, then a cyclic hash across 65536 bytes will be stored
|
||||
at every 32768 bytes of input data. If `--window-step=2`, then a hash value
|
||||
will be stored at every 16384 bytes. This means that not every possible
|
||||
65536-byte duplicate segment will be detected, but it is guaranteed that
|
||||
all duplicate segments of (`window_size` + `window_step`) bytes or more
|
||||
will be detected (unless they span across block boundaries, of course).
|
||||
If you use a larger value for this option, the increments become *smaller*,
|
||||
and `mkdwarfs` will be slightly slower and use more memory.
|
||||
|
||||
* `-B`, `--max-lookback-blocks=`*value*:
|
||||
Specify how many of the most recent blocks to scan for duplicate segments.
|
||||
By default, only the current block will be scanned. The larger this number,
|
||||
the more duplicate segments will likely be found, which may further improve
|
||||
compression. Impact on compression speed is minimal, but this could cause
|
||||
resulting filesystem to be slightly less efficient to use, as single small
|
||||
files can now potentially span multiple filesystem blocks. Passing `-B0`
|
||||
will completely disable duplicate segment search.
|
||||
|
||||
* `--bloom-filter-size`=*value*:
|
||||
The segmenting algorithm uses a bloom filter to determine quickly if
|
||||
there is *no* match at a given position. This will filter out more than
|
||||
90% of bad matches quickly with the default bloom filter size. The default
|
||||
is pretty much where the sweet spot lies. If you have copious amounts of
|
||||
RAM and CPU power, feel free to increase this by one or two and you *might*
|
||||
be able to see some improvement. If you're tight on memory, then decreasing
|
||||
this will potentially save a few MiBs.
|
||||
|
||||
* `--remove-empty-dirs`:
|
||||
Removes all empty directories from the output file system, recursively.
|
||||
This is particularly useful when using scripts that filter out a lot of
|
||||
|
Loading…
x
Reference in New Issue
Block a user