DragonFly On-Line Manual Pages
lrzip(1) lrzip(1)
NAME
lrzip - a large-file compression program
SYNOPSIS
lrzip [OPTIONS] <file>
lrzip -d [OPTIONS] <file>
lrunzip [OPTIONS] <file>
lrzcat [OPTIONS] <file>
lrztar [lrzip options] <directory>
lrztar -d [lrzip options] <directory>
lrzuntar [lrzip options] <directory>
LRZIP=NOCONFIG [lrzip|lrunzip] [OPTIONS] <file>
DESCRIPTION
LRZIP is a file compression program designed to do particularly well on
very large files containing long distance redundancy. lrztar is a
wrapper for LRZIP to simplify compression and decompression of
directories.
OPTIONS SUMMARY
Here is a summary of the options to lrzip.
General options:
-c, --check check integrity of file written on decompression
-d, --decompress decompress
-e, --encrypt password protected sha512/aes128 encryption on compression
-h, -?, --help show help
-H, --hash display md5 hash integrity information
-i, --info show compressed file information
-q, --quiet don't show compression progress
-t, --test test compressed file integrity
-v[v], --verbose Increase verbosity
-V, --version show version
Options affecting output:
-D, --delete delete existing files
-f, --force force overwrite of any existing files
-k, --keep-broken keep broken or damaged output files
-o, --outfile filename specify the output file name and/or path
-O, --outdir directory specify the output directory when -o is not used
-S, --suffix suffix specify compressed suffix (default '.lrz')
Options affecting compression:
-b, --bzip2 bzip2 compression
-g, --gzip gzip compression using zlib
-l, --lzo lzo compression (ultra fast)
-n, --no-compress no backend compression - prepare for other compressor
-z, --zpaq zpaq compression (best, extreme compression, extremely slow)
Low level options:
-L, --level level set lzma/bzip2/gzip compression level (1-9, default 7)
-N, --nice-level value Set nice value to value (default 19)
-p, --threads value Set processor count to override number of threads
-m, --maxram size Set maximim available ram in hundreds of MB
overrides detected ammount of available ram
-T, --threshold Disable LZO compressibility testing
-U, --unlimited Use unlimited window size beyond ramsize (potentially much slower)
-w, --window size maximum compression window in hundreds of MB
default chosen by heuristic dependent on ram and chosen compression
LRZIP=NOCONFIG environment variable setting can be used to bypass lrzip.conf.
TMP environment variable will be used for storage of temporary files when needed.
TMPDIR may also be stored in lrzip.conf file.
If no filenames or "-" is specified, stdin/out will be used.
OPTIONS
General options
-c This option enables integrity checking of the file written to
disk on decompression. All decompression is tested internally in
lrzip with either crc32 or md5 hash checking depending on the
version of the archive already. However the file written to
disk may be corrupted for other reasons to do with other
userspace problems such as faulty library versions, drivers,
hardware failure and so on. Enabling this option will make lrzip
perform an md5 hash check on the file that's written to disk.
When the archive has the md5 value stored in it, it is compared
to this. Otherwise it is compared to the value calculated during
decompression. This offers an extra guarantee that the file
written is the same as the original archived.
-d Decompress. If this option is not used then lrzip looks at the
name used to launch the program. If it contains the string
"lrunzip" then the -d option is automatically set. If it
contains the string "lrzcat" then the -d -o - options are
automatically set.
-e Encrypt. This option enables high grade password encryption
using a combination of multiply sha512 hashed password, random
salt and aes128 CBC encryption. Passwords up to 500 characters
long are supported, and the encryption mechanism used virtually
guarantees that the same file created with the same password
will never be the same. Furthermore, the password hashing is
increased according to the date the file is encrypted,
increasing the number of CPU cycles required for each password
attempt in accordance with Moore's law, thus making the
difficulty of attempting brute force attacks proportional to the
power of modern computers.
-h|-? Print an options summary page
-H This shows the md5 hash value calculated on compressing or
decompressing an lrzip archive. By default all compression has
the md5 value calculated and stored in all archives since
version 0.560. On decompression, when an md5 value has been
found, it will be calculated and used for integrity checking.
If the md5 value is not stored in the archive, it will not be
calculated unless explicitly specified with this option, or
check integrity (see below) has been requested.
-i This shows information about a compressed file. It shows the
compressed size, the decompressed size, the compression ratio,
what compression was used and what hash checking will be used
for internal integrity checking. Note that the compression mode
is detected from the first block only and it will show no
compression used if the first block was incompressible, even if
later blocks were compressible. If verbose options -v or -vv are
added, a breakdown of all the internal blocks and progressively
more information pertaining to them will also be shown.
-q If this option is specified then lrzip will not show the
percentage progress while compressing. Note that compression
happens in bursts with lzma compression which is the default
compression. This means that it will progress very rapidly for
short periods and then stop for long periods.
-t This tests the compressed file integrity. It does this by
decompressing it to a temporary file and then deleting it.
-v[v] Increases verbosity. -vv will print more messages than -v.
-V Print the lrzip version number
Options affecting output
-D If this option is specified then lrzip will delete the source
file after successful compression or decompression. When this
option is not specified then the source files are not deleted.
-f If this option is not specified (Default) then lrzip will not
overwrite any existing files. If you set this option then rzip
will silently overwrite any files as needed.
-k This option will keep broken or damaged files instead of
deleting them. When compression or decompression is interrupted
either by user or error, or a file decompressed fails an
integrity check, it is normally deleted by LRZIP.
-o Set the output file name. If this option is not set then the
output file name is chosen based on the input name and the
suffix. The -o option cannot be used if more than one file name
is specified on the command line.
-O Set the output directory for the default filename. This option
cannot be combined with -o.
-S Set the compression suffix. The default is '.lrz'.
Options affecting compression
-b Bzip2 compression. Uses bzip2 compression for the 2nd stage,
much like the original rzip does.
-g Gzip compression. Uses gzip compression for the 2nd stage. Uses
libz compress and uncompress functions.
-l LZO Compression. If this option is set then lrzip will use the
ultra fast lzo compression algorithm for the 2nd stage. This
mode of compression gives bzip2 like compression at the speed it
would normally take to simply copy the file, giving excellent
compression/time value.
-n No 2nd stage compression. If this option is set then lrzip will
only perform the long distance redundancy 1st stage compression.
While this does not compress any faster than LZO compression, it
produces a smaller file that then responds better to further
compression (by eg another application), also reducing the
compression time substantially.
-z ZPAQ compression. Uses ZPAQ compression which is from the PAQ
family of compressors known for having some of the highest
compression ratios possible but at the cost of being extremely
slow on both compress and decompress (4x slower than lzma which
is the default).
Low level options
-L 1..9
Set the compression level from 1 to 9. The default is to use
level 7, which gives good all round compression. The compression
level is also strongly related to how much memory lrzip uses.
See the -w option for details.
-N value
The default nice value is 19. This option can be used to set the
priority scheduling for the lrzip backup or decompression. Valid
nice values are from -20 to 19. Note this does NOT speed up or
slow down compression.
-p value
Set the number of processor count to determine the number of
threads to run. Normally lrzip will scale according to the
number of CPUs it detects. Using this will override the value in
case you wish to use less CPUs to either decrease the load on
your machine, or to improve compression. Setting it to 1 will
maximise compression but will not attempt to use more than one
CPU.
-T Disables the LZO compressibility threshold testing when a slower
compression back-end is used. LZO testing is normally performed
for the slower back-end compression of LZMA and ZPAQ. The
reasoning is that if it is completely incompressible by LZO then
it will also be incompressible by them. Thus if a block fails to
be compressed by the very fast LZO, lrzip will not attempt to
compress that block with the slower compressor, thereby saving
time. If this option is enabled, it will bypass the LZO testing
and attempt to compress each block regardless.
-U Unlimited window size. If this option is set, and the file being
compressed does not fit into the available ram, lrzip will use a
moving second buffer as a "sliding mmap" which emulates having
infinite ram. This will provide the most possible compression in
the first rzip stage which can improve the compression of ultra
large files when they're bigger than the available ram. However
it runs progressively slower the larger the difference between
ram and the file size, so is best reserved for when the smallest
possible size is desired on a very large file, and the time
taken is not important.
-w n Set the maximum allowable compression window size to n in
hundreds of megabytes. This is the amount of memory lrzip will
search during its first stage of pre-compression and is the main
thing that will determine how much benefit lrzip will provide
over ordinary compression with the 2nd stage algorithm. If not
set (recommended), the value chosen will be determined by an
internal heuristic in lrzip which uses the most memory that is
reasonable, without any hard upper limit. It is limited to 2GB
on 32bit machines. lrzip will always reduce the window size to
the biggest it can be without running out of memory.
INSTALLATION
"make install" or just install lrzip somewhere in your search path.
COMPRESSION ALGORITHM
LRZIP operates in two stages. The first stage finds and encodes large
chunks of duplicated data over potentially very long distances in the
input file. The second stage is to use a compression algorithm to
compress the output of the first stage. The compression algorithm can
be chosen to be optimised for extreme size (zpaq), size (lzma -
default), speed (lzo), legacy (bzip2 or gzip) or can be omitted
entirely doing only the first stage. A one stage only compressed file
can almost always improve both the compression size and speed done by a
subsequent compression program.
The key difference between lrzip and other well known compression
algorithms is its ability to take advantage of very long distance
redundancy. The well known deflate algorithm used in gzip uses a
maximum history buffer of 32k. The block sorting algorithm used in
bzip2 is limited to 900k of history. The history buffer in lrzip can be
any size long, not even limited by available ram.
It is quite common these days to need to compress files that contain
long distance redundancies. For example, when compressing a set of home
directories several users might have copies of the same file, or of
quite similar files. It is also common to have a single file that
contains large duplicated chunks over long distances, such as pdf files
containing repeated copies of the same image. Most compression programs
won't be able to take advantage of this redundancy, and thus might
achieve a much lower compression ratio than lrzip can achieve.
FILES
LRZIP recognises a configuration file that contains default settings.
This configuration is searched for in the current directory,
/etc/lrzip, and $HOME/.lrzip. The configuration filename must be
lrzip.conf.
ENVIRONMENT
By default, lrzip will search for and use a configuration file,
lrzip.conf. If the user wishes to bypass the file, a startup ENV
variable may be set.
LRZIP = NOCONFIG [lrzip|lrunzip] [OPTIONS] <file>
which will force lrzip to ignore the configuration file.
HISTORY - Notes on rzip by Andrew Tridgell
The ideas behind rzip were first implemented in 1998 while I was
working on rsync. That version was too slow to be practical, and was
replaced by this version in 2003. LRZIP was created by the desire to
have better compression and/or speed by Con Kolivas on blending the
lzma and lzo compression algorithms with the rzip first stage, and
extending the compression windows to scale with increasing ram sizes.
BUGS
Nil known.
SEE ALSO
lrzip.conf(5), lrunzip(1), lrzcat(1), lrztar(1), lrzuntar(1), bzip2(1),
gzip(1), lzop(1), rzip(1), zip(1)
AUTHOR and CREDITS
lrzip is being extensively bastardised from rzip by Con Kolivas.
rzip was written by Andrew Tridgell.
lzma was written by Igor Pavlov.
lzo was written by Markus Oberhumer.
zpaq was written by Matt Mahoney.
Peter Hyman added informational output, updated LZMA SDK, and added
lzma multi-threading capabilities.
If you wish to report a problem, or make a suggestion, then please
email Con at kernel@kolivas.org
lrzip is released under the GNU General Public License version 2.
Please see the file COPYING for license details.
May 2011 lrzip(1)