DragonFly On-Line Manual Pages
LZX Compression(3) DragonFly Library Functions Manual (local)
NAME
lzx_init, lzx_compress_block, lzx_finish - LZX compression
SYNOPSIS
#include <sys/types.h>
#include <lzx_compress.h>
int
lzx_init(lzx_data ** lzxdp, int wsize_code, lzx_get_bytes_t get_bytes,
void *get_bytes_arg, lzx_at_eof_t at_eof, lzx_put_bytes_t put_bytes,
void *put_bytes_arg, lzx_mark_frame_t mark_frame,
void *mark_frame_arg);
int
lzx_compress_block(lzx_data *lzxd, int block_size, int subdivide);
int
lzx_finish(lzx_data *lzxd, struct lzx_results *lzxr);
void
lzx_reset(lzx_data *lzxd);
DESCRIPTION
The lzx_init(), lzx_compress_block(), and lzx_finish() functions comprise
an compression engine for Microsoft's LZX compression format.
Initializing and releasing the LZX compressor
The lzx_init() function takes a wsize_code to indicate the log (base 2)
of the window size for compression, so 15 is 32K, 16 is 64K, on up to 21
meaning 2MB. It also takes the following callback functions and their
associated arguments:
int get_bytes(void *get_bytes_arg, int n, void *buf)
The lzx_compress_block() routine calls this function when it
needs more uncompressed input to process. The number of bytes
requested is n and the bytes should be placed in the buffer
pointed to by buf. The get_bytes() function should return the
number of bytes actually provided (which must not be greater
than n), nor 0, except at EOF.
int at_eof(void * get_bytes_arg)
Must return 0 if the end of the input data has not been reached,
positive otherwise. Note that this function takes the same
argument as get_bytes().
int put_bytes(void * put_bytes_arg, int n, void * buf)
The put_bytes() callback is called by lzx_compress() when
compressed bytes need to be output. The number of bytes to be
output is n and the bytes are in the buffer pointed to by buf.
int mark_frame(void *mark_frame_arg, uint32_t uncomp, uint32_t
comp)
The mark_frame() callback is called whenever LZX_FRAME_SIZE
(0x8000) uncompressed bytes have been processed. The current
(as of the last call to put_bytes() ) location in the
uncompressed and compressed data streams are provided in uncomp
and comp respectively. This is intended for .CHM (ITSS) and
other similar files which require a "reset table" listing the
frame locations. This callback is optional; if the mark_frame
argument to lzx_init() is NULL, no function will be called at
the end of each frame.
The lzx_init() function allocates an opaque structure, a pointer to which
will be returned in lzxdp. A pointer to this structure may be passed to
the other LZX compression functions. The function returns negative on
error, 0 otherwise
The lzx_finish() function writes out any unflushed data, releases all
memory held by the compressor (including the lzxd structure) and
optionally fills in the lzx_results structure, a pointer to which is
passed in as lzxr (NULL if results are not required)
Running the compressor
The lzx_compress_block() function takes the opaque pointer returned by
lzx_init(), a block_size, and a flag which says whether or not to
subdivide the block. If the subdivide flag is set, blocks may be
subdivided to increase compression ratio based on the entropy of the data
at a given point. Otherwise, just one block is created. Returns
negative on error, 0 otherwise.
Note:
The block size must not be larger than the window size. While the
compressor will create apparently-valid LZX files if this restriction
is violated, some decompressors will not handle them.
The lzx_reset() function may be called after any block in order to reset
all compression state except the number of compressed and uncompressed
bytes processed. This forces the one-bit Intel preprocessing header to
be output again, the Lempel-Ziv window to be cleared, and the Huffman
tables to be reset to zero length. It should only be called on a frame
boundary; the results of calling it elsewhere or during a callback are
undefined.
To compress data, simply call lzx_compress_block() and optionally
lzx_reset() repeatedly, handling the various callbacks described above,
until your data is exhausted.
ERRORS
The functions return a negative number on error.
The callbacks are intended to return a negative result on error, but this
is not yet understood by the compressor.
BUGS
The compressor is currently unable to output an uncompressed block, so
incompressible data may expand more than is necessary (though still not
more than is permitted by the CAB standard, 6144 bytes.)
There is no well-defined set of error codes.
There is no way for the callbacks to report an error and abort the
compression.
The algorithm for splitting blocks is suboptimal.
AUTHOR
Matthew T. Russotto
REFERENCES
LZXFMT.DOC -- Microsoft LZX Data Compression Format (part of Microsoft
Cabinet SDK)
Comments in cabextract.c, concerning errors in LZXFMT.DOC (part of
cabextract, at ~ http://www.kyz.uklinux.net/cabextract.php3)
CHM file format documentation (~
http://www.speakeasy.net/~russotto/chmformat.html)
SEE ALSO
cabextract(1)
LOCAL May 27, 2002 LOCAL