DragonFly On-Line Manual Pages
rwflowappend(8) SiLK Tool Suite rwflowappend(8)
NAME
rwflowappend - Append incremental SiLK Flow files to hourly files
SYNOPSIS
rwflowappend --incoming-directory=DIR_PATH --root-directory=DIR_PATH
--error-directory=DIR_PATH [--archive-directory=DIR_PATH]
[--flat-archive] [--post-command=COMMAND]
[--hour-file-command=COMMAND] [--threads=N]
[--reject-hours-past=NUM] [--reject-hours-future=NUM]
[--no-file-locking] [--polling-interval=NUM]
[--byte-order=ENDIAN] [--pad-header]
[--compression-method=COMP_METHOD]
[--site-config-file=FILENAME]
{ --log-destination=DESTINATION
| --log-pathname=FILE_PATH
| --log-directory=DIR_PATH [--log-basename=LOG_BASENAME]
[--log-post-rotate=COMMAND] }
[--log-level=LEVEL] [--log-sysfacility=NUMBER]
[--pidfile=FILE_PATH] [--no-chdir] [--no-daemon]
rwflowappend --help
rwflowappend --version
DESCRIPTION
rwflowappend is a daemon that watches a directory for files that
contain small numbers of SiLK Flow records---these files are called
incremental files---as generated by rrwwfflloowwppaacckk(8) when it is run with
--output-mode=incremental-files or --output-mode=sending. rwflowappend
appends these SiLK Flow records to the hourly files stored in the SiLK
data repository whose directory tree root is specified by the
--root-directory switch.
The directory that rwflowappend watches for incremental files is
specified by --incoming-directory. Once rwflowappend processes an
incremental file, the file is deleted unless the --archive-directory
switch is specified, in which case the incremental file is moved to
that directory.
If a fatal write error occurs (for example, the disk containing the
data repository becomes full), rwflowappend exits. Before exiting,
rwflowappend attempts to truncate the hourly file to the size it had
when it was opened, and rwflowappend moves the incremental file it was
reading to the directory specified by --error-directory.
Running rwflowappend separately from rwflowpack is used when you wish
to copy the packed SiLK Flow records from the machine doing the packing
to multiple machines for use by analysts. Almost any network file
transport protocol may be used to move the files from the packing
machine to the destination machine where rwflowappend is running,
though we have written the rrwwsseennddeerr(8) and rrwwrreecceeiivveerr(8) to perform
this task.
Separate rwflowpack and rwflowappend processes are also recommended if
you want another process (such as the Analysis Pipeline
<http://tools.netsa.cert.org/analysis-pipeline/>) to process the SiLK
Flow records as they are generated.
OPTIONS
Option names may be abbreviated if the abbreviation is unique or is an
exact match for an option. A parameter to an option may be specified
as --arg=param or --arg param, though the first form is required for
options that take optional parameters.
General Configuration
The following switches are required:
--incoming-directory=DIR_PATH
Watch this directory for new incremental files to append to the
hourly files. rwflowappend ignores any files in this directory
that are empty or whose names begin with a dot ("."). In addition,
new files are only considered when their size is constant for one
polling-interval after they are first noticed.
--root-directory=DIR_PATH
Append to existing hourly files and create new hourly files in the
directory tree rooted at this location. The directory tree has the
same subdirectory structure as that created by rwflowpack.
--error-directory=DIR_PATH
Store in this directory incremental files that were NOT
successfully appended to an hourly file.
The following switches are optional:
--archive-directory=DIR_PATH
Move each incremental file to DIR_PATH or a subdirectory of it
after rwflowappend has successfully appended the incremental file
to an hourly file. If this switch is not provided, the incremental
files are deleted once they are successfully appended to an hourly
file. When the --flat-archive switch is also provided, incremental
files are moved into the top of DIR_PATH; when --flat-archive is
not given, each incremental file is moved to a subdirectory of
DIR_PATH that mirrors the path of the hourly file to which the
incremental file was appended. Removing files from the archive-
directory is not the job of rwflowappend; the system administrator
should implement a separate process to clean this directory. This
switch is required when the --post-command switch is present.
--flat-archive
When archiving incremental files via --archive-directory, move the
files into the top of the archive-directory, not into
subdirectories of it. This switch has no effect if
--archive-directory is not also specified. This switch can be used
to allow another process to watch for new files appearing in the
archive-directory.
--post-command=COMMAND
Run COMMAND on each incremental file after rwflowappend has
successfully appended it to an hourly file and moved it into the
archive-directory. Each occurrence of the string %s in COMMAND is
replaced with the full path to the incremental file in the archive-
directory, and each occurrence of "%%" is replaced with "%". If
any other character follows "%", rwflowappend exits with an error.
When using this feature, the --archive-directory must be specified.
See also the rrwwppoolllleexxeecc(8) daemon.
--hour-file-command=COMMAND
Run COMMAND upon creation of a new hourly file. The string %s in
COMMAND is replaced with the full path to the hourly file, and the
string "%%" is replaced with "%". If any other character follows
"%", rwflowappend exits with an error.
--threads=N
Invoke rwflowappend with N threads reading the incremental files
and writing to the repository. When this switch is not provided,
rwflowappend runs with a single thread. Since SiLK 3.8.2.
--reject-hours-past=NUM
Reject incremental files containing records whose starting hour
occurs more than this number of hours in the past relative to the
current hour. Incremental files that violate this value are moved
into the error directory. Times are compared using the starting
hour of the flow record and the current hour. For example, flow
records that start at 18:02:56 and 18:58:04 are considered 1 hour
in the past whether the current time is 19:01:47 or 19:59:33. When
performing live data collection, it is not uncommon to get flows
one to two hours in the past due to the flow generator's active
timeout (often 30 minutes) and the time to transfer the flow
records through the collection system. The default is to accept
all incremental files.
--reject-hours-future=NUM
Similar to --reject-hours-past, but reject incremental files
containing records whose starting hour occurs more than this number
of hours in the future relative to the current hour. Future dated
flow records are rare, but can occur due to time drift at the
sensor. The default is to accept all incremental files.
--no-file-locking
Do not use advisory write locks. Normally, rwflowappend obtains a
write lock on an hourly file prior to writing records to it. The
write lock prevents two instances of rwflowappend from writing to
the same hourly file simultaneously. However, attempting to use a
write lock on some file systems causes rwflowappend to exit with an
error, and this switch can be use when writing data to these file
systems.
--polling-interval=NUM
Check the incoming directory for new incremental files every NUM
seconds. The default polling interval is 15 seconds.
--byte-order=ENDIAN
Set the byte order for newly created SiLK Flow files. When
appending records to an existing file, the byte order of the file
is maintained. The argument is one of the following:
"as-is"
Maintain the byte order of the incremental files (i.e., the
byte order specified to rwflowpack). This is the default.
"native"
Use the byte order of the machine where rwflowappend is
running.
"big"
Use network byte order (big endian) for the flow files.
"little"
Write the flow files in little endian format.
--compression-method=COMP_METHOD
Specify how to compress newly created hourly files. When this
switch is not given, newly created hourly files maintain the
compression method used by the incremental file (i.e., the
compression method specified to rwflowpack). When appending to an
existing hourly file, the compression method of the file is
maintained. The valid values for COMP_METHOD are determined by
which external libraries were found when SiLK was compiled. To see
the available compression methods and the default method. use the
--help or --version switch. SiLK can support the following
COMP_METHOD values when the required libraries are available.
none
Do not compress the output using an external library.
zlib
Use the zzlliibb(3) library for compressing the output. Using zlib
produces the smallest output files at the cost of speed.
lzo1x
Use the lzo1x algorithm from the LZO real time compression
library for compression. This compression provides good
compression with less memory and CPU overhead.
best
Use lzo1x if available, otherwise use zlib.
--site-config-file=FILENAME
Read the SiLK site configuration from the named file FILENAME.
When this switch is not provided, rwflowappend searches for the
site configuration file in the locations specified in the "FILES"
section.
Logging and Daemon Configuration
One of the following mutually-exclusive switches is required:
--log-destination=DESTINATION
Specify the destination where logging messages are written. When
DESTINATION begins with a slash "/", it is treated as a file system
path and all log messages are written to that file; there is no log
rotation. When DESTINATION does not begin with "/", it must be one
of the following strings:
"none"
Messages are not written anywhere.
"stdout"
Messages are written to the standard output.
"stderr"
Messages are written to the standard error.
"syslog"
Messages are written using the ssyysslloogg(3) facility.
"both"
Messages are written to the syslog facility and to the standard
error (this option is not available on all platforms).
--log-directory=DIR_PATH
Use DIR_PATH as the directory where the log files are written.
DIR_PATH must be a complete directory path. The log files have the
form
DIR_PATH/LOG_BASENAME-YYYYMMDD.log
where YYYYMMDD is the current date and LOG_BASENAME is the
application name or the value passed to the --log-basename switch
when provided. The log files are rotated: At midnight local time,
a new log is opened, the previous file is closed, and the command
specified by --log-post-rotate is invoked on the previous day's log
file. (Old log files are not removed by rwflowappend; the
administrator should use another tool to remove them.) When this
switch is provided, a process-ID file (PID) is also written in this
directory unless the --pidfile switch is provided.
--log-pathname=FILE_PATH
Use FILE_PATH as the complete path to the log file. The log file
is not rotated.
The following set of switches is optional:
--log-level=LEVEL
Set the severity of messages that will be logged. The levels from
most severe to least are: "emerg", "alert", "crit", "err",
"warning", "notice", "info", "debug". The default is "info".
--log-sysfacility=NUMBER
Set the facility that ssyysslloogg(3) uses for logging messages. This
switch takes a number as an argument. The default is a value that
corresponds to "LOG_USER" on the system where rwflowappend is
running. This switch produces an error unless
--log-destination=syslog is specified.
--log-basename=LOG_BASENAME
Use LOG_BASENAME in place of the application name in the name of
log files in the log directory. See the description of the
--log-directory switch. This switch does not affect the name of
the process-ID file.
--log-post-rotate=COMMAND
Run COMMAND on the previous day's log file after log rotation.
When this switch is not specified, the previous day's log file is
compressed with ggzziipp(1). When the switch is specified and COMMAND
is the empty string, no action is taken on the log file. Each
occurrence of the string %s in COMMAND will be replaced with the
full path to the log file, and each occurrence of "%%" will be
replaced with "%". If any other character follows "%",
rwflowappend exits with an error. Specifying this switch without
also using --log-directory is an error.
--pidfile=FILE_PATH
Set the complete path to the file in which rwflowappend writes its
process ID (PID) when it is running as a daemon. No PID file is
written when --no-daemon is given. When this switch is not
present, no PID file is written unless the --log-directory switch
is specified, in which case the PID is written to
LOGPATH/rwflowappend.pid.
--no-chdir
Do not change directory to the root directory. When rwflowappend
becomes a daemon process, it changes its current directory to the
root directory so as to avoid potentially running on a mounted file
system. Specifying --no-chdir prevents this behavior, which may be
useful during debugging. The application does not change its
directory when --no-daemon is given.
--no-daemon
Force rwflowappend to run in the foreground---it does not become a
daemon process. This may be useful during debugging.
--help
Print the available options and exit.
--version
Print the version number and information about how SiLK was
configured, then exit the application.
ENVIRONMENT
SILK_CONFIG_FILE
This environment variable is used as the value for the
--site-config-file when that switch is not provided.
SILK_PATH
This environment variable gives the root of the install tree. When
searching for configuration files, rwflowappend may use this
environment variable. See the "FILES" section for details.
FILES
${SILK_CONFIG_FILE}
ROOT_DIRECTORY/silk.conf
${SILK_PATH}/share/silk/silk.conf
${SILK_PATH}/share/silk.conf
/usr/local/share/silk/silk.conf
/usr/local/share/silk.conf
Possible locations for the SiLK site configuration file which are
checked when the --site-config-file switch is not provided, where
ROOT_DIRECTORY/ is the directory specified to the --root-directory
switch.
SEE ALSO
rrwwfflloowwppaacckk(8), rrwwrreecceeiivveerr(8), rrwwsseennddeerr(8), rrwwppoolllleexxeecc(8), rrwwffiilltteerr(1),
ssiillkk(7), ggzziipp(1), ssyysslloogg(3), zzlliibb(3), The SiLK Installation Handbook
NOTES
rwflowappend does not check the integrity of an hourly file before
appending records to it.
Prior to SiLK 3.6.0 when a write error occurred, rwflowappend could
leave a partially written record or compressed block in the hourly
file. If a partially written compressed block remained and additional
compressed blocks were appended, these compressed blocks could not be
read by other SiLK tools. If a partially written record remained and
additional records were appended, SiLK tools would read the unaligned
data as if it were aligned and produce garbage records. Although SiLK
3.6.0 works around the issue on write errors, similar issues can occur
if rwflowappend is suddenly killed (e.g., by "kill -9").
When a write error occurs, rwflowappend may leave a zero byte file in
the data repository. Such files do affect the exit status of
rrwwffiilltteerr(1), though rwfilter warns about being unable to read the
header from the file.
As of SiLK 3.1.0, rwflowappend obtains an advisory write lock on the
hourly file it is writing, allowing multiple rwflowappend processes to
write to the same hourly file. File locking may be disabled by using
the --no-file-locking switch. If this switch is enabled, the
administrator must ensure that multiple rwflowappend processes do not
attempt to write to the same hourly file simultaneously.
SiLK 3.11.0.1 2016-02-19 rwflowappend(8)