DragonFly On-Line Manual Pages
rwfglob(1) SiLK Tool Suite rwfglob(1)
NAME
rwfglob - Print files that rwfilter's File Selection switches will
access
SYNOPSIS
rwfglob { [--class=CLASS] [--type={all | TYPE[,TYPE ...]}]
| [--flowtype=CLASS/TYPE[,CLASS/TYPE ...]] }
[--sensors=SENSOR[,SENSOR ...]]
[--start-date=YYYY/MM/DD[:HH] [--end-date=YYYY/MM/DD[:HH]]]
[--data-rootdir=ROOT_DIRECTORY] [--site-config-file=FILENAME]
[--print-missing-files] [--no-block-check] [--no-file-names]
[--no-summary]
rwfglob [--data-rootdir=ROOT_DIRECTORY]
[--site-config-file=FILENAME] --help
rwfglob --version
DESCRIPTION
rwfglob accepts the normal File Selection options of rrwwffiilltteerr(1) and
prints, to the standard output, the names of the files that would
normally be accessed, one file name per line. At the end, a summary is
printed, to the standard output, of the number of files that rwfglob
found. To suppress the printing of the file names and/or the summary,
specify the --no-file-names and/or --no-summary switches, respectively.
By default, rwfglob only prints the names of files that exist. When
the --print-missing-files switch is provided, rwfglob prints, to the
standard error, the names of files that it did not find, one file name
per line, preceded by the text 'Missing '.
For each file it finds, rwfglob will check the size of the file and the
number of blocks allocated to the file. If the block count is zero but
the file size is non-zero, rwfglob treats the file as existing but as
residing on tape. The names of these files are printed to the standard
output, but each name is preceded by the text ' \t*** ON_TAPE ***'
where '\t' represents a tab character. The summary line will include
the number of files that rwfglob believes are on tape. To suppress
this check and to remove the count from the summary line, use the
--no-block-check switch.
OPTIONS
Option names may be abbreviated if the abbreviation is unique or is an
exact match for an option. A parameter to an option may be specified
as --arg=param or --arg param, though the first form is required for
options that take optional parameters.
Selection Switches
This set of switches are the same as those used by rwfilter to select
the files to process. At least one of these switches must be provided.
--class=CLASS
The --class switch is used to specify a group of files to print.
Only a single class may be selected with the --class switch; for
multiple classes, use the --flowtypes switch. Classes are defined
in the ssiillkk..ccoonnff(5) site configuration file. If the --class option
is not given, the default-class as specified in silk.conf is used.
To see the available classes and the default class, either examine
the output from rwfglob --help or invoke rrwwssiitteeiinnffoo(1) with the
switch --fields=class,default-class.
--type={"all" | TYPE[,TYPE]}
The --type predicate further specifies data within the selected
CLASS by listing the TYPEs of traffic to process. The switch takes
a comma-separated list of types or the keyword "all" which
specifies all types for the specified CLASS. Types are defined in
silk.conf, they typically refer to the direction of the flow, and
they may vary by class. When the --type switch is not specified, a
list of default types is used. The default-type list is determined
by the value of CLASS, and the default types generally include only
incoming traffic. To see the available types and the default types
for each class, examine the --help output of rwfglob or run
rwsiteinfo with --fields=class,type,default-type.
--flowtypes=CLASS/TYPE[,CLASS/TYPE ...]
The --flowtype predicate provides an alternate way to specify
class/type pairs. The --flowtypes switch allows a single rwfglob
invocation to print data from multiple classes. The keyword "all"
may be used for the CLASS and/or TYPE to select all classes and/or
types.
--sensors=SENSOR[,SENSOR ...]
The --sensor switch is used to select data from specific sensors.
The parameter is a comma separated list of sensor names, sensor IDs
(integers), and/or ranges of sensor IDs. Sensors are defined in
the ssiillkk..ccoonnff(5) site configuration file, and the rrwwssiitteeiinnffoo(1)
command can be used to print a mapping of sensor names to IDs and
classes. When the --sensor switch is not specified, the default is
to use all sensors which are valid for the specified class(es).
--start-date=YYYY/MM/DD[:HH]
--end-date=YYYY/MM/DD[:HH]
The date predicates indicate which days and hours to consider when
creating the list of files. The dates may be expressed as seconds
since the UNIX epoch or in "YYYY/MM/DD[:HH]" format, where the hour
is optional. A "T" may be used in place of the ":" to separate the
day and hour. Whether the "YYYY/MM/DD[:HH]" strings represent
times in UTC or the local timezone depend on how SiLK was compiled.
To determine how your version of SiLK was compiled, see the
"Timezone support" setting in the output from rwfglob --version.
When times are expressed in "YYYY/MM/DD[:HH]" format:
o When both --start-date and --end-date are specified to hour
precision, all hours within that time range are processed.
o When --start-date is specified to day precision, the hour
specified in --end-date (if any) is ignored, and files for all
dates between midnight on start-date and 23:59 on end-date are
processed.
o When --start-date is specified to hour precision and --end-date
is specified to day precision, the hour of the start-date is
used as the hour for the end-date.
o When --end-date is not specified and --start-date is specified
to day precision, files for that complete day are processed.
o When --end-date is not specified and --start-date is specified
to hour precision, files for that single hour are processed.
When at least one time is expressed as seconds since the UNIX
epoch:
o When --end-date is specified in epoch seconds, the given
--start-date and --end-date are considered to be in hour
precision.
o When --start-date is specified in epoch seconds and --end-date
is specified in "YYYY/MM/DD[:HH]" format, the start-date is
considered to be in day precision if it divisible by 86400, and
hour precision otherwise.
o When --start-date is specified in epoch seconds and --end-date
is not given, the start-date is considered to be in hour-
precision.
When neither --start-date nor --end-date is given, rwfglob prints
all files for the current day.
It is an error to specify --end-date without specifying
--start-date.
--data-rootdir=ROOT_DIRECTORY
Tell rwfglob to use ROOT_DIRECTORY as the root of the data
repository, which overrides the location given in the
SILK_DATA_ROOTDIR environment variable, which in turn overrides the
location that was compiled into rwfglob (/data).
--site-config-file=FILENAME
Read the SiLK site configuration from the named file FILENAME.
When this switch is not provided, rwfglob searches for the site
configuration file in the locations specified in the "FILES"
section.
--print-missing-files
This option prints to the standard error the names of the files
that rwfglob expected to find but did not. The file names are
preceded by the text 'Missing '; each file name appears on a
separate line. This switch is useful for debugging, but the list
of files it produces can be misleading. For example, suppose there
is a decommissioned sensor that still appears in the silk.conf
file; rwfglob considers these data files as missing even though
their absence is expected. Use the output from this switch
judiciously.
Application Switches
--no-block-check
This option instructs rwfglob not to check whether the file exists
on tape by checking whether the number of blocks allocated to the
file is zero. By default, rwfglob precedes a file name that has a
block count of 0 with the text ' \t*** ON_TAPE ***'.
--no-file-names
This option instructs rwfglob not to print the names of the files
that it successfully finds. By default, rwfglob prints the names
of the files it finds and a summary line showing the number of
files it found. When both this switch and --print-missing-files
are specified, rwfglob prints only the names of missing files (and
the summary).
--no-summary
This option instructs rwfglob not to print the summary line (that
is, the line that shows the number of files found). By default,
rwfglob prints the names of the files it finds and a summary line
showing the number of files it found.
--help
Print the available options and exit. The available classes and
types will be included in output; you may specify a different root
directory or site configuration file before --help to see the
classes and types available for that site.
--version
Print the version number and information about how SiLK was
configured, then exit the application.
EXAMPLES
In the following examples, the dollar sign ("$") represents the shell
prompt. The text after the dollar sign represents the command line.
Looking at a day on a single sensor:
$ rwfglob --start=2003/10/11 --sensor=2
/data/in/2003/10/11/in-GAMMA_20031011.23
/data/in/2003/10/11/in-GAMMA_20031011.22
/data/in/2003/10/11/in-GAMMA_20031011.21
/data/in/2003/10/11/in-GAMMA_20031011.20
/data/in/2003/10/11/in-GAMMA_20031011.19
/data/in/2003/10/11/in-GAMMA_20031011.18
/data/in/2003/10/11/in-GAMMA_20031011.17
/data/in/2003/10/11/in-GAMMA_20031011.16
/data/in/2003/10/11/in-GAMMA_20031011.15
/data/in/2003/10/11/in-GAMMA_20031011.14
/data/in/2003/10/11/in-GAMMA_20031011.13
/data/in/2003/10/11/in-GAMMA_20031011.12
/data/in/2003/10/11/in-GAMMA_20031011.11
/data/in/2003/10/11/in-GAMMA_20031011.10
/data/in/2003/10/11/in-GAMMA_20031011.09
/data/in/2003/10/11/in-GAMMA_20031011.08
/data/in/2003/10/11/in-GAMMA_20031011.07
/data/in/2003/10/11/in-GAMMA_20031011.06
/data/in/2003/10/11/in-GAMMA_20031011.05
/data/in/2003/10/11/in-GAMMA_20031011.04
/data/in/2003/10/11/in-GAMMA_20031011.03
/data/in/2003/10/11/in-GAMMA_20031011.02
/data/in/2003/10/11/in-GAMMA_20031011.01
/data/in/2003/10/11/in-GAMMA_20031011.00
globbed 24 files; 0 on tape
If you only want the summary, specify --no-file-names
$ rwfglob --start-date=2003/10/11 --sensor=2 --no-file-names
globbed 24 files; 0 on tape
ENVIRONMENT
SILK_CONFIG_FILE
This environment variable is used as the value for the
--site-config-file when that switch is not provided.
SILK_DATA_ROOTDIR
This environment variable specifies the root directory of data
repository. This value overrides the compiled-in value, and
rwfglob uses it unless the --data-rootdir switch is specified. In
addition, rwfglob may use this value when searching for the SiLK
site configuration file. See the "FILES" section for details.
SILK_PATH
This environment variable gives the root of the install tree. When
searching for configuration files, rwfglob may use this environment
variable. See the "FILES" section for details.
TZ When a SiLK installation is built to use the local timezone (to
determine if this is the case, check the "Timezone support" value
in the output from rwfglob --version), the value of the TZ
environment variable determines the timezone in which rwfglob
parses timestamps. (The date on the filenames that rwfglob returns
are always in UTC.) If the TZ environment variable is not set, the
default timezone is used. Setting TZ to 0 or the empty string
causes timestamps to be parsed as UTC. The value of the TZ
environment variable is ignored when the SiLK installation uses
utc. For system information on the TZ variable, see ttzzsseett(3) or
eennvviirroonn(7).
FILES
${SILK_CONFIG_FILE}
ROOT_DIRECTORY/silk.conf
${SILK_PATH}/share/silk/silk.conf
${SILK_PATH}/share/silk.conf
/usr/local/share/silk/silk.conf
/usr/local/share/silk.conf
Possible locations for the SiLK site configuration file which are
checked when the --site-config-file switch is not provided, where
ROOT_DIRECTORY/ is the directory rwfglob is using as the root of
the data repository.
${SILK_DATA_ROOTDIR}/
/data/
Locations for the root directory of the data repository when the
--data-rootdir switch is not specified.
SEE ALSO
rrwwffiilltteerr(1), rrwwssiitteeiinnffoo(1), ssiillkk..ccoonnff(5), ssiillkk(7), ttzzsseett(3), eennvviirroonn(7)
BUGS
The --print-missing-files option needs to be smarter about what files
are really missing.
The output of --print-missing-files goes to the standard error, while
all other output goes to the standard output. To redirect the output
of --print-missing-files to the standard output, use the following in a
Bourne-compatible shell:
$ rwfglob --print-missing-files ... 2>&1
The block count check is of unknown portability across different tape-
farm systems.
SiLK 3.11.0.1 2016-02-19 rwfglob(1)