DragonFly On-Line Manual Pages

Search: Section:  


rwbagbuild(1)                   SiLK Tool Suite                  rwbagbuild(1)

NAME

rwbagbuild - Create a binary Bag from non-flow data.

SYNOPSIS

rwbagbuild { --set-input=SETFILE | --bag-input=TEXTFILE } [--delimiter=C] [--default-count=DEFAULTCOUNT] [--key-type=FIELD_TYPE] [--counter-type=FIELD_TYPE] [--note-add=TEXT] [--note-file-add=FILE] [--compression-method=COMP_METHOD] [--output-path=OUTPUTFILE] rwbagbuild --help rwbagbuild --version

DESCRIPTION

rwbagbuild builds a binary Bag file from an IPset file or from textual input. When creating a Bag from an IPset, the value associated with each IP address is the value given by the --default-count switch, or 1 if the switch isn't provided. The textual input read from the argument to the --bag-input switch is processed a line at a time. Comments begin with a '"#"'-character and continue to the end of the line; they are stripped from each line. Any line that is blank or contains only whitespace is ignored. All other lines must contain a valid key or key-count pair; whitespace around the key and count is ignored. If the delimiter character (specified by the --delimiter switch and having pipe ('"|"') as its default) is not present, the line must contain only an IP address or an integer key. If the delimiter is present, the line must contain an IP address or integer key before the delimiter and an integer count after the delimiter. These lines may have a second delimiter after the integer count; the second delimiter and any text to the right of it are ignored. When the --default-count switch is specified, its value is used as the count for each key, and the count value parsed from each line, if any, is ignored. Otherwise, the parsed count is used, or 1 is used as the count if no delimiter was present. For each key-count pair, the key is inserted into Bag with its count or, if the key is already present in the Bag, its total count is be incremented by the count from this line. When using the --default-count switch, the count for a key that appears in the input N times is the product of N and DEFAULTCOUNT. The IP address or integer key must be expressed in one of the following formats. rwbagbuild complains if the key field contains a mixture of IPv6 addresses and integer values. o Dotted decimal---all 4 octets are required: 10.1.2.4 o An unsigned 32-bit integer: 167838212 o An IPv6 address in canonical form (when SiLK has been compiled with IPv6 support): 2001:db8:a:1::2:4 ::ffff:10.1.2.4 o Any of the above with a CIDR designation---for dotted decimal all four octets are still required: 10.1.2.4/31 167838212/31 2001:db8:a:1::2:4/127 ::ffff:10.1.2.4/31 o SiLK IP wildcard notation. A SiLK IP Wildcard can represent multiple IPv4 or IPv6 addresses. An IP Wildcard contains an IP in its canonical form, except each part of the IP (where part is an octet for IPv4 or a hexadectet for IPv6) may be a single value, a range, a comma separated list of values and ranges, or the letter "x" to signify all values for that part of the IP (that is, "0-255" for IPv4). You may not specify a CIDR suffix when using the IP Wildcard notation. 10.x.1-2.4,5 2001:db8:a:x::1-2:4,5 If an IP address or count cannot be parsed, or if a line contains a delimiter character but no count, rwbagbuild prints an error and exits.

OPTIONS

Option names may be abbreviated if the abbreviation is unique or is an exact match for an option. A parameter to an option may be specified as --arg=param or --arg param, though the first form is required for options that take optional parameters. The following two switches control the type of input; one and only one must be provided: --set-input=SETFILE Create a Bag from an IPset. SETFILE is a filename, a named pipe, or the keyword "stdin" or "-" to read the IPset from the standard input. Counts have a volume of 1 when the --default-count switch is not specified. (IPsets are typically created by rrwwsseett(1) or rrwwsseettbbuuiilldd(1).) --bag-input=TEXTFILE Create a Bag from a delimited text file. TEXTFILE is a filename, a named pipe, or the keyword "stdin" or "-" to read the text from the standard input. See the "DESCRIPTION" section for the syntax of the TEXTFILE. --delimiter=C The delimiter to expect between each key-count pair of the TEXTFILE read by the --bag-input switch. The default delimiter is the vertical pipe ('"|"'). The delimiter is ignored if the --set-input switch is specified. When the delimiter is a whitespace character, any amount of whitespace may surround and separate the key and counter. Since '"#"' is used to denote comments and newline is used to denote records, neither is a valid delimiter character. --default-count=DEFAULTCOUNT Override the counts of all values in the input text or IPset with the value of DEFAULTCOUNT. DEFAULTCOUNT must be a positive integer. --key-type=FIELD_TYPE Write a entry into the header of the Bag file that specifies the key contains FIELD_TYPE values. When this switch is not specified, the key type of the Bag is set to "custom". The FIELD_TYPE is case insensitive. The supported FIELD_TYPEs are: sIPv4 source IP address, IPv4 only dIPv4 destination IP address, IPv4 only sPort source port dPort destination port protocol IP protocol packets packets, see also "sum-packets" bytes bytes, see also "sum-bytes" flags bitwise OR of TCP flags sTime starting time of the flow record, seconds resolution duration duration of the flow record, seconds resolution eTime ending time of the flow record, seconds resolution sensor sensor ID input SNMP input output SNMP output nhIPv4 next hop IP address, IPv4 only initialFlags TCP flags on first packet in the flow sessionFlags bitwise OR of TCP flags on all packets in the flow except the first attributes flow attributes set by the flow generator application guess as to the content of the flow, as set by the flow generator class class of the sensor type type of the sensor icmpTypeCode an encoded version of the ICMP type and code, where the type is in the upper byte and the code is in the lower byte sIPv6 source IP, IPv6 dIPv6 destination IP, IPv6 nhIPv6 next hop IP, IPv6 records count of flows sum-packets sum of packet counts sum-bytes sum of byte counts sum-duration sum of duration values any-IPv4 a generic IPv4 address any-IPv6 a generic IPv6 address any-port a generic port any-snmp a generic SNMP value any-time a generic time value, in seconds resolution custom a number --counter-type=FIELD_TYPE Write a entry into the header of the Bag file that specifies the counter contains FIELD_TYPE values. When this switch is not specified, the counter type of the Bag is set to "custom". The supported FIELD_TYPEs are the same as those for the key. --note-add=TEXT Add the specified TEXT to the header of the output file as an annotation. This switch may be repeated to add multiple annotations to a file. To view the annotations, use the rrwwffiilleeiinnffoo(1) tool. --note-file-add=FILENAME Open FILENAME and add the contents of that file to the header of the output file as an annotation. This switch may be repeated to add multiple annotations. Currently the application makes no effort to ensure that FILENAME contains text; be careful that you do not attempt to add a SiLK data file as an annotation. --compression-method=COMP_METHOD Specify how to compress the output. When this switch is not given, output to the standard output or to named pipes is not compressed, and output to files is compressed using the default chosen when SiLK was compiled. The valid values for COMP_METHOD are determined by which external libraries were found when SiLK was compiled. To see the available compression methods and the default method, use the --help or --version switch. SiLK can support the following COMP_METHOD values when the required libraries are available. none Do not compress the output using an external library. zlib Use the zzlliibb(3) library for compressing the output, and always compress the output regardless of the destination. Using zlib produces the smallest output files at the cost of speed. lzo1x Use the lzo1x algorithm from the LZO real time compression library for compression, and always compress the output regardless of the destination. This compression provides good compression with less memory and CPU overhead. best Use lzo1x if available, otherwise use zlib. Only compress the output when writing to a file. --output-path=OUTPUTFILE Redirect output to OUTPUTFILE. OUTPUTFILE is a filename, a named pipe, or the keyword "stdout" or "-" to write the bag to the standard output. --help Print the available options and exit. --version Print the version number and information about how SiLK was configured, then exit the application.

EXAMPLES

In the following examples, the dollar sign ("$") represents the shell prompt. The text after the dollar sign represents the command line. Lines have been wrapped for improved readability, and the back slash ("\") is used to indicate a wrapped line. Create a bag with IP addresses as keys from a text file Assume the file mybag.txt contains the following lines, where each line contains an IP address, a comma as a delimiter, a count, and ends with a newline. 192.168.0.1,5 192.168.0.2,500 192.168.0.3,3 192.168.0.4,14 192.168.0.5,5 To build a bag with it: $ rwbagbuild --bag-input=mybag.txt --delimiter=, > mybag.bag Use rrwwbbaaggccaatt(1) to view its contents: $ rwbagcat mybag.bag 192.168.0.1| 5| 192.168.0.2| 500| 192.168.0.3| 3| 192.168.0.4| 14| 192.168.0.5| 5| Create a bag with protocols as keys from a text file To create a Bag of protocol data from the text file myproto.txt: 1| 4| 6| 138| 17| 131| use $ rwbagbuild --key-type=proto --bag-input=myproto.txt > myproto.bag $ rwbagcat myproto.bag 1| 4| 6| 138| 17| 131| When the --key-type switch is specified, rwbagcat knows the keys should be printed as integers, and rrwwffiilleeiinnffoo(1) shows the type of the key: $ rwfileinfo --fields=bag myproto.bag myproto.bag: bag key: protocol @ 4 octets; counter: custom @ 8 octets Without the --key-type switch, rwbagbuild assumes the integers in myproto.txt represent IP addresses: $ rwbagbuild --bag-input=myproto.txt | rwbagcat 0.0.0.1| 4| 0.0.0.6| 138| 0.0.0.17| 131| Although the --integer-keys switch on rwbagcat forces it to print keys as integers, it is generally better to use the --key-type switch when creating the bag. $ rwbagbuild --bag-input=myproto.txt | rwbagcat --integer-keys 1| 4| 6| 138| 17| 131| Create a bag and override the existing counter To ignore the counts that exist in myproto.txt and set the counts for each protocol to 1, use the --default-count switch which overrides the existing value: $ rwbagbuild --key-type=protocol --bag-input=myproto.txt \ --default-count=1 --output-path=myproto1.bag $ rwbagcat myproto1.bag 1| 1| 6| 1| 17| 1| Create a bag with IP addresses as keys from an IPset file Given the IP set myset.set, create a bag where every entry in the bag has a count of 3: $ rwbagbuild --set-input=myset.set --default-count=3 \ --out=mybag2.bag Create a bag from multiple input files Suppose we have three IPset files, A.set, B.set, and C.set: $ rwsetcat A.set 10.0.0.1 10.0.0.2 $ rwsetcat B.set 10.0.0.2 10.0.0.3 $ rwsetcat C.set 10.0.0.1 10.0.0.2 10.0.0.4 We want to create a bag file from these IPset files where the count for each IP address is the number of files that IP appears in. rwbagbuild accepts a single file as an argument, so we cannot do the following: $ rwbagbuild --set-input=A.set --set-input=B.set ... # WRONG! (Even if we could repeat the --set-input switch, specifying it multiple times would be annoying if we had 300 files instead of only 3.) The IPset files are (mathematical) sets, so if we join them together first with rrwwsseettttooooll(1) and then run rwbagbuild, each IP address gets a count of 1: $ rwsettool --union A.set B.set C.set \ | rwbagbuild --set-input=- \ | rwbagcat 10.0.0.1| 1| 10.0.0.2| 1| 10.0.0.3| 1| 10.0.0.4| 1| When rwbagbuild is processing textual input, it sums the counters for keys that appear in the input multiple times. We can use rrwwsseettccaatt(1) to convert each IPset file to text and feed that as single textual stream to rwbagbuild. Use the --cidr-blocks switch on rwsetcat to reduce the amount of input that rwbagbuild must process. This is probably the best approach to the problem: $ rwsetcat --cidr-block *.set | rwbagbuild --bag-input=- > total1.bag $ rwbagcat total1.bag 10.0.0.1| 2| 10.0.0.2| 3| 10.0.0.3| 1| 10.0.0.4| 1| A less efficient solution is to convert each IPset to a bag and then use rrwwbbaaggttooooll(1) to add the bags together: $ for i in *.set ; do rwbagbuild --set-input=$i --output-file=/tmp/$i.bag ; done $ rwbagtool --add /tmp/*.set.bag > total2.bag $ rm /tmp/*.set.bag There is no need to create a bag file for each IPset; we can get by with only two bag files, the final bag file, total3.bag, and a temporary file, tmp.bag. We initialize total3.bag to an empty bag. As we loop over each IPset, rwbagbuild converts the IPset to a bag on its standard output, rwbagtool creates tmp.bag by adding its standard input to total3.bag, and we rename tmp.bag to total3.bag: $ rwbagbuild --bag-input=/dev/null --output-file=total3.bag $ for i in *.set ; do rwbagbuild --set-input=$i \ | rwbagtool --output-file=tmp.bag --add total3.bag stdin ; /bin/mv tmp.bag total3.bag ; done $ rwbagcat total3.bag 10.0.0.1| 2| 10.0.0.2| 3| 10.0.0.3| 1| 10.0.0.4| 1|

ENVIRONMENT

SILK_CLOBBER The SiLK tools normally refuse to overwrite existing files. Setting SILK_CLOBBER to a non-empty value removes this restriction.

SEE ALSO

rrwwbbaagg(1), rrwwbbaaggccaatt(1), rrwwbbaaggttooooll(1), rrwwffiilleeiinnffoo(1), rrwwsseett(1), rrwwsseettbbuuiilldd(1), rrwwsseettccaatt(1), rrwwsseettttooooll(1), ssiillkk(7), zzlliibb(3)

BUGS

The --default-count switch is poorly named. SiLK 3.11.0.1 2016-02-19 rwbagbuild(1)

Search: Section: