DragonFly On-Line Manual Pages

Search: Section:  


xlcleaner(1)(TDH)                                            xlcleaner(1)(TDH)

NAME

xlcleaner(1) - clean up Excel/spreadsheet tab-delimited export files

SYNOPSIS

cat tabdata | xlcleaner options

DESCRIPTION

xlcleaner cleans up the irregularities and annoyances often present in MS Excel tab-delimited export files, including rounding of numeric values to reasonable precisions. The tab-delimited export text file is read from standard input and the cleaned-up result is written to standard output. Default behavior is to convert tab-delimited files to files that use the null field symbol (which is = by default or may be changed in your project config file). Incoming lines that hold nothing but white space are omitted. Blanks that are embedded in datafields are converted to underscores, unless -b is used.

EXAMPLE

cat tabdata | xlcleaner -1 > out

META OPTIONS

-1 Equivalent to -s -r 0 -t -f . Useful in converting tab delimited data exported from spreadsheets.

OPTIONS

-r D Rounding option. Round numeric values to a reasonable # of decimal places. D may be used to alter the normal behavior. If D = 0 nothing happens. If D = 1 rounding will go to 1 additional decimal place. D = -1 rounding will go to one less decimal place than normal. -t Create tab-delimited output, overriding any previously specified field delimitation character. Also, if -c is being used, comment lines will be copied to output verbatim. Unless -b is used, embedded blanks will still be converted to underscores. -f The first non-comment, non-empty line is examined; the number of fields (non-blank or blank) is counted and this value is used to set the number of fields per record. All subsequent records will be truncated to that number of fields. A warning will be issued for any discarded fields having content other than blank. If your data set has blank fields at the end of the first record it may be best to remove these manually with a text editor before running xlcleaner -f. If -s is in effect, blank fields (spacer columns) will be included in the initial count and in the truncation, but then will be screened out before writing the result. -n N Same as -f but N (the number of fields to process) may be specified directly. -s skip spacer columns. Spacer columns are empty "columns" or fields. They are often created by spreadsheet users for spacing purposes. This option works as follows: The first non-empty non-comment line is examined. Any empty fields in this line are considered "spacers". Those fields will then be screened out when the result output is written. A warning will be issued if any content is encountered in a spacer field that is being omitted. For example, if fields 5 and 9 of the first line are empty, and -s is in effect, then fields 5 and 9 of all records will be screened out. -c SYM Define SYM as a comment symbol present in the input. Any line whose first non-whitespace content is this symbol will be copied to output with a DMS comment symbol (//) added at the beginning. -p N Add padding of N blanks to end of each result record. -b don't convert blanks that are embedded within data fields, to underscores. -m SYM Set null data code to SYM, overriding any previously defined null data code. 22-SEP-2003 TDH scg@jax.org xlcleaner(1)(TDH)

Search: Section: