DragonFly On-Line Manual Pages

Search: Section:  


pjoin(1)(TDH)                                                    pjoin(1)(TDH)

NAME

pjoin(1) - relational join on two files

SYNOPSIS

pjoin [options] file1 keyfields1 file2 keyfields2

DESCRIPTION

pjoin performs a relational join on two sets of whitespace-delimited tabular data records. file1 and file2 are the input files. One of them may be - to indicate that data are to be read from standard input. The result is written to standard output and uses the same field delimitation as the input files. Comment lines (beginning with //) and blank lines are skipped on input and do not appear in output. keyfields1 specifies one or more fields in file1 to consider when performing the join; it may be a single field specifier or a list of field specifiers delimited by commas. Likewise keyfields2 for file2. Fields may be specified by number, e.g. 2 specifies the second field. If a file has a field name header, field names may be used, e.g. id (however the -h1 and/or -h2 options must be used so that pjoin knows to expect a field name header in file1 or file2 respectively).

OPTIONS

-l Do a left join. Prevents records from being omitted from the left side (file1) as the result of the join. Missing records will be filled with placeholder characters. -r Do a right join. Prevents records from being omitted from the right side (file2) as the result of the join. Missing records will be filled with placeholder characters. Note: -l and -r may both be used to produce a loss-less join. -i Make comparisons case-insensitive. Normally they are case- sensitive. -dup1 Allow multiple instances in file1. The matching record from file2 will be replicated for each instance. -dup2 Allow multiple instances in file2. The matching record from file1 will be replicated for each instance. -q Quick option. Do not sort input; assume inputs are already in sort order. Normally the inputs are piped through an appropriate sort(1) command to sort on the key fields. pjoin will not give correct results if inputs are unsorted. -rml Remove the lefthand portion of the result, leaving only the records from file2. -rmr Remove the righthand portion of the result, leaving only the records from file1. -H Both file1 and file2 have field name headers, and a field name header will also be written to the output. Equivalent to -h1 -h2 -ho. -h1 Indicates that file1 has a field name header. This allows fields in file1 to be specified by name. -h2 Indicates that file2 has a field name header. This allows fields in file2 to be specified by name. -ho Field name header will be written as the first line of output. At least one of the input files must have a field name header. If one of the files did not have a field name header, placeholder fill characters will be written for that portion of the field name header. -t Indicates that input and output are tab delimited. Normally the join result uses a space between the left side and right side; with -t a tab is used instead. -fc Set the placeholder fill character to c. Normally it is -.

NOTES

Using -l anf -r together results in a "loss-less" join. -dup1 and -dup2 cannot be used together. -rmr and -rml cannot be used together.

EXAMPLE

Suppose file1 looks like this: 001 A red 001 B red 002 A blue 003 C yellow and file2 looks like this: 001 A Jean 002 A Jan We could perform an ordinary join by issuing the command: pjoin file1 1,2 file2 1,2 001 A red 001 A Jean 002 A blue 002 A Jan Or we could perform a left join by issuing this command: pjoin -l file1 1,2 file2 1,2 001 A red 001 A Jean 001 B red --- - ---- 002 A blue 002 A Jan 003 C yellow --- - ---

AUTHOR

Steve Grubb, with portions developed by Sandra Reynolds and Marv Newhouse. 22-SEP-2003 TDH scg@jax.org pjoin(1)(TDH)

Search: Section: