DragonFly On-Line Manual Pages
AF(1) Amberfish AF(1)
NAME
af - Amberfish text retrieval software
SYNOPSIS
af [-i] [options] [file ...]
af [-s] [options]
af [-L] [options]
af [-l] [options]
af [--fetch] [file] [begin] [end]
af [--version]
DESCRIPTION
The program af is a text-based interface to Amberfish functions for
indexing and searching documents. A simple indexing example would look
something like:
af -iCv -d mydb *.txt
This creates a new database, mydb, containing an index to the set of
files, *.txt. To enable faster searching, an optional "linearize" step
can be done (this can take a long time to run):
af -L -d mydb
Here is a typical search command:
af -s -d mydb -Q 'cat & (dog | mouse)'
COMMAND OPTIONS
Only one of these options can be used at a time.
-i, --index
Index documents (either file ... or specified via standard
input if -F is used).
-s, --search
Search an indexed database.
-L, --linearize
Linearize an indexed database. Linearizing is an optional step
that can be done after indexing. The advantages of linearizing
are that it reduces searching time and slightly reduces the size
of the index files. The disadvantages are that the linearizing
process is very slow, especially when used on large databases,
and it prevents any additional documents from being added to the
database.
-l, --list
List the documents contained in a database.
--fetch
Output a portion of a file. This command takes no other
options. The file name file, starting offset begin, and ending
offset end are specified at the end of the line.
--version
Print the af version number.
GENERAL OPTIONS
These options are generally available with all command options.
-d, --db dbname
Use dbname as the database name. With some command options such
as -s, this option can be supplied multiple times to specify
multiple databases.
-v, --verbose
Show verbose output. This option can be supplied multiple times
to increase verbosity.
-D, --debug
Show extremely verbose (debugging) output. Using this option
once is equivalent to -vvvvv, and it can be supplied multiple
times to increase verbosity further.
INDEX OPTIONS
The following options can only be used together with the indexing (-i)
command.
-C, --create
Create a new database, overwriting any existing one with the
same name.
-m, --memory maximum
Set the maximum amount of memory in megabytes to use for
indexing. More memory speeds up indexing.
--phrase
Enable phrase searching. This can only be used together with
-C.
--split delimiter
Parse input files into multiple documents at points where the
specified delimiter string is found.
-t, --doctype=text, --doctype=xml
Set the document type. The default is text. Specifying xml
enables functions related to searching and retrieving within
nested tags in XML documents.
--dlevel level
The maximum resolution (levels of descent) for retrieval of
nested documents. The default value is 1; increasing it
lengthens indexing time significantly. Use this for XML instead
of --split to subdivide documents. Note that this only affects
resolution of elements returned from searches and is unrelated
to nested queries which have much higher (fixed) resolution.
--no-stem
Do not perform stemming. This can only be used together with
-C. Normally, stemming is automatically enabled if Amberfish
was compiled with the stemming function. This option disables
stemming even if it is available. Note that the stemming
function is not distributed with this package and must be
installed manually.
-F Read list of documents to be indexed from standard input, rather
than from the end of the command line.
SEARCH OPTIONS
The following options can only be used together with the searching (-s)
command.
-Q, --query-boolean query_string
Search for the specified Boolean query string.
-n, --numhits x
Output a maximum of x results.
--skiphits x
Do not output the first x results.
--totalhits
Output the total number of results.
--style=list, --style=lineage, --style=trec
Set style of printed result sets. The default is list. Use the
lineage style with XML to see hierarchical results. For the
trec style, it is assumed that the indexed file names are the
document numbers and that --skiphits is not used (because rank
always starts at 1).
--trec-tag run_tag
Output TREC results with the specified run tag. (This is to be
used with --style=trec.)
--trec-topic topic_number
Output TREC results with the specified topic number. (This is
to be used with --style=trec.)
LINEARIZE OPTIONS
The following options can only be used together with the linearize (-L)
command.
-m, --memory maximum
Set the maximum amount of memory in megabytes to use for
linearizing. More memory speeds up linearizing.
--no-linear-buffer
Do not use a memory buffer to speed up linearizing. This option
will be removed once the linearization buffer code proves to be
reliable.
AUTHOR
Nassib Nassar; see http://www.etymon.com/ for updates.
Copyright (C) 1998-2006 Etymon Systems, Inc.
AF(1)