DragonFly On-Line Manual Pages

AF(1)                              Amberfish                             AF(1)

NAME
       af - Amberfish text retrieval software

SYNOPSIS
       af [-i] [options] [file ...]
       af [-s] [options]
       af [-L] [options]
       af [-l] [options]
       af [--fetch] [file] [begin] [end]
       af [--version]

DESCRIPTION
       The program af is a text-based interface to Amberfish functions for
       indexing and searching documents.  A simple indexing example would look
       something like:

       af -iCv -d mydb *.txt

       This creates a new database, mydb, containing an index to the set of
       files, *.txt.  To enable faster searching, an optional "linearize" step
       can be done (this can take a long time to run):

       af -L -d mydb

       Here is a typical search command:

       af -s -d mydb -Q 'cat & (dog | mouse)'

COMMAND OPTIONS
       Only one of these options can be used at a time.

       -i, --index
              Index documents (either file ...  or specified via standard
              input if -F is used).

       -s, --search
              Search an indexed database.

       -L, --linearize
              Linearize an indexed database.  Linearizing is an optional step
              that can be done after indexing.  The advantages of linearizing
              are that it reduces searching time and slightly reduces the size
              of the index files.  The disadvantages are that the linearizing
              process is very slow, especially when used on large databases,
              and it prevents any additional documents from being added to the
              database.

       -l, --list
              List the documents contained in a database.

       --fetch
              Output a portion of a file.  This command takes no other
              options.  The file name file, starting offset begin, and ending
              offset end are specified at the end of the line.

       --version
              Print the af version number.

GENERAL OPTIONS
       These options are generally available with all command options.

       -d, --db dbname
              Use dbname as the database name.  With some command options such
              as -s, this option can be supplied multiple times to specify
              multiple databases.

       -v, --verbose
              Show verbose output.  This option can be supplied multiple times
              to increase verbosity.

       -D, --debug
              Show extremely verbose (debugging) output.  Using this option
              once is equivalent to -vvvvv, and it can be supplied multiple
              times to increase verbosity further.

INDEX OPTIONS
       The following options can only be used together with the indexing (-i)
       command.

       -C, --create
              Create a new database, overwriting any existing one with the
              same name.

       -m, --memory maximum
              Set the maximum amount of memory in megabytes to use for
              indexing.  More memory speeds up indexing.

       --phrase
              Enable phrase searching.  This can only be used together with
              -C.

       --split delimiter
              Parse input files into multiple documents at points where the
              specified delimiter string is found.

       -t, --doctype=text, --doctype=xml
              Set the document type.  The default is text.  Specifying xml
              enables functions related to searching and retrieving within
              nested tags in XML documents.

       --dlevel level
              The maximum resolution (levels of descent) for retrieval of
              nested documents.  The default value is 1; increasing it
              lengthens indexing time significantly.  Use this for XML instead
              of --split to subdivide documents.  Note that this only affects
              resolution of elements returned from searches and is unrelated
              to nested queries which have much higher (fixed) resolution.

       --no-stem
              Do not perform stemming.  This can only be used together with
              -C.  Normally, stemming is automatically enabled if Amberfish
              was compiled with the stemming function.  This option disables
              stemming even if it is available.  Note that the stemming
              function is not distributed with this package and must be
              installed manually.

       -F     Read list of documents to be indexed from standard input, rather
              than from the end of the command line.

SEARCH OPTIONS
       The following options can only be used together with the searching (-s)
       command.

       -Q, --query-boolean query_string
              Search for the specified Boolean query string.

       -n, --numhits x
              Output a maximum of x results.

       --skiphits x
              Do not output the first x results.

       --totalhits
              Output the total number of results.

       --style=list, --style=lineage, --style=trec
              Set style of printed result sets.  The default is list.  Use the
              lineage style with XML to see hierarchical results.  For the
              trec style, it is assumed that the indexed file names are the
              document numbers and that --skiphits is not used (because rank
              always starts at 1).

       --trec-tag run_tag
              Output TREC results with the specified run tag.  (This is to be
              used with --style=trec.)

       --trec-topic topic_number
              Output TREC results with the specified topic number.  (This is
              to be used with --style=trec.)

LINEARIZE OPTIONS
       The following options can only be used together with the linearize (-L)
       command.

       -m, --memory maximum
              Set the maximum amount of memory in megabytes to use for
              linearizing.  More memory speeds up linearizing.

       --no-linear-buffer
              Do not use a memory buffer to speed up linearizing.  This option
              will be removed once the linearization buffer code proves to be
              reliable.

AUTHOR
       Nassib Nassar; see http://www.etymon.com/ for updates.

       Copyright (C) 1998-2006 Etymon Systems, Inc.

                                                                         AF(1)