DragonFly On-Line Manual Pages

FASTS/TFASTSv3(1)      DragonFly General Commands Manual     FASTS/TFASTSv3(1)

NAME
       fasts3, fasts3_t - compare several short peptide sequences against a
       protein database using a modified fasta algorithm.

       tfasts3, tfasts3_t - compare short pepides against a translated DNA
       database.

DESCRIPTION
       fasts3 and tfasts3 are designed to compare set of (presumably non-
       contiguous) peptides to a protein (fasts3) or translated DNA (tfasts3)
       database.  fasts3/tfasts3 are designed particularly for short peptide
       data from mass-spec analysis of protein digests.  Unlike the
       traditional fasta3 search, which uses a protein or DNA sequence, fasts3
       and tfasts3 work with a query sequence of the form:
            >tests from mgstm1
            MLLE,
            MILGYW,
            MGADP,
            MLCYNP
This sequence indicates that four peptides are to be used.  When this sequence
is compared against mgstm1.aa (included with the distribution), the result is:
     testf    MILGYW----------MLLE------------MGDAP-----------
              ::::::          ::::            :::::
     GT8.7  MPMILGYWNVRGLTHPIRMLLEYTDSSYDEKRYTMGDAPDFDRSQWLNEK
                    10        20        30        40        50

     testf  --------------------------------------------------

     GT8.7  FKLGLDFPNLPYLIDGSHKITQSNAILRYLARKHHLDGETEEERIRADIV
                    60        70        80        90       100

                           20
     testf  ------------MLCYNP
                        ::::::
     GT8.7  ENQVMDTRMQLIMLCYNPDFEKQKPEFLKTIPEKMKLYSEFLGKRPWFAG
                   110       120       130       140       150

Options
       fasts3 and tfasts3 can accept a query sequence from the unix "stdin"
       data stream.  This makes it much easier to use fasta3 and its relatives
       as part of a WWW page. To indicate that stdin is to be used, use "-" or
       "@" as the query sequence file name.

       -b #   number of best scores to show (must be < -E cutoff)

       -d #   number of best alignments to show ( must be < -E cutoff)

       -D     turn on debugging mode.  Enables checks on sequence alphabet
              that cause problems with tfastx3, tfasty3, tfasta3.

       -E #   Expectation value limit for displaying scores and alignments.
              Expectation values for fasts3 and tfasts3 are not as accurate as
              those for the other fasta3 programs.

       -H     turn off histogram display

       -i     compare against only the reverse complement of the library
              sequence.

       -L     report long sequence description in alignments

       -m 0,1,2,3,4,5,6,9,10
              alignment display options

       -N #   break long library sequences into blocks of # residues.  Useful
              for bacterial genomes, which have only one sequence entry.  -N
              2000 works well for well for bacterial genomes.

       -O file
              send output to file

       -q/-Q  quiet option; do not prompt for input

       -R file
              save all scores to statistics file

       -S #   offset substitution matrix values by  a constant #

       -s name
              specify substitution matrix.  BLOSUM50 is used by default;
              PAM250, PAM120, and BLOSUM62 can be specified by setting -s
              P120, P250, or BL62.  With this version, many more scoring
              matrices are available, including BLOSUM80 (BL80), and MDM_10,
              MDM_20, MDM_40 (M10, M20, M40). Alternatively, BLASTP1.4 format
              scoring matrix files can be specified.

       -T #   (threaded, parallel only) number of threads or workers to use
              (set by default to 4 at compile time).

       -t #   Translation table - tfasts3 can use the BLAST tranlation tables.
              See
              http://www.ncbi.nih.gov/htbin-post/Taxonomy/wprintgc?mode=c/.

       -w #   line width for similarity score, sequence alignment, output.

       -x "#,#"
              offsets query, library sequence for numbering alignments

       -z #   Specify statistical calculation. Default is -z 1, which uses
              regression against the length of the library sequence. -z 0
              disables statistics.  -z 2 uses the ln() length correction. -z 3
              uses Altschul and Gish's statistical estimates for specific
              protein BLOSUM scoring matrices and gap penalties. -z 4: an
              alternate regression method.

       -Z db_size
              Set the apparent database size used for expectation value
              calculations.

       -3     (TFASTS3 only) use only forward frame translations

Environment variables:
       FASTLIBS
              location of library choice file (-l FASTLIBS)

       SMATRIX
              default scoring matrix (-s SMATRIX)

       SRCH_URL
              the format string used to define the option to re-search the
              database.

       REF_URL
              the format string used to define the option to lookup the
              library sequence in entrez, or some other database.

AUTHOR
       Bill Pearson
       wrp@virginia.EDU

                                     local                   FASTS/TFASTSv3(1)