DragonFly On-Line Manual Pages

AFLEX(1)               DragonFly General Commands Manual              AFLEX(1)

NAME
       aflex - fast lexical analyzer generator for Ada

SYNOPSIS
       aflex [ -bdfipstvEILT -Sskeleton_file ] [ filename ]

DESCRIPTION
       aflex is a version of the Unix tool lex , but it is written in Ada and
       generates scanners in Ada.  It is upwardly compatible with the UCI tool
       alex, but is much faster and generates smaller scanners.

OPTIONS
       Command line options are given in a different format than in the old
       UCI alex.  Aflex options are as follows

       -t     Write the scanner output to the standard output rather than to a
              file.  The default name of the scanner file for base.l is base.a
              Note that this option is not as useful with aflex because in
              addition to the scanner file there are files for the externally
              visible dfa functions (base_dfa.a) and the external IO functions
              (base_io.a)

       -b     Generate backtracking information to aflex.backtrack.  This is a
              list of scanner states which require backtracking and the input
              characters on which they do so.  By adding rules one can remove
              backtracking states.  If all backtracking states are eliminated
              and -f is used, the generated scanner will run faster (see the
              -p flag).  Only users who wish to squeeze every last cycle out
              of their scanners need worry about this option.

       -d     makes the generated scanner run in debug mode.  Whenever a
              pattern is recognized the scanner will write to stderr a line of
              the form:

                  --accepting rule #n

              Rules are numbered sequentially with the first one being 1.
              Rule #0 is executed when the scanner backtracks; Rule #(n+1)
              (where n is the number of rules) indicates the default action;
              Rule #(n+2) indicates that the input buffer is empty and needs
              to be refilled and then the scan restarted.  Rules beyond (n+2)
              are end-of-file actions.

       -f     has the same effect as lex's -f flag (do not compress the
              scanner tables); the mnemonic changes from fast compilation to
              (take your pick) full table or fast scanner.  The actual
              compilation takes longer, since aflex is I/O bound writing out
              the big table.  The compilation of the Ada file containing the
              scanner is also likely to take a long time because of the large
              arrays generated.

       -i     instructs aflex to generate a case-insensitive scanner.  The
              case of letters given in the aflex input patterns will be
              ignored, and the rules will be matched regardless of case.  The
              matched text given in yytext will have the preserved case (i.e.,
              it will not be folded).

       -p     generates a performance report to stderr.  The report consists
              of comments regarding features of the aflex input file which
              will cause a loss of performance in the resulting scanner.  Note
              that the use of the ^ operator and the -I flag entail minor
              performance penalties.

       -s     causes the default rule (that unmatched scanner input is echoed
              to stdout) to be suppressed.  If the scanner encounters input
              that does not match any of its rules, it aborts with an error.
              This option is useful for finding holes in a scanner's rule set.

       -v     has the same meaning as for lex (print to stderr a summary of
              statistics of the generated scanner).  Many more statistics are
              printed, though, and the summary spans several lines.  Most of
              the statistics are meaningless to the casual aflex user, but the
              first line identifies the version of aflex, which is useful for
              figuring out where you stand with respect to patches and new
              releases.

       -E     instructs aflex to generate additional information about each
              token, including line and column numbers.  This is needed for
              the advanced automatic error option correction in ayacc.

       -I     instructs aflex to generate an interactive scanner.  Normally,
              scanners generated by aflex always look ahead one character
              before deciding that a rule has been matched.  At the cost of
              some scanning overhead, aflex will generate a scanner which only
              looks ahead when needed.  Such scanners are called interactive
              because if you want to write a scanner for an interactive system
              such as a command shell, you will probably want the user's input
              to be terminated with a newline, and without -I the user will
              have to type a character in addition to the newline in order to
              have the newline recognized.  This leads to dreadful interactive
              performance.

              If all this seems to confusing, here's the general rule: if a
              human will be typing in input to your scanner, use -I, otherwise
              don't; if you don't care about how fast your scanners run and
              don't want to make any assumptions about the input to your
              scanner, always use -I.

              Note, -I cannot be used in conjunction with full i.e., the -f
              flag.

       -L     instructs aflex to not generate #line directives (see below).

       -T     makes aflex run in trace mode.  It will generate a lot of
              messages to stdout concerning the form of the input and the
              resultant non-deterministic and deterministic finite automatons.
              This option is mostly for use in maintaining aflex.

       -Sskeleton_file
              overrides the default internal skeleton from which aflex
              constructs its scanners.  You'll probably never need this option
              unless you are doing aflex maintenance or development.

INCOMPATIBILITIES WITH LEX
       aflex is fully compatible with lex with the following exceptions:

       -      Source file format:

              The input specification file for aflex must use the following
              format.

                        definitions section
                        %%
                        rules section
                        %%
                        user defined section
                        ##
                        user defined section

       -      lex's %r (Ratfor scanners) and %t (translation table) options
              are not supported.

       -      The do-nothing -n flag is not supported.

       -      When definitions are expanded, aflex encloses them in
              parentheses.  With lex, the following

                  NAME    [A-Z][A-Z0-9]*
                  %%
                  foo{NAME}?      text_io.put_line( "Found it" );
                  %%

              will not match the string "foo" because when the macro is
              expanded the rule is equivalent to "foo[A-Z][A-Z0-9]*?" and the
              precedence is such that the '?' is associated with "[A-Z0-9]*".
              With aflex, the rule will be expanded to "foo([A-z][A-Z0-9]*)?"
              and so the string "foo" will match.  Note that because of this,
              the ^, $, <s>, and / operators cannot be used in a definition.

       -      Input can be controlled by redefining the YY_INPUT function.
              YY_INPUT's calling sequence is "YY_INPUT(buf,result,max_size)".
              Its action is to place up to max_size characters in the
              character buffer "buf" and return in the integer variable
              "result" either the number of characters read or the constant
              YY_NULL to indicate EOF.  The default YY_INPUT reads from
              Standard_Input.

              You also can add in things like counting keeping track of the
              input line number this way; but don't expect your scanner to go
              very fast.

       -      Yytext is a function returning a vstring.

       -      aflex reads only one input file, while lex's input is made up of
              the concatenation of its input files.

       -      The following lex constructs are not supported
     - REJECT

     - %T      -- character set tables

     - %x -- changes to internal array sizes (see below)

ENHANCEMENTS
       -      Exclusive start-conditions can be declared by using %x instead
              of %s.  These start-conditions have the property that when they
              are active, no other rules are active.  Thus a set of rules
              governed by the same exclusive start condition describe a
              scanner which is independent of any of the other rules in the
              aflex input.  This feature makes it easy to specify "mini-
              scanners" which scan portions of the input that are
              syntactically different from the rest (e.g., comments).
              End-of-file rules.  The special rule "<<EOF>>" indicates actions
              which are to be taken when an end-of-file is encountered and
              yywrap() returns non-zero (i.e., indicates no further files to
              process).  The action can either text_io.set_input() to a new
              file to process, in which case the action should finish with
              YY_NEW_FILE (this is a branch, so subsequent code in the action
              won't be executed), or it should finish with a return statement.
              <<EOF>> rules may not be used with other patterns; they may only
              be qualified with a list of start conditions.  If an unqualified
              <<EOF>> rule is given, it applies only to the INITIAL start
              condition, and not to %s start conditions.  These rules are
              useful for catching things like unclosed comments.  An example:

                  %x quote
                  %%
                  ...
                  <quote><<EOF>>   {
                        error( "unterminated quote" );
                        }
                  <<EOF>>          {
                        set_input( next_file );
                        YY_NEW_FILE;
                        }

       -      aflex dynamically resizes its internal tables, so directives
              like "%a 3000" are not needed when specifying large scanners.

       -      aflex generates --#line comments mapping lines in the output to
              their origin in the input file.

       -      All actions must be enclosed by curly braces.

       -      Comments may be put in the first section of the input by
              preceding them with '#'.

       -      Ada style comments are supported instead of C style comments.

       -      All template files are internalized.

       -      The input source file must end with a ".l" extension.

FILES
       The names of the files containing the generated scanner, IO,
              and DFA packages are based on the basename of the input file.
              For example if the input file is called scan.l then the scanner
              file is called scan.a, the DFA package is in scan_dfa.a, and
              scan_io.a is the IO package file.  All of these file names may
              be changed by modifying the external_file_manager package (see
              the porting notes for more information.)

       aflex.backtrack
              backtracking information for -b

SEE ALSO
       lex(1)

       M. E. Lesk and E. Schmidt, LEX - Lexical Analyzer Generator.  Technical
       Report Computing Science Technical Report, 39, Bell Telephone
       Laboratories, Murray Hill, NJ, 1975.

       Military Standard Ada Programming Language   (ANSI/MIL-STD-1815A-1983),
       American National Standards Institute, January 1983.

       T. Nguyen and K. Forester, Alex - An Ada Lexical Analysis Generator
       Arcadia Document UCI-88-17, University of California, Irvine, 1988

       D. Taback and D. Tolani, Ayacc User's Manual, Arcadia Document
       UCI-85-10, University of California, Irvine, 1986

AUTHOR
       John Self.  Based on the tool flex written and designed by Vern Paxson.
       It reimplements the functionality of the tool alex designed by Thieu Q.
       Nguyen.

       Send requests for aflex information to alex-info@ics.uci.edu
       Send bug reports for aflex to alex-bugs@ics.uci.edu

DIAGNOSTICS
       aflex scanner jammed - a scanner compiled with -s has encountered an
       input string which wasn't matched by any of its rules.

       old-style lex command ignored - the aflex input contains a lex command
       (e.g., "%n 1000") which is being ignored.

BUGS
       Some trailing context patterns cannot be properly matched and generate
       warning messages ("Dangerous trailing context").  These are patterns
       where the ending of the first part of the rule matches the beginning of
       the second part, such as "zx*/xy*", where the 'x*' matches the 'x' at
       the beginning of the trailing context.  (Lex doesn't get these patterns
       right either.)

       variable trailing context (where both the leading and trailing parts do
       not have a fixed length) entails a substantial performance loss.

       For some trailing context rules, parts which are actually fixed-length
       are not recognized as such, leading to the abovementioned performance
       loss.  In particular, parts using '|' or {n} are always considered
       variable-length.

       Nulls are not allowed in aflex inputs or in the inputs to scanners
       generated by aflex.  Their presence generates fatal errors.

       Pushing back definitions enclosed in ()'s can result in nasty,
       difficult-to-understand problems like:

            {DIG}  [0-9] -- a digit

       In which the pushed-back text is "([0-9] -- a digit)".

       Due to both buffering of input and read-ahead, you cannot intermix
       calls to text_io routines, such as, for example, text_io.get() with
       aflex rules and expect it to work.  Call input() instead.

       There are still more features that could be implemented (especially
       REJECT) Also the speed of the compressed scanners could be improved.

       The utility needs more complete documentation.

Version 1.4                      10 March 1994                        AFLEX(1)