aflex

AFLEX(1)                     General Commands Manual                    AFLEX(1)



NAME
       aflex - fast lexical analyzer generator for Ada

SYNOPSIS
       aflex [ -bdfipstvEILT -Sskeleton_file ] [ filename ]

DESCRIPTION
       aflex is a version of the Unix tool lex , but it is written in Ada and
       generates scanners in Ada.  It is upwardly compatible with the UCI tool
       alex, but is much faster and generates smaller scanners.

OPTIONS
       Command line options are given in a different format than in the old UCI
       alex.  Aflex options are as follows

       -t     Write the scanner output to the standard output rather than to a
              file.  The default name of the scanner file for base.l is base.a
              Note that this option is not as useful with aflex because in
              addition to the scanner file there are files for the externally
              visible dfa functions (base_dfa.a) and the external IO functions
              (base_io.a)

       -b     Generate backtracking information to aflex.backtrack.  This is a
              list of scanner states which require backtracking and the input
              characters on which they do so.  By adding rules one can remove
              backtracking states.  If all backtracking states are eliminated
              and -f is used, the generated scanner will run faster (see the -p
              flag).  Only users who wish to squeeze every last cycle out of
              their scanners need worry about this option.

       -d     makes the generated scanner run in debug mode.  Whenever a pattern
              is recognized the scanner will write to stderr a line of the form:

                  --accepting rule #n

              Rules are numbered sequentially with the first one being 1.  Rule
              #0 is executed when the scanner backtracks; Rule #(n+1) (where n
              is the number of rules) indicates the default action; Rule #(n+2)
              indicates that the input buffer is empty and needs to be refilled
              and then the scan restarted.  Rules beyond (n+2) are end-of-file
              actions.

       -f     has the same effect as lex's -f flag (do not compress the scanner
              tables); the mnemonic changes from fast compilation to (take your
              pick) full table or fast scanner.  The actual compilation takes
              longer, since aflex is I/O bound writing out the big table.  The
              compilation of the Ada file containing the scanner is also likely
              to take a long time because of the large arrays generated.

       -i     instructs aflex to generate a case-insensitive scanner.  The case
              of letters given in the aflex input patterns will be ignored, and
              the rules will be matched regardless of case.  The matched text
              given in yytext will have the preserved case (i.e., it will not be
              folded).

       -p     generates a performance report to stderr.  The report consists of
              comments regarding features of the aflex input file which will
              cause a loss of performance in the resulting scanner.  Note that
              the use of the ^ operator and the -I flag entail minor performance
              penalties.

       -s     causes the default rule (that unmatched scanner input is echoed to
              stdout) to be suppressed.  If the scanner encounters input that
              does not match any of its rules, it aborts with an error.  This
              option is useful for finding holes in a scanner's rule set.

       -v     has the same meaning as for lex (print to stderr a summary of
              statistics of the generated scanner).  Many more statistics are
              printed, though, and the summary spans several lines.  Most of the
              statistics are meaningless to the casual aflex user, but the first
              line identifies the version of aflex, which is useful for figuring
              out where you stand with respect to patches and new releases.

       -E     instructs aflex to generate additional information about each
              token, including line and column numbers.  This is needed for the
              advanced automatic error option correction in ayacc.

       -I     instructs aflex to generate an interactive scanner.  Normally,
              scanners generated by aflex always look ahead one character before
              deciding that a rule has been matched.  At the cost of some
              scanning overhead, aflex will generate a scanner which only looks
              ahead when needed.  Such scanners are called interactive because
              if you want to write a scanner for an interactive system such as a
              command shell, you will probably want the user's input to be
              terminated with a newline, and without -I the user will have to
              type a character in addition to the newline in order to have the
              newline recognized.  This leads to dreadful interactive
              performance.

              If all this seems to confusing, here's the general rule: if a
              human will be typing in input to your scanner, use -I, otherwise
              don't; if you don't care about how fast your scanners run and
              don't want to make any assumptions about the input to your
              scanner, always use -I.

              Note, -I cannot be used in conjunction with full i.e., the -f
              flag.

       -L     instructs aflex to not generate #line directives (see below).

       -T     makes aflex run in trace mode.  It will generate a lot of messages
              to stdout concerning the form of the input and the resultant non-
              deterministic and deterministic finite automatons.  This option is
              mostly for use in maintaining aflex.

       -Sskeleton_file
              overrides the default internal skeleton from which aflex
              constructs its scanners.  You'll probably never need this option
              unless you are doing aflex maintenance or development.

INCOMPATIBILITIES WITH LEX
       aflex is fully compatible with lex with the following exceptions:

       -      Source file format:

              The input specification file for aflex must use the following
              format.


                        definitions section
                        %%
                        rules section
                        %%
                        user defined section
                        ##
                        user defined section


       -      lex's %r (Ratfor scanners) and %t (translation table) options are
              not supported.

       -      The do-nothing -n flag is not supported.

       -      When definitions are expanded, aflex encloses them in parentheses.
              With lex, the following

                  NAME    [A-Z][A-Z0-9]*
                  %%
                  foo{NAME}?      text_io.put_line( "Found it" );
                  %%

              will not match the string "foo" because when the macro is expanded
              the rule is equivalent to "foo[A-Z][A-Z0-9]*?"  and the precedence
              is such that the '?' is associated with "[A-Z0-9]*".  With aflex,
              the rule will be expanded to "foo([A-z][A-Z0-9]*)?" and so the
              string "foo" will match.  Note that because of this, the ^, $,
              <s>, and / operators cannot be used in a definition.

       -      Input can be controlled by redefining the YY_INPUT function.
              YY_INPUT's calling sequence is "YY_INPUT(buf,result,max_size)".
              Its action is to place up to max_size characters in the character
              buffer "buf" and return in the integer variable "result" either
              the number of characters read or the constant YY_NULL to indicate
              EOF.  The default YY_INPUT reads from Standard_Input.

              You also can add in things like counting keeping track of the
              input line number this way; but don't expect your scanner to go
              very fast.

       -      Yytext is a function returning a vstring.

       -      aflex reads only one input file, while lex's input is made up of
              the concatenation of its input files.

       -      The following lex constructs are not supported
     - REJECT

     - %T      -- character set tables

     - %x -- changes to internal array sizes (see below)


ENHANCEMENTS
       -      Exclusive start-conditions can be declared by using %x instead of
              %s.  These start-conditions have the property that when they are
              active, no other rules are active.  Thus a set of rules governed
              by the same exclusive start condition describe a scanner which is
              independent of any of the other rules in the aflex input.  This
              feature makes it easy to specify "mini-scanners" which scan
              portions of the input that are syntactically different from the
              rest (e.g., comments).  End-of-file rules.  The special rule
              "<<EOF>>" indicates actions which are to be taken when an end-of-
              file is encountered and yywrap() returns non-zero (i.e., indicates
              no further files to process).  The action can either
              text_io.set_input() to a new file to process, in which case the
              action should finish with YY_NEW_FILE (this is a branch, so
              subsequent code in the action won't be executed), or it should
              finish with a return statement.  <<EOF>> rules may not be used
              with other patterns; they may only be qualified with a list of
              start conditions.  If an unqualified <<EOF>> rule is given, it
              applies only to the INITIAL start condition, and not to %s start
              conditions.  These rules are useful for catching things like
              unclosed comments.  An example:

                  %x quote
                  %%
                  ...
                  <quote><<EOF>>   {
                        error( "unterminated quote" );
                        }
                  <<EOF>>          {
                        set_input( next_file );
                        YY_NEW_FILE;
                        }


       -      aflex dynamically resizes its internal tables, so directives like
              "%a 3000" are not needed when specifying large scanners.

       -      aflex generates --#line comments mapping lines in the output to
              their origin in the input file.

       -      All actions must be enclosed by curly braces.

       -      Comments may be put in the first section of the input by preceding
              them with '#'.

       -      Ada style comments are supported instead of C style comments.

       -      All template files are internalized.

       -      The input source file must end with a ".l" extension.

FILES
       The names of the files containing the generated scanner, IO,
              and DFA packages are based on the basename of the input file.  For
              example if the input file is called scan.l then the scanner file
              is called scan.a, the DFA package is in scan_dfa.a, and scan_io.a
              is the IO package file.  All of these file names may be changed by
              modifying the external_file_manager package (see the porting notes
              for more information.)

       aflex.backtrack
              backtracking information for -b

SEE ALSO
       lex(1)

       M. E. Lesk and E. Schmidt, LEX - Lexical Analyzer Generator.  Technical
       Report Computing Science Technical Report, 39, Bell Telephone
       Laboratories, Murray Hill, NJ, 1975.

       Military Standard Ada Programming Language      (ANSI/MIL-
       STD-1815A-1983), American National Standards Institute, January 1983.

       T. Nguyen and K. Forester, Alex - An Ada Lexical Analysis Generator
       Arcadia Document UCI-88-17, University of California, Irvine, 1988

       D. Taback and D. Tolani, Ayacc User's Manual, Arcadia Document UCI-85-10,
       University of California, Irvine, 1986

AUTHOR
       John Self.  Based on the tool flex written and designed by Vern Paxson.
       It reimplements the functionality of the tool alex designed by Thieu Q.
       Nguyen.

       Send requests for aflex information to alex-info@ics.uci.edu
       Send bug reports for aflex to alex-bugs@ics.uci.edu

DIAGNOSTICS
       aflex scanner jammed - a scanner compiled with -s has encountered an
       input string which wasn't matched by any of its rules.

       old-style lex command ignored - the aflex input contains a lex command
       (e.g., "%n 1000") which is being ignored.

BUGS
       Some trailing context patterns cannot be properly matched and generate
       warning messages ("Dangerous trailing context").  These are patterns
       where the ending of the first part of the rule matches the beginning of
       the second part, such as "zx*/xy*", where the 'x*' matches the 'x' at the
       beginning of the trailing context.  (Lex doesn't get these patterns right
       either.)

       variable trailing context (where both the leading and trailing parts do
       not have a fixed length) entails a substantial performance loss.

       For some trailing context rules, parts which are actually fixed-length
       are not recognized as such, leading to the abovementioned performance
       loss.  In particular, parts using '|' or {n} are always considered
       variable-length.

       Nulls are not allowed in aflex inputs or in the inputs to scanners
       generated by aflex.  Their presence generates fatal errors.

       Pushing back definitions enclosed in ()'s can result in nasty, difficult-
       to-understand problems like:

            {DIG}  [0-9] -- a digit

       In which the pushed-back text is "([0-9] -- a digit)".

       Due to both buffering of input and read-ahead, you cannot intermix calls
       to text_io routines, such as, for example, text_io.get() with aflex rules
       and expect it to work.  Call input() instead.

       There are still more features that could be implemented (especially
       REJECT) Also the speed of the compressed scanners could be improved.

       The utility needs more complete documentation.



Version 1.4                       10 March 1994                         AFLEX(1)