781 lines
		
	
	
		
			20 KiB
		
	
	
	
		
			Groff
		
	
	
	
	
	
			
		
		
	
	
			781 lines
		
	
	
		
			20 KiB
		
	
	
	
		
			Groff
		
	
	
	
	
	
| .TH FLEX 1 "26 May 1990" "Version 2.3"
 | |
| .SH NAME
 | |
| flex, lex - fast lexical analyzer generator
 | |
| .SH SYNOPSIS
 | |
| .B flex
 | |
| .B [-bcdfinpstvFILT8 -C[efmF] -Sskeleton]
 | |
| .I [filename ...]
 | |
| .SH DESCRIPTION
 | |
| .I flex
 | |
| is a tool for generating
 | |
| .I scanners:
 | |
| programs which recognized lexical patterns in text.
 | |
| .I flex
 | |
| reads
 | |
| the given input files, or its standard input if no file names are given,
 | |
| for a description of a scanner to generate.  The description is in
 | |
| the form of pairs
 | |
| of regular expressions and C code, called
 | |
| .I rules.  flex
 | |
| generates as output a C source file,
 | |
| .B lex.yy.c,
 | |
| which defines a routine
 | |
| .B yylex().
 | |
| This file is compiled and linked with the
 | |
| .B -lfl
 | |
| library to produce an executable.  When the executable is run,
 | |
| it analyzes its input for occurrences
 | |
| of the regular expressions.  Whenever it finds one, it executes
 | |
| the corresponding C code.
 | |
| .LP
 | |
| For full documentation, see
 | |
| .B flexdoc(1).
 | |
| This manual entry is intended for use as a quick reference.
 | |
| .SH OPTIONS
 | |
| .I flex
 | |
| has the following options:
 | |
| .TP
 | |
| .B -b
 | |
| Generate backtracking information to
 | |
| .I lex.backtrack.
 | |
| This is a list of scanner states which require backtracking
 | |
| and the input characters on which they do so.  By adding rules one
 | |
| can remove backtracking states.  If all backtracking states
 | |
| are eliminated and
 | |
| .B -f
 | |
| or
 | |
| .B -F
 | |
| is used, the generated scanner will run faster.
 | |
| .TP
 | |
| .B -c
 | |
| is a do-nothing, deprecated option included for POSIX compliance.
 | |
| .IP
 | |
| .B NOTE:
 | |
| in previous releases of
 | |
| .I flex
 | |
| .B -c
 | |
| specified table-compression options.  This functionality is
 | |
| now given by the
 | |
| .B -C
 | |
| flag.  To ease the the impact of this change, when
 | |
| .I flex
 | |
| encounters
 | |
| .B -c,
 | |
| it currently issues a warning message and assumes that
 | |
| .B -C
 | |
| was desired instead.  In the future this "promotion" of
 | |
| .B -c
 | |
| to
 | |
| .B -C
 | |
| will go away in the name of full POSIX compliance (unless
 | |
| the POSIX meaning is removed first).
 | |
| .TP
 | |
| .B -d
 | |
| makes the generated scanner run in
 | |
| .I debug
 | |
| mode.  Whenever a pattern is recognized and the global
 | |
| .B yy_flex_debug
 | |
| is non-zero (which is the default), the scanner will
 | |
| write to
 | |
| .I stderr
 | |
| a line of the form:
 | |
| .nf
 | |
| 
 | |
|     --accepting rule at line 53 ("the matched text")
 | |
| 
 | |
| .fi
 | |
| The line number refers to the location of the rule in the file
 | |
| defining the scanner (i.e., the file that was fed to flex).  Messages
 | |
| are also generated when the scanner backtracks, accepts the
 | |
| default rule, reaches the end of its input buffer (or encounters
 | |
| a NUL; the two look the same as far as the scanner's concerned),
 | |
| or reaches an end-of-file.
 | |
| .TP
 | |
| .B -f
 | |
| specifies (take your pick)
 | |
| .I full table
 | |
| or
 | |
| .I fast scanner.
 | |
| No table compression is done.  The result is large but fast.
 | |
| This option is equivalent to
 | |
| .B -Cf
 | |
| (see below).
 | |
| .TP
 | |
| .B -i
 | |
| instructs
 | |
| .I flex
 | |
| to generate a
 | |
| .I case-insensitive
 | |
| scanner.  The case of letters given in the
 | |
| .I flex
 | |
| input patterns will
 | |
| be ignored, and tokens in the input will be matched regardless of case.  The
 | |
| matched text given in
 | |
| .I yytext
 | |
| will have the preserved case (i.e., it will not be folded).
 | |
| .TP
 | |
| .B -n
 | |
| is another do-nothing, deprecated option included only for
 | |
| POSIX compliance.
 | |
| .TP
 | |
| .B -p
 | |
| generates a performance report to stderr.  The report
 | |
| consists of comments regarding features of the
 | |
| .I flex
 | |
| input file which will cause a loss of performance in the resulting scanner.
 | |
| .TP
 | |
| .B -s
 | |
| causes the
 | |
| .I default rule
 | |
| (that unmatched scanner input is echoed to
 | |
| .I stdout)
 | |
| to be suppressed.  If the scanner encounters input that does not
 | |
| match any of its rules, it aborts with an error.
 | |
| .TP
 | |
| .B -t
 | |
| instructs
 | |
| .I flex
 | |
| to write the scanner it generates to standard output instead
 | |
| of
 | |
| .B lex.yy.c.
 | |
| .TP
 | |
| .B -v
 | |
| specifies that
 | |
| .I flex
 | |
| should write to
 | |
| .I stderr
 | |
| a summary of statistics regarding the scanner it generates.
 | |
| .TP
 | |
| .B -F
 | |
| specifies that the
 | |
| .I fast
 | |
| scanner table representation should be used.  This representation is
 | |
| about as fast as the full table representation
 | |
| .RB ( \-f ),
 | |
| and for some sets of patterns will be considerably smaller (and for
 | |
| others, larger).  See
 | |
| .B flexdoc(1)
 | |
| for details.
 | |
| .IP
 | |
| This option is equivalent to
 | |
| .B -CF
 | |
| (see below).
 | |
| .TP
 | |
| .B -I
 | |
| instructs
 | |
| .I flex
 | |
| to generate an
 | |
| .I interactive
 | |
| scanner, that is, a scanner which stops immediately rather than
 | |
| looking ahead if it knows
 | |
| that the currently scanned text cannot be part of a longer rule's match.
 | |
| Again, see
 | |
| .B flexdoc(1)
 | |
| for details.
 | |
| .IP
 | |
| Note,
 | |
| .B -I
 | |
| cannot be used in conjunction with
 | |
| .I full
 | |
| or
 | |
| .I fast tables,
 | |
| i.e., the
 | |
| .B -f, -F, -Cf,
 | |
| or
 | |
| .B -CF
 | |
| flags.
 | |
| .TP
 | |
| .B -L
 | |
| instructs
 | |
| .I flex
 | |
| not to generate
 | |
| .B #line
 | |
| directives in
 | |
| .B lex.yy.c.
 | |
| The default is to generate such directives so error
 | |
| messages in the actions will be correctly
 | |
| located with respect to the original
 | |
| .I flex
 | |
| input file, and not to
 | |
| the fairly meaningless line numbers of
 | |
| .B lex.yy.c.
 | |
| .TP
 | |
| .B -T
 | |
| makes
 | |
| .I flex
 | |
| run in
 | |
| .I trace
 | |
| mode.  It will generate a lot of messages to
 | |
| .I stdout
 | |
| concerning
 | |
| the form of the input and the resultant non-deterministic and deterministic
 | |
| finite automata.  This option is mostly for use in maintaining
 | |
| .I flex.
 | |
| .TP
 | |
| .B -8
 | |
| instructs
 | |
| .I flex
 | |
| to generate an 8-bit scanner.
 | |
| On some sites, this is the default.  On others, the default
 | |
| is 7-bit characters.  To see which is the case, check the verbose
 | |
| .B (-v)
 | |
| output for "equivalence classes created".  If the denominator of
 | |
| the number shown is 128, then by default
 | |
| .I flex
 | |
| is generating 7-bit characters.  If it is 256, then the default is
 | |
| 8-bit characters.
 | |
| .TP 
 | |
| .B -C[efmF]
 | |
| controls the degree of table compression.
 | |
| .IP
 | |
| .B -Ce
 | |
| directs
 | |
| .I flex
 | |
| to construct
 | |
| .I equivalence classes,
 | |
| i.e., sets of characters
 | |
| which have identical lexical properties.
 | |
| Equivalence classes usually give
 | |
| dramatic reductions in the final table/object file sizes (typically
 | |
| a factor of 2-5) and are pretty cheap performance-wise (one array
 | |
| look-up per character scanned).
 | |
| .IP
 | |
| .B -Cf
 | |
| specifies that the
 | |
| .I full
 | |
| scanner tables should be generated -
 | |
| .I flex
 | |
| should not compress the
 | |
| tables by taking advantages of similar transition functions for
 | |
| different states.
 | |
| .IP
 | |
| .B -CF
 | |
| specifies that the alternate fast scanner representation (described in
 | |
| .B flexdoc(1))
 | |
| should be used.
 | |
| .IP
 | |
| .B -Cm
 | |
| directs
 | |
| .I flex
 | |
| to construct
 | |
| .I meta-equivalence classes,
 | |
| which are sets of equivalence classes (or characters, if equivalence
 | |
| classes are not being used) that are commonly used together.  Meta-equivalence
 | |
| classes are often a big win when using compressed tables, but they
 | |
| have a moderate performance impact (one or two "if" tests and one
 | |
| array look-up per character scanned).
 | |
| .IP
 | |
| A lone
 | |
| .B -C
 | |
| specifies that the scanner tables should be compressed but neither
 | |
| equivalence classes nor meta-equivalence classes should be used.
 | |
| .IP
 | |
| The options
 | |
| .B -Cf
 | |
| or
 | |
| .B -CF
 | |
| and
 | |
| .B -Cm
 | |
| do not make sense together - there is no opportunity for meta-equivalence
 | |
| classes if the table is not being compressed.  Otherwise the options
 | |
| may be freely mixed.
 | |
| .IP
 | |
| The default setting is
 | |
| .B -Cem,
 | |
| which specifies that
 | |
| .I flex
 | |
| should generate equivalence classes
 | |
| and meta-equivalence classes.  This setting provides the highest
 | |
| degree of table compression.  You can trade off
 | |
| faster-executing scanners at the cost of larger tables with
 | |
| the following generally being true:
 | |
| .nf
 | |
| 
 | |
|     slowest & smallest
 | |
|           -Cem
 | |
|           -Cm
 | |
|           -Ce
 | |
|           -C
 | |
|           -C{f,F}e
 | |
|           -C{f,F}
 | |
|     fastest & largest
 | |
| 
 | |
| .fi
 | |
| .IP
 | |
| .B -C
 | |
| options are not cumulative; whenever the flag is encountered, the
 | |
| previous -C settings are forgotten.
 | |
| .TP
 | |
| .B -Sskeleton_file
 | |
| overrides the default skeleton file from which
 | |
| .I flex
 | |
| constructs its scanners.  You'll never need this option unless you are doing
 | |
| .I flex
 | |
| maintenance or development.
 | |
| .SH SUMMARY OF FLEX REGULAR EXPRESSIONS
 | |
| The patterns in the input are written using an extended set of regular
 | |
| expressions.  These are:
 | |
| .nf
 | |
| 
 | |
|     x          match the character 'x'
 | |
|     .          any character except newline
 | |
|     [xyz]      a "character class"; in this case, the pattern
 | |
|                  matches either an 'x', a 'y', or a 'z'
 | |
|     [abj-oZ]   a "character class" with a range in it; matches
 | |
|                  an 'a', a 'b', any letter from 'j' through 'o',
 | |
|                  or a 'Z'
 | |
|     [^A-Z]     a "negated character class", i.e., any character
 | |
|                  but those in the class.  In this case, any
 | |
|                  character EXCEPT an uppercase letter.
 | |
|     [^A-Z\\n]   any character EXCEPT an uppercase letter or
 | |
|                  a newline
 | |
|     r*         zero or more r's, where r is any regular expression
 | |
|     r+         one or more r's
 | |
|     r?         zero or one r's (that is, "an optional r")
 | |
|     r{2,5}     anywhere from two to five r's
 | |
|     r{2,}      two or more r's
 | |
|     r{4}       exactly 4 r's
 | |
|     {name}     the expansion of the "name" definition
 | |
|                (see above)
 | |
|     "[xyz]\\"foo"
 | |
|                the literal string: [xyz]"foo
 | |
|     \\X         if X is an 'a', 'b', 'f', 'n', 'r', 't', or 'v',
 | |
|                  then the ANSI-C interpretation of \\x.
 | |
|                  Otherwise, a literal 'X' (used to escape
 | |
|                  operators such as '*')
 | |
|     \\123       the character with octal value 123
 | |
|     \\x2a       the character with hexadecimal value 2a
 | |
|     (r)        match an r; parentheses are used to override
 | |
|                  precedence (see below)
 | |
| 
 | |
| 
 | |
|     rs         the regular expression r followed by the
 | |
|                  regular expression s; called "concatenation"
 | |
| 
 | |
| 
 | |
|     r|s        either an r or an s
 | |
| 
 | |
| 
 | |
|     r/s        an r but only if it is followed by an s.  The
 | |
|                  s is not part of the matched text.  This type
 | |
|                  of pattern is called as "trailing context".
 | |
|     ^r         an r, but only at the beginning of a line
 | |
|     r$         an r, but only at the end of a line.  Equivalent
 | |
|                  to "r/\\n".
 | |
| 
 | |
| 
 | |
|     <s>r       an r, but only in start condition s (see
 | |
|                below for discussion of start conditions)
 | |
|     <s1,s2,s3>r
 | |
|                same, but in any of start conditions s1,
 | |
|                s2, or s3
 | |
| 
 | |
| 
 | |
|     <<EOF>>    an end-of-file
 | |
|     <s1,s2><<EOF>>
 | |
|                an end-of-file when in start condition s1 or s2
 | |
| 
 | |
| .fi
 | |
| The regular expressions listed above are grouped according to
 | |
| precedence, from highest precedence at the top to lowest at the bottom.
 | |
| Those grouped together have equal precedence.
 | |
| .LP
 | |
| Some notes on patterns:
 | |
| .IP -
 | |
| Negated character classes
 | |
| .I match newlines
 | |
| unless "\\n" (or an equivalent escape sequence) is one of the
 | |
| characters explicitly present in the negated character class
 | |
| (e.g., "[^A-Z\\n]").
 | |
| .IP -
 | |
| A rule can have at most one instance of trailing context (the '/' operator
 | |
| or the '$' operator).  The start condition, '^', and "<<EOF>>" patterns
 | |
| can only occur at the beginning of a pattern, and, as well as with '/' and '$',
 | |
| cannot be grouped inside parentheses.  The following are all illegal:
 | |
| .nf
 | |
| 
 | |
|     foo/bar$
 | |
|     foo|(bar$)
 | |
|     foo|^bar
 | |
|     <sc1>foo<sc2>bar
 | |
| 
 | |
| .fi
 | |
| .SH SUMMARY OF SPECIAL ACTIONS
 | |
| In addition to arbitrary C code, the following can appear in actions:
 | |
| .IP -
 | |
| .B ECHO
 | |
| copies yytext to the scanner's output.
 | |
| .IP -
 | |
| .B BEGIN
 | |
| followed by the name of a start condition places the scanner in the
 | |
| corresponding start condition.
 | |
| .IP -
 | |
| .B REJECT
 | |
| directs the scanner to proceed on to the "second best" rule which matched the
 | |
| input (or a prefix of the input).
 | |
| .B yytext
 | |
| and
 | |
| .B yyleng
 | |
| are set up appropriately.  Note that
 | |
| .B REJECT
 | |
| is a particularly expensive feature in terms scanner performance;
 | |
| if it is used in
 | |
| .I any
 | |
| of the scanner's actions it will slow down
 | |
| .I all
 | |
| of the scanner's matching.  Furthermore,
 | |
| .B REJECT
 | |
| cannot be used with the
 | |
| .I -f
 | |
| or
 | |
| .I -F
 | |
| options.
 | |
| .IP
 | |
| Note also that unlike the other special actions,
 | |
| .B REJECT
 | |
| is a
 | |
| .I branch;
 | |
| code immediately following it in the action will
 | |
| .I not
 | |
| be executed.
 | |
| .IP -
 | |
| .B yymore()
 | |
| tells the scanner that the next time it matches a rule, the corresponding
 | |
| token should be
 | |
| .I appended
 | |
| onto the current value of
 | |
| .B yytext
 | |
| rather than replacing it.
 | |
| .IP -
 | |
| .B yyless(n)
 | |
| returns all but the first
 | |
| .I n
 | |
| characters of the current token back to the input stream, where they
 | |
| will be rescanned when the scanner looks for the next match.
 | |
| .B yytext
 | |
| and
 | |
| .B yyleng
 | |
| are adjusted appropriately (e.g.,
 | |
| .B yyleng
 | |
| will now be equal to
 | |
| .I n
 | |
| ).
 | |
| .IP -
 | |
| .B unput(c)
 | |
| puts the character
 | |
| .I c
 | |
| back onto the input stream.  It will be the next character scanned.
 | |
| .IP -
 | |
| .B input()
 | |
| reads the next character from the input stream (this routine is called
 | |
| .B yyinput()
 | |
| if the scanner is compiled using
 | |
| .B C++).
 | |
| .IP -
 | |
| .B yyterminate()
 | |
| can be used in lieu of a return statement in an action.  It terminates
 | |
| the scanner and returns a 0 to the scanner's caller, indicating "all done".
 | |
| .IP
 | |
| By default,
 | |
| .B yyterminate()
 | |
| is also called when an end-of-file is encountered.  It is a macro and
 | |
| may be redefined.
 | |
| .IP -
 | |
| .B YY_NEW_FILE
 | |
| is an action available only in <<EOF>> rules.  It means "Okay, I've
 | |
| set up a new input file, continue scanning".
 | |
| .IP -
 | |
| .B yy_create_buffer( file, size )
 | |
| takes a
 | |
| .I FILE
 | |
| pointer and an integer
 | |
| .I size.
 | |
| It returns a YY_BUFFER_STATE
 | |
| handle to a new input buffer large enough to accomodate
 | |
| .I size
 | |
| characters and associated with the given file.  When in doubt, use
 | |
| .B YY_BUF_SIZE
 | |
| for the size.
 | |
| .IP -
 | |
| .B yy_switch_to_buffer( new_buffer )
 | |
| switches the scanner's processing to scan for tokens from
 | |
| the given buffer, which must be a YY_BUFFER_STATE.
 | |
| .IP -
 | |
| .B yy_delete_buffer( buffer )
 | |
| deletes the given buffer.
 | |
| .SH VALUES AVAILABLE TO THE USER
 | |
| .IP -
 | |
| .B char *yytext
 | |
| holds the text of the current token.  It may not be modified.
 | |
| .IP -
 | |
| .B int yyleng
 | |
| holds the length of the current token.  It may not be modified.
 | |
| .IP -
 | |
| .B FILE *yyin
 | |
| is the file which by default
 | |
| .I flex
 | |
| reads from.  It may be redefined but doing so only makes sense before
 | |
| scanning begins.  Changing it in the middle of scanning will have
 | |
| unexpected results since
 | |
| .I flex
 | |
| buffers its input.  Once scanning terminates because an end-of-file
 | |
| has been seen,
 | |
| .B
 | |
| void yyrestart( FILE *new_file )
 | |
| may be called to point
 | |
| .I yyin
 | |
| at the new input file.
 | |
| .IP -
 | |
| .B FILE *yyout
 | |
| is the file to which
 | |
| .B ECHO
 | |
| actions are done.  It can be reassigned by the user.
 | |
| .IP -
 | |
| .B YY_CURRENT_BUFFER
 | |
| returns a
 | |
| .B YY_BUFFER_STATE
 | |
| handle to the current buffer.
 | |
| .SH MACROS THE USER CAN REDEFINE
 | |
| .IP -
 | |
| .B YY_DECL
 | |
| controls how the scanning routine is declared.
 | |
| By default, it is "int yylex()", or, if prototypes are being
 | |
| used, "int yylex(void)".  This definition may be changed by redefining
 | |
| the "YY_DECL" macro.  Note that
 | |
| if you give arguments to the scanning routine using a
 | |
| K&R-style/non-prototyped function declaration, you must terminate
 | |
| the definition with a semi-colon (;).
 | |
| .IP -
 | |
| The nature of how the scanner
 | |
| gets its input can be controlled by redefining the
 | |
| .B YY_INPUT
 | |
| macro.
 | |
| YY_INPUT's calling sequence is "YY_INPUT(buf,result,max_size)".  Its
 | |
| action is to place up to
 | |
| .I max_size
 | |
| characters in the character array
 | |
| .I buf
 | |
| and return in the integer variable
 | |
| .I result
 | |
| either the
 | |
| number of characters read or the constant YY_NULL (0 on Unix systems)
 | |
| to indicate EOF.  The default YY_INPUT reads from the
 | |
| global file-pointer "yyin".
 | |
| A sample redefinition of YY_INPUT (in the definitions
 | |
| section of the input file):
 | |
| .nf
 | |
| 
 | |
|     %{
 | |
|     #undef YY_INPUT
 | |
|     #define YY_INPUT(buf,result,max_size) \\
 | |
|         { \\
 | |
|         int c = getchar(); \\
 | |
|         result = (c == EOF) ? YY_NULL : (buf[0] = c, 1); \\
 | |
|         }
 | |
|     %}
 | |
| 
 | |
| .fi
 | |
| .IP -
 | |
| When the scanner receives an end-of-file indication from YY_INPUT,
 | |
| it then checks the
 | |
| .B yywrap()
 | |
| function.  If
 | |
| .B yywrap()
 | |
| returns false (zero), then it is assumed that the
 | |
| function has gone ahead and set up
 | |
| .I yyin
 | |
| to point to another input file, and scanning continues.  If it returns
 | |
| true (non-zero), then the scanner terminates, returning 0 to its
 | |
| caller.
 | |
| .IP
 | |
| The default
 | |
| .B yywrap()
 | |
| always returns 1.  Presently, to redefine it you must first
 | |
| "#undef yywrap", as it is currently implemented as a macro.  It is
 | |
| likely that
 | |
| .B yywrap()
 | |
| will soon be defined to be a function rather than a macro.
 | |
| .IP -
 | |
| YY_USER_ACTION
 | |
| can be redefined to provide an action
 | |
| which is always executed prior to the matched rule's action.
 | |
| .IP -
 | |
| The macro
 | |
| .B YY_USER_INIT
 | |
| may be redefined to provide an action which is always executed before
 | |
| the first scan.
 | |
| .IP -
 | |
| In the generated scanner, the actions are all gathered in one large
 | |
| switch statement and separated using
 | |
| .B YY_BREAK,
 | |
| which may be redefined.  By default, it is simply a "break", to separate
 | |
| each rule's action from the following rule's.
 | |
| .SH FILES
 | |
| .TP
 | |
| .I flex.skel
 | |
| skeleton scanner.
 | |
| .TP
 | |
| .I lex.yy.c
 | |
| generated scanner (called
 | |
| .I lexyy.c
 | |
| on some systems).
 | |
| .TP
 | |
| .I lex.backtrack
 | |
| backtracking information for
 | |
| .B -b
 | |
| flag (called
 | |
| .I lex.bck
 | |
| on some systems).
 | |
| .TP
 | |
| .B -lfl
 | |
| library with which to link the scanners.
 | |
| .SH "SEE ALSO"
 | |
| .LP
 | |
| flexdoc(1), lex(1), yacc(1), sed(1), awk(9).
 | |
| .LP
 | |
| M. E. Lesk and E. Schmidt,
 | |
| .I LEX - Lexical Analyzer Generator
 | |
| .SH DIAGNOSTICS
 | |
| .I reject_used_but_not_detected undefined
 | |
| or
 | |
| .LP
 | |
| .I yymore_used_but_not_detected undefined -
 | |
| These errors can occur at compile time.  They indicate that the
 | |
| scanner uses
 | |
| .B REJECT
 | |
| or
 | |
| .B yymore()
 | |
| but that
 | |
| .I flex
 | |
| failed to notice the fact, meaning that
 | |
| .I flex
 | |
| scanned the first two sections looking for occurrences of these actions
 | |
| and failed to find any, but somehow you snuck some in (via a #include
 | |
| file, for example).  Make an explicit reference to the action in your
 | |
| .I flex
 | |
| input file.  (Note that previously
 | |
| .I flex
 | |
| supported a
 | |
| .B %used/%unused
 | |
| mechanism for dealing with this problem; this feature is still supported
 | |
| but now deprecated, and will go away soon unless the author hears from
 | |
| people who can argue compellingly that they need it.)
 | |
| .LP
 | |
| .I flex scanner jammed -
 | |
| a scanner compiled with
 | |
| .B -s
 | |
| has encountered an input string which wasn't matched by
 | |
| any of its rules.
 | |
| .LP
 | |
| .I flex input buffer overflowed -
 | |
| a scanner rule matched a string long enough to overflow the
 | |
| scanner's internal input buffer (16K bytes - controlled by
 | |
| .B YY_BUF_MAX
 | |
| in "flex.skel").
 | |
| .LP
 | |
| .I scanner requires -8 flag -
 | |
| Your scanner specification includes recognizing 8-bit characters and
 | |
| you did not specify the -8 flag (and your site has not installed flex
 | |
| with -8 as the default).
 | |
| .LP
 | |
| .I
 | |
| fatal flex scanner internal error--end of buffer missed -
 | |
| This can occur in an scanner which is reentered after a long-jump
 | |
| has jumped out (or over) the scanner's activation frame.  Before
 | |
| reentering the scanner, use:
 | |
| .nf
 | |
| 
 | |
|     yyrestart( yyin );
 | |
| 
 | |
| .fi
 | |
| .LP
 | |
| .I too many %t classes! -
 | |
| You managed to put every single character into its own %t class.
 | |
| .I flex
 | |
| requires that at least one of the classes share characters.
 | |
| .SH AUTHOR
 | |
| Vern Paxson, with the help of many ideas and much inspiration from
 | |
| Van Jacobson.  Original version by Jef Poskanzer.
 | |
| .LP
 | |
| See flexdoc(1) for additional credits and the address to send comments to.
 | |
| .SH DEFICIENCIES / BUGS
 | |
| .LP
 | |
| Some trailing context
 | |
| patterns cannot be properly matched and generate
 | |
| warning messages ("Dangerous trailing context").  These are
 | |
| patterns where the ending of the
 | |
| first part of the rule matches the beginning of the second
 | |
| part, such as "zx*/xy*", where the 'x*' matches the 'x' at
 | |
| the beginning of the trailing context.  (Note that the POSIX draft
 | |
| states that the text matched by such patterns is undefined.)
 | |
| .LP
 | |
| For some trailing context rules, parts which are actually fixed-length are
 | |
| not recognized as such, leading to the abovementioned performance loss.
 | |
| In particular, parts using '|' or {n} (such as "foo{3}") are always
 | |
| considered variable-length.
 | |
| .LP
 | |
| Combining trailing context with the special '|' action can result in
 | |
| .I fixed
 | |
| trailing context being turned into the more expensive
 | |
| .I variable
 | |
| trailing context.  For example, this happens in the following example:
 | |
| .nf
 | |
| 
 | |
|     %%
 | |
|     abc      |
 | |
|     xyz/def
 | |
| 
 | |
| .fi
 | |
| .LP
 | |
| Use of unput() invalidates yytext and yyleng.
 | |
| .LP
 | |
| Use of unput() to push back more text than was matched can
 | |
| result in the pushed-back text matching a beginning-of-line ('^')
 | |
| rule even though it didn't come at the beginning of the line
 | |
| (though this is rare!).
 | |
| .LP
 | |
| Pattern-matching of NUL's is substantially slower than matching other
 | |
| characters.
 | |
| .LP
 | |
| .I flex
 | |
| does not generate correct #line directives for code internal
 | |
| to the scanner; thus, bugs in
 | |
| .I flex.skel
 | |
| yield bogus line numbers.
 | |
| .LP
 | |
| Due to both buffering of input and read-ahead, you cannot intermix
 | |
| calls to <stdio.h> routines, such as, for example,
 | |
| .B getchar(),
 | |
| with
 | |
| .I flex
 | |
| rules and expect it to work.  Call
 | |
| .B input()
 | |
| instead.
 | |
| .LP
 | |
| The total table entries listed by the
 | |
| .B -v
 | |
| flag excludes the number of table entries needed to determine
 | |
| what rule has been matched.  The number of entries is equal
 | |
| to the number of DFA states if the scanner does not use
 | |
| .B REJECT,
 | |
| and somewhat greater than the number of states if it does.
 | |
| .LP
 | |
| .B REJECT
 | |
| cannot be used with the
 | |
| .I -f
 | |
| or
 | |
| .I -F
 | |
| options.
 | |
| .LP
 | |
| Some of the macros, such as
 | |
| .B yywrap(),
 | |
| may in the future become functions which live in the
 | |
| .B -lfl
 | |
| library.  This will doubtless break a lot of code, but may be
 | |
| required for POSIX-compliance.
 | |
| .LP
 | |
| The
 | |
| .I flex
 | |
| internal algorithms need documentation.
 | |
| .\" ref. to awk(9) man page corrected -- ASW 2005-01-15
 | 
