Noeud:Start Conditions, Noeud « Next »:Advanced Use of Flex, Noeud « Previous »:Using Flex, Noeud « Up »:Scanning with Flex
Non keywords often need some form of conversion: strings of digits are
converted into integers, and so on. This conversion often involves
another scanning of the token, for instance to convert the escapes,
e.g., \n
, into character literals. Writing this scanner by hand
is easy, but frustrating.
Sometimes one is limited by the theory itself: imagine your language
supports nested comments. It is easily proven that a language of
balanced parentheses1 cannot be described by regular expressions. Indeed, this would imply
the existence of an FSR, with say q states. Then if we
overflow its memory with more than q opening parentheses, it
completely loses its count. Therefore there cannot be such an FSR,
hence no regular expression, thus we are stuck! Nevertheless it would
have been very easy to write a scanner solely tracking /*
and
*/
and throwing away any other string.
Our scanners are nothing but automata, such as in the example 6.9. We could solve the two problems above simply if we could join the corresponding FSR to a new initial state labelled with some conditions:
if (in_body) ,-----------------. ,---------------->| body scanner | / `-----------------' / ,---./ if (in_comment) ,-----------------. -->| |-------------------->| comment scanner | `---'\ `-----------------' \ `g' \ if (in_string) ,-----------------. `---------------->| string scanner | `-----------------'
Example 6.13: A Condition Driven FSR Combination
These are called start conditions. They allow to combine small
parsers into a bigger one. The default start condition is named
INITIAL
, others can be introduced thanks to the Flex directive
%x
start-condition...
. To set the current start condition,
i.e., to select the eligible branch at the next run of the
automaton, use BEGIN
start-condition. This is not a form
of
return
or goto
, the execution proceeds normally in the
current action.
Finally, to complete the description of the rules by their conditions, use either
<start-condition, ...>pattern action
or
<start-condition, ...>{ pattern-1 action-1 pattern-2 action-2 }