EelcoVisser.
Scannerless Generalized-LR parsing. Technical Report P9707, Programming Research Group, University of Amsterdam, July 1997.
Abstract
Current deterministic parsing techniques have a number of
problems. These include the limitations of parser generators for
deterministic languages and the complex interface between scanner and
parser. Scannerless parsing is a parsing technique in which lexical
and context-free syntax are integrated into one grammar and are all
handled by a single context-free analysis phase. This approach has a
number of advantages including discarding of the scanner and lexical
disambiguation by means of the context in which a lexical token
occurs. Scannerless parsing generates a number of interesting problems
as well. Integrated grammars do not fit the requirements of the
conventional deterministic parsing techniques. A plain context-free
grammar formalism leads to unwieldy grammars, if all lexical
information is included. Lexical disambiguation needs to be
reformulated for use in context-free parsing.
The scannerless generalized-LR parsing approach presented in this
paper solves these problems. Grammar normalization is used to support
an expressive grammar formalism without complicating the underlying
machinery. Follow restrictions are used to express longest match
lexical disambiguation. Reject productions are used to express the
prefer keywords rule for lexical disambiguation. The SLR(1) parser
generation algorithm is adapted to implement disambiguation by general
priority and associativity declarations and to interpret follow
restrictions. Generalized-LR parsing is used to provide dynamic
lookahead and to support parsing of arbitrary context-free grammars
including ambiguous ones. An adaptation of the GLR algorithm supports
the interpretation of grammars with reject productions.
Further Reading
CategoryPaper