Deriving Lex In SDF
XT -- A Bundle of Program Transformation Tools
-----------------------------------------------------------------------------
RECOVERY OF SYNTAX DEFINITION FOR LEX
-----------------------------------------------------------------------------
No syntax definition for LEX was available in the grammar-base. In
order to further automate the translation of LEX/YACC grammars to SDF2
syntax definitions, such a syntax definition is needed. In this file I
report the steps I took to get LEX in SDF2.
-- Eelco Visser 2001/09/29
-----------------------------------------------------------------------------
[Step 1] Locate the sources
ftp://ftp.gnu.org/non-gnu/flex/
[Step 2] Inspect the source
cd flex-2.5.4
less parse.y
[Step 3] Copy source to grammar base
cp parse.y ~/res/XT/gb/grammars/lex.0
[Step 3] Parse the YACC (Bison) source
> parse -l yacc -i lex.y -I -o lex.af
=> Error: charliteral '\n' not recognized. Repair syntax definition of
YACC.
[Step 4] Translate to AbstractSDF
> parse -l yacc -i lex.y -I -o lex.af
yacc2sdf -i lex.af -o lex.asdf
[Step 5] Pretty-print syntax definition and inspect
> sdf-bracket -i lex.asdf | pp -a -l sdf -o lex.def -v 2.1
less lex.def
[Step 6] Regularize the syntax definition
> parse -l yacc -i lex.y -I -o lex.af
yacc2sdf -i lex.af -o lex.asdf
sdf-regularize -i lex.asdf -o lex.reg.asdf
sdf-bracket -i lex.reg.asdf | pp -a -l sdf -o lex.def -v 2.1
less lex.def
[Step 7] Generate constructors
> parse -l yacc -i lex.y -I -o lex.af
yacc2sdf -i lex.af -o lex.asdf
sdf-regularize -i lex.asdf -o lex.reg.asdf
sdf-cons -i lex.reg.asdf -o lex.reg.cons.asdf
sdf-bracket -i lex.reg.cons.asdf | pp -a -l sdf -o lex.def -v 2.1
less lex.def
[Step 8] Edit the definition to define lexical syntax and improve constructors
[Step 9] Unpack the definition to create separate SDF modules. Make check can then
be used to generate lex.def and automatically parse various example files.
Before doing this change the names of the modules Lexical and Generated into
Lex-Symbols and Lex, respectively.
> unpack-sdf lex.def
[Step 10] Further edit the modules to define lexical syntax.
=> It turns rather hard to parse complete lex files. The lexical syntax
is very tricky. Instead I decide to reduce the problem by editing lex
files by hand to remove all irrelevant stuff and only leave definitions
of the form |name re| and rules of the form |re return id;|. Newlines
cannot be used as general layout, but are used to delimit definitions
and rules. No superfluous newlines are allowed.
=> Succeed in parsing stratego.mod.l, a modified version of the stratego lexical
syntax in lex.
[Step 11] Improve the syntax definition to get good abstract syntax. Start with
unfolding literals.
> parse -l sdf -v 2.1 -I -i lex.def -o lex.adef
unfold-literal -i lex.adef -o lex.unf.adef
sdf-bracket -i lex.adef | pp -a -l sdf -o lex.def -v 2.1
less lex.def
=> Automatic application does not work, do it manually. (come back and
repair unfold-literal later)
[Step 12] Abstract syntax looks good. Install parse table such that it can be used with the
parse tool of the grammar base.
> make install
parse -l lex -i data/stratego.mod.l -I
[Step 13] Project finished.
=> Future work:
- parse full lex definitions
- generate a signature from the syntax definition
--
EelcoVisser - 29 Sep 2001