TWEAST: A Simple and Effective Technique to Implement Concrete-Syntax AST Rewriting Using Partial Parsing

From LRDE

Abstract

ASTs are commonly used to represent an input/output program in compilers and language processing tools. Many of the tasks of these tools consist in generating and rewriting ASTs. Such an approach can become tedious and hard to maintain for complex operations, namely program transformation, optimization, instrumentation, etc. On the other hand, concrete syntax provides a natural and simpler representation of programs, but it is not usually available as a direct feature of the aforementioned tools. We propose a simple technique to implement AST generation and rewriting in general purpose languages using concrete syntax. Our approach relies on extensions made in the scanner and the parser and the use of objects supporting partial parsing called Text With Embedded Abstract Syntax Trees (TWEASTS). A compiler for a simple language (Tiger) written in Cxx serves as an example, featuring transformations in concrete syntax: syntactic desugaringoptimization, code instrumentation such as bounds-checkingetc. Extensions of this technique to provide a full-fledged concrete-syntax rewriting framework are presented as well.

Documents

Bibtex (lrde.bib)

@InProceedings{	  demaille.09.sac,
  author	= {Akim Demaille and Roland Levillain and Beno\^it Sigoure},
  title		= {{TWEAST}: A Simple and Effective Technique to Implement
		  Concrete-Syntax {AST} Rewriting Using Partial Parsing},
  booktitle	= {Proceedings of the 24th Annual ACM Symposium on Applied
		  Computing (SAC'09)},
  pages		= {1924--1929},
  year		= 2009,
  address	= {Waikiki Beach, Honolulu, Hawaii, USA},
  month		= mar,
  abstract	= {ASTs are commonly used to represent an input/output
		  program in compilers and language processing tools. Many of
		  the tasks of these tools consist in generating and
		  rewriting ASTs. Such an approach can become tedious and
		  hard to maintain for complex operations, namely program
		  transformation, optimization, instrumentation, etc. On the
		  other hand, \emph{concrete syntax} provides a natural and
		  simpler representation of programs, but it is not usually
		  available as a direct feature of the aforementioned tools.
		  We propose a simple technique to implement AST generation
		  and rewriting in general purpose languages using concrete
		  syntax. Our approach relies on extensions made in the
		  scanner and the parser and the use of objects supporting
		  partial parsing called Text With Embedded Abstract Syntax
		  Trees (TWEASTS). A compiler for a simple language (Tiger)
		  written in \Cxx serves as an example, featuring
		  transformations in concrete syntax: syntactic desugaring,
		  optimization, code instrumentation such as bounds-checking,
		  etc. Extensions of this technique to provide a full-fledged
		  concrete-syntax rewriting framework are presented as well.}
}