C Transformers
Overview
CTransformers is a generic framework for C source-to-source transformations.
CTransformers uses the
Stratego/XT framework.
Current work
Parsing
We have an ISO C99 grammar written in
SDF.
This grammar is ambiguous, so we use a
generalized parser (
SGLR) to produce a parse forest from the input source code. This
parse forest must be disambiguated afterwards.
Disambiguation
Here are the various ambiguities resulting from the use of the standard C99 grammar (ISO/IEC 9899). We handle these ambiguities using attribute
evaluations in the grammar. We wrote a new tool (
sdf-attribute) that
allows us to write attributes in the
SDF grammar with
Stratego code.
Typedef vs. variable
#ifdef FLAG
typedef int a;
#else
int a;
#endif
// the following code is context-dependent:
int f() {
int b;
a * b; // multiplication or variable declaration of type int*?
} |
Cast vs. function call
#ifdef FLAG
void a(int);
#else
typedef long a;
#endif
// the following code is context-dependent:
int f() {
(a)(0); // cast of constant 0 into long or call of function a?
} |
Bitfields
struct {
int : 4; // valid, the identifier is optional
int t : 5; // ambiguity: 't' can be a typedef or an identifier.
}; |
Here, we don't have the choice.
t must be an identifier, because a typespecifier cannot be composed of
int + typedefname.
Actually, this ambiguity is handled with the same code as the identifier vs. typedef problem.
Enumeration constant vs. variable
#ifdef FLAG
enum { a = 0 };
#else
int a = 1;
#endif
// the grammar makes a clear distinction between an
// identifier and an enumeration constant, thus the
// following code is clearly context-dependant:
int f() {
int i = a; // variable or enumeration?
} |
Sizeof
int f()
{
#ifdef FLAG
int b;
#else
typedef int b;
#endif
return sizeof(b); // is b a type or a variable?
} |
The
sizeof operator can take as argument either a type identifier or
a variable identifier. Thus, a context-dependant construct appears. This
ambiguity is processed earlier in the parse tree because the type table
is built to solve other ambiguities.
Parameter declaration list
The grammar makes a distinction between a list of identifiers and a
list of parameters (type specifier + optional
identifier) in the parameter declaration list.
So the following code have an ambiguity:
typedef int a, b, c;
int f(a, b, c); |
This is solved by always taking the list of parameters when we have the choice.
Variable argument lists problem
Variable argument lists (va_args) are troublesome, especially the
va_arg function:
type va_arg(va_list ap, type); |
Reading the man page reveals that the
va_arg function is actually a macro. It is expanded by GCC into
_builtin_va_args(va_list ap, _type), which is a builtin. The problem is that the standard C grammar would never allow us to give a
type as an actual parameter. The GCC compiler must hack around it.
An alternative method for handling va_arg, found in the Tiny C Compiler (tcc), is to use the following macro:
// ugly line to grab arguments on the stack at the right place.
#define va_arg(ap,type) (ap += (sizeof(type)+3)&~3, *(type *)(ap - ((sizeof(type)+3)&~3))) |
This solution is obviously architecture-dependant, so it is not a good solution.
The first step towards a more correct solution is to throw away the GCC/GLIBC headers that we depend on. We should make our own set of headers, directly extracted from the C standard.
C transformations
To demonstrate
C Transformers capabilities, we extended C into
ContractC: C with "design by contract'' support.
to top