This document describes the Tiger project for EPITA students as of August 14, 2003. More information is available on the EPITA Tiger Compiler Project Home Page.
Tiger is a language introduced by Andrew Appel in his book Modern Compiler Implementation. This document is by no means sufficient to produce an actual Tiger compiler, nor to understand compilation. You are strongly encouraged to buy and read Appel's book: it is an excellent book.
There are several differences with the original book, the most important being that EPITA students have to implement this compiler in C++ and using modern object oriented programming techniques. You ought to buy the original book, nevertheless, pay extreme attention to implementing the version of the language specified below, not that of the book.
array
, if
, then
, else
, while
,
for
, to
, do
, let
, in
, end
,
of
, break
, nil
, function
, var
, and
type
,
, :
, ;
, (
, )
, [
, ]
,
{
, }
, .
, +
, -
, *
, /
,
=
, <>
, <
, <=
, >
, >=
, &
,
|
, and :=
\n\r
, and \r\n
, and \r
, and
\n
, freely intermixed.
"
, with support for
the following escapes:
\a
, \b
, \f
, \n
, \r
, \t
, \v
\\
\"
All the other characters are plain characters and are to be included in
the string. In particular, multi-line strings are allowed.
Code /* Comment /* Nested comment */ Comment */ Code
id ::= letter { letter | digit | _ } letter ::= a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | p | q | r | s | t | u | v | w | x | y | z | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z digit ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
integer ::= digit { digit } op ::= + | - | * | / | = | <> | > | < | >= | <= | & | |
We use Extended BNF, with [
and ]
for zero or once,
(
and )
for grouping, and {
and }
for any
number of repetition including zero.
program ::= exp exp ::= # Literals. nil | integer | string # Array and record creations. | type-id [ exp ] of exp | type-id {[ id = exp { , id = exp } ] } # Variables, field, elements of an array. | lvalue # Function call. | id ( [ exp { , exp }] ) # Operations | - exp | exp op exp | ( exps ) # Assignment | lvalue := exp # Control structures | if exp then exp [else exp] | while exp do exp | for id := exp to exp do exp | break | let decs in exps end lvalue ::= id | lvalue . id | lvalue [ exp ] exps ::= [ exp { ; exp } ] decs ::= { dec } dec ::= # Type declaration type id = ty # Variable declaration | var id [ : type-id ] := exp # Function declaration | function id ( tyfields ) [ : type-id ] = exp # Types ty ::= type-id | { tyfields } | array of type-id tyfields ::= [ id : type-id { , id : type-id } ] type-id ::= id op ::= + | - | * | / | = | <> | > | < | >= | <= | & | |
let type int_array = array of int var table := int_array[100] of 0 in ... end
int
and string
. They may be
redefined.
Type aliases do not build new types, hence they are equivalent.
let type a = int type b = int var a := 1 var b := 2 in a = b /* OK */ end
let type a = {a : int} var a := 0 function a (a : a) : a = a{a = a.a} in a (a{a = a}) end
But three name spaces is easier to implement than two.
&&
and ||
. Because they are implemented as syntactic sugar, one
could easily make 123 || 456
return 1
or 123
.
For the time being the "right" result is considered being
123
. Similarly 123 && 456
is 456
. This is
unnatural, but it is what is the most consistent with the suggested
implementation. In the future (a different class), this might change.
But anyway, no test will depend on this.
* / + - >= <= = <> < > & |
All the associative operators are associative to the left.
These entities are predefined, i.e., they are available when you start the Tiger compiler, but a Tiger program may redefine them.
There are two predefined types:
int
string
Some runtime function may fail if some assertions are not fulfilled. In that case, the program must exit with a properly labelled error message, and with exit code 120. Please, note that the error messages are standardized, and must be exactly observed. Any difference, in better or worse, is a failure to comply with the (this) Tiger Reference Manual.
chr (code : int) | string |
Return the one character long string containing the character which
code is code. If code does not belong to the range
[0..255], raise a runtime error: chr: character out of range .
|
concat (first: string, second: string) | string |
exit (status: int) | void |
Exit the program with exit code status. |
flush () | void |
getchar () | string |
not (boolean: int) | int |
ord (string: string) | int |
print (string: string) | void |
print_int (int: int) | void |
Note: this is an EPITA extension. Output int in its decimal
canonical form (equivalent to %d for printf ).
|
size (string: string) | int |
substring (string: string, first: int, length: int) | string |
Return a string composed of the characters of string starting at
the first character (0 being the origin), and composed of
length characters (i.e., up to and including the character
first + length).
Let size be the size of the string, the following assertions must hold:
otherwise a runtime failure is raised: |
tc
Synopsis:
tc option... file
where file can be -
, denoting the standard input.
Global options are:
-h
--help
--version
--task-list
--task-order
The options related to scanning are:
--scan-trace
The options related to parsing are:
--parse-trace
-A
--ast-display
The options related to type checking are:
-T
--types-check
The options related to escapes computation are:
-e
--escapes-compute
-E
--escapes-display
--escapes-compute
, so that it is possible to check that the
defaults (everybody escapes) are properly implemented.
The options related to the high level intermediate representation are:
--hir-compute
--types-check
.
-H
--hir-display
--hir-compute
.
The options related to the low level intermediate representation are:
--canon-trace
--canon-compute
-C
--canon-display
--lir-compute
. It
is convenient to determine whether a failure is due to canonicalization,
or traces.
--traces-trace
--traces-compute
--canon-compute
.
--lir-compute
--traces-compute
. Actually, it
is nothing but a nice looking alias for the latter.
-L
--lir-display
--lir-compute
.
The options related to the instruction selection are:
--inst-compute
--lir-compute
.
-I
--inst-display
--inst-compute
.
-R
--runtime-display
The options related to the liveness information are:
-F
--flowgraphs-dump
--inst-compute
.
-V
--liveness-dump
--inst-compute
.
-N
--interference-dump
--inst-compute
.
Compile errors must be reported on the standard error flow with precise error location. Examples include:
$ echo "1 + + 2" | ./tc - error-->standard input:1.4: syntax error, unexpected "+" error-->Parsing Failed
and
$ echo "1 + () + 2" | ./tc -T - error-->standard input:1.0-5: type mismatch error--> right operand type: void error--> expected type: int
Note that the symbol error--> is not part of the actual output. It is
only used in this document to highlight that the message is produced on
the standard error flow. Do not include it as part of the compiler's
messages.
The compiler exit value should reflect faithfully the compilation status. The possible values are:
malloc
or fopen
failed, a file is missing
etc.
Note that an optional option (such as --hir-use-ix
) must cause
tc
to exit 1 if it does not support it. If you don't, be sure
that your compiler will be exercised on these optional features, and it
will most probably have 0.
EX_USAGE
)
argp
.
When several errors have occurred, the least value should be issued, not the earliest. For instance:
(let error in end; %)
should exit 2, not 3, although the parse error was first detected.
In addition to compiler errors, the compiled programs may have to raise a runtime error, for instance when runtime functions received improper arguments. In that case use the exit code 120, and issue a clear diagnostic. Note that because of the basic MIPS model we target which does not provide the standard error output, the message is to be output onto the standard output.
A strictly compliant compiler must behave exactly as specified in this document and in Andrew Appel's book, and as demonstrated by the samples exhibited in this document, and in the "Reports" document.
Nevertheless, you are entirely free to extend your compiler as you wish, as long as this extension is enabled by a non standard option. Extensions include:
isatty
, as the correction program will not appreciate.
In any case, if you don't implement an extension that was
suggested (such as --hir-use-ix
, then you must not
accept the option. If the compiler accepts an option, then the effect
of this option will be checked. For instance, if your compiler accepts
--hir-use-ix
but does not implement it, then be sure to get 0
on these tests.