Everything exposed in this document is expected to be known.
This document details the various tasks the “Compilation” students must complete. It was last edited on February 24, 2004.
--- The Detailed Node Listing ---
Introduction
History
Instructions
Coding Style
Evaluation
Tarballs
Project Layout
Compiler Stages
T0, Naive Scanner and Parser
T1, Scanner and Parser
T2, Building the Abstract Syntax Tree
T2 Samples
T3, Computing the Escaping Variables
T4, Type Checking
T5, Translating to the High Level Intermediate Representation
T5 Samples
T5 Options
T6, Translating to the Low Level Intermediate Representation
T6 Samples
T7, Instruction Selection
T8, Liveness Analysis
T9, Register Allocation
Tools
The GNU Build System
Appendices
This document presents the Tiger Project as part of the EPITA curriculum. It aims at the implementation of a Tiger compiler (see Modern Compiler Implementation) in C++.
If you are a newcomer, you might be afraid by its sheer size. Don't worry, but in any case, do not give up: as stated in the very beginning of this document,
That is to say everything exposed in this document is considered to be known. If it is written but you didn't know, you are wrong. If it is not written and was not clearly reported in the news, I am wrong.
Basically this document contains three kinds of informations:
There is additional material on the Internet:
This project is quite different from most other EPITA projects, and has aims at several different goals, in different areas:
This also means that you have to design a test suite, and maintain it
through out the project. The test suite is an integral part of
the project.
Note, however, that implementing an industrial strength compiler in C++
makes a lot of sense1.
Bjarne Stroustrup's list of C++ Applications mentions Metrowerks
(CodeWarrior), HP, Sun, Intel, M$ as examples.
Q: What is your opinion, is knowing assembly language useful for programmers nowadays?BS: It is useful to understand how machines work and knowing assembler is almost essential for that.
English has an important role as a common language for programmers, and I suspect that it would be unwise to abandon that without serious consideration.
Any attempt to break the importance of English is wrong. For instance,
do not translate this document nor any other. Ask support to the
Yakas, or to the English team. By the past, some oral and written
examinations were made in English. It may well be back some day. Some
books will help you to improve your English, see The Elements of Style.
The Tiger project is not unique in these regards, see Cool: The Classroom Object-Oriented Compiler, for instance, with many strikingly similar goals, and some profound differences. See also Making Compiler Design Relevant for Students who will (Most Likely) Never Design a Compiler, for an explanation of why compilation techniques have a broader influence than they seem.
This section could have been named “What Akim did not say”, or “Common misinterpretations”.
The first and foremost misinterpretation would be “Akim says C sucks and is useless”. Wrong. C sucks, definitely, but today C is probably the first employer of programmers in the world, so let's face it: C is mandatory in your education. The fact that C++ is studied afterward does not mean that learning C is a loss of time, it means that since C is basically a subset of C++ it makes sense to learn it first, it also means that (let it be only because it is a superset) C++ provides additional services so it is often a better choice, but even more often you don't have the choice.
C++ is becoming a common requirement for programmers, so you also have to learn it, although given its roots, it naturally suffers from many defects. But it's an industrial standard, so learn it, and learn it well: know its strengths and weaknesses.
And by the way, of course C++ sucks++.
Another common rumor in EPITA has it that “C/Unix programming does not deserve attention after the first period”. Wrong again. First of all its words are wrong: it is a legacy belief that C and Unix require each other: you can implement advanced system features using other languages than C (starting with C++, of course), and of course C can be used for other tasks than just system programming. Note for instance that Bjarne Stroustrup's list of C++ Applications mentions that the following ones are written in C++:
- Apple
- OS X is written in a mix of language, but a few important parts are C++. The two most interesting are:
- Finder
- IOKit device drivers. (IOKit is the only place where we use C++ in the kernel, though.)[...]
- Ericsson
- TelORB - Distributed operating system with object oriented
- Microsoft
- Literally everything at Microsoft is built using various flavors of Visual C++ - mostly 6.0 and 7.0 but we do have a few holdouts still using 5.0 :-( and some products like Windows XP use more recent builds of the compiler. The list would include major products like:
- Windows XP
- Windows NT (NT4 and 2000)
- Windows 9x (95, 98, Me)
- Microsoft Office (Word, Excel, Access, PowerPoint, Outlook)[...]
- CDE
- The CDE desktop (the standard desktop on many UNIX systems) is written in C++.
Know C. Learn when it is adequate, and why you need it.
Know C++. Learn when it is adequate, and why you need it.
Know other languages. Learn when they are adequate, and why you need them.
And then, if you are asked to choose, make an educated choice. If there is no choice to be made, just deal with Real Life.
The Tiger Compiler Project evolves every year, so as to improve its infrastructure, to demonstrate more instructional material and so forth. This section tries to keep a list of these changes, together with the most constructive criticisms from students (or ourselves).
If you have information, including criticisms, that should be mentioned here, please send it to me.
The years correspond to the class, e.g., Tiger 2005 refers to EPITA class 2005, i.e., the project ran from January 2003 to September 2003.
Before diving into the history of the Tiger Compiler Project in EPITA, a whole project in itself for ourselves, with experimental tries and failures, it might be good to review some constraints that can explain why things are the way they are. Understanding these constraints will make it easier to criticize actual flaws, instead of focusing on issues that are mandated by other factors.
Bear in mind that Tiger is an instructional project, the purpose of which is detailed above, see Why the Tiger Project. Because the input is a stream of students with virtually no knowledge whatsoever in C++, and our target is a stream of students with good fluency in many constructs and understanding of complex matters, we have to gradually transform them via intermediate forms with increasing skills. In particular this means that by the end of the project, evolved techniques can and should be used, but at the beginning only introductory knowledge should be needed. As an example of a consequence, we cannot have a nice and high-tech AST.
Because the insight of compilers is not the primary goal, when a choice is to be made between (i) more interesting work on compiler internals with little C++ novelty, and (ii) providing most of this work and focusing on something else, then we are most likely to select the second option. This means that the Tiger Project is doomed to be a low-tech featureless compiler, with no call graph, no default optimization, no debugging support (outputting comments in the assembly showing the original code), no bells, no whistles, no etc. This also implies that sometimes interested students will feel we “stole” the pleasure to write nice pieces of code from them; understand that we actually provided code to the other students. However, you are free to rewrite everything if you wish.
std
name space unqualified etc.). In
addition, we were using hash_map
which is an SGI
extension that is not available in standard C++. It was therefore
decided to upgrade the compiler in 2003, and to upgrade the programming
style.
all
target as first running clean
and then the
actual build.
As a result I grew tired of fixing the tarballs, and in order to have a robust, efficient (albeit some piece of pain in the neck sometimes) distributions 2 we moved to using Automake, and hence Autoconf.
There are reasons not to be happy with it, agreed. But there are many more reasons to be sad without it. So Autoconf and Automake are here to stay.
Note, however, that you are free to use another system if you wish.
Just obey to the standard package interface (see Delivery).
SemantVisitor
is a nightmare to maintainSemantVisitor
, which performs both the type checking and the
translation to intermediate code, was near to impossible to deliver in
pieces to the students: because type checking and translation were so
much intertwined, it was not possible to deliver as a first step the
type checking machinery template, and then the translation pieces.
Students had to fight with non applicable patches. This was fixed in
Tiger 2003 by splitting the SemantVisitor
into TypeVisitor
and TranslationVisitor
. The negative impact, of course, is a
performance loss.
During this year, I was helped by:
Delivery date were:
Stage | Delivery
|
---|---|
T1 | Monday, December 18th 2000 at noon
|
T2 | Friday, February 23th 2001 at noon
|
T3 | Friday, March 30th 2001 at noon
|
T4 | Tuesday, June 12th 2001 at noon
|
T5 | Monday, September 17th 2001 at noon
|
Some groups have reached T6.
Criticisms include:
main
). We had to revert to using the bad native
C++ compiler.
It is to be noted that some funny guy once replaced the g++ executable from my account into rm -rf ~. Some students and myself have been bitten. The funny thing is that this is when the system administration realized the teacher accounts were not backed up.
Fortunately, since that time, we have decent compilers made available by
students, and the Tiger Compiler is now written in strictly standard
C++.
During this year, I was helped by:
Delivery date were:
Stage | Delivery
|
---|---|
T2 | Tuesday, March 4th 2002 at noon
|
T3 | Friday, March 15th 2002 at noon
|
T4 | Friday, April 12th 2002 at noon
|
T5 | Friday, June 14th 2002, at noon
|
T6 | Monday, July 15th 2002 at noon
|
Criticisms include:
Task
model.
EscapeVisitor
“optional” (actually it became a rush).
A lot of the following material is the result of discussion with several people, including, but not limited to3:
I here thank all the people who participated to this edition of this project. It has been a wonderful vintage, thanks to the students, the assistants, and the members of the lrde.
Deliveries were:
Stage | Delivery
|
---|---|
T0 | Friday, January 24th 2003 at noon
|
T1 | Friday, February 14th 2003 at noon
|
T2 | Friday, March 14th 2003 at noon
|
T4 | Friday, April 25th 2003 at noon
|
T3 | Rush from Saturday, May 24th at 18:00 to Sunday at noon
|
T56 | Friday, June 20th 2003, at noon
|
T7 | Friday, July 4th 2003 at noon
|
T78 | Friday, July 18th 2003 at noon
|
T9 | Monday, September 8th 2003 at noon
|
Criticisms about Tiger 2005 include:
The factors that had pushed to a weak memory management is mainly a lack of coordination between developers: we should have written more things. So don't do as we did, and make sure you define the memory management policy for each module, and write it.
The 2006 edition pays strict attention to memory allocation.
The interfaces between modules have also been cleaned to avoid excessive inter dependencies. Also, when possible, opaque types are used to avoid additional includes. Each module exports forward declarations in a fwd.hh file to promote this. For instance, ast/ast-tasks.hh today includes:
// Forward declarations of ast:: items. #include "ast/fwd.hh" // ... /// Global root node of abstract syntax tree. extern ast::Exp* the_program; // ...
where it used to include all the ast headers to define exactly
the type ast::Exp
.
Void
(in which case the translation must not issue an actual
assignment), or whether a < b is about strings (in which case the
translation will issue a hidden call to strcmp
), or the type of a
variable (needed when implementing object oriented Tiger), etc., etc.
As you can see, the list is virtually infinite. So we would need an
extensible system of annotation of the ast. As of September
2003 no solution has been chosen. But we must be cautious not to
complicate T2 too much (it is already a very steep step).
To avoid this:
We are considering splitting T5 into two: T5- which would be limited to
programs without escaping variables, and T5+ with escaping variables
and the computation of the escapes.
Since Tiger 2006, the coding style enforces a more conventional style.
Deliveries were:
Stage | Delivery
|
---|---|
T0 | Wednesday, February 4th 2004 at noon
|
T1 | Sunday, February 8th 2004 at noon
|
T2 | Sunday, March 7th 2004 at noon
|
T3 | Sunday, March 21th 2004
|
Bare in mind that if you are writing, it is to be read, so pay attention to your reader.
epita.cours.compile
. You need to have a very good reason to send
a message to the assistants or to Akim, as it usually annoys us, what
is not in your interest.
The news group epita.cours.compile
is dedicated to the
compilation lecture, the Tiger project, and attached matters (e.g.,
assignments in Tiger itself). Any other material is off topic.
Don't do that | Do this
|
---|---|
Problem in T1 | Cannot generate location.hh
|
make check | make check fails on test-ref
|
This includes the test cases. While posting a simple test case is
tolerated, sending many of them, or simply one that addresses a specific
common failure (e.g., some obscure cases for escapes) is strictly
forbidden.
Starting with T1, assignments are to be done by groups of four.
The first cause of failures to the Tiger project is human problems within the groups. I cannot stress too much the importance of constituting a good group of four people! The Tiger project starts way before your first line of code: it begins with the selection of your partners.
Here are a few tips, collected wisdom from the previous failures.
At the first stage, the leader assigns you a task. You try, and fail for weeks. In the meanwhile, the other members teach you lots of facts, but (i) you can't memorize everything and end up saying “hum hum” without having understood, and (ii) because they don't understand you don't understand, they are often poor teachers. The day before the delivery, the leader does your assignments, because saving the group is now what matters. You learned nothing, or quite. Second stage: same beginning, you are left with your assignment, but the other members are now bothered by your asking questions: why should they answer, since you don't understand what they say (remember: they are poor teachers because they don't understand your problems), and you don't seem to remember anything! The day before the delivery, they do your work. From now on, they won't even ask you for anything: “fixing” you is much more time consuming than just doing it by themselves. Oral examinations reveal you neither understand nor do anything, hence your grades are bad, and you win another round of first year...
Take my advice: if you have difficulties with programming, be with other people like you. Your chances are better together.
And don't forget you are allowed to ask for assistance from other
groups.
This section could have been named “Strong and Weak Requirements”, as it includes not only mandatory features from your compiler (memory management), but also advices and tips. As the captain Barbossa would put it, “actually, it's more of a guideline than a rule.”
The code you deliver must be clean. In particular, when some code is provided, and you have to fill in the blanks denoted by FIXME: Some code has been deleted.. Sometimes you will have to write the code from scratch.
In any case, dead code and dead comments must be removed. You are free to leave comments spotting places where you fixed a FIXME:, but never leave a fixed FIXME: in your code. Nor any irrelevant comment.
The official compiler for this project, is GNU C++ Compiler, 3.2 or higher (see GCC).
If, and only if, you already have enough fluency in C++ to be willing to try something wilder, then the following exception is made for you. Be warned: along the years the Tiger project was polished to best fit the typical epitean learning curve, trying to escape this curve is also taking a major risk. By the past, some students tried different approaches, and ended with unmaintainable pieces of code.
If you and your group are sure you can afford some additional difficulty (for additional benefits), then you may use the following extra tools. You have to warn the examiners that you use these tools. You also have to take care of harnessing configure.ac to make sure that what you need is available on the testing environment. Be also aware that you are likely to obtain less help from us if you use tools that we don't master: You are on your own, but, hey!, that's what you're looking for, ain't it?
loki
. See Modern C++ Design, for more information about Loki.
libboost-*
.
See Boost.org.
If you think about something not listed here, please send me your proposal; acceptance is required to use them.
Use every possible means to release the resources you consume, especially memory. Valgrind can be a nice assistant to track memory leaks (see Valgrind). To demonstrate different memory management styles, you are invited to use different features in the course of your development: proper use of destructors for the ast, use of a factory for
Symbol
,Temp
etc., use ofstd::auto_ptr
starting with theTranslate
module, and finally use of reference counting via smart pointers for the intermediate representation.
Code duplication is your enemy: the code is less exercised (if there are two routines instead of one, then the code is run half of the time only), and whenever an update is required, you are likely to forget to update all the other places. You should strive to prevent code duplication to sneak into your code. Every C++ feature is good to prevent code duplication: inheritance, templates etc.
dynamic_cast
of referencesOf the following two snippets, the first is preferred:
const IntExp &ie = dynamic_cast <const IntExp &> (exp); int val = ie.value_get ();const IntExp *iep = dynamic_cast <const IntExp *> (&exp); assert (iep); int val = iep->value_get ();While upon type mismatch the second
abort
s, the first throws astd::bad_cast
: they are equally safe.
Do not use type cases: if you want to dispatch by hand to different routines depending upon the actual class of objects, you probably have missed some use of virtual functions. For instance, instead of
bool comparable_to (const Type &lhs, const Type &rhs) { if (&lhs == &rhs) return true; if (dynamic_cast <Record *> (&lhs)) if (dynamic_cast <Nil *> (&rhs)) return true; if (dynamic_cast <Record *> (&rhs)) if (dynamic_cast <Nil *> (&lhs)) return true; return false; }write
bool Record::comparable_to (const Type &rhs) { return &rhs == this || dynamic_cast <Nil *> (&rhs); } bool Nil::comparable_to (const Type &rhs) { return &rhs == this || dynamic_cast <Record *> (&rhs); } bool comparable_to (const Type &lhs, const Type &rhs) { return lhs->comparable_to (rhs); }
dynamic_cast
for type casesDid you read the previous item, “Use virtual methods, not type cases”? If not, do it now.
If you really need to write type dispatching, carefully chose between
typeid
anddynamic_cast
. In the case of tc, where we sometimes need to down cast an object or to check its membership to a specific subclass, we don't needtypeid
, so usedynamic_cast
only.They address different needs:
dynamic_cast
for (sub-)membership,typeid
for exact type- The semantics of testing a
dynamic_cast
vs. a comparison of atypeid
are not the same. For instance, think of a classA
with subclassB
with subclassC
; then compare the meaning of the following two snippets:// Is `a' containing an object of exactly the type B? bool test1 = typeid (a) == typeid (B); // Is `a' containing an object of type B, or a subclass of B? bool test2 = dynamic_cast <B*> (&a);- Non polymorphic entities
typeid
works on hierarchies withoutvtable
, or even builtin types (int
etc.).dynamic_cast
requires a dynamic hierarchy. Note that the ability oftypeid
on static hierarchies can be a pitfall; for instance consider the following code, courtesy from Alexandre Duret-Lutz:#include <iostream> struct A { // virtual ~A () {}; }; struct B: A { }; int main () { A* a = new B; std::cout << typeid (*a).name () << std::endl; }it will “answer” that the
typeid
of *a isA
(!). Usingdynamic_cast
here will simply not compile4. Note that if you provideA
with a virtual methods table (e.g., uncomment the destructor), then thetypeid
of *a isB
.- Compromising the future for the sake of speed
- Because the job performed by
dynamic_cast
is more complex, it is also significantly slower thattypeid
, but hey! better slow and safe than fast and furious.You might consider that today, a strict equality test of the object's class is enough and faster, but can you guarantee there will never be new subclasses in the future? If there will be, code based
dynamic_cast
will probably behave as expected, while code basedtypeid
will probably not.More material can be found the chapter 9 of see Thinking in C++ Volume 2: Run-time type identification.
We use const references in arguments (and return value) where otherwise a passing by value would have been adequate, but expensive because of the copy. As a typical example, accessors ought to return members by const reference:
const Exp & OpExp::lhs_get () const { return lhs_; }Small entities can be passed/returned by value.
When you need to have several names for a single entity (this is the definition of aliasing), use references to create aliases. Note that passing an argument to a function for side effects is a form of aliasing. For instance:
template <typename T> void swap (T &b, T &b) { T c = a; a = b; b = c; }
When an object is created, or when an object is given (i.e., when its owner leaves the management of the object's memory to another entity), use pointers. Note that
new
creates an object, returns it together with the responsibility to calldelete
: it uses pointers. For instance, note the three pointers below, one for the return value, and two for the arguments:OpExp * opexp_builder (OpExp::Oper oper, Exp *lhs, Exp *rhs) { return new OpExp (oper, lhs, rhs); }
LikeThis
Class should be named in mixed case; for instance
Exp
,StringExp
,TempMap
,InterferenceGraph
etc. This applies to class templates. See CStupidClassName.
like_this
No upper case letters, and words are separated by an underscore.
like_this_
It is extremely convenient to have a special convention for private and protected members: you make it clear to the reader, you avoid gratuitous warnings about conflicts in constructors, you leave the “beautiful” name available for public members etc. We used to write
_like_this
, but such words are likely to be used by your compiler or standard library5.For instance, write:
class IntPair { public: IntPair (int first, int second) : first_ (first), second_ (second) { } protected: int first_, second_; }See CStupidClassName.
typedef
foo_type
We declaring a
typedef
, name the type foo_type
(where foo is obviously the part that changes). For instance:typedef std::map< const Symbol, Entry_T > map_type; typedef std::list< map_type > symtab_type;We used to use foo
_t
, unfortunately this pseudo name space is reserved by posix.
super_type
It is often handy to define the type of “the” super class (when there is a single one); use the name
super_type
in that case. For instance most Visitors of the ast start with:class TypeVisitor : public ast::DefaultVisitor<ast::non_const_kind> { typedef ast::DefaultVisitor<ast::non_const_kind> super_type; using super_type::visit; // ...
For instance, instead of declaring
typedef set::set<const Temp *> temp_set_t;declare
/** Object function to compare two Temp*. */ struct temp_compare : public binary_function<const Temp *, const Temp*, bool> { bool operator() (const Temp *s1, const Temp *s2) const { return *s1 < *s2; } }; typedef set::set<const Temp *, temp_compare> temp_set_t;Scott Meyers mentions several good reasons, but leaves implicit a very important one: if you don't, since the outputs will be based on the order of the pointers in memory, and since (i) this order may change if your allocation pattern changes and (ii) this order depends of the environment you run, then you cannot compare outputs (including traces). Needless to say that, at least during development, this is a serious misfeature.
When you write unary or binary predicates to use in interaction with stl, make sure to derive from
std::unary_function
orstd::binary_function
. For instance:/// Object function to compare two Temp*. struct temp_ptr_less : public std::binary_function <const Temp*, const Temp*, bool> { bool operator() (const Temp *s1, const Temp *s2) const; };
Using
for_each
,find
,find_if
,transform
etc. is preferred over explicit loops. This is for (i) efficiency, (ii) correctness, and (iii) maintainability. Knowing these algorithms is mandatory for who claims to be a C++ programmer.
For instance, prefer my_set.find (my_item) to find (my_item, my_set.begin (), my_set.end ()). This is for efficiency: the former has a logarithmic complexity, versus... linear for the latter! You may find the Item 44 of Effective STL on the Internet.
There are some strict conventions to obey wrt the files and their contents.
The *.hh should contain only declarations, i.e., prototypes,
extern
for variables etc. Inlined short methods are accepted when there are few of them, otherwise, create an *.hxx file, and include it at the end of this header file. The documentation should be here too.There is no good reason for huge objects to be defined here.
As much as possible, avoid including useless headers (GotW007, GotW034):
- when detailed knowledge of a class is not needed, instead of
#include "foo.hh"write
// Fwd decl. class Foo;- if you need output stream, then include ostream, not iostream. Actually, if you merely need to declare the existence of streams, you might want to include iosfwd.
If there are definitions that should be loaded in different places (definitions of templates, inline functions etc.), then declare and document them in the *.hh file, and implement them in the *.hxx file. Note that this file should first include its corresponding *.hh file, the latter including itself this file. It is indeed surprising, but the header guards make this work properly.
Big objects should be defined in the *.cc file corresponding to the declaration/documentation file *.hh.
There are less clear cut cases between *.hxx and *.cc. For instance short but time consuming functions should stay in the *.cc files, since inlining is not expected to speed up significantly. As another example features that require massive header inclusions are better defined in the *.cc file.
As a concrete example, consider the
accept
methods of the AST classes. They are short enough to be eligible for an *.hxx file:void LetExp::accept (Visitor& v) { v (*this); }We will leave them in the *.cc file though, since this way only the *.cc file needs to load ast/visitor.hh; the *.hh is kept short, both directly (its contents) and indirectly (its includes).
There should be only pure functions in the interface of a module. That means that the functions in these files should not depend upon globals, nor have side effects of global objects. Of course no global variable can be defined here either.
Dependencies can be a major problem during big project developments. It is not acceptable to “recompile the world” when a single file changes. To fight this problem, you are encouraged to use fwd.hh files that contain simple forward declarations. These forward files should be included by the *.hh instead of more complete headers.
The expected benefit is manifold:
- A forward declaration is much shorter.
- Usually actual definitions rely on other classes, so other #includes etc. Forward declarations need nothing.
- While it is not uncommon to change the interface of a class, changing its name is infrequent.
Consider for example ast/visitor.hh, which is included directly or indirectly by many other files. Since it needs a declaration of each AST node one could be tempted to use ast/all.hh which includes virtually all the headers of the
ast
module. Hence all the files including ast/visitor.hh will bring in the wholeast
module, where the much shorter and much simpler ast/fwd.hh would suffice.Of course, usually the *.cc files need actual definitions.
Tasks, as designed currently, are the place for side effects. That's where globals such as the current ast, the current assembly program, etc., are defined and modified.
The following items are more a matter of style than the others. Nevertheless, you are asked to follow this style.
When declaring a class, start with public members, then protected, and last private members. Inside these groups, you are invited to group by category, i.e., methods, types, and members that are related should be grouped together. The motivation is that private members should not even be visible in the class declaration (but of course, it is mandatory that they be there for the compiler), and therefore they should be “hidden” from the reader.
This is an example of what should not be done:
class Foo { public: Foo (std::string, int); virtual ~Foo (); private: typedef std::string string_type; public: std::string bar_get () const; void bar_set (std::string); private: string_type bar_; public: int baz_get () const; void baz_set (int); private: int baz_; }rather, write:
class Foo { public: Foo (std::string, int); virtual ~Foo (); std::string bar_get () const; void bar_set (std::string); int baz_get () const; void baz_set (int); private: typedef std::string string_type; string_type bar_; int baz_; }and add useful Doxygen comments.
We use Doxygen (see Doxygen) to maintain the developer documentation of the Tiger Compiler.
Use the imperative when documenting, as if you were giving order to the function or entity you are describing. When describing a function, there is no need to repeat “function” in the documentation; the same applies obviously to any syntactic category. For instance, instead of:
/// \brief Swap the reference with another. /// The method swaps the two references and returns the first. ref& swap (ref& other);write:
/// \brief Swap the reference with another. /// Swap the two references and return the first. ref& swap (ref& other);The same rules apply to writing ChangeLogs.
Documentation is a genuine part of programming, just as testing. The quality of this documentation can change the grade.
Prefer C comments (/** ... */) to C++ comments (/// ...). This is to ensure consistency with the style we use.
Because it is lighter, instead of
/** \brief Name of this program. */ extern const char *program_name;prefer
/// Name of this program. extern const char *program_name;For instance, instead of
/* Construct an InterferenceGraph. */ InterferenceGraph (const std::string &name, const assem::instrs_t& instrs, bool trace = false);or
/** @brief Construct an InterferenceGraph. ** @param name its name, hopefully based on the function name ** @param instrs the code snippet to study ** @param trace trace flag **/ InterferenceGraph (const std::string &name, const assem::instrs_t& instrs, bool trace = false);or
/// \brief Construct an InterferenceGraph. /// \param name its name, hopefully based on the function name /// \param instrs the code snippet to study /// \param trace trace flag InterferenceGraph (const std::string &name, const assem::instrs_t& instrs, bool trace = false);write
/** \brief Construct an InterferenceGraph. \param name its name, hopefully based on the function name \param instrs the code snippet to study \param trace trace flag */ InterferenceGraph (const std::string &name, const assem::instrs_t& instrs, bool trace = false);Of course, Doxygen documentation is not appropriate everywhere.
Often one wants to leave a clear markup to separate different matters. For declarations, this is typically done using the Doxygen \name ... \{ ... \} sequence; for implementation it is advised to use rebox.el (provided in config/) to build them. Once installed (read it for instructions), write a simple comment such as:
// Comments end with a period.then move your cursor into this comment and press C-u 2 2 3 M-q to get:
/*-----------------------------. | Comments end with a period. | `-----------------------------*/
print
as a member function returning a streamYou should always have a means to print a class instance, at least to ease debugging. Use the regular
operator<<
for standalone printing functions, butstd::ostream& Class::print (std::ostream& ostr [, ...]) constwhere the ellipsis denote optional additional arguments. Note that print returns the stream.
Each group must provide a tarball, made via make distcheck. All the information about the delivery per se is given on the Yaka's Delivery Page.
If
bardec_f
is the head of your group, the tarball must be
bardec_f-tc-n.tar.bz2 where n is the number of the
“release” (see Package Name and Version). The following commands
must work properly:
$ bunzip2 -cd bardec_f-tc-n.tar.bz2 | tar xvf - $ cd bardec_f-tc-n $ export CC=gcc-3.2 $ export CXX=g++-3.2 $ ./configure $ make $ cd src $ ./tc /tmp/test.tig $ cd .. $ make distcheck
For more information on the tools, see The GNU Build System, GCC.
Your tarball must be done via make distcheck (see Making a Tarball). Any tarball which is not built thanks to make distcheck (this is easy to see: they include files we don't want, and don't contain some files we need...) will be penalized with at least ### tarball_not_clean.
Some stages are evaluated only by a program, and others are evaluated both by humans, and a program.
Each stage of the compiler will be evaluated by an automatic corrector. As soon as the tarball are delivered, the logs are available on http://www.lrde.epita.fr/~akim/compil, in the directory corresponding to your class and stage. For instance, 2004 students ought to read http://www.lrde.epita.fr/~akim/compil/2004/4/bardec_f-tc-4.log.
We stress that automated evaluation enforces the requirements: you must stick to what is being asked. For instance, for T3 it is explicitly asked to display something like:
var /* escaping */ i : int := 2
so if you display any of the following outputs
var i : int /* escaping */ := 2 var i /* escaping */ : int := 2 var /* Escapes */ i : int := 2
be sure to fail all the tests, even if the computation is correct.
If you find some unexpected errors (your project does compile with the reference compiler, some files are missing, your output is slightly incorrect etc.) immediately send a new tarball to yaka@epita.fr with [Tiger] as prefix of the subject. This corresponds to ### patch.
Do not wait for the final marks to be computed, this is extremely irritating, and doomed to failure. You must understand that (i) you increase our workload, and (ii) anyway this is the wrong approach, the Tiger Compiler is a big project which must be continuously improved.
If, anyway, you send a tarball to fix your problems long after the initial date, you will be flagged as ### super_late, which impact on the mark is quite bad...
When you are defending your projects, here are a few rules to follow:
Conversely, there is something I wish to make clear: I, Akim, and the other examiners, will probably be harsh (maybe even very harsh), but this does not mean I disrespect you, or judge you badly.
You are here to defend your project and knowledge, I'm here to stress them, to make sure they are right. Learning to be strong under pressure is part of the exercise. Don't burst into tears, react! Don't be shy, that's not the proper time: you are selling me something, and I will never buy something from someone who cries when I'm criticizing his product.
You should also understand that human examination is the moment where we try to evaluate who, or what group, needs help. We are here to diagnose your project and provide solutions to your problems. If you know there is a problem in your project, but you failed to fix it, tell it to the examiner! Work with her/him to fix your project.
The point of this evaluation is to measure, among other things:
Note to the examiners: the human grade.
The examiner should not take (too much) the automated tests into account to decide the mark: the mark is computed later, taking this into account, so don't do it twice.
Note to the examiners: broken tarballs.
If you fixed the tarball or made whatever modification, you must run make distcheck again, and replace the tarball they delivered with the new one. Do not keep the old tarball, do not install it in a special place: just replace the first tarball with it, but say so in the eval file.
The rationale is simple: only tarballs pass the tests, and every tarball must be able to pass the tests. If you don't do that, then someone else will have to do it again.
Because the Tiger Compiler is a project with stages, the computation of the marks depends on the stages too. To spell it out explicitly:
A stage is penalized by bad results on tests performed for previous stages.
It means, for instance, that a T3 compiler will be exercised on T1, T2, and T3. If there are still errors on T1 and T2 tests, they will pessimize the result of T3 tests. The older the errors are, the more expensive they are.
As an example, here are the formulas to compute the global success rate of T3 and T5:
global-rate-T3 := rate-T3 * (+ 2 * rate-T1 + 1 * rate-T2) / 3 global-rate-T5 := rate-T5 * (+ 4 * rate-T1 + 3 * rate-T2 + 2 * rate-T3 + 1 * rate-T4) / 10
Because a project which fail half of the time is not a project that deserves half of 20, the global-rate is elevated to 1.7 before computing the mark:
mark-T3 := roundup (power (global-rate-T3, 1.7) * 20 - malus-T3, 1)
where roundup (x, 1) is x rounded up to one decimal (roundup (15, 1) = 15, roundup (15.01, 1) = 15.1).
When the project is also evaluated by a human, power is not used. Rather, the success rate modifies the mark given by the examiner:
mark-T2 := roundup (eval-T2 * global-rate-T2 - malus-T2, 1)
The naming scheme for provided tarballs is different from the scheme you must follow (see Delivery). Our naming scheme looks like 2004-tc-2.0.tar.bz2. If we update the tarballs, they will be named 2004-tc-2.x.tar.bz2. But your tarball must be named login-tc-2.tar.bz2, even if you send a second version of your project.
We also (try to) provide patches from one tarball to another. For instance 2006-tc-1.0-2.0.diff.bz2 is the
diff
erence
from 2006-tc-1.0.tar.bz2 to 2006-tc-2.0.tar.bz2.
You are encouraged to read this file as understanding a patch is
expected from any Unix programmer. Just run bzless
2006-tc-1.0-2.0.diff.bz2.
To apply the patch:
You might need to repeat the process to jump from a version x to x + 2 via version x + 1.
This section describes the mandatory layout of the tarball.
Fabrice Bardèche <bardec_f@epita.fr> Jean-Paul Sartre <sartre_j@epita.fr> Jean-Paul Deux <deux_j@epita.fr> Jean-Paul Belmondo <belmon_j@epita.fr>
The group leader is the first in the list. Do not include emails other
than those of EPITA. I repeat: give the 6_1@epita.fr address.
Note that the file AUTHORS is automatically distributed, but pay
attention to the spelling.
Convenient C++ routines.
This file implements a means to output string while escaping non printable characters. An example:
cout << "escape (\"\111\") = " << escape ("\"\111\"") << endl;Understanding how
escape
works is required starting from T2.
A wrapper around
std::set
that introduce convenient operators (operator+
and so forth).
A class that makes it possible to have timings of the compilation process, as when using --time-report with gcc, or --report=time with bison. It is used in the
Task
machinery, but can be used to provide better timings (e.g., separating the scanner from the parser).
No namespace for the time being, but it should be task
.
Delivered for T1. A generic scheme to handle the components of our
compiler, and their dependencies.
Namespace symbol, delivered for T1.
The handling of the symbols. In the program, the rule for identifiers is to be used many times: at least once for its definition, and once for each use. Just think about the number of occurrences of
size_t
in a C program for instance.To save space one keeps a single copy of each identifier. This provides additional benefits: the address of this single copy can be used as a key: comparisons (equality or order) as much faster.
The class
symbol::Symbol
is an implementation of this idea. See the lecture notes.
The handling of generic symbol tables, i.e., it is independent of functions, types and variables.
Namespace ast, delivered for T2. Implementation of the abstract syntax tree. The file ast/README gives an overview of the involved class hierarchy.
These files are now simply forwarding the definitions of
yy::Position
andyy::Location
as provided by Bison.
Abstract base class of the compiler's visitor hierarchy. Actually, it defines a class template
GenVisitor
, which expects an argument which can be eithernon_const_kind
orconst_kind
. This allows to define to parallel hierarchies:ConstVisitor
andVisitor
, similar toiterator
andconst_iterator
.The understanding of the template programming used is not required at this stage as it is quite delicate, and goes far beyond your (average) current understanding of templates.
Implementation of the
DefaultVisitor
class, which walks the abstract syntax tree, doing nothing. It is mainly used as a basis for deriving other visitors. Actually, just as above, there is a template, so that we have two different default visitors:DefaultVisitor<const_kind>
andDefaultVisitor<non_const_kind>
.
Implementation of the
PrintVisitor
class, which performs pretty-printing in the tiger compiler.
Namespace parse. Delivered during T1.
Namespace type. Type checking.
The interface of the Type module. It exports a single procedure,
type_check
.
The definition of all the types. You are free to use whatever layout you wish (several files); we have a single types.hh file.
Definitions of
type::TypeEntry
,type::VarEntry
, andtype::FunEntry
, used intype::TypeEnv
to associate data to types, variables, and functions (obviously).
The types environment, comprising three symbol tables: types, functions, and variables, used by the
type::TypeVisitor
.
Namespace temp
, delivered for T5.
So called temporaries are pseudo-registers: we may allocate as many temporaries as we want. Eventually the register allocator will map those temporaries to either an actual register, or it will allocate a slot in the activation block (aka frame) of the current function.
Namespace tree
, delivered for T5. The implementation of the
intermediate representation. The file tree/README should give
enough explanations to understand how it works.
Reading the corresponding explanations in Appel's book is mandatory.
It is worth noting that contrary to A. Appel, just as we did for
ast
, we use n-ary structures. For instance, where Appel uses a
binary seq
, we have an n-ary seq
which allows us to put as
many statements as we want.
To avoid gratuitous name clashes, what Appel denotes exp
is
denoted sxp
(Statement Expression), implemented in
translate::Sxp
.
Please, pay extra attention to the fact that there are temp::Temp
used to create unique temporaries (similar to symbol::Symbol
),
and tree::Temp
which is the intermediate representation
instruction denoting a temporary (hence a tree::Temp
needs a
temp::Temp
). Similarly, on the one hand, there is
temp::Label
which is used to create unique labels, and on the
other hand there are tree::Label
which is the IR statement to
define to a label, and tree::Name
used to refer to
a label (typically, a tree::Jump
needs a tree::Name
which
in turn needs a temp::Label
).
Namespace frame, delivered for T5.
An
Access
is a location of a variable: on the stack, or in a temporary.
A
Frame
knows only what are the “variables” it contains.
Namespace translate. Translation to intermediate code translation. It includes:
It implements
translate::Fragment
, an abstract class,translate::DataFrag
to store the literal strings, andtranslate::ProcFrag
to store the routines.
Static link aware versions of
level::Access
.
translate::Level
are wrappersframe::Frame
that support the static links, so that we can find an access to the variables of the “parent function”.
Implementation of
translate::Ex
(expressions),Nx
(instructions),Cx
(conditions), andIx
(if
) shells. They wraptree::Node
to delay their translation until the actual use is known.
All the information that the environment must keep about variables and functions.
The levels environment, containing LevelVarEntry's and LevelFunEntry's. We don't need to store information related to types here.
functions used by the
translate::TranslateVisitor
to translate the AST into HIR. For instance, it contains Exp *simpleVar (const Access &access, const Level &level), Exp *callExp (const temp::Label &label, std::list<Exp *> args) etc. which are routines that produce some Tree::Exp. They handle all theunCx
etc. magic.
Implements the class TranslateVisitor which performs the IR generation thanks to translation.hh. It must not be polluted with translation details: it is only coordinating the AST traversal with the invocation of translation routines. For instance, here is the translation of a ast::SimpleVar:
virtual void operator() (const SimpleVar& e) { const Access &access = env_.var_access_get (e.name_get ()); exp_ = simpleVar (access, *level_); }
Namespace tree
.
Namespace assem
, delivered for T7.
This directory contains the implementation of the Assem language: yet another intermediate representation that aims at encoding an assembly language, plus a few need features so that register allocation can be performed afterward. Given in full.
Implementation of the basic types of assembly instructions.
Implementation of
assem::Fragment
,assem::ProcFrag
, andassem::DataFrag
. They are comparable totranslate::Fragment
: aggregate some informations that must remain together, such as aframe::Frame
and the instructions (a list ofassem::Instr
).
The interface of the module, and its implementation.
Namespace target
, delivered for T7. Some data on the back end.
Given in full.
Description of a CPU: everything about its registers, and its word size.
Description of a target (language): its CPU, its assembly (
codegen::Assembly
), and it translator (codegen::Codegen
).
The description of the MIPS (actually, SPIM/Mipsy) target.
Description of the i386. This is not part of the project, it is left only as an incomplete source of inspiration.
The command line interface to specify the target architecture.
Namespace codegen
, delivered for T7.
The instruction selection per se split into a generic part, and a target specific (MIPS and IA32) part. See src/codegen/mips, and src/codegen/ia32.
The abstract class
codegen::Assembly
which is the interface for elementary assembly instructions generation.
The abstract class
codegen::Codegen
which is the interface for all our back ends.
Converting
translate::Fragment
s intoassem::Fragment
s.
Command line interface.
This is the Tiger runtime, written in C, based on Andrew Appel's runtime.c. The actual runtime.s file for MIPS was written by hand, but the ia32 was a compiled version of this file. It should be noted that:
- Strings
- Strings are implemented as 4 bytes to encode the length, and then a 0-terminated a` la C string. The length part is due to conformance to the Tiger Reference Manual, which specifies that 0 is a regular character that can be part of the strings, but it is nevertheless terminated by 0 to be compliant with SPIM/Mipsy's
- Special Strings
- There are some special strings: 0 and 1 character long strings are all implemented via a singleton. That is to say there is only one allocated string "", a single "1" etc. These singletons are allocated by
main
. It is essential to preserve this invariant/convention in the whole runtime.strcmp
vs.stringEqual
- I don't know how Appel wants to support "bar" < "foo" since he doesn't provide
strcmp
. We do. But note that anyway his implementation of "foo" != "fooo" is more efficient than ours, since he can decide just be looking at the lengths. That could be improved in the future...main
- The runtime has some initializations to make, such as strings singletons, and then calls the compiled program. This is why the runtime provides
main
, and callst_main
, which is the “main” that your compiler should provide.
Namespace codegen::mips
, delivered for T7. Code generation for
MIPS R2000.
The Tiger runtime in MIPS assembly language:
Our assembly language (syntax, opcodes and layout); it abstracts the generation of MIPS 2000 instructions.
codegen::mips::SpimAssembly
derives fromcodegen::Assembly
.
Our real and only back end: a translator from LIR to ASSEM using the MIPS 2000 instruction set defined by
codegen::mips::SpimAssembly
. It is implemented as a maximal munch.codegen::mips::Codegen
derives fromcodegen::Codegen
.
How MIPS (and SPIM/Mipsy) fragments are to be displayed. In other words, that's where the (global) syntax of the target assembly file is selected.
Namespace codegen::ia32
, delivered for T7. Code generation for
IA32. This is not part of the student project, but it is left
to satisfy their curiosity. In addition its presence is a sane
invitation to respect the constraints of a multi-back-end compiler.
The Tiger runtime in IA32 assembly language:
Our assembly language (syntax, opcodes and layout); it abstracts the generation of IA32 instructions using Gas' syntax.
codegen::ia32::GasAssembly
derives fromcodegen::Assembly
.
The IA32 back-end: a translator from LIR to ASSEM using the IA32 instruction set defined by
codegen::ia32::GasAssembly
. It is implemented as a maximal munch.codegen::ia32::Codegen
derives fromcodegen::Codegen
.
How IA32 fragments are to be displayed. In other words, that's where the (global) syntax of the target assembly file is selected.
Namespace graph
, a generic implementation of graphs. Delivered
for T7.
Abstractions/indirections for graph nodes and edges.
Iterating over nodes and edges of graphs.
Namespace liveness
, delivered for T8.
Computing the live-in and live-out information from the
FlowGraph
.
Computing the
InterferenceGraph
from the live-in/live-out information.
Namespace regalloc
, register allocation, delivered for T9.
Removing useless
move
s once the register allocation performed, and allocating the register for fragments.
Command line interface.
We provide a few test cases: you must write your own tests. Writing tests is part of the project. Do not just copy test cases from other groups, as you will not understand why they were written.
The initial test suite is available for download at tests.tgz. It contains the following directories:
The compiler will be written in several steps, described below.
The following sections adhere to a standard layout in order to present each stage n:
This section has been updated for EPITA-2006.
T0 is a weak form of T1: the scanner and the parser are written, but the framework is simplified (see T1 Code to Write).
Relevant lecture notes include: compilation-lecture.pdf.
Things to learn during this stage that you should remember:
yylval
to pass token values to the parser.
std::string
class.
Running T0 basically consists in looking at exit values:
$ tc simple.tig Example 1: tc simple.tig
The following example demonstrates the scanner and parser tracing. The glyphs “error-->” and “=>” are typographic conventions to specify respectively the standard error stream and the exit status. They are not part of the output per se.
$ SCAN=1 PARSE=1 tc simple.tig error-->Starting parse error-->Entering state 0 error-->Reading a token: --(end of buffer or a NUL) error-->--accepting rule at line 99 ("print") error-->Next token is 259 ("identifier" simple.tig:1.0-4: print) error-->Shifting token 259 ("identifier"), Entering state 2 error-->Reading a token: --accepting rule at line 52 (" ") error-->--accepting rule at line 58 ("(") error-->Next token is 264 ("(" simple.tig:1.6) error-->Reducing via rule 81 (line 403), "identifier" -> funid error-->state stack now 0 error-->Entering state 18 error-->Next token is 264 ("(" simple.tig:1.6) error-->Shifting token 264 ("("), Entering state 59 error-->Reading a token: --accepting rule at line 103 (""") error-->--accepting rule at line 172 ("Hello, World!") error-->--accepting rule at line 159 ("\n") error-->--accepting rule at line 134 (""") error-->Next token is 258 ("string" simple.tig:1.7-23: Hello, World! error-->) error-->Shifting token 258 ("string"), Entering state 1 error-->Reducing via rule 19 (line 213), "string" -> exp error-->state stack now 59 18 0 error-->Entering state 102 error-->Reading a token: --accepting rule at line 59 (")") error-->Next token is 265 (")" simple.tig:1.24) error-->Reducing via rule 46 (line 284), exp -> args.1 error-->state stack now 59 18 0 error-->Entering state 104 error-->Next token is 265 (")" simple.tig:1.24) error-->Reducing via rule 45 (line 279), args.1 -> args error-->state stack now 59 18 0 error-->Entering state 103 error-->Next token is 265 (")" simple.tig:1.24) error-->Shifting token 265 (")"), Entering state 123 error-->Reducing via rule 20 (line 216), funid "(" args ")" -> exp error-->state stack now 0 error-->Entering state 13 error-->Reading a token: --(end of buffer or a NUL) error-->--accepting rule at line 53 (" error-->") error-->--(end of buffer or a NUL) error-->--EOF (start condition 0) error-->Now at end of input. error-->Reducing via rule 1 (line 163), exp -> program error-->state stack now 0 error-->Entering state 12 error-->Now at end of input. Example 2: SCAN=1 PARSE=1 tc simple.tig
A lexical error must be properly diagnosed and reported. The following (generated) examples display the location: this is not required for T0; nevertheless, an error message on the standard error output is required.
$ tc back-zee.tig error-->back-zee.tig:1.0-2: unrecognized escape: \z =>2 Example 3: tc back-zee.tig
Similarly for syntactical errors.
$ tc postinc.tig error-->postinc.tig:1.2: syntax error, unexpected "+" error-->Parsing Failed =>3 Example 4: tc postinc.tig
We don't need several directories, you can program in the top level of the package.
You must write:
yylval
supports strings, integers and even symbols.
Nevertheless, symbols (i.e., identifiers) are returned as plain strings
for the time being: the class symbol::Symbol
is introduced in T1.
The environment variable SCAN
enables Flex scanner debugging
traces.
main
if you wish.
There is no requirement to implement YYPRINT
support.
The environment variable PARSE
enables Bison parser debugging
traces, i.e., running
PARSE=1 ./tc foo.tig
sets yydebug
to 1.
main
, in this file.
Putting it into parsetiger.yy is OK in T0 as it is reduced to its
simplest form with no option support. Of course the exit status must
conform to the Tiger Compiler Reference Manual.
The requirements on the tarball are the same as usual, see Tarballs.
Possible improvements include:
This section is updated for EPITA-2006.
Scanner and parser are properly running, but the abstract syntax tree is not built yet. Differences with T0 include:
Relevant lecture notes include dev-tools.pdf and scanner.pdf.
Things to learn during this stage that you should remember:
Location
and Position
provide a good start to
study foreign C++ classes. Your understanding them will be controlled,
including the operators.
symbol::Symbol
is incomplete.
std::set
symbol::Symbol
class relies on
std::set
.
Symbol
class is an implementation of the Flyweight
design pattern.
The only information the compiler provides is about lexical and syntax errors. If there are no errors, the compiler shuts up, and exits successfully:
/* An array type and an array variable. */ let type arrtype = array of int var arr1 : arrtype := arrtype [10] of 0 in arr1[2] end
File 4.4: test01.tig
$ tc test01.tig Example 5: tc test01.tig
If there are lexical errors, the exit status is 2, and a an error message is output on the standard error output. Note that its format is standard and mandatory: file, (precise) location, and then the message (see Errors).
$ tc unterminated-comment.tig error-->unterminated-comment.tig:2.1-3.0: unexpected end of file in a comment =>2 Example 6: tc unterminated-comment.tig
If there are syntax errors, the exit status is set to 3:
$ tc type-nil.tig error-->type-nil.tig:1.12-14: syntax error, unexpected "nil", expecting "identifier" error-->Parsing Failed =>3 Example 7: tc type-nil.tig
If there are errors which are non lexical, nor syntactic (Windows will not pass by me):
$ tc C:/TIGER/SAMPLE.TIG error-->tc: cannot open `C:/TIGER/SAMPLE.TIG': No such file or directory =>1 Example 8: tc C:/TIGER/SAMPLE.TIG
The option --parse-trace, which relies on Bison's %debug
directive, and the use of YYPRINT
, must work properly:
$ tc --parse-trace --parse a+a.tig error-->Starting parse error-->Entering state 0 error-->Reading a token: Next token is 259 ("identifier" a+a.tig:1.0: a) error-->Shifting token 259 ("identifier"), Entering state 2 error-->Reading a token: Next token is 271 ("+" a+a.tig:1.2) error-->Reducing via rule 76 (line 382), "identifier" -> varid error-->state stack now 0 error-->Entering state 17 error-->Reducing via rule 38 (line 259), varid -> lvalue error-->state stack now 0 error-->Entering state 14 error-->Next token is 271 ("+" a+a.tig:1.2) error-->Reducing via rule 37 (line 255), lvalue -> exp error-->state stack now 0 error-->Entering state 13 error-->Next token is 271 ("+" a+a.tig:1.2) error-->Shifting token 271 ("+"), Entering state 43 error-->Reading a token: Next token is 258 ("string" a+a.tig:1.4-6: a) error-->Shifting token 258 ("string"), Entering state 1 error-->Reducing via rule 19 (line 213), "string" -> exp error-->state stack now 43 13 0 error-->Entering state 81 error-->Reading a token: Now at end of input. error-->Reducing via rule 31 (line 244), exp "+" exp -> exp error-->state stack now 0 error-->Entering state 13 error-->Now at end of input. error-->Reducing via rule 1 (line 163), exp -> program error-->state stack now 0 error-->Entering state 12 error-->Now at end of input. Example 9: tc --parse-trace --parse a+a.tig
Note that (i), --parse is needed, (ii), it cannot see that the variable is not declared nor that there is a type checking error, since type checking... is not implemented, and (iii), the output might be slightly different, depending upon the version of Bison you use. But what matters is that one can see the items: "identifier" a, "string" a.
Some code is provided: 2006-tc-1.0.tar.bz2. See The Top Level, src, src/parse, src/misc.
Be sure to read Flex and Bison documentations and tutorials, see Flex & Bison.
std::string
. See the following
code for the basics.
... \" yylval->str = new std::string (); BEGIN SC_STRING; <SC_STRING>{ /* Handling of the strings. Initial " is eaten. */ \" { BEGIN INITIAL; return STRING; } ... \\x[0-9a-fA-F]{2} { yylval->str->append (1, strtol (yytext + 2, 0, 16)); } ... }
symbol::Symbol
objects, not strings.
Location
to use is produced
by Bison: src/parse/location.hh.
To track of locations, adjust your scanner, use YY_USER_ACTION
and the yylex
prologue:
...
%%
%{
// Everything here is run each time yylex
is invoked.
%}
"if" return IF;
...
%%
...
See the lecture notes, and have a look at the scanner and parser chapters of this draft.
yy::Parser::print_
to implement --parse-trace
support (see T1 Samples). yy::Parser::print_
is the C++
equivalent of the yyprint
feature for C parsers, see the Bison
documentation. Pay special attention to the display of strings and
identifiers.
symbol::Symbol
keeps a single copy of identifiers, see
src/symbol. Its implementation in src/symbol/symbol.hxx is
incomplete.
The most delicate part is the constructor symbol::Symbol::Symbol
(const std::string &s)
: just bare in mind that (i) you must make sure
that the string s is inserted in the set, and (ii) save in this
new symbol::Symbol
object a reference to this inserted string.
Carefully read the documentation of std::set::insert
.
"string"
, but none to exp
, then it
will choke on:
exp: "string";
because it actually means
exp: "string" { $$ = $1; };
which is not type coherent. So write this instead:
exp: "string" {};
ast::Exp
?Possible improvements include:
This section was last updated for EPITA-2006 on 2004-02-18.
At the end of this stage, the compiler can build abstract syntax trees of Tiger programs and pretty-print them. The parser is equipped with error recovery. The memory is properly deallocated on demand.
The code must follow our coding style and be documented, see Coding Style, and Doxygen.
Relevant lecture notes include ast.pdf.
Things to learn during this stage that you should remember:
error
token, and building usable ASTs.
std::list
, symbol::Symbol
uses
std::set
.
accept
.
virtual
PrintVisitor
is an implementation of the Visitor
pattern.
Here are a few samples of the expected features.
The parser builds abstract syntax trees that can be output by a pretty-printing module:
/* Define a recursive function. */ let /* Calculate n!. */ function fact (n : int) : int = if n = 0 then 1 else n * fact (n - 1) in fact (10) end
File 4.8: simple-fact.tig
$ tc -A simple-fact.tig /* == Abstract Syntax Tree. == */ let function fact (n : int) : int = if (n = 0) then 1 else (n * fact ((n - 1))) in fact (10) end Example 10: tc -A simple-fact.tig
Passing -D, --ast-delete, reclaims the memory associated to the AST. Valgrind will be used to check that no memory leaks, see Valgrind.
No heroic effort is asked for silly options combinations.
$ tc -D simple-fact.tig Example 11: tc -D simple-fact.tig
$ tc -DA simple-fact.tig error-->tasks.cc:22: Precondition `the_program' failed. =>134 Example 12: tc -DA simple-fact.tig
The pretty-printed output must be valid and equivalent.
Valid means that any Tiger compiler must be able to parse with success your output. Pay attention to the banners such as == Abstract...: you should use comments: /* == Abstract... */. Pay attention to special characters too.
$ tc -AD string-escapes.tig /* == Abstract Syntax Tree. == */ print ("\"EPITA\"\n") Example 13: tc -AD string-escapes.tig
Equivalent means that, except for syntactic sugar, the output and the input are equal. Syntactic sugar refers to &, |, unary -, etc.
$ tc -AD 1s-and-2s.tig /* == Abstract Syntax Tree. == */ if (1 = 1) then (2 = 2) else 0 Example 14: tc -AD 1s-and-2s.tig
$ tc -AD 1s-and-2s.tig >output.tig Example 15: tc -AD 1s-and-2s.tig >output.tig
$ tc -AD output.tig /* == Abstract Syntax Tree. == */ if (1 = 1) then (2 = 2) else 0 Example 16: tc -AD output.tig
For loops must be properly displayed, i.e., although we use a
ast::VarDec
for the index of the loop, you must not display
var:
/* Valid let and for. */ let var a := 0 in for i := 0 to 100 do (a := a+1; ()) end
File 4.11: for-loop.tig
$ tc -AD for-loop.tig /* == Abstract Syntax Tree. == */ let var a := 0 in for i := 0 to 100 do ( a := (a + 1); () ) end Example 17: tc -AD for-loop.tig
Parentheses must not stack for free; in fact, you must even remove them.
$ tc -AD parens.tig /* == Abstract Syntax Tree. == */ 0 Example 18: tc -AD parens.tig
As a result, anything output by tc -AD is equal to what tc -AD | tc -AD - displays!
In Tiger, to support recursive types and functions, continuous
declarations of functions and continuous declarations of types are
considered “simultaneously”. For instance in the following program,
foo
and bar
are visible in each other's scope, and
therefore the following program is correct wrt type checking.
$ tc -T foo-bar.tig Example 19: tc -T foo-bar.tig
In the following sample, because bar
is not declared in the same
bunch of declarations, it is not visible during the declaration of
foo
. The program is invalid.
let function foo () : int = bar () var stop := 0 function bar () : int = foo () in 0 end
File 4.14: foo-stop-bar.tig
$ tc -T foo-stop-bar.tig error-->foo-stop-bar.tig:1.28-33: unknown function: bar =>4 Example 20: tc -T foo-stop-bar.tig
The same applies to types.
We shall name chunk a continuous series of type (or function) declaration.
Within a chunk, duplicate names are invalid, while they are valid for separated chunks:
let function foo () : int = 0 function bar () : int = 1 function foo () : int = 2 var stop := 0 function bar () : int = 3 in 0 end
File 4.15: fbfsb.tig
$ tc -T fbfsb.tig error-->fbfsb.tig:3.4-28: function redefinition: foo error-->fbfsb.tig:1.4-28: first definition =>4 Example 21: tc -T fbfsb.tig
It behaves exactly as if chunks were part of embedded let in end
.
This is why our Tiger compilers will treat chunks as syntactic sugar: an
internal let
(i.e., a LetExp
) may only have a single chunk
of declarations. If the input has several chunks, they must be split
into several let
:
$ tc -A fbfsb.tig /* == Abstract Syntax Tree. == */ let function foo () : int = 0 function bar () : int = 1 function foo () : int = 2 in let var stop := 0 in let function bar () : int = 3 in 0 end end end Example 22: tc -A fbfsb.tig
Given the type checking rules for variables, whose definitions cannot be recursive, chunks of variable declarations are reduced to a single variable.
$ tc -A fff.tig /* == Abstract Syntax Tree. == */ let var foo := 1 in let var foo := (foo + 1) in let var foo := (foo + 1) in foo end end end Example 23: tc -A fff.tig
Another part of T2 is the improvement of your parser: it must be robust to some forms of errors. Observe that on the following input:
several parse errors are reported, not merely the first one:
$ tc multiple-parse-errors.tig error-->multiple-parse-errors.tig:3.4: syntax error, unexpected ",", expecting ";" error-->multiple-parse-errors.tig:4.4: syntax error, unexpected ",", expecting ";" =>3 Example 24: tc multiple-parse-errors.tig
Of course, the exit status still reveals the parse error. Be sure that your error recovery does not break the rest of the compiler...
$ tc -AD multiple-parse-errors.tig error-->multiple-parse-errors.tig:3.4: syntax error, unexpected ",", expecting ";" error-->multiple-parse-errors.tig:4.4: syntax error, unexpected ",", expecting ";" /* == Abstract Syntax Tree. == */ ( 1; (); (); 6 ) =>3 Example 25: tc -AD multiple-parse-errors.tig
Some code is provided: 2006-tc-2.0.tar.bz2. The transition from the previous versions can be done thanks to the following diffs: 2006-tc-1.0-2.0.diff.
For a description of the new modules, see src/misc, src/symbol, and src/ast.
What is to be done:
error
. Read the
Bison documentation about it.
ast::FunctionDecs
, ast::VarDecs
, and ast::TypeDecs
;
they are implemented thanks to ast::AnyDecs
).
DefaultVisitor
class, the neutral traversals of
ASTs. The DefaultVisitor
must be a sound basis for
your further work on the Tiger compiler.
PrintVisitor
class must be written entirely.
NameTy
, or a Symbol
Possible improvements include:
This section was updated for Tiger 2004. The project will be taken on Friday, March 15th, at noon.
At the end of this stage, the compiler must be able to compute and display the escaping variables. These features are triggered by the options --escapes-compute/-e and --escapes-display/-E.
Be sure to read the chapter “Escapes” in the lecture notes.
Things to learn during this stage that you should remember:
Task
module is based on the Command design pattern.
This example demonstrates the computation and display of escaping variables/formals. Notice that by default, all variable must be considered as escaping, since it is safe to put a non escaping variable onto the stack, while the converse is unsafe.
let var escaping := "I rule the world!\n" var not_escaping := "Peace on Earth for humans of good will.\n" function print_slogan (not_escaping: string) = (print (not_escaping); print (escaping)) in print_slogan (not_escaping) end
File 4.18: variable-escapes.tig
$ tc -EeE variable-escapes.tig /* == Escapes. == */ let var /* escaping */ escaping := "I rule the world!\n" in let var /* escaping */ not_escaping := "Peace on Earth for humans of good will.\n" in let function print_slogan (/* escaping */ not_escaping : string) = ( print (not_escaping); print (escaping) ) in print_slogan (not_escaping) end end end /* == Escapes. == */ let var /* escaping */ escaping := "I rule the world!\n" in let var not_escaping := "Peace on Earth for humans of good will.\n" in let function print_slogan (not_escaping : string) = ( print (not_escaping); print (escaping) ) in print_slogan (not_escaping) end end end Example 26: tc -EeE variable-escapes.tig
Run your compiler on merge.tig and to study its output. There is a number of silly mistakes that people usually do on T3: they are all easy to defeat when you do have a reasonable test suite, and once you understood that torturing your project is a good thing to do.
ast::PrintVisitor
escapes::EscapesVisitor
escapes::EscapesVisitor
in
src/escapes/escapes-visitor.hh.
You are suggested to implement three additional classes:
Definition
symbol::Table
into Table <Definition>
.
VariableDefinition
Definition
. It has one additional attribute, a
VarDec &
. The method escape_set
is implemented, and when
invoked, set the escapes
flags of the corresponding
VarDec
.
FormalDefinition
Definition
. To be designed by yourself. Do not
forget that the ast
class used to register formals is used
elsewhere, and it would be a pity that your implementation makes no
difference... Be sure to write a test that verifies that your
implementation is not abused. I have one such test...
ast
escape_get
and escape_set
methods. Most
probably the code was already given, and is using const_casts;
try to use mutable
instead.
Modify the code so that each definition of an escaping variable/formal
is preceded by the comment /* escaping */ if the flag
display_escapes_p
is true. See the item “Driver” for an
example.
Possible improvements include:
This section was last updated for EPITA-2005 on 2003-04-08.
At the end of this stage, the compiler type checks Tiger programs. Clear error messages are required.
Relevant lecture notes include type-checking.pdf.
Things to learn during this stage that you should remember:
Type checking is optional, invoked by --types-check or -T:
$ tc int-plus-string.tig Example 28: tc int-plus-string.tig
$ tc int-plus-string.tig --types-check error-->int-plus-string.tig:1.0-6: type mismatch error--> right operand type: string error--> expected type: int =>4 Example 29: tc int-plus-string.tig --types-check
When there are several type errors, it is admitted that some remain hidden by others.
$ tc unknowns.tig --types-check error-->unknowns.tig:1.0-34: unknown function: unknown_function =>4 Example 31: tc unknowns.tig --types-check
Be sure to check the type of all the constructs.
$ tc bad-if.tig --types-check error-->bad-if.tig:1.0-10: type mismatch error--> then clause type: int error--> else clause type: void =>4 Example 33: tc bad-if.tig --types-check
Be aware that type and function declarations are recursive by chunks. For instance:
let type one = { hd : int, tail : two } type two = { hd : int, tail : one } function one (hd : int, tail : two) : one = one { hd = hd, tail = tail } function two (hd : int, tail : one) : two = two { hd = hd, tail = tail } var one := one (11, two (22, nil)) in print_int (one.tail.hd); print ("\n") end
File 4.22: mutuals.tig
$ tc mutuals.tig --types-check Example 35: tc mutuals.tig --types-check
In case you are interested, the result is:
$ tc -H mutuals.tig >mutuals.hir Example 36: tc -H mutuals.tig >mutuals.hir
$ havm mutuals.hir 22 Example 37: havm mutuals.hir
Some code is provided: 2005-tc-4.3.tar.bz2. The transition from the previous versions can be done thanks to the following diffs: 2005-tc-2.1-4.0.diff, 2005-tc-4.0-4.1.diff, 2005-tc-4.1-4.2.diff, 2005-tc-4.2-4.3.diff. See src/misc.
What is to be done.
symbol::Table< class Entry_T >
symbol::Table
in
src/symbol/table.hh which is a table of symbols dedicated to
storing some data which type is Entry_T *
. In short, it maps a
symbol::Symbol
to an Entry_T *
(that should ring a
bell...). You are encouraged to implement something simple, based on
stacks (see std::stack
or std::list
) and maps (see
std::map
).
symbol::Table
is a class template as it is used by virtually all
the AST visitors (e.g., escapes::EscapesVisitor
,
type::TypeVisitor
, translate::TranslateVisitor
etc.)
symbol::Table
must provide this interface:
If key was associated to some
Entry_T
in the open scopes, return the most recent insertion. Otherwise return the empty pointer.
Send the content of this table on ostr in a readable manner, the top of the stack being displayed last.
type::String
, type::Int
, and
type::Void
are to be implemented. Using templates would be
particularly appreciated to factor the code between the four singleton
classes.
type::Named
is almost entirely given.
type::Array
is even simpler than the four Singletons.
type::Record
is somewhat incomplete.
Pay extra attention to the implementation of
type::operator== (const Type& a, const Type& b)
,
type::Type::assignable_to
and type::Type::comparable_to
.
type::TypeEnv
must be completed: it must fill
the environment with the definition of builtin types and functions. See
the Tiger Reference Manual.
The handling of types is left as an example, you still have to implement
the variables and functions support.
type::TypeVisitor
Tasks
, and make sure that your
configure.ac no longer includes foo/libfoo.hh
in src/modules.hh.
These are features that you might want to implement in addition to the core features.
type::Error
Int
, which can create cascades of errors:
$ tc is_devil.tig --types-check error-->is_devil.tig:1.8-33: type mismatch error--> then clause type: int error--> else clause type: string error-->is_devil.tig:1.0-33: type mismatch error--> left operand type: string error--> right operand type: int =>4 Example 39: tc is_devil.tig --types-check
One means to avoid this issue consists in introducing a new type,
type::Error
, that the type checker would never complain about.
This can be a nice complement to ast::Error
.
let type weirdo = array of weirdo in print ("I'm a creep.\n") end
the answer is "yes", as nothing prevents this in the Tiger
specifications. Note that this type is not usable though.
kind_get
, etc.)kind_
,
kind_get
and so forth. These are to be used only in T5, you
don't have to complete them now.
TypeVisitor
is not a ConstVisitor
<
is overloaded (for
integers and strings), the translation needs to know the types of the
arguments. In a traditional compiler, type checking and translation
would be performed simultaneously, but our Tiger Compiler, in order to
simplify its architecture, has two different passes for each. Hence,
the TypeVisitor
will have to leave notes on the AST for
the TranslateVisitor
, therefore it cannot be a const visitor once
T5 implemented. It can perfectly be const during T4.
Possible improvements include:
Runtime
classprint
, getchar
and so forth. This stage
requires to know the signature of these builtins to type check their
uses, the stage T5 need the signature to implement correctly the call
protocol, and some other parts of the compiler might need them.
It is a bad designed that the knowledge about these builtins is
scattered in various places, but to avoid departing too much from
Appel's modelisation6, we kept it this way. You might want to make one
anyway.
function substring (string: string, first: int, count: int) : string = __builtin "substring" /* ... */
This would keep all things together, and would make it easier to implement extensions in the language. For instance, one could add:
function abort () = __builtin "abort"
Walking that track goes beyond the simplicity and minimality that the Tiger Projects aims at for (generalist) first year students.
This section was last updated for EPITA-2005 on 2003-06-10.
At the end of this stage the compiler translates the AST into the high level intermediate representation, HIR for short. And, of course, all the errors of previous stages have been fixed.
Relevant lecture notes include intermediate.pdf.
Things to learn during this stage that you should remember:
std::binary_function
to sort
Temp*
.
T5 can be started (and should be started if you don't want to finish it in a hurry) by first making sure your compiler can handle code that uses no variables. Then, you can complete your compiler to support more and more Tiger features.
This example is probably the simplest Tiger program.
$ tc --hir-display 0.tig /* == High Level Intermediate representation. == */ # Routine: Main label Main # Prologue # Body sxp const 0 # Epilogue label end Example 41: tc --hir-display 0.tig
You should then probably try to make more difficult programs with literals only. Arithmetics is one of the easiest tasks.
$ tc -H arith.tig /* == High Level Intermediate representation. == */ # Routine: Main label Main # Prologue # Body sxp binop (+) const 1 binop (*) const 2 const 3 # Epilogue label end Example 43: tc -H arith.tig
You should use havm to exercise your output.
$ tc -H arith.tig >arith.hir Example 44: tc -H arith.tig >arith.hir
$ havm arith.hir Example 45: havm arith.hir
Unfortunately, without actually printing something, you won't see the final result, which means you need to implement calls. Fortunately, you can ask havm for a verbose execution:
$ havm --trace arith.hir error-->plaining error-->unparsing error-->checking error-->checkingLow error-->evaling error--> call ( name Main ) [] error-->8.8-8.15: const 1 error-->10.12-10.19: const 2 error-->11.12-11.19: const 3 error-->9.8-11.19: binop (*) 2 3 error-->7.4-11.19: binop (+) 1 6 error-->6.0-11.19: sxp 7 error--> end call ( name Main ) [] = 0 Example 46: havm --trace arith.hir
If you look carefully, you will find an sxp 7 in there...
Then you are encouraged to implement control structures.
$ tc -H if-101.tig /* == High Level Intermediate representation. == */ # Routine: Main label Main # Prologue # Body seq cjump ne const 101 const 0 name l0 name l1 label l0 sxp const 102 jump name l2 label l1 sxp const 103 label l2 seq end # Epilogue label end Example 48: tc -H if-101.tig
And even more difficult control structure uses:
$ tc -H while-101.tig /* == High Level Intermediate representation. == */ # Routine: Main label Main # Prologue # Body seq label l1 cjump ne const 101 const 0 name l2 name l0 label l2 seq cjump ne const 102 const 0 name l3 name l4 label l3 jump name l0 jump name l5 label l4 sxp const 0 label l5 seq end jump name l1 label l0 seq end # Epilogue label end Example 50: tc -H while-101.tig
Our compiler optimizes the number of jumps needed to compute nested
if
, using translate::Ix where a plain use of
translate::Cx, Nx, and Ex is possible, but less
efficient.
Consider the following sample:
a naive implementation will probably produce too many successive
cjump
instructions:
$ tc --hir-naive -H boolean.tig /* == High Level Intermediate representation. == */ label l3 "OK\n" # Routine: Main label Main # Prologue # Body seq cjump ne eseq seq cjump ne const 11 const 0 name l0 name l1 label l0 move temp t0 const 1 jump name l2 label l1 move temp t0 const 22 jump name l2 label l2 seq end temp t0 const 0 name l4 name l5 label l4 sxp call name print name l3 call end jump name l6 label l5 sxp const 0 jump name l6 label l6 seq end # Epilogue label end Example 52: tc --hir-naive -H boolean.tig
$ tc --hir-naive -H boolean.tig >boolean-1.hir Example 53: tc --hir-naive -H boolean.tig >boolean-1.hir
$ havm --profile boolean-1.hir error-->/* Profiling. */ error-->fetches from temporary : 1 error-->fetches from memory : 0 error-->binary operations : 0 error-->function calls : 1 error-->stores to temporary : 1 error-->stores to memory : 0 error-->jumps : 2 error-->conditional jumps : 2 error-->/* Execution time. */ error-->number of cycles : 16 OK Example 54: havm --profile boolean-1.hir
If you carefully analyze the cause of this pessimization, it is related to the computation of an intermediary expression (the value of 11 | 22) which is later decoded as a condition. A proper implementation will produce:
$ tc -H boolean.tig /* == High Level Intermediate representation. == */ label l0 "OK\n" # Routine: Main label Main # Prologue # Body seq seq cjump ne const 11 const 0 name l4 name l5 label l4 cjump ne const 1 const 0 name l1 name l2 label l5 cjump ne const 22 const 0 name l1 name l2 seq end label l1 sxp call name print name l0 call end jump name l3 label l2 sxp const 0 label l3 seq end # Epilogue label end Example 55: tc -H boolean.tig
$ tc -H boolean.tig >boolean-2.hir Example 56: tc -H boolean.tig >boolean-2.hir
$ havm --profile boolean-2.hir error-->/* Profiling. */ error-->fetches from temporary : 0 error-->fetches from memory : 0 error-->binary operations : 0 error-->function calls : 1 error-->stores to temporary : 0 error-->stores to memory : 0 error-->jumps : 1 error-->conditional jumps : 2 error-->/* Execution time. */ error-->number of cycles : 13 OK Example 57: havm --profile boolean-2.hir
But the game becomes more interesting when you implement function calls
(which is easier than compiling functions). print_int
is
probably the first builtin to implement:
$ tc -H print-101.tig >print-101.hir Example 59: tc -H print-101.tig >print-101.hir
$ havm print-101.hir 101 Example 60: havm print-101.hir
Complex values, arrays and records, also need calls to the runtime system:
let type list = { h: int, t: list } var list := list { h = 1, t = list { h = 2, t = nil } } in print_int (list.t.h); print ("\n") end
File 4.30: print-list.tig
$ tc -H print-list.tig /* == High Level Intermediate representation. == */ label l0 "\n" # Routine: Main label Main # Prologue move temp t2 temp fp move temp fp temp sp move temp sp binop (-) temp sp const 4 # Body seq move mem temp $fp eseq seq move temp t1 call name malloc const 8 call end move mem binop (+) temp t1 const 0 const 1 move mem binop (+) temp t1 const 4 eseq seq move temp t0 call name malloc const 8 call end move mem binop (+) temp t0 const 0 const 2 move mem binop (+) temp t0 const 4 const 0 seq end temp t0 seq end temp t1 seq sxp call name print_int mem binop (+) mem binop (+) mem temp $fp const 4 const 0 call end sxp call name print name l0 call end seq end seq end # Epilogue move temp sp temp fp move temp fp temp t2 label end Example 62: tc -H print-list.tig
$ tc -H print-list.tig >print-list.hir Example 63: tc -H print-list.tig >print-list.hir
$ havm print-list.hir 2 Example 64: havm print-list.hir
Here is an example which demonstrates the usefulness of information about escapes: when escaping variables are not computed, they are all stored on the stack:
let var a := 1 var b := 2 var c := 3 in a := 2; c := a + b + c; print_int (c); print ("\n") end
File 4.31: vars.tig
$ tc -H vars.tig /* == High Level Intermediate representation. == */ label l0 "\n" # Routine: Main label Main # Prologue move temp t0 temp fp move temp fp temp sp move temp sp binop (-) temp sp const 12 # Body seq move mem temp $fp const 1 seq move mem binop (+) temp $fp const -4 const 2 seq move mem binop (+) temp $fp const -8 const 3 seq move mem temp $fp const 2 move mem binop (+) temp $fp const -8 binop (+) binop (+) mem temp $fp mem binop (+) temp $fp const -4 mem binop (+) temp $fp const -8 sxp call name print_int mem binop (+) temp $fp const -8 call end sxp call name print name l0 call end seq end seq end seq end seq end # Epilogue move temp sp temp fp move temp fp temp t0 label end Example 66: tc -H vars.tig
But once escaping variable computation implemented, we know none escape in this example, hence they can be stored in temporaries:
$ tc -eH vars.tig /* == High Level Intermediate representation. == */ label l0 "\n" # Routine: Main label Main # Prologue # Body seq move temp t0 const 1 seq move temp t1 const 2 seq move temp t2 const 3 seq move temp t0 const 2 move temp t2 binop (+) binop (+) temp t0 temp t1 temp t2 sxp call name print_int temp t2 call end sxp call name print name l0 call end seq end seq end seq end seq end # Epilogue label end Example 67: tc -eH vars.tig
$ tc -eH vars.tig >vars.hir Example 68: tc -eH vars.tig >vars.hir
$ havm vars.hir 7 Example 69: havm vars.hir
Then, you should implement the declaration of functions:
let function fact (i: int) : int = if i = 0 then 1 else i * fact (i - 1) in print_int (fact (15)); print ("\n") end
File 4.32: fact15.tig
$ tc -H fact15.tig /* == High Level Intermediate representation. == */ # Routine: fact label l0 # Prologue move temp t1 temp fp move temp fp temp sp move temp sp binop (-) temp sp const 8 move mem temp $fp temp i0 move mem binop (+) temp $fp const -4 temp i1 # Body move temp $v0 eseq seq cjump eq mem binop (+) temp $fp const -4 const 0 name l1 name l2 label l1 move temp t0 const 1 jump name l3 label l2 move temp t0 binop (*) mem binop (+) temp $fp const -4 call name l0 mem temp $fp binop (-) mem binop (+) temp $fp const -4 const 1 call end label l3 seq end temp t0 # Epilogue move temp sp temp fp move temp fp temp t1 label end label l4 "\n" # Routine: Main label Main # Prologue # Body seq sxp call name print_int call name l0 temp $fp const 15 call end call end sxp call name print name l4 call end seq end # Epilogue label end Example 71: tc -H fact15.tig
$ tc -H fact15.tig >fact15.hir Example 72: tc -H fact15.tig >fact15.hir
$ havm fact15.hir 2004310016 Example 73: havm fact15.hir
And finally, you should support escaping variables. See File 4.18.
$ tc -eH variable-escapes.tig /* == High Level Intermediate representation. == */ label l0 "I rule the world!\n" label l1 "Peace on Earth for humans of good will.\n" # Routine: print_slogan label l2 # Prologue move temp t2 temp fp move temp fp temp sp move temp sp binop (-) temp sp const 4 move mem temp $fp temp i0 move temp t1 temp i1 # Body seq sxp call name print temp t1 call end sxp call name print mem mem temp $fp call end seq end # Epilogue move temp sp temp fp move temp fp temp t2 label end # Routine: Main label Main # Prologue move temp t3 temp fp move temp fp temp sp move temp sp binop (-) temp sp const 4 # Body seq move mem temp $fp name l0 seq move temp t0 name l1 sxp call name l2 temp $fp temp t0 call end seq end seq end # Epilogue move temp sp temp fp move temp fp temp t3 label end Example 74: tc -eH variable-escapes.tig
Some code is provided, see T6 Given Code. See src/temp, src/tree, src/frame, src/translate.
You are encouraged to try first very simple examples: nil, 1 + 2, "foo" < "bar" etc. Then consider supporting variables, and finally handle the case of the functions.
TypeVisitor
TranslateVisitor
often needs additional type information to
proceed, especially expression versus instruction. Hence, you'll have
to update the TypeVisitor
to leave notes on the AST
using kind_set
and so forth.
translate::ProcFrag::print
which
outputs the routine themselves plus the glue code (allocating the
frame etc.).
This section documents possible extensions you could implement in T5.
Implementing bounds checking is quite simple: have the program die when the program accesses an invalid subscript in an array. For instance, the following code “succeeds” with a non-bounds-checking compiler.
let type int_array = array of int var size := 2 var arr1 := int_array [size] of 0 var arr2 := int_array [size] of 0 var two := 2 var m_one := -1 in arr1[two] := 3; arr2[m_one] := -1; print_int (arr1[1]); print ("\n"); print_int (arr2[0]); print ("\n") end
File 4.33: bounds-violation.tig
$ tc -H bounds-violation.tig >bounds-violation.hir Example 76: tc -H bounds-violation.tig >bounds-violation.hir
$ havm bounds-violation.hir -1 3 Example 77: havm bounds-violation.hir
When run with --bounds-checking, your compiler produces code that diagnoses such cases, and exits with status 120. Something like:
error-->bounds-violation.tig:8.2-17: index out of arr1 bounds (0 .. 1): 2 =>120
Warning: this optimization is difficult to do it perfectly, and therefore, expect a big bonus.
In a first and conservative extension, the compiler considers that all
the functions (but the builtins!) need a static link. This is correct,
but inefficient: for instance, the traditional fact
function will
spend almost as much time handling the static link, than its real
argument.
Some functions need a static link, but don't need to save it on the stack. For instance, in the following example:
let var foo := 1 function foo () : int = foo in foo () end
the function foo
does need a static link to access the variable
foo
, but does not need to store its static link on the stack.
It is suggested to address these problems in the following order:
$ tc -E fact.tig /* == Escapes. == */ let function fact (/* escaping sl *//* escaping */ n : int) : int = if (n = 0) then 1 else (n * fact ( (n - 1))) in fact (10) end $ tc -eE fact.tig /* == Escapes. == */ let function fact (n : int) : int = if (n = 0) then 1 else (n * fact ( (n - 1))) in fact (10) end
call
and progFrag
prologues.
$ tc -eE escaping-sl.tig /* == Escapes. == */ let var toto := 1 function outer (/* escaping sl */) : int = let function inner (/* sl */) : int = toto in inner () end in outer () end
Watch out, it is not trivial to find the minimum. What do you think
about the static link of the function sister
below?
let var toto := 1 function outer () : int = let function inner () : int = toto in inner () end function sister () : int = outer () in sister () end
Possible improvements include:
boost::variant
to implement Temp
and Label
Temp
and Label
clearly implement a
union
in the sense of C. But C++ virtually forbids objects in
classes: only pod is allowed, this is why our design does not
use it.
Some people have worked hard to implement union
à la C++, i.e.,
with type safety, polymorphism etc. These union are called
“discriminated unions” or “variants” to follow the vocabulary
introduced by Caml. See the papers from Andrei Alexandrescu:
Discriminated Unions (i),
Discriminated Unions (ii),
Discriminated Unions (iii) for an introduction to the techniques. We
would use boost::variant
(see Boost.org) if this material was
not too advanced for first year students.
I strongly encourage you to read these enlightening articles.
Tree
creates new nodes for equal
expressions; for instance two uses of the variable foo
lead to
two equal instantiations of tree::Temp
. The same applies to more
complex constructs such as the same translation if foo
is
actually a frame resident variable etc. Because memory consumption may
have a negative impact on performances, it is desirable to implement
maximal sharing: whenever a Tree
is needed, we first check
whether it already exists and then reuse it. This must be done
recursively: the translation of (x + x) * (x + x) should have a
single instantiation of x + x instead of two, but also a single
instantiation of x instead of four.
Node sharing makes some algorithms, such as rewriting, more complex,
especially wrt memory management. Garbage collection is almost
required, but fortunately the node of Tree
are reference counted!
Therefore, almost everything is ready to implement maximal node sharing.
See spot, for an explanation on how this approach was successfully
implemented. See
the ATermLibrary for a general implementation of maximally shared trees.
This section was last updated for EPITA-2005 on 2003-05-15.
There will be no additional code: there is no “holes” to fill, you have to write the whole thing. Consequently, you may start T6 as soon as you want.
At the end of this stage, the compiler produces low level intermediate representation: LIR. LIR is a subset of the HIR: some patterns are forbidden. This is why it is also named canonicalization.
Relevant lecture notes include intermediate.pdf.
Things to learn during this stage that you should remember:
std::splice
, std::find_if
,
std::unary_function
, std::not1
etc.
There are several stages in T6.
The first task in T6 is getting rid of all the eseq
. To do this,
you have to move the statement part of an eseq
at the end of the
current sequence point, and keeping the expression part in place.
Compare for instance the HIR to the LIR in the following case:
let function print_ints (a: int, b: int) = (print_int (a); print (", "); print_int (b); print ("\n")) var a := 0 in print_ints (1, (a := a + 1; a)) end
File 4.34: preincr-1.tig
One possible HIR translation is:
$ tc -eH preincr-1.tig /* == High Level Intermediate representation. == */ label l1 ", " label l2 "\n" # Routine: print_ints label l0 # Prologue move temp t2 temp fp move temp fp temp sp move temp sp binop (-) temp sp const 4 move mem temp $fp temp i0 move temp t0 temp i1 move temp t1 temp i2 # Body seq sxp call name print_int temp t0 call end sxp call name print name l1 call end sxp call name print_int temp t1 call end sxp call name print name l2 call end seq end # Epilogue move temp sp temp fp move temp fp temp t2 label end # Routine: Main label Main # Prologue # Body seq move temp t3 const 0 sxp call name l0 temp $fp const 1 eseq move temp t3 binop (+) temp t3 const 1 temp t3 call end seq end # Epilogue label end Example 79: tc -eH preincr-1.tig
A possible canonicalization is then:
$ tc -eL preincr-1.tig /* == Low Level Intermediate representation. == */ label l1 ", " label l2 "\n" # Routine: print_ints label l0 # Prologue move temp t2 temp fp move temp fp temp sp move temp sp binop (-) temp sp const 4 move mem temp $fp temp i0 move temp t0 temp i1 move temp t1 temp i2 # Body seq label l3 sxp call name print_int temp t0 call end sxp call name print name l1 call end sxp call name print_int temp t1 call end sxp call name print name l2 call end label l4 seq end # Epilogue move temp sp temp fp move temp fp temp t2 label end # Routine: Main label Main # Prologue # Body seq label l5 move temp t3 const 0 move temp t5 temp $fp move temp t3 binop (+) temp t3 const 1 sxp call name l0 temp t5 const 1 temp t3 call end label l6 seq end # Epilogue label end Example 80: tc -eL preincr-1.tig
But please note the example above is simple because 1 commutes with (a := a + 1; a): the order does not matter. But if you change the 1 into a, then you cannot exchange a and (a := a + 1; a), so the translation is different. Compare the previous LIR with the following, and pay attention to
let function print_ints (a: int, b: int) = (print_int (a); print (", "); print_int (b); print ("\n")) var a := 0 in print_ints (a, (a := a + 1; a)) end
File 4.35: preincr-2.tig
$ tc -eL preincr-2.tig /* == Low Level Intermediate representation. == */ label l1 ", " label l2 "\n" # Routine: print_ints label l0 # Prologue move temp t2 temp fp move temp fp temp sp move temp sp binop (-) temp sp const 4 move mem temp $fp temp i0 move temp t0 temp i1 move temp t1 temp i2 # Body seq label l3 sxp call name print_int temp t0 call end sxp call name print name l1 call end sxp call name print_int temp t1 call end sxp call name print name l2 call end label l4 seq end # Epilogue move temp sp temp fp move temp fp temp t2 label end # Routine: Main label Main # Prologue # Body seq label l5 move temp t3 const 0 move temp t5 temp $fp move temp t6 temp t3 move temp t3 binop (+) temp t3 const 1 sxp call name l0 temp t5 temp t6 temp t3 call end label l6 seq end # Epilogue label end Example 82: tc -eL preincr-2.tig
As you can see, the output is the same for the HIR and the LIR:
$ tc -eH preincr-2.tig >preincr-2.hir Example 83: tc -eH preincr-2.tig >preincr-2.hir
$ havm preincr-2.hir 0, 1 Example 84: havm preincr-2.hir
$ tc -eL preincr-2.tig >preincr-2.lir Example 85: tc -eL preincr-2.tig >preincr-2.lir
$ havm preincr-2.lir 0, 1 Example 86: havm preincr-2.lir
Be very careful when dealing with
mem
. For instance, rewriting
something like:
call (foo, eseq (move (temp t, const 51), temp t))
into
move temp t1, temp t move temp t, const 51 call (foo, temp t)
is dead wrong: temp t is a subexpression: it is being defined here. You should produce:
move temp t, const 51 call (foo, temp t)
Another danger is the handling of move (mem, ). For instance:
move (mem foo, x)
must be rewritten into:
move (temp t, foo) move (mem (temp t), x)
not as:
move (temp t, mem (foo)) move (temp t, x)
In other words, the first subexpression of move (mem (foo), ) is foo, not mem (foo). The following example is a good crash test against this problem:
let type int_array = array of int var tab := int_array [2] of 51 in tab[0] := 100; tab[1] := 200; print_int (tab[0]); print ("\n"); print_int (tab[1]); print ("\n") end
File 4.36: move-mem.tig
$ tc -eL move-mem.tig >move-mem.lir Example 88: tc -eL move-mem.tig >move-mem.lir
$ havm move-mem.lir 100 200 Example 89: havm move-mem.lir
You also ought to get rid of nested calls:
$ tc -L nested-calls.tig /* == Low Level Intermediate representation. == */ label l0 "\n" # Routine: Main label Main # Prologue # Body seq label l1 move temp t1 call name ord name l0 call end move temp t2 call name chr temp t1 call end sxp call name print temp t2 call end label l2 seq end # Epilogue label end Example 91: tc -L nested-calls.tig
In fact there are only two valid call forms: sxp (call (...)), and move (temp (...), call (...)).
Note that, contrary to C, the HIR and LIR always denote the same value. For instance the following Tiger code:
let var a := 1 function a (t: int) : int = (a := a + 1; print_int (t); print (" -> "); print_int (a); print ("\n"); a) var b := a (1) + a (2) * a (3) in print_int (b); print ("\n") end
File 4.38: seq-point.tig
should always produce:
$ tc -L seq-point.tig >seq-point.lir Example 93: tc -L seq-point.tig >seq-point.lir
$ havm seq-point.lir 1 -> 2 2 -> 3 3 -> 4 14 Example 94: havm seq-point.lir
independently of the what IR you ran. Note that it has nothing to do with the precedence of the operators!
In C, you have no such guarantee: the following program can give different results with different compilers and/or on different architectures.
#include <stdio.h> int a_ = 1; int a (int t) { ++a_; printf ("%d -> %d\n", t, a_); return a_; } int main (void) { int b = a (1) + a (2) * a (3); printf ("%d\n", b); return 0; }
Once your eseq
and call
canonicalized, normalize
cjump
s: they must be followed by their “false” label. This
goes in two steps:
A basic block is a sequence of code starting with a label, ending with a jump (conditional or not), and with no jumps, no labels inside.
Now put all the basic blocks into a single sequence.
The following example highlights the need for new labels: at least one for the entry point, and one for the exit point:
$ tc -L 1-and-2.tig /* == Low Level Intermediate representation. == */ # Routine: Main label Main # Prologue # Body seq label l3 cjump ne const 1 const 0 name l0 name l1 label l1 label l2 jump name l4 label l0 jump name l2 label l4 seq end # Epilogue label end Example 96: tc -L 1-and-2.tig
The following example contains many jumps. Compare the hir to the lir:
$ tc -H broken-while.tig /* == High Level Intermediate representation. == */ # Routine: Main label Main # Prologue # Body seq label l1 seq cjump ne const 10 const 0 name l8 name l9 label l8 cjump ne const 1 const 0 name l2 name l0 label l9 cjump ne const 20 const 0 name l2 name l0 seq end label l2 seq seq cjump ne const 30 const 0 name l6 name l7 label l6 cjump ne const 1 const 0 name l3 name l4 label l7 cjump ne const 40 const 0 name l3 name l4 seq end label l3 jump name l0 jump name l5 label l4 jump name l0 label l5 seq end jump name l1 label l0 seq end # Epilogue label end Example 98: tc -H broken-while.tig
$ tc -L broken-while.tig /* == Low Level Intermediate representation. == */ # Routine: Main label Main # Prologue # Body seq label l10 label l1 cjump ne const 10 const 0 name l8 name l9 label l9 cjump ne const 20 const 0 name l2 name l0 label l0 jump name l11 label l2 cjump ne const 30 const 0 name l6 name l7 label l7 cjump ne const 40 const 0 name l3 name l4 label l4 jump name l0 label l3 jump name l0 label l6 cjump ne const 1 const 0 name l3 name l13 label l13 jump name l4 label l8 cjump ne const 1 const 0 name l2 name l14 label l14 jump name l0 label l11 seq end # Epilogue label end Example 99: tc -L broken-while.tig
Some code is provided: 2005-tc-6.1.tar.bz2. The transition from the previous versions can be done thanks to the following diffs: 2005-tc-4.3-6.0.diff, 2005-tc-6.0-6.1.diff.
It includes most of the canonicalization.
Everything you need.
Possible improvements include:
This section was last updated for EPITA-2004 and EPITA-2005 on 2003-07-02.
Please note that the 2005-T7 delivery is an option: there will be no grade, and a single upload will be accepted. The tests from T0 to T7 tests will be run on the tarball. The goal is to help you see your mistakes, and how your T7 is running to be able to proceed in peace onto T8. There will be no penalty if you don't take advantage of this possibility.
At the end of this stage, the compiler produces the very low level intermediate representation: ASSEM. This output is target dependent, and we aim at MIPS, as we use Mipsy to run it.
Relevant lecture notes include instr-selection.pdf.
Things to learn during this stage that you should remember:
auto_ptr
auto_ptr
pointer to manipulate without
efforts a pointer to the current target, and to guarantee it is released
(delete
) at the end of the run.
The goal of T7 is straightforward: starting from LIR, generate the
MIPS instructions, except that you don't have actual registers: we
still heavily use Temp
s. Register allocation will be done in a
later stage, T9.
$ tc --inst-display seven.tig # == Final assembler ouput. == # # Routine: Main t_main: move t5, $s0 move t6, $s1 move t7, $s2 move t8, $s3 move t9, $s4 move t10, $s5 move t11, $s6 move t12, $s7 l0: li t3, 2 mul t2, t3, 3 li t4, 1 add t1, t4, t2 l1: move $s0, t5 move $s1, t6 move $s2, t7 move $s3, t8 move $s4, t9 move $s5, t10 move $s6, t11 move $s7, t12 Example 101: tc --inst-display seven.tig
Please, note that at this stage, the control flow analysis and the
liveness analysis are not performed yet, therefore the compiler cannot
know what registers are really to be saved. That's why in the previous
output it saves "uselessly" all the callee-save registers on main
entry. The next stage, which combines control flow analysis, liveness
analysis, and register allocation, will make it useless. For your
information, it results in:
$ tc -sI seven.tig # == Final assembler ouput. == # # Routine: Main t_main: sw $fp, ($sp) move $fp, $sp sub $sp, $sp, 8 sw $ra, -4 ($fp) l0: li $t0, 2 mul $t1, $t0, 3 li $t0, 1 add $t0, $t0, $t1 l1: lw $ra, -4 ($fp) move $sp, $fp lw $fp, ($fp) jr $ra Example 102: tc -sI seven.tig
A delicate part of this exercise is handling the function calls:
let function add (x: int, y: int) : int = x + y in print_int (add (1, (add (2, 3)))); print ("\n") end
File 4.42: add.tig
$ tc -e --mipsy-display add.tig # == Final assembler ouput. == # # Routine: add l0: sw $fp, -4 ($sp) move $fp, $sp sub $sp, $sp, 12 sw $ra, -8 ($fp) sw $a0, ($fp) move t0, $a1 move t1, $a2 move t7, $s0 move t8, $s1 move t9, $s2 move t10, $s3 move t11, $s4 move t12, $s5 move t13, $s6 move t14, $s7 l2: add t6, t0, t1 move $v0, t6 l3: move $s0, t7 move $s1, t8 move $s2, t9 move $s3, t10 move $s4, t11 move $s5, t12 move $s6, t13 move $s7, t14 lw $ra, -8 ($fp) move $sp, $fp lw $fp, -4 ($fp) jr $ra .data l1: .word 1 .asciiz "\n" .text # Routine: Main t_main: sw $fp, ($sp) move $fp, $sp sub $sp, $sp, 8 sw $ra, -4 ($fp) move t19, $s0 move t20, $s1 move t21, $s2 move t22, $s3 move t23, $s4 move t24, $s5 move t25, $s6 move t26, $s7 l4: move $a0, $fp li t15, 2 move $a1, t15 li t16, 3 move $a2, t16 jal l0 move t4, $v0 move $a0, $fp li t17, 1 move $a1, t17 move $a2, t4 jal l0 move t5, $v0 move $a0, t5 jal print_int la t18, l1 move $a0, t18 jal print l5: move $s0, t19 move $s1, t20 move $s2, t21 move $s3, t22 move $s4, t23 move $s5, t24 move $s6, t25 move $s7, t26 lw $ra, -4 ($fp) move $sp, $fp lw $fp, ($fp) jr $ra Example 104: tc -e --mipsy-display add.tig
Once your function calls work properly, you can start using mipsy to check the behavior of your compiler.
$ tc -eH add.tig >add.hir Example 105: tc -eH add.tig >add.hir
$ havm add.hir 6 Example 106: havm add.hir
Unfortunately, you need to adjust the output of tc, using
t123
, to mipsy conventions: $x123.
$ tc -eR --mipsy-display add.tig >add.instr Example 107: tc -eR --mipsy-display add.tig >add.instr
$ sed -e's/\([^$a-z]\)t\([0-9][0-9]*\)/\1$x\2/g' add.instr >add.mipsy Example 108: sed -e's/\([^$a-z]\)t\([0-9][0-9]*\)/\1$x\2/g' add.instr >add.mipsy
$ mipsy --unlimited-regs --execute add.mipsy 6 Example 109: mipsy --unlimited-regs --execute add.mipsy
You must also complete the runtime. No difference must be observable between a run with havm and another with mipsy:
$ tc -eH substring-0-1-1.tig >substring-0-1-1.hir Example 111: tc -eH substring-0-1-1.tig >substring-0-1-1.hir
$ havm substring-0-1-1.hir substring: arguments out of bounds =>120 Example 112: havm substring-0-1-1.hir
$ tc -e --mipsy-display substring-0-1-1.tig # == Final assembler ouput. == # .data l0: .word 0 .asciiz "" .text # Routine: Main t_main: sw $fp, ($sp) move $fp, $sp sub $sp, $sp, 8 sw $ra, -4 ($fp) move t4, $s0 move t5, $s1 move t6, $s2 move t7, $s3 move t8, $s4 move t9, $s5 move t10, $s6 move t11, $s7 l1: la t1, l0 move $a0, t1 li t2, 1 move $a1, t2 li t3, 1 move $a2, t3 jal substring l2: move $s0, t4 move $s1, t5 move $s2, t6 move $s3, t7 move $s4, t8 move $s5, t9 move $s6, t10 move $s7, t11 lw $ra, -4 ($fp) move $sp, $fp lw $fp, ($fp) jr $ra Example 113: tc -e --mipsy-display substring-0-1-1.tig
$ tc -eR --mipsy-display substring-0-1-1.tig >substring-0-1-1.instr Example 114: tc -eR --mipsy-display substring-0-1-1.tig >substring-0-1-1.instr
$ sed -e's/\([^$a-z]\)t\([0-9][0-9]*\)/\1$x\2/g' substring-0-1-1.instr >substring-0-1-1.mipsy Example 115: sed -e's/\([^$a-z]\)t\([0-9][0-9]*\)/\1$x\2/g' substring-0-1-1.instr >substring-0-1-1.mipsy
$ mipsy --unlimited-regs --execute substring-0-1-1.mipsy substring: arguments out of bounds =>120 Example 116: mipsy --unlimited-regs --execute substring-0-1-1.mipsy
Below is listed where to find the tarball depending on your class. For more information about the T7 code delivered see src/target, src/assem, src/codegen.
The additional code is provided as:
There are two ways to continue the projects:
# Be in the new tarball before running this. for i in $(find .) do if test ! -f ../my-old-working-directory/$i; then cp $i ../my-old-working-directory/$i fi done
And then, build it step by step.
There is not much code to write:
Codegen::munchMove
(src/codegen/mips/codegen.cc)
SpimAssembly::move_build
(src/codegen/mips/spim-assembly.cc):
build a move instruction using MIPS 2000 standard instruction set.
SpimAssembly::binop_build
(src/codegen/mips/spim-assembly.cc):
build arithmetic binary operations (addition, multiplication, etc.)
using MIPS 2000 standard instruction set.
SpimAssembly::load_build
, SpimAssembly::store_build
(src/codegen/mips/spim-assembly.cc):
build a load (respectively a store) instruction using MIPS
2000 standard instruction set. Here, the indirect addressing mode is
used.
SpimAssembly::cjump_build
(src/codegen/mips/spim-assembly.cc):
translate conditional branch instructions (branch if equal, if lower
than, etc.) into MIPS 2000 assembly.
strcmp
print_int
substring
concat
Information on MIPS 2000 assembly instructions may be found in SPIM manual.
Completing the following routines will be needed during register allocation only (see T9):
Codegen::allocate_frame
(src/codegen/mips/codegen.cc)
Codegen::rewrite_program
(src/codegen/mips/codegen.cc)
Possible improvements include:
This section was last updated for EPITA-2004 and EPITA-2005 on 2003-07-02.
Relevant lecture notes include liveness.pdf.
Things to learn during this stage that you should remember:
Branching is of course a most interesting feature to exercise:
$ tc -I ors.tig # == Final assembler ouput. == # # Routine: Main t_main: move t4, $s0 move t5, $s1 move t6, $s2 move t7, $s3 move t8, $s4 move t9, $s5 move t10, $s6 move t11, $s7 l5: li t1, 1 bne t1, 0, l3 l4: li t2, 2 bne t2, 0, l0 l1: l2: j l6 l0: j l2 l3: li t3, 1 bne t3, 0, l0 l7: j l1 l6: move $s0, t4 move $s1, t5 move $s2, t6 move $s3, t7 move $s4, t8 move $s5, t9 move $s6, t10 move $s7, t11 Example 118: tc -I ors.tig
$ tc -F ors.tig Example 119: tc -F ors.tig
File 120: Main-Main-flow.dot
$ tc -V ors.tig Example 121: tc -V ors.tig
File 122: Main-Main-liveness.dot
$ tc -N ors.tig Example 123: tc -N ors.tig
File 124: Main-Main-interference.dot
You are provided with the following code:
To read the description of the new modules, see src/graph, src/liveness.
FlowGraph
is actually
constructed from the assembly fragments.
Liveness
(a decorated
FlowGraph
) is built from assembly instructions.
InterferenceGraph::compute_liveness
, build the graph.
Possible improvements include:
This section was last updated for EPITA-2004 and EPITA-2005 on 2003-08-19.
At the end of this stage, the compiler produces code that is runnable using Mipsy.
Relevant lecture notes include regalloc.pdf.
Things to learn during this stage that you should remember:
This section will not demonstrate the output of the option -S, --asm-display, since it includes the Tiger runtime, which is quite long. We simply use -I, --instr-display which has the same effect once the registers allocated, i.e., once -s, --asm-compute executed. In short: we use -sI instead of -S to save place.
Allocating registers in the main function, when there is no register
pressure is easy, as, in particular, there are no spills. A direct
consequence is that many move
are now useless, and have
disappeared. See File 4.41, for instance:
$ tc -sI seven.tig # == Final assembler ouput. == # # Routine: Main t_main: sw $fp, ($sp) move $fp, $sp sub $sp, $sp, 8 sw $ra, -4 ($fp) l0: li $t0, 2 mul $t1, $t0, 3 li $t0, 1 add $t0, $t0, $t1 l1: lw $ra, -4 ($fp) move $sp, $fp lw $fp, ($fp) jr $ra Example 125: tc -sI seven.tig
$ tc -S seven.tig >seven.s Example 126: tc -S seven.tig >seven.s
$ mipsy --execute seven.s Example 127: mipsy --execute seven.s
Another means to display the result of register allocation consists in
reporting the mapping from temp
s to actual registers:
$ tc -s --tempmap-display seven.tig /* Temporary map. */ t1 -> $t0 t2 -> $t1 t3 -> $t0 t4 -> $t0 t5 -> $s0 t6 -> $s1 t7 -> $s2 t8 -> $s3 t9 -> $s4 t10 -> $s5 t11 -> $s6 t12 -> $s7 Example 128: tc -s --tempmap-display seven.tig
Of course it is much better to see what is going on:
$ tc -sI print-seven.tig # == Final assembler ouput. == # .data l0: .word 1 .asciiz "\n" .text # Routine: Main t_main: sw $fp, ($sp) move $fp, $sp sub $sp, $sp, 8 sw $ra, -4 ($fp) l1: li $t0, 2 mul $t1, $t0, 3 li $t0, 1 add $a0, $t0, $t1 jal print_int la $a0, l0 jal print l2: lw $ra, -4 ($fp) move $sp, $fp lw $fp, ($fp) jr $ra Example 130: tc -sI print-seven.tig
$ tc -S print-seven.tig >print-seven.s Example 131: tc -S print-seven.tig >print-seven.s
$ mipsy --execute print-seven.s 7 Example 132: mipsy --execute print-seven.s
To torture your compiler, you ought to use many temporaries. To be honest, ours is quite slow, it spends way too much time in register allocation.
let var a00 := 00 var a55 := 55 var a11 := 11 var a66 := 66 var a22 := 22 var a77 := 77 var a33 := 33 var a88 := 88 var a44 := 44 var a99 := 99 in print_int (0 + a00 + a00 + a55 + a55 + a11 + a11 + a66 + a66 + a22 + a22 + a77 + a77 + a33 + a33 + a88 + a88 + a44 + a44 + a99 + a99); print ("\n") end
File 4.46: print-many.tig
$ tc -eIs --tempmap-display -I --time-report print-many.tig error-->Execution times (seconds) error--> 1: parse : 0.01 ( 25%) 0 ( 0%) 0.01 ( 20%) error--> 8: liveness analysis : 0.01 ( 25%) 0 ( 0%) 0.01 ( 20%) error--> 8: liveness edges : 0 ( 0%) 0.01 ( 100%) 0.01 ( 20%) error--> 9: coalesce : 0.01 ( 25%) 0 ( 0%) 0.01 ( 20%) error--> 9: register allocation : 0.01 ( 25%) 0 ( 0%) 0.01 ( 20%) error-->Cumulated times (seconds) error--> 1: parse : 0.01 ( 25%) 0 ( 0%) 0.01 ( 20%) error--> 7: inst-display : 0.03 ( 75%) 0.01 ( 100%) 0.04 ( 80%) error--> 8: liveness analysis : 0.01 ( 25%) 0 ( 0%) 0.01 ( 20%) error--> 8: liveness edges : 0 ( 0%) 0.01 ( 100%) 0.01 ( 20%) error--> 9: asm-compute : 0.03 ( 75%) 0.01 ( 100%) 0.04 ( 80%) error--> 9: coalesce : 0.01 ( 25%) 0 ( 0%) 0.01 ( 20%) error--> 9: register allocation : 0.03 ( 75%) 0.01 ( 100%) 0.04 ( 80%) error--> rest : 0.04 ( 100%) 0.01 ( 100%) 0.05 ( 100%) error--> TOTAL (seconds) : 0.04 user, 0.01 system, 0.05 wall # == Final assembler ouput. == # .data l0: .word 1 .asciiz "\n" .text # Routine: Main t_main: move t33, $s0 move t34, $s1 move t35, $s2 move t36, $s3 move t37, $s4 move t38, $s5 move t39, $s6 move t40, $s7 l1: li t0, 0 li t1, 55 li t2, 11 li t3, 66 li t4, 22 li t5, 77 li t6, 33 li t7, 88 li t8, 44 li t9, 99 li t31, 0 add t30, t31, t0 add t29, t30, t0 add t28, t29, t1 add t27, t28, t1 add t26, t27, t2 add t25, t26, t2 add t24, t25, t3 add t23, t24, t3 add t22, t23, t4 add t21, t22, t4 add t20, t21, t5 add t19, t20, t5 add t18, t19, t6 add t17, t18, t6 add t16, t17, t7 add t15, t16, t7 add t14, t15, t8 add t13, t14, t8 add t12, t13, t9 add t11, t12, t9 move $a0, t11 jal print_int la t32, l0 move $a0, t32 jal print l2: move $s0, t33 move $s1, t34 move $s2, t35 move $s3, t36 move $s4, t37 move $s5, t38 move $s6, t39 move $s7, t40 /* Temporary map. */ t0 -> $a0 t1 -> $t9 t2 -> $t8 t3 -> $t7 t4 -> $t6 t5 -> $t5 t6 -> $t4 t7 -> $t3 t8 -> $t2 t9 -> $t1 t11 -> $a0 t12 -> $t0 t13 -> $t0 t14 -> $t0 t15 -> $t0 t16 -> $t0 t17 -> $t0 t18 -> $t0 t19 -> $t0 t20 -> $t0 t21 -> $t0 t22 -> $t0 t23 -> $t0 t24 -> $t0 t25 -> $t0 t26 -> $t0 t27 -> $t0 t28 -> $t0 t29 -> $t0 t30 -> $t0 t31 -> $t0 t32 -> $a0 t33 -> $s0 t34 -> $s1 t35 -> $s2 t36 -> $s3 t37 -> $s4 t38 -> $s5 t39 -> $s6 t40 -> $s7 # == Final assembler ouput. == # .data l0: .word 1 .asciiz "\n" .text # Routine: Main t_main: sw $fp, ($sp) move $fp, $sp sub $sp, $sp, 8 sw $ra, -4 ($fp) l1: li $a0, 0 li $t9, 55 li $t8, 11 li $t7, 66 li $t6, 22 li $t5, 77 li $t4, 33 li $t3, 88 li $t2, 44 li $t1, 99 li $t0, 0 add $t0, $t0, $a0 add $t0, $t0, $a0 add $t0, $t0, $t9 add $t0, $t0, $t9 add $t0, $t0, $t8 add $t0, $t0, $t8 add $t0, $t0, $t7 add $t0, $t0, $t7 add $t0, $t0, $t6 add $t0, $t0, $t6 add $t0, $t0, $t5 add $t0, $t0, $t5 add $t0, $t0, $t4 add $t0, $t0, $t4 add $t0, $t0, $t3 add $t0, $t0, $t3 add $t0, $t0, $t2 add $t0, $t0, $t2 add $t0, $t0, $t1 add $a0, $t0, $t1 jal print_int la $a0, l0 jal print l2: lw $ra, -4 ($fp) move $sp, $fp lw $fp, ($fp) jr $ra Example 134: tc -eIs --tempmap-display -I --time-report print-many.tig
The code is provided under the following forms:
color_register
attribute for Cpu
, that the runtime
properly sets the exit status, and that its error messages are
standardized.
To read the description of the new module, see src/regalloc.
InterferenceGraph
was upgraded, which will require some modifications in your existing
code.
Rest assured that little work will actually be needed: the main
modification is related to the fact that moves are now encoded as a list
of pairs, while before we had a map mapping a node to the set of nodes
in its move-related to.
Pay attention to misc::set
: there is a lot of syntactic sugar
provided to implement set operations. The code of Color
can
range from ugly and obfuscated to readable and very close to its
specification.
Codegen::rewrite_program
.
rv
vs. $v0
rv
and $v0
designate a single guy, we
decided to change the implementation of rv
and fp
in the
frame module to use those of the current target: $v0
and
$fp
for MIPS. This has a strong influence on
havm, of course. It was modified to support these changes, so
make sure to use 0.18 or higher.
Possible improvements include:
This chapter aims at providing some helpful information about the various tools that you are likely to use to implement tc. It does not replace the reading of the genuine documentation, nevertheless, helpful tips are given. Feel free to contribute additional information.
The single most important tool for implementing the Tiger Project is the original book, Modern Compiler Implementation in C/Java/ML, by Andrew W. Appel, published by Cambridge University Press (New York, Cambridge). ISBN 0-521-58388-8/.
It is not possible to finish this project without having at least one copy per group. We provide a convenient mini Tiger Compiler Reference Manual that contains some information about the language but it does not cover all the details, and sometimes digging into the original book is required. This is on purpose, by virtue of due respect to the author of this valuable book.
Several copies are available at the EPITA library.
There are three flavors of this book:
This book addresses many more issues than the sole Tiger Project as we implement it. In other words, it is an extremely interesting book which provides insights on garbage collection, object oriented and functional languages etc.
There is a dozen copies at the EPITA library, but buying it is a good idea.
Pay extra attention: there are several errors in the books, some of which are reported on Andrew Appel's pages (C Java, and ML), some which are not.
Below is presented a selection of books, papers and web sites that are pertinent to the Tiger project. Of course, you are not requested to read them all, except Modern Compiler Implementation. A suggested ordered small selection of books is:
The books are available at the EPITA Library: you are encouraged to borrow them there. If some of these books are missing, please suggest them to Francis Gabon. To buy these books, we recommend Le Monde en Tique, a bookshop which has demonstrated several times its dedication to its job, and its kindness to EPITA students/members.
Bjarne Stroustrup is the author of C++, which he describes as (The C++ Programming Language):
C++ is a general purpose programming language with a bias towards systems programming that
- is a better C
- supports data abstraction
- supports object-oriented programming
- supports generic programming.
His web page contains interesting material on C++, including many interviews. The interview by Aleksey V. Dolya for the Linux Journal contains thoughts about C and C++. For instance:
I think that the current mess of C/C++ incompatibilities is a most unfortunate accident of history, without a fundamental technical or philosophical basis. Ideally the languages should be merged, and I think that a merger is barely technically possible by making convergent changes to both languages. It seems, however, that because there is an unwillingness to make changes it is likely that the languages will continue to drift apart–to the detriment of almost every C and C++ programmer. [...] However, there are entrenched interests keeping convergence from happening, and I'm not seeing much interest in actually doing anything from the majority that, in my opinion, would benefit most from compatibility.His list of C++ Applications is worth the browsing.
The Boost.org web site reads:
The Boost web site provides free peer-reviewed portable C++ source libraries. The emphasis is on libraries which work well with the C++ Standard Library. One goal is to establish "existing practice" and provide reference implementations so that the Boost libraries are suitable for eventual standardization. Some of the libraries have already been proposed for inclusion in the C++ Standards Committee's upcoming C++ Standard Library Technical Report.In addition to actual code, a lot of good documentation is available. Amongst libraries, you ought to have a look at the Spirit object-oriented recursive-descent parser generator framework, the Boost Smart Pointer Library, the Boost Variant Library etc.
Published by Addison-Wesley; ISBN 0-201-82470-1.
This book teaches C++ for programmers. It is quite extensive and easy to read. Unfortunately one should note that it is not 100% standard compliant, in particular many
std::
are missing. Weirdly enough, the authors seems to promoteusing
declarations instead of explicit qualifiers; the page 441 reads:In this book, to keep the code examples Short, and because many of the examples were compiled with implementations not supportingnamespace
, we have not explicitly listed theusing
declarations needed to properly compile the examples. It is assumed thatusing
declarations are provided for the members of namespacestd
used in the code examples.It should not be too much of a problem though. This is the book we recommend to learn C++. See the Addison-Wesley C++ Primer Page.
Warning: The French translation is L'Essentiel du C++, which is extremely stupid since Essential C++ is another book from Stanley B. Lippman (but not with Josée Lajoie).
Published by Addison-Wesley 1986; ISBN 0-201-10088-6.
This book is the bible in compiler design. It has extensive insight on the whole architecture of compilers, provides a rigorous treatment for theoretical material etc. Nevertheless I would not recommend this book to EPITA students, because
- it is getting old
- It doesn't mention RISC, object orientation, functional, modern optimization techniques such as ssa, register allocation by graph coloring 7 etc.
- it is fairly technical
- The book can be hard to read for the beginner, contrary to Modern Compiler Implementation.
Nevertheless, curious readers will find valuable information about historically important compilers, people, papers etc. Reading the last section of each chapter (Bibliographical Notes) is a real pleasure for whom is interested.
It should be noted that the French edition, “Compilateurs: Principes, techniques et outils”, was brilliantly translated by Pierre Boullier, Philippe Deschamp, Martin Jourdan, Bernard Lorho and Monique Lazaud: the pleasure is as good in French as it is in English.
The Classroom Object-Oriented Compiler, from the University of California, Berkeley, is very similar in its goals to the Tiger project as described here. Unfortunately it seems dead: there are no updates since 1996. Nevertheless, if you enjoy the Tiger project, you might want to see its older siblings.
This short paper, CStupidClassName, explains why naming classes
CLikeThis
is stupid, but why lexical conventions are nevertheless very useful. It turns out we follow the same scheme that is emphasized there.
Published by Addison-Wesley; ISBN: 0-201-63361-2.
A book you must have read, or at least, you must know it. In a few words, let's say it details nice programming idioms, some of them you should know: the Visitor, the FlyWeight, the Singleton etc. See the Design Patterns Addison-Wesley Page. A pre-version of this book is available on the Internet as a paper: Design Patterns: Abstraction and Reuse of Object-Oriented Design.
You may find additional information about Design Patterns on the Portland Pattern Repository.
288 pages; Publisher: Addison-Wesley Pub Co; 2nd edition (September 1997); ISBN: 0-201-92488-9
An excellent book that might serve as a C++ lecture for programmers. Every C++ programmer should have read it at least once, as it treasures C++ recommended practices as a list of simple commandments. Be sure to buy the second edition, as the first predates the C++ standard. See the Effective STL Addison-Wesley Page.
Published by Addison-Wesley; ISBN: 0-201-74962-9
A remarkable book that provides deep insight on the best practice with STL. Not only does it teach what's to be done, but it clearly shows why. A book that any C++ programmer should have read. See the Effective STL Addison-Wesley Page.
This report is available on line from Visitors Page: Generic Visitors in C++. Its abstract reads:
The Visitor design pattern is a well-known software engineering technique that solves the double dispatch problem and allows decoupling of two inter-dependent hierarchies. Unfortunately, when used on hierarchies of Composites, such as abstract syntax trees, it presents two major drawbacks: target hierarchy dependence and mixing of traversal and behavioral code.CWI's visitor combinators are a seducing solution to these problems. However, their use is limited to specific “combinators aware” hierarchies.
We present here Visitors, our attempt to build a generic, efficient C++ visitor combinators library that can be used on any standard “visitable” target hierarchies, without being intrusive on their codes.
This report is in the spirit of Modern C++ Design, and should probably be read afterward.
Written by various authors, compiled by Herb Sutter
Guru of the Week (GotW) is a regular series of C++ programming problems created and written by Herb Sutter. Since 1997, it has been a regular feature of the Internet newsgroup
comp.lang.c++.moderated
, where you can find each issue's questions and answers (and a lot of interesting discussion).The Guru of the Week Archive (the famous GotW) is freely available. In this document, GotWn refers to the item number n.
Published by O'Reilly & Associates; 2nd edition (October 1992); ISBN: 1-565-92000-7.
Because the books aims at a complete treatment of Lex and Yacc on a wide range of platforms, it provides too many details on material with little interest for us (e.g., we don't care about portability to other Lexes and Yacces), and too few details on material with big interest for us (more about exclusive start condition (Flex only), more about Bison only stuff, interaction with C++ etc.).
This paper about teaching compilers justifies this lecture. It should be noted that the paper is addressing compilation lectures, not compilation projects, and therefore it misses quite a few motivations we have for the Tiger project.
Published by Addison-Wesley in 2001; ISBN: 0-52201-70431-5
A wonderful book on very advanced C++ programming with a heavy use of templates to achieve beautiful and useful designs (including the classical design patterns, see Design Patterns: Elements of Reusable Object-Oriented Software). The code is available in the form of the Loki Library. The Modern C++ Design Web Site includes pointers to excerpts such as the Smart Pointers chapter.
Read this book only once you have gained good understanding of the C++ core language, and after having read the “Effective C++/STL” books.
Published by Cambridge University Press; ISBN: 0-521-58390-X
See Modern Compiler Implementation. In my humble opinion, most books give way too much emphasis to scanning and parsing, leaving little material to the rest of the compiler, or even nothing for advanced material. This book does not suffer this flaw.
Published by the authors; ISBN: 0-13-651431-6
A remarkable review of all the parsing techniques. Because the book is out of print, its authors made it freely available: Parsing Techniques – A Practical Guide.
This report presents spot, a model checking library written in C++ and Python. Parts were inspired by the Tiger project, and reciprocally, parts inspired modifications in the Tiger project. For instance, you are encouraged to read the sections about the visitor hierarchy and its implementation. Another useful source of inspiration is the use of Python and Swig to write the command line interface.
Published by Addison-Wesley, ISBN 0-201-54330-3.
No comment, since I still have not read it. It is quite famous though.
Published by Pearson Allyn & Bacon; 4th edition (January 15, 2000); ISBN: 020530902X.
This little book (105 pages) is perfect for people who want to improve their English prose. It is quite famous, and, in addition to providing useful writing thumb rules, it features rules that are interesting as pieces of writing themselves! For instance “The writer must, however, be certain that the emphasis is warranted, lest a clipped sentence seem merely a blunder in syntax or in punctuation”.
You may find the much shorter (43 pages) First Edition of The Elements of Style on line.
Published by Prentice Hall; ISBN: 0-13-979809-9
Available on the Internet: Thinking in C++ Volume 1
Available on the Internet: Thinking in C++ Volume 2.
The first presentation of the traits technique is from this paper, Traits: a new and useful template technique. It is now a common C++ programming idiom, which is even used in the C++ standard.
Published by Wiley; Second Edition, ISBN: 0-471-11353-0
This book is not very interesting for us: the compiler material is not very advanced (no real ast, not a single line on optimization, register allocation is naive as the translation is stack based etc.), and the C++ material is not convincing (for a start, it is not standard C++ as it still uses #include <iostream.h> and the like, there is no use of STL etc.).
Automake is used to facilitate the writing of power Makefile. Autoconf is required by Automake: we don't not address portability issues for this project.
You may read the Autoconf documentation, and the Automake documentation. Using info is pleasant: info autoconf on any properly set up system. The Goat Book covers the whole GNU Build System: Autoconf, Automake and Libtool.
To set the name and version of your package, change the AC_INIT
invocation. For instance, T4 for the bardec_f
group gives:
AC_INIT([Bardeche Group Tiger Compiler], 4, [bardec_f@epita.fr], [bardec_f-tc])
If something goes wrong, or if it is simply the first time you create configure.ac or a Makefile.am, you need to set up the GNU Build System. That's the goal of the simple script bootstrap, which most important action is invoking:
$ autoreconf -fvi
The various files (configure, Makefile.in, etc.) are created. There is no need to run make distclean, or aclocal or whatever, before running autoreconf: it knows what to do.
Then invoke configure and make (see GCC):
$ ./configure CC=gcc-3.2 CXX=g++-3.2 $ make
Alternatively you may set CC
and CXX
in your environment:
$ export CC=gcc-3.2 $ export CXX=g++-3.2 $ ./configure && make
This solution is preferred as in that case the value of CC
etc. will be used by the ./configure invocation from
make distcheck (see Making a Tarball).
Once the package autotool'ed (see Bootstrapping the Package), once you can run a simple make, then you should be able to run make distcheck to set up the package.
The mission of make distcheck is to make sure everything will work properly. In particular it:
Arguments passed to the top level configure (e.g., ./configure CC=gcc-3.2 CXX=g++-3.2) will not be taken into account here.
Running export CC=gcc-3.2; export
CXX=g++-3.2 is a better way to make sure that
these compilers will be used. Alternatively use
DISTCHECK_CONFIGURE_FLAGS
to specify the arguments of the
embedded ./configure:
$ make distcheck DISTCHECK_CONFIGURE_FLAGS='--without-swig CXX=g++-4.0'
If you just run make dist instead of make distcheck, then you might not notice some files are missing in the distribution. If you don't even run make dist, the tarball might not compile elsewhere (not to mention that we don't care about object files etc.).
Running make distcheck is the only means for you to check that
the project will properly compile on our side. Not running
distcheck
is like turning off the type checking of your compiler:
you hide the errors, you avoid them, instead of actually getting rid of
them.
At this stage, if running make distcheck does not create bardec_f-tc-4.tar.bz2, then something is wrong in your package. Do not rename it, do not create the tarball by hand: something is rotten and be sure it will break on the examiner's machine.
We use GCC 3.2, which includes both gcc-3.2 and g++-3.2: the C and C++ compilers. Do not use older versions as they have poor compliance with the C++ standard. You are welcome to use more recent versions of GCC if you can use one, but the tests will be done with 3.2. Using a more recent version is often a good means to get better error messages if you can't understand what 3.2 is trying to say.
There are good patches floating around to improve GCC. In particular, you might want to use the bounds checking extension available on Herman ten Brugge Home Page.
Valgrind is an open-source memory debugger for x86-GNU/Linux written by Julian Seward, already known for having committed Bzip2. It is the best news for programmers for years. Unfortunately, due to EPITA's choice of NetBSD using Valgrind will not be convenient for you... Nevertheless, Valgrind is so powerful, so beautifully designed that you definitely should wander on the Valgrind Home Page to learn what you are missing.
In the case of the Tiger Compiler Project correct memory management is a primary goal. To this end, Valgrind is a precious tool, as is dmalloc, but because STL implementations are often keeping some memory for efficiency, you might see “leaks” from your C++ library. See its documentation on how to reclaim this memory. For instance, reading the GCC's C++ Library FAQ, especially the item “memory leaks” in containers is enlightening.
I personally use the following shell script to track memory leaks:
#! /bin/sh exec 3>&1 export GLIBCPP_FORCE_NEW=1 export GLIBCXX_FORCE_NEW=1 exec valgrind --num-callers=20 \ --leak-check=yes \ --leak-resolution=high \ --show-reachable=yes \ "$@" 2>&1 1>&3 3>&- | sed 's/^==[0-9]*==/==/' >&2 1>&2 3>&-
File 5.1: v
For instance on File 4.24,
$ v tc -A 0.tig error-->== Memcheck, a memory error detector for x86-linux. error-->== Copyright (C) 2002-2003, and GNU GPL'd, by Julian Seward. error-->== Using valgrind-2.1.0, a program supervision framework for x86-linux. error-->== Copyright (C) 2000-2003, and GNU GPL'd, by Julian Seward. error-->== Estimated CPU clock rate is 1667 MHz error-->== For more details, rerun with: -v error-->== error-->== error-->== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) error-->== malloc/free: in use at exit: 50 bytes in 2 blocks. error-->== malloc/free: 656 allocs, 654 frees, 37979 bytes allocated. error-->== For counts of detected errors, rerun with: -v error-->== searching for pointers to 2 not-freed blocks. error-->== checked 5674152 bytes. error-->== error-->== 18 bytes in 1 blocks are possibly lost in loss record 1 of 2 error-->== at 0x4002F202: operator new(unsigned) (vg_replace_malloc.c:162) error-->== by 0x402C6C78: std::__default_alloc_template<true, 0>::allocate(unsigned) (in /usr/lib/libstdc++.so.5.0.5) error-->== by 0x402CC567: std::string::_Rep::_S_create(unsigned, std::allocator<char> const&) (in /usr/lib/libstdc++.so.5.0.5) error-->== by 0x402CD2BF: (within /usr/lib/libstdc++.so.5.0.5) error-->== by 0x402C9AB8: std::string::string(char const*, std::allocator<char> const&) (in /usr/lib/libstdc++.so.5.0.5) error-->== by 0x805FD59: parse::tasks::parse() (tasks.cc:30) error-->== by 0x80F8BAF: FunctionTask::execute() const (function-task.cc:18) error-->== by 0x80FA370: TaskRegister::execute() (task-register.cc:274) error-->== by 0x804B307: main (tc.cc:26) error-->== error-->== error-->== 32 bytes in 1 blocks are still reachable in loss record 2 of 2 error-->== at 0x4002F202: operator new(unsigned) (vg_replace_malloc.c:162) error-->== by 0x806508E: yy::Parser::parse() (parsetiger.yy:212) error-->== by 0x805FF47: parse::parse(std::string const&, bool, bool) (libparse.cc:18) error-->== by 0x805FD65: parse::tasks::parse() (tasks.cc:30) error-->== by 0x80F8BAF: FunctionTask::execute() const (function-task.cc:18) error-->== by 0x80FA370: TaskRegister::execute() (task-register.cc:274) error-->== by 0x804B307: main (tc.cc:26) error-->== error-->== LEAK SUMMARY: error-->== definitely lost: 0 bytes in 0 blocks. error-->== possibly lost: 18 bytes in 1 blocks. error-->== still reachable: 32 bytes in 1 blocks. error-->== suppressed: 0 bytes in 0 blocks. /* == Abstract Syntax Tree. == */ 0 Example 136: v tc -A 0.tig
$ v tc -AD 0.tig error-->== Memcheck, a memory error detector for x86-linux. error-->== Copyright (C) 2002-2003, and GNU GPL'd, by Julian Seward. error-->== Using valgrind-2.1.0, a program supervision framework for x86-linux. error-->== Copyright (C) 2000-2003, and GNU GPL'd, by Julian Seward. error-->== Estimated CPU clock rate is 1669 MHz error-->== For more details, rerun with: -v error-->== error-->== error-->== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) error-->== malloc/free: in use at exit: 0 bytes in 0 blocks. error-->== malloc/free: 665 allocs, 665 frees, 38209 bytes allocated. error-->== For counts of detected errors, rerun with: -v error-->== No malloc'd blocks -- no leaks are possible. /* == Abstract Syntax Tree. == */ 0 Example 137: v tc -AD 0.tig
Starting with GCC 3.4, GLIBCPP_FORCE_NEW
is spelled
GLIBCXX_FORCE_NEW
.
We use Bison 1.875c which is able to produce a C++ parser. This Bison is unpublished, as the maintainers still have issues to fix. Nevertheless, it is usable, and perfectly functional for Tiger. It is installed in ~akim/bin, under the name bison. Be aware that Bison 1.875 produces buggy C++ parsers.
If you don't use this Bison, you will be in trouble. If you are willing to work at home, use bison-1.875a.tar.bz2.
The original papers on Lex and Yacc are:
These introductory guides can help beginners:
An introduction to Lex and Yacc.
Contains information about Autoconf, Automake, Gperf, Flex, Bison, and GCC.
The Bison documentation, and the Flex documentation are available for browsing.
HAVM is a Tree
(hir or lir)
programs interpreter. It was written by Robert Anisko so that
EPITA students could exercise their compiler projects before
the final jump to assembly code. It is implemented in Haskell, a pure
non strict functional language very well suited for this kind of
symbolic processing. HAVM was coined on both Haskell, and
VM standing for Virtual Machine.
Information about HAVM can be found on HAVM Home Page, and feedback can be sent to LRDE's Projects Address.
MIPSY is a MIPS simulator designed to execute simple register based MIPS assembly code. It is a minimalist MIPS virtual machine that, contrary to other simulators (see SPIM), supports unlimited registers. The lack of a simulator featuring this prompted the development of MIPSY.
Its features are:
It was written by Benoît Perrot as an LRDE member, so that EPITA students could exercise their compiler projects after the instruction selection but before the register allocation. It is implemented in C++ and Python.
Information about MIPSY can be found on MIPSY Home Page, and feedback can be sent to lrde's Projects Address.
SPIM S20 is a simulator that runs programs for the MIPS R2R3000 RISC computers. SPIM can read and immediately execute files containing assembly language. SPIM is a self-contained system for running these programs and contains a debugger and interface to a few operating system services.The architecture of the MIPS computers is simple and regular, which makes it easy to learn and understand. The processor contains 32 general-purpose 32-bit registers and a well-designed instruction set that make it a propitious target for generating code in a compiler.
However, the obvious question is: why use a simulator when many people have workstations that contain a hardware, and hence significantly faster, implementation of this computer? One reason is that these workstations are not generally available. Another reason is that these machine will not persist for many years because of the rapid progress leading to new and faster computers. Unfortunately, the trend is to make computers faster by executing several instructions concurrently, which makes their architecture more difficult to understand and program. The MIPS architecture may be the epitome of a simple, clean RISC machine.
In addition, simulators can provide a better environment for low-level programming than an actual machine because they can detect more errors and provide more features than an actual computer. For example, SPIM has a X-window interface that is better than most debuggers for the actual machines.
Finally, simulators are an useful tool for studying computers and the programs that run on them. Because they are implemented in software, not silicon, they can be easily modified to add new instructions, build new systems such as multiprocessors, or simply to collect data.
SPIM is written and maintained by James R. Larus.
Our compiler provides two different user interfaces: one is a command line interface fully written in C++, using the “Task” system, and the other is a binding of the primary functions into the Python script language (see Python. This binding is automatically extracted from our modules using SWIG.
The SWIG home page reads:
SWIG is a software development tool that connects programs written in C and C++ with a variety of high-level programming languages. SWIG is primarily used with common scripting languages such as Perl, Python, Tcl/Tk, and Ruby, however the list of supported languages also includes non-scripting languages such as Java, OCaml and C#. Also several interpreted and compiled Scheme implementations (Guile, MzScheme, Chicken) are supported. SWIG is most commonly used to create high-level interpreted or compiled programming environments, user interfaces, and as a tool for testing and prototyping C/C++ software. SWIG can also export its parse tree in the form of XML and Lisp s-expressions. SWIG may be freely used, distributed, and modified for commercial and non-commercial use.
We promote, but do not require, Python as a scripting language over Perl because in our opinion it is a cleaner language. A nice alternative to Python is Ruby.
The Python Home Page reads:
Python is an interpreted, interactive, object-oriented programming language. It is often compared to Tcl, Perl, Scheme or Java.Python combines remarkable power with very clear syntax. It has modules, classes, exceptions, very high level dynamic data types, and dynamic typing. There are interfaces to many system calls and libraries, as well as to various windowing systems (X11, Motif, Tk, Mac, MFC). New built-in modules are easily written in C or C++. Python is also usable as an extension language for applications that need a programmable interface.
The Python implementation is portable: it runs on many brands of UNIX, on Windows, OS/2, Mac, Amiga, and many other platforms. If your favorite system isn't listed here, it may still be supported, if there's a C compiler for it. Ask around on news:comp.lang.python – or just try compiling Python yourself.
The Python implementation is copyrighted but freely usable and distributable, even for commercial use.
We use Doxygen as the standard tool for producing the developer's documentation of the project. Its features must be used to produce good documentation, with an explanation of the role of the arguments etc. The quality of the documentation will be part of the notation. Details on how to use proper comments are given in the Doxygen Manual.
The documentation produced by Doxygen must not be included, but the
target html
must produce the html documentation in the
doc/html directory.
Contributions to this section (as for the rest of this documentation) will be greatly appreciated.
frame::Frame
(see T5).
Tree
(hir or lir)
programs interpreter. See HAVM.
From WordNet:
See “schooling” and “curriculum”.
struct A { }; struct B: A { };
vtable
Copyright © 2000 Free Software Foundation, Inc. 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed.
The purpose of this License is to make a manual, textbook, or other written document free in the sense of freedom: to assure everyone the effective freedom to copy and redistribute it, with or without modifying it, either commercially or noncommercially. Secondarily, this License preserves for the author and publisher a way to get credit for their work, while not being considered responsible for modifications made by others.
This License is a kind of “copyleft”, which means that derivative works of the document must themselves be free in the same sense. It complements the GNU General Public License, which is a copyleft license designed for free software.
We have designed this License in order to use it for manuals for free software, because free software needs free documentation: a free program should come with manuals providing the same freedoms that the software does. But this License is not limited to software manuals; it can be used for any textual work, regardless of subject matter or whether it is published as a printed book. We recommend this License principally for works whose purpose is instruction or reference.
This License applies to any manual or other work that contains a notice placed by the copyright holder saying it can be distributed under the terms of this License. The “Document”, below, refers to any such manual or work. Any member of the public is a licensee, and is addressed as “you”.
A “Modified Version” of the Document means any work containing the Document or a portion of it, either copied verbatim, or with modifications and/or translated into another language.
A “Secondary Section” is a named appendix or a front-matter section of the Document that deals exclusively with the relationship of the publishers or authors of the Document to the Document's overall subject (or to related matters) and contains nothing that could fall directly within that overall subject. (For example, if the Document is in part a textbook of mathematics, a Secondary Section may not explain any mathematics.) The relationship could be a matter of historical connection with the subject or with related matters, or of legal, commercial, philosophical, ethical or political position regarding them.
The “Invariant Sections” are certain Secondary Sections whose titles are designated, as being those of Invariant Sections, in the notice that says that the Document is released under this License.
The “Cover Texts” are certain short passages of text that are listed, as Front-Cover Texts or Back-Cover Texts, in the notice that says that the Document is released under this License.
A “Transparent” copy of the Document means a machine-readable copy, represented in a format whose specification is available to the general public, whose contents can be viewed and edited directly and straightforwardly with generic text editors or (for images composed of pixels) generic paint programs or (for drawings) some widely available drawing editor, and that is suitable for input to text formatters or for automatic translation to a variety of formats suitable for input to text formatters. A copy made in an otherwise Transparent file format whose markup has been designed to thwart or discourage subsequent modification by readers is not Transparent. A copy that is not “Transparent” is called “Opaque”.
Examples of suitable formats for Transparent copies include plain ascii without markup, Texinfo input format, LaTeX input format, SGML or XML using a publicly available DTD, and standard-conforming simple HTML designed for human modification. Opaque formats include PostScript, PDF, proprietary formats that can be read and edited only by proprietary word processors, SGML or XML for which the DTD and/or processing tools are not generally available, and the machine-generated HTML produced by some word processors for output purposes only.
The “Title Page” means, for a printed book, the title page itself, plus such following pages as are needed to hold, legibly, the material this License requires to appear in the title page. For works in formats which do not have any title page as such, “Title Page” means the text near the most prominent appearance of the work's title, preceding the beginning of the body of the text.
You may copy and distribute the Document in any medium, either commercially or noncommercially, provided that this License, the copyright notices, and the license notice saying this License applies to the Document are reproduced in all copies, and that you add no other conditions whatsoever to those of this License. You may not use technical measures to obstruct or control the reading or further copying of the copies you make or distribute. However, you may accept compensation in exchange for copies. If you distribute a large enough number of copies you must also follow the conditions in section 3.
You may also lend copies, under the same conditions stated above, and you may publicly display copies.
If you publish printed copies of the Document numbering more than 100, and the Document's license notice requires Cover Texts, you must enclose the copies in covers that carry, clearly and legibly, all these Cover Texts: Front-Cover Texts on the front cover, and Back-Cover Texts on the back cover. Both covers must also clearly and legibly identify you as the publisher of these copies. The front cover must present the full title with all words of the title equally prominent and visible. You may add other material on the covers in addition. Copying with changes limited to the covers, as long as they preserve the title of the Document and satisfy these conditions, can be treated as verbatim copying in other respects.
If the required texts for either cover are too voluminous to fit legibly, you should put the first ones listed (as many as fit reasonably) on the actual cover, and continue the rest onto adjacent pages.
If you publish or distribute Opaque copies of the Document numbering more than 100, you must either include a machine-readable Transparent copy along with each Opaque copy, or state in or with each Opaque copy a publicly-accessible computer-network location containing a complete Transparent copy of the Document, free of added material, which the general network-using public has access to download anonymously at no charge using public-standard network protocols. If you use the latter option, you must take reasonably prudent steps, when you begin distribution of Opaque copies in quantity, to ensure that this Transparent copy will remain thus accessible at the stated location until at least one year after the last time you distribute an Opaque copy (directly or through your agents or retailers) of that edition to the public.
It is requested, but not required, that you contact the authors of the Document well before redistributing any large number of copies, to give them a chance to provide you with an updated version of the Document.
You may copy and distribute a Modified Version of the Document under the conditions of sections 2 and 3 above, provided that you release the Modified Version under precisely this License, with the Modified Version filling the role of the Document, thus licensing distribution and modification of the Modified Version to whoever possesses a copy of it. In addition, you must do these things in the Modified Version:
If the Modified Version includes new front-matter sections or appendices that qualify as Secondary Sections and contain no material copied from the Document, you may at your option designate some or all of these sections as invariant. To do this, add their titles to the list of Invariant Sections in the Modified Version's license notice. These titles must be distinct from any other section titles.
You may add a section entitled “Endorsements”, provided it contains nothing but endorsements of your Modified Version by various parties—for example, statements of peer review or that the text has been approved by an organization as the authoritative definition of a standard.
You may add a passage of up to five words as a Front-Cover Text, and a passage of up to 25 words as a Back-Cover Text, to the end of the list of Cover Texts in the Modified Version. Only one passage of Front-Cover Text and one of Back-Cover Text may be added by (or through arrangements made by) any one entity. If the Document already includes a cover text for the same cover, previously added by you or by arrangement made by the same entity you are acting on behalf of, you may not add another; but you may replace the old one, on explicit permission from the previous publisher that added the old one.
The author(s) and publisher(s) of the Document do not by this License give permission to use their names for publicity for or to assert or imply endorsement of any Modified Version.
You may combine the Document with other documents released under this License, under the terms defined in section 4 above for modified versions, provided that you include in the combination all of the Invariant Sections of all of the original documents, unmodified, and list them all as Invariant Sections of your combined work in its license notice.
The combined work need only contain one copy of this License, and multiple identical Invariant Sections may be replaced with a single copy. If there are multiple Invariant Sections with the same name but different contents, make the title of each such section unique by adding at the end of it, in parentheses, the name of the original author or publisher of that section if known, or else a unique number. Make the same adjustment to the section titles in the list of Invariant Sections in the license notice of the combined work.
In the combination, you must combine any sections entitled “History” in the various original documents, forming one section entitled “History”; likewise combine any sections entitled “Acknowledgments”, and any sections entitled “Dedications”. You must delete all sections entitled “Endorsements.”
You may make a collection consisting of the Document and other documents released under this License, and replace the individual copies of this License in the various documents with a single copy that is included in the collection, provided that you follow the rules of this License for verbatim copying of each of the documents in all other respects.
You may extract a single document from such a collection, and distribute it individually under this License, provided you insert a copy of this License into the extracted document, and follow this License in all other respects regarding verbatim copying of that document.
A compilation of the Document or its derivatives with other separate and independent documents or works, in or on a volume of a storage or distribution medium, does not as a whole count as a Modified Version of the Document, provided no compilation copyright is claimed for the compilation. Such a compilation is called an “aggregate”, and this License does not apply to the other self-contained works thus compiled with the Document, on account of their being thus compiled, if they are not themselves derivative works of the Document.
If the Cover Text requirement of section 3 is applicable to these copies of the Document, then if the Document is less than one quarter of the entire aggregate, the Document's Cover Texts may be placed on covers that surround only the Document within the aggregate. Otherwise they must appear on covers around the whole aggregate.
Translation is considered a kind of modification, so you may distribute translations of the Document under the terms of section 4. Replacing Invariant Sections with translations requires special permission from their copyright holders, but you may include translations of some or all Invariant Sections in addition to the original versions of these Invariant Sections. You may include a translation of this License provided that you also include the original English version of this License. In case of a disagreement between the translation and the original English version of this License, the original English version will prevail.
You may not copy, modify, sublicense, or distribute the Document except as expressly provided for under this License. Any other attempt to copy, modify, sublicense or distribute the Document is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance.
The Free Software Foundation may publish new, revised versions of the GNU Free Documentation License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. See http://www.gnu.org/copyleft/.
Each version of the License is given a distinguishing version number. If the Document specifies that a particular numbered version of this License “or any later version” applies to it, you have the option of following the terms and conditions either of that specified version or of any later version that has been published (not as a draft) by the Free Software Foundation. If the Document does not specify a version number of this License, you may choose any version ever published (not as a draft) by the Free Software Foundation.
To use this License in a document you have written, include a copy of the License in the document and put the following copyright and license notices just after the title page:
Copyright (C) year your name. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1 or any later version published by the Free Software Foundation; with the Invariant Sections being list their titles, with the Front-Cover Texts being list, and with the Back-Cover Texts being list. A copy of the license is included in the section entitled ``GNU Free Documentation License''.
If you have no Invariant Sections, write “with no Invariant Sections” instead of saying which ones are invariant. If you have no Front-Cover Texts, write “no Front-Cover Texts” instead of “Front-Cover Texts being list”; likewise for Back-Cover Texts.
If your document contains nontrivial examples of program code, we recommend releasing these examples in parallel under your choice of free software license, such as the GNU General Public License, to permit their use in free software.
This is version 0.241 of assignments.texi, last edited on February 24, 2004, and compiled 24 February 2004, using:
$ tc --version tc (LRDE Tiger Compiler 0.66a) Revision 0.1228 Tue, 17 Feb 2004 18:58:49 +0100 This package was written by and with the assistance of * Akim Demaille akim@freefriends.org - Maintenance. * Alexandre Duret-Lutz duret_g@epita.fr * Cedric Bail bail_c@epita.fr - Initial escaping static link computation framework. * Alexis Brouard brouar_a@epita.fr - Portability of tc-check to NetBSD. * Benoît Perrot benoit@lrde.epita.fr - Extensive documentation. - Redesign of the Task system. - Design and implementation of target handling. - Deep clean up of every single module. - Third redesign of the AST, and actually, automatic generation of the AST. * Daniel Gazard gazard_d@epita.fr - Initial framework from LIR to MIPS. * Francis Maes francis@lrde.epita.fr - Generation of static C++ Tree As Types. * Julien Roussel spip@lrde.epita.fr - "let" desugaring. * Nicolas Burrus - Generation of a Swig bindings of the tc libraries to Python. - Implementation of a tc shell. * Pierre-Yves Strub strub_p@epita.fr - Second redesign of the AST. - Second redesign of Symbol. * Quôc Peyrot chojin@lrde.epita.fr - Initial Task framework. * Raphaël Poss r.poss@online.fr - Conversion of AST to using pointers instead of references. - Breakup between interfaces and implementations (.hh only -> .hxx, .cc) - Miscellaneous former TODO items. - Implementation of reference counting for Tree. * Robert Anisko anisko_r@epita.fr * Sébastien Broussaud brouss_s@epita.fr - Escapes torture tests. * Stéphane Molina molina_s@epita.fr - Configuration files in tc-check. * Thierry Géraud theo@epita.fr - Initial idea for visitors. - Initial idea for tasks. - Initial implementation of AST. - Initial implementation of Tree. * Valentin David david_v@epita.fr - Some additional tests. * Yann Popo popo_y@epita.fr - Implementation of the Timer class. * Yann Régis-Gianas yann@lrde.epita.fr - Reimplementation of graphs. Copyright (C) 2004 LRDE. Example 138: tc --version
$ havm --version HAVM 0.21 Written by Robert Anisko. Copyright (C) 2003 Laboratoire de Recherche et Développement de l'EPITA. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. Example 139: havm --version
$ mipsy --version mipsy (Mipsy) 0.5 Written by Benoit Perrot. Copyright (C) 2003 Benoit Perrot. mipsy comes with ABSOLUTELY NO WARRANTY. This is free software, and you are welcome to redistribute and modify it under certain conditions; see source for details. Example 140: mipsy --version
and
*-tasks.cc are impure
: File Conventionsaccess.cc
: src/translateaccess.cc
: src/frameaccess.hh
: src/translateaccess.hh
: src/frameassembly.hh
: src/codegenBjarne Stroustrup
: BibliographyBoost.org
: BibliographyC++ Primer
: Bibliographycodegen-tasks.cc
: src/codegencodegen-tasks.hh
: src/codegencodegen.cc
: src/codegen/ia32codegen.cc
: src/codegen/mipscodegen.hh
: src/codegen/ia32codegen.hh
: src/codegen/mipscodegen.hh
: src/codegencolor.hh
: src/regalloccommon.hh
: srcCompilers: Principles, Techniques and Tools
: Bibliographycontract.hh
: src/miscCool: The Classroom Object-Oriented Compiler
: Bibliographycpu.hh
: src/targetCStupidClassName
: BibliographyDeclarations in
*.hh: File Conventionsdefault-visitor.hh
: src/astDefinitions of functions and variables in
*.cc: File Conventionsdepth_get
: T3 Code To WriteDesign Patterns: Elements of Reusable Object-Oriented Software
: Bibliographydistcheck
: Making a Tarballdynamic_cast
: Use of C++ FeaturesEffective C++
: BibliographyEffective STL
: Bibliographyescape
: src/miscescape.hh
: src/miscescape_set
: T3 Code To Writeescapes::EscapesVisitor
: T3 Code To WriteEscapesVisitor
: T3 Code To Writeexp.hh
: src/translateflowgraph.hh
: src/livenessfoo_get
: Matters of Stylefoo_set
: Matters of Stylefragment.cc
: src/assemfragment.hh
: src/assemfragment.hh
: src/translateframe.cc
: src/frameframe.hh
: src/frame exports forward declarations
: File Conventionsgas-assembly.cc
: src/codegen/ia32gas-assembly.hh
: src/codegen/ia32gas-layout.cc
: src/codegen/ia32gas-layout.hh
: src/codegen/ia32Generic Visitors in C++
: Bibliographyget
: T4 Code to WriteGLIBCPP_FORCE_NEW
: ValgrindGLIBCXX_FORCE_NEW
: Valgrindgraph.hh
: src/graphgraph.hxx
: src/graphGuru of the Week
: Bibliographyhandler.hh
: src/graphhandler.hxx
: src/graphHunt code duplication
: Use of C++ FeaturesHunt Leaks
: Use of C++ Featuresia32
: src/codegenia32-cpu.hh
: src/targetia32-target.hh
: src/targetInlined definitions in
*.hxx: File Conventionsinstr.hh
: src/asseminterference-graph.cc
: src/livenessinterference-graph.hh
: src/livenessiterator.hh
: src/graphiterator.hxx
: src/graphlabel.hh
: src/assemlabel.hh
: src/templayout.hh
: src/assemlevel-entry.hh
: src/translatelevel-env.hh
: src/translatelevel.cc
: src/translatelevel.hh
: src/translateLex & Yacc
: Bibliography and
lib*.cc are pure
: File Conventionslibassem.cc
: src/assemlibassem.hh
: src/assemlibcodegen.cc
: src/codegenlibcodegen.hh
: src/codegenlibparse.hh
: src/parselibregalloc.cc
: src/regalloclibregalloc.hh
: src/regalloclibtranslate.cc
: src/translatelibtranslate.hh
: src/translatelibtype.hh
: src/typeliveness.cc
: src/livenessliveness.hh
: src/livenesslocation.hh
: src/parselocation.hh
: src/astMake functor classes adaptable (ES40)
: Use of STLMaking Compiler Design Relevant for Students who will (Most Likely) Never Design a Compiler
: Bibliographymalloc
: T5 Builtin Calls Samplesmips
: src/codegenmips-cpu.hh
: src/targetmips-target.hh
: src/targetModern C++ Design -- Generic Programming and Design Patterns Applied
: BibliographyModern Compiler Implementation in C, Java, ML
: Bibliographymove.hh
: src/assemName private/protected members like_this_
: Use of C++ FeaturesName public members like_this
: Use of C++ FeaturesName the parent class super_type
: Use of C++ FeaturesName your classes LikeThis
: Use of C++ FeaturesName your typedef
foo_type
: Use of C++ Featuresoper.hh
: src/assemOrder class members by visibility first
: Matters of Styleparsetiger.yy
: src/parseParsing Techniques -- A Practical Guide
: Bibliographypatch
: Given Tarballsposition.hh
: src/parseposition.hh
: src/astPrefer algorithm call to hand-written loops (ES43)
: Use of STLPrefer C Comments for Long Comments
: Matters of StylePrefer C++ Comments for One Line Comments
: Matters of StylePrefer Doxygen Documentation to plain comments
: Matters of StylePrefer dynamic_cast of references
: Use of C++ FeaturesPrefer member functions to algorithms with the same names (ES44)
: Use of STLprint
: T4 Code to Writeprint
: Matters of Styleprint-visitor.hh
: src/astput
: T4 Code to Writeregalloc-tasks.cc
: src/regallocregalloc-tasks.hh
: src/regallocregallocator.hh
: src/regallocruntime.cc
: src/codegen/ia32runtime.cc
: src/codegen/mipsruntime.s
: src/codegen/ia32runtime.s
: src/codegen/mipsscantiger.ll
: src/parsescope_begin
: T4 Code to Writescope_end
: T4 Code to Writeset.hh
: src/miscSpecify comparison types for associative containers of pointers (ES20)
: Use of STLspim-assembly.cc
: src/codegen/mipsspim-assembly.hh
: src/codegen/mipsspim-layout.cc
: src/codegen/mipsspim-layout.hh
: src/codegen/mipsspot : une bibliothèque de vérification de propriétés de logique temporelle à temps linéaire
: BibliographySTL Home
: BibliographySymbol
: src/symbolsymbol.hh
: src/symbolsymbol::Table< class Entry_T >
: T4 Code to Writetable.hh
: src/symboltarget-tasks.cc
: src/targettarget-tasks.hh
: src/targettarget.hh
: src/targettc
: srctc.cc
: srctemp.hh
: src/temptest-flowgraph.cc
: src/livenesstest-graph.cc
: src/graphtest-regalloc.cc
: src/regallocThe Design and Evolution of C++
: BibliographyThe Dragon Book
: BibliographyThe Elements of Style
: BibliographyThinking in C++ Volume 1
: BibliographyThinking in C++ Volume 2
: Bibliographytiger-runtime.c
: src/codegentimer.cc
: src/misctimer.hh
: src/miscTraits: a new and useful template technique
: Bibliographytranslate-visitor.hh
: src/translatetranslation.hh
: src/translatetype-entry.hh
: src/typetype-env.hh
: src/typetype::Error
: T4 Optionstypeid
: Use of C++ Featurestypes.hh
: src/typeUse
\directive: Matters of StyleUse const references in arguments to save copies (EC22)
: Use of C++ FeaturesUse dynamic_cast for type cases
: Use of C++ FeaturesUse foo_get, not get_foo
: Matters of StyleUse pointers when passing an object together with its management
: Use of C++ FeaturesUse print as a member function returning a stream
: Matters of StyleUse
rebox.el to markup paragraphs
: Matters of StyleUse references for aliasing
: Use of C++ FeaturesUse the Imperative
: Matters of StyleUse virtual methods, not type cases
: Use of C++ Featuresvisitor.hh
: src/assemvisitor.hh
: src/astvtable
: GlossaryWrite Documentation in Doxygen
: Matters of StyleWriting Compilers and Interpreters -- An Applied Approach Using C++
: Bibliographyyaka@epita.fr
: Automated Evaluation[1] The fact that the compiler compiles C++ is virtually irrelevant.
[2] See the shift of language? From tarball to distribution.
[3] Please, let me know who I forgot!
[4] For instance, g++ reports an error: cannot dynamic_cast `a' (of type `struct A*') to type `struct B*' (source type is not polymorphic).
[5] Actually, it is _[A-Z] which is reserved.
[6] This statement is unfair to Andrew Appel: since in his modelisation type checking and translation are performed in a single step, the information about the builtins remains in a single place.
[7] To be fair, the Dragon Book leaves a single page (not sheet) to graph coloring.