Deadline: 6:45pm, Thursday, December 3, 2009
TL09 and Compiler Extensions
In the previous phase of the project you have completed a
non-optimizing compiler for the core TL09 language that can
produce semantically equivalent MIPS assembly code. In this
phase, you have the opportunity to earn up to 200 additional
points towards an 'A' by extending your compiler to support
additional language features and/or add additional optimizations.
This phase is intentionally less tightly specified and
more open ended than those of Phase I and Core. Below is
a list of possible extensions and the amount of points that
can be earned by successfully completing each extensions.
Please include, in the extensions/README file,
the list of extensions attempted, their status, information
on how to enable/disable the extension (if possible), and
a list of relevant items included extensions/docs
directory.
Extensions
Language Extensions
Create a revised language description (including lexical categories,
context-free grammar, informal type rules, and informal semantics---as
appropriate) containing one or more of the following optimizations.
Implement the features so that the compiler in the "extensions" folder
can correctly compile the revised language. Create a test plan describing
how the language extensions will be tested, what test cases will be used,
and what the result of the testing is.
- Permit operators to occur adjacent
to identifiers/numbers/keywords without intervening spaces
(so that, for example, both "x := x + 1" and "x:=x+1;" would be
legal---and equivalent---statements) (40 points)
- Support for comments
- single line comments marked whose start is marked by some (sequence
of) chracter(s) (10 points)
- multi-line comments whose beginning is marked by some (sequence of)
character(s) and whose ending is marked by
some (sequence of) character(s) (10 points)
- Support additional base types:
- floating point numbers (40 points)
- characters (20 points)
- strings (40 points)
- Support for arrays:
- of integers (40 points)
- of other base types (20 points each)
- multi-dimensional arrays (40 additional points)
- Support for procedures:
- basic non-recursive implementation (40 points)
- support for recursive procedures (40 additional points)
- support for nesting with static scoping (40 additional points)
- support for procedure variable and parameters (40 additional points)
Optimizations and Compiler Extensions
Below is a list of several transformations and optimizations that can
be implemented in your compiler project for additional credit.
In addition to implementing the code, you should
provide a description of the algorithm used (with citations if appropriate),
and a description of the testing done (including test inputs and outputs
checked).
- translation into (and out of) SSA form (40 points)
Translation into SSA should use one of the following algorithms :
- The simplest method is probably to follow the
approach of Aycock and Horspool. This is similar to what is
described in Cooper 9.3.1. (But read the paper.)
- Alternatively, you may use the algorithm in Cooper 9.3.3/9.3.4.
- The seminal work on SSA construction is that of Cytron.
- There's also a a
newer algorithm by Das and Ramakrishna that may be worth
considering.
Translation out of SSA form should pay careful attention to semantics
of the φ-functions, especially their "simultaneous" execution, which
leads to the swap problem (Cooper 9.3.5). Morgan's "Building an
Optimizing Compiler" has a detailed algorithm for handling this.
Contact the instructor if you're interested. Some of these issue
only show up when optimizations are performed in SSA form.
- dominator-based value numbering (40 points)
- Implement constant propagation/folding using the algorithm in Cooper Section
9.2 or Cooper 10.3.3 (40 points)
- unreachable and useless code elimination in Cooper Section 10.3.1 (40 points)
- implement Chaitin-Briggs register allocation with live range analysis
and interference graph (Cooper Section 13.5.1, 13.5.3, 13.5.5) (40 points)
Testing Combinations of Extensions
If you implement more than one extension. Develop a plan, test cases,
and documentation of test results (as appropriate) that provide
confidence that compiler still works correctly when extensions are combined.
(20 points per extension, beyond the first one, covered by the test plan)
Input/Output Specification
The compiler should produce the same outputs as required for Core
project. In addition, for each machine independent optimization pass,
there should be (an) additional output(s) showing the state of the relevant
IR's before and after each optimization (e.g., the ILOC/CFG output from
core and an ILOC dump--perhaps using the same output routine---after
translating into SSA and then another after translating back out
of SSA).
Grading Criteria
For each extension, students will earn the points listed above
according to the following rubric:
- 100%: code has no significant flaws. Students are responsible
for demonstrating evidence of this by having adequate testing
that the code works correctly or nearly correctly for nearly
all test cases. But TA/instructor test cases and code inspection
may also be used in assessment.
- 80%: code has minor flaws, but correctly handles the "common case".
Students are responsible for demonstrating evidence of this by
having adequate testing and showing that the code implements
the documented extensions.
But TA/instructor test cases and code inspection may also be used
in assessment.
- 40%: code has major flaws, but the extension have been documented (as
described above) and the submission includes at least two non-trivial
test cases that shows the extension working.
The points listed above add up to more than 200 points. Students
may attempt up to 300 points worth of extensions but only up
to 200 points may be earned.
No points will be awarded for any extensions unless the core
compiler works correctly.
The core compiler will be considered
to work correctly if it...
- builds according to the build instructions,
- produces executing SPIM code for
all provided test programs, and
- produces correctly executing SPIM code for most of the provided
test programs in each category, especially
the "Real" Programs
Submission Mechanics and Packaging
Please refer to the the
submission instructions for information on how to prepare the
subversion repository containing your source code for grading.
The files for this phase of the project should be placed in the
extensions. This means that the version of the compiler
in the core directory must still faithfully implement
the unextended core TL09 language. The
extensions/README file should contain
a list of what extensions are implemented. Detailed documentation
related to extensions should be included as separate text files,
OpenOffice documents, or PDF's in an extensions/docs directory.
Testing
You are also required to adequately test your compiler optimizations
and language extensions. Please submit
your test cases along with your source code and document the
current state of your compiler based on your own testing.
In your documentation for each extensions, please include
a description of which test cases you developed to specifically
test those extensions.
If you implement multiple optimizations,
it is advisable to include a command-line switch that
activates/deactivates individual optimizations when feasible.
This will allow, for example, full points to be assigned for
"optimization 1" even if "optimization 2" crashes for some
test cases. Otherwise, you may lose points for "optimization 1"
because "optimization 2" breaks a test case. Additional points
are giving for having tests and a test plan that includes
testing combinations of as well as individual optimization
switches.
Errata/Clarifications
There may need to be corrections, clarifications, or other modifications
to these instructions, you are responsible for monitoring the class web
site and listening during lecture for announcements related to this
assignment.
- 2009/11/19: Added extension for adding comments to TL09 program.
- 2009/11/12: For reference, there are 600 non-extension
points available in the course and 600 points are needed
to guarantee an 'A'. (60 for homeworks, 40 for phase 1,
100 for first midterm, 200 for core, and 200 for the final.)
Attempting two 40 point extensions will allow you to earn an
'A' with an average of 88.24% of 680 points. (You will also
earn an 'A' with 100% of 600 points, 97% of 620 points,
94% of 640 points, 91% of 660 points, 86% of 700 points,
83.3% of 720 points, 81.1% of 740 points, 79% of 760
points, 77% of 780 points, and 75% of 800 points.)
- 2009/10/31: Put in link to the required core test cases.
- 2009/10/12: Fixed up some of the references to possible optimizations.