Initial Examples of Semantic Actions:
Consider the recursive descent parser for the following grammar giving arithmetic expressions.
P ---> E '#'
E ---> T {('+'|'-') T}
T ---> S {('*'|'/') S}
S ---> F '^' S | F
F ---> '(' E ')' | char
It is easy to add extra code to this parser so that it will
translate arithmetic expressions in to reverse Polish notation
(RPN). So little additional code is needed that this example
illustrates the power of this approach.
Here is another example of semantic actions: an evaluator of arithmetic expressions:
Reference Material About MIPS:
Initial Work on Assignment: For this assignment and the next two, you are to translate portions of the language described in the previous assignment (the recursive descent parser) into MIPS assembly code.
For this recitation you should ignore all statements except assignments and output statements. In particular, ignore function definitions, ignore the exponentiation operator ^, and ignore all 6 relational operators and both logical operators. This subset also uses single-character identifiers and single-character integers.
The output statements are of the form:
< expression ; (print an expression) < B ; (print a blank) < N ; (print a newline) < T ; (print a tab)
Portions of the Grammar to Implement (highlighted in red bold below):
M ---> { ( S | D ) } '#'
S ---> I | W | A | P | C | G
D ---> '(' id '(' [ id { ',' id } ] ')' { S } ')'
I ---> '[' E '?' { S } ':' { S } ']' | '[' E '?' { S } ']'
W ---> '{' E '?' { S } '}'
A ---> id '=' E ';'
P ---> '<' E ';'
C ---> '<' ( 'B' | 'T' | 'N' ) ';'
G ---> '>' id ';'
E ---> Q [ ('&' | '|') Q ]
Q ---> R [ ('<' | '>' | '<=' | '>=' | '==' | '!=' ) R ]
R ---> T { ('+' | '-') T }
T ---> U { ('*' | '/' | '%') U }
U ---> F '^' U | F
F ---> ['+' | '-' | '!'] ( '(' E ')' | id | num |
id '(' [ E { ',' E } ] ')' )
id ---> letter { letter | digit }
num ---> digit { digit }
Form of assembler output: There are many different forms the output could take. I am suggesting one form here but please do not consider yourself limited to this form.
la $s1, M
.data
M: .word 0,1,2,3,4,5,6,7,8,9 # constants
.space 104 # 26 variables a to z
.space 500 # 125 temporaries
lw $t1, 8($s1) # load constant 2 into $t1 sw $t1, 64($s1) # store into g (which is M[16]
lw $t1, 60($s1) # load f (= M[15]) into $t1 lw $t2, 64($s1) # load g (= M[16]) into $t2 add $t3, $t1, $t2 # form sum in $t3 sw $t3, 176($s1) # store $t3 into temporary M[44] lw $t1, 176($s1) # load temporary M[44] into $t1 sw $t1, 68($s1) # store $t1 into h (= M[17])
li $v0, 1 # 1 to print an int lw $a0, 64($s1) # load g's value (M[16]) into $a0 syscall # now print
li $v0, 4 # 4 to print a string la $a0, Blank # load address of string syscall # now print
where elsewhere you need:
.data
Blank: .asciiz " "
How to do the actual translation: You will be adding code to the affected portions of the recursive descent parser. As a hint for completing the assignment, you can study the code above that evaluates an arithmetic expression as a double, since the code you need is actually quite similar.
In this case, instead of returning a double giving the value of an expression, each of the various functions of the parser should return an integer giving the location in memory where the given operand or expression is to be found.
One very simple strategy I used above is for memory locations 0 to 9 to hole constants 0 to 9, memory locations 10 to 35 to hold values of variables a to z, and memory locations 36 to 85 for to hold values of 50 temporaries.
What to Turn In for the Assignment: You should turn in a source listing of your compiler, a listing of the MIPS code you generate, and a listing of the output when you run the MIPS code. It is essential that your actually run your MIPS code in order to check for errors. Just using the single red source below will be sufficient for your run, though you should use other source if you have trouble with this one.
f = 5 + 8;
g = f + 8;
h = f + g;
< h; < N;
#
The output spim code might be the following, but there are lots of other possible forms.
### Compiled on: Fri Sep 27 11:39:34 CDT 2002 main: addu $s7, $ra, $zero la $s1, M ### Start of compiled code # M[36] = M[5] + M[8] lw $t1, 20($s1) lw $t2, 32($s1) add $t3, $t1, $t2 sw $t3, 144($s1) # M[15] = M[36] lw $t1, 144($s1) sw $t1, 60($s1) # M[37] = M[15] + M[8] lw $t1, 60($s1) lw $t2, 32($s1) add $t3, $t1, $t2 sw $t3, 148($s1) # M[16] = M[37] lw $t1, 148($s1) sw $t1, 64($s1) # M[38] = M[15] + M[16] lw $t1, 60($s1) lw $t2, 64($s1) add $t3, $t1, $t2 sw $t3, 152($s1) # M[17] = M[38] lw $t1, 152($s1) sw $t1, 68($s1) # Print M[17] li $v0, 1 lw $a0, 68($s1) syscall # Print NewL as ASCII char li $v0, 4 la $a0, NewL syscall ### End of complied code addu $ra, $s7, $zero jr $ra .data M: .word 0,1,2,3,4,5,6,7,8,9 # constants .space 104 # variables a to z .space 500 # temporaries Blank: .asciiz " " NewL: .asciiz "\n" Tab: .asciiz "\t"
With output (when the spim code is executed):
34
f = 5 + 8;
g = f + 8;
h = f + g;
a = f*f + g*g;
b = g*h + f*g;
c = g*g + h*h;
f = a*a + b*b;
g = b*c + a*b;
h = c*c + b*b;
< h; < N;
#
With output equal to the 33rd Fibonacci number (when the spim code is executed):
3524578
d = 5*2;
c = d*d*d*d*d*d*d;
b = 2*c;
a = 4*c;
k = 0;
x = 8*2;
t = a/(8*k+1) - b/(8*k+4) - c/(8*k+5) - c/(8*k+6);
s = t; < s; < N;
k = 1;
t = a/(8*k+1) - b/(8*k+4) - c/(8*k+5) - c/(8*k+6);
s = s + t/x; < s; < N;
k = 2;
t = a/(8*k+1) - b/(8*k+4) - c/(8*k+5) - c/(8*k+6);
s = s + t/(x*x); < s; < N;
k = 3;
t = a/(8*k+1) - b/(8*k+4) - c/(8*k+5) - c/(8*k+6);
s = s + t/(x*x*x); < s; < N;
#
Here is the output that should be produced by this:
31333334 31414225 31415874 31415924
Extensions to Handle Arbitrary Identifiers and Integers: If you want to use arbitrary identifiers, then you will need a symbol table for them and will need keep track of the memory location of each identifier. Arbitrary integers could be conveniently handled by creating each integer as it occurs using an addi instruction. (Better would be to have a table of integer constants, either combined with the first table or separate.) Of course, even with single-character identifiers, you could still use a symbol table.
Suggestions for Debugging: It is important to issue error messages for parser errors, because you may very well have trouble with your parser. You should also keep the parser code as it is and create the compiler as a separate program. Then if you get a new parser error, you can debug it with the simpler parser before working on the compiler.
I also recommend outputting assembler comments, perhaps similar to the ones shown above. That may help you track down errors.