CD Unit3
CD Unit3
E E.VAL=3 E E.VAL=3
E E +
+
3 3
E E E E.VAL=1 + E E.VAL=2
+
1 2 1 2
7.2 Implementation of Syntax directed translators
A syntax directed translation scheme provides a method for describing input output mapping
To calculate the translation at node A associated with the production AXYZ, we need to know the values
of the translations at X, Y and Z.
These nodes X, Y and Z will become the children of node A after reduction.
Top is a pointer associated with the top of the stack
Before XYZ is reduced to A, the value of translation of Z is in VAL[TOP]
The value of Y is in VAL[TOP+1]
The value of Z is in VAL[TOP+2]
After reduction, top is incremented by 2 and the value of A.VAL appears in VAL[TOP]
Stack before reduction
State val
Z Z.VAL
Y Y.VAL
X X.VAL
E(E(1)) E.CODE:=E(1).CODE
Eid E.CODE = id
The value of the translation E.CODE is the concatenation of 2 translations E (1).CODE and E(2).CODE
The translation of parenthesized expression is same as that of the un-parenthesized expression
The translation of any identifier is identifier itself
The following eg shows the implementation of infix-postfix translation
Production Program fragment
EE(1) op E(2) {print op}
E(E(1)) {}
Eid { print id}
Evaluate a+b*c using syntax directed infix –postfix translation
1. Shift a
2. Reduce by Eid , print a
3. Shift +
4. Shift b
5. Reduce by Eid , print b
6. Shift *
7. Shift c
8. Reduce by Eid , print c
9. Reduce by EE op E , print *
10. Reduce by EE op E , print +
7.5 Parse trees and syntax trees
Syntax tree a*(b+c)/d syntax tree if a=b then a:=c+d else b:=c-d
/
If then else
* d :=
=
:=
a + -
a b + b
a
b c c d
c d
Quadruples:
Representation of 3 address statements using 4 fields is called quadruples
We have 4 fields - op, arg1, arg2, result
Quadruple representation:
op Arg1 Arg2 Result
(0) uminus B -- T1
(1) + C D T2
(2) * T1 T2 T3
(3) := T3 -- A
Here the contents of arg1,arg2,result are pointers to the symbol table entries
Temporary names should also be entered in the symbol table.
Triples:
Types
Direct triples
Indirect triples
The method of entering temporary names in the symbol table is avoided in this method.
(Temporary names are avoided)
The contents are either pointers to the symbol table or pointers to the structure itself.
Since three fields are used, the intermediate representation is called triples.
3 address code implemented in triple form
Direct triples
op Arg1 Arg2
(0 uminus B --
)
(1 + C D
)
(2 * (0) (1)
)
(3 := A (2)
)
Indirect triples
Indirect triples have a separate listing of pointers to triples.
An array statement is used to list pointers to triples in desired order
statement
(0) 14
(1) 15
(2) 16
(3) 17
op Arg Arg2
1
1 - B ----
4
1 + C D
5
1 * (14) (15)
6
In triples notation, if a 1 := A (16) stmt containing temporary vales are moved, then we
have to change all the 7 pointers to that statement in arg1, arg2 arrays.
In indirect triples, if a statement is moved, we simply reorder the statement list.
7.7 Translation of Assignment statements
Assignment statement with integer types :
Consider simple assignment statement involving integer quantities.
The following grammar is a form of assignment statement.
Aid:=E
EE+E/ E*E / -E /(E) id
Translation of E
It is a structure with 2 fields
E.PLACEused to store the value of expression
E.CODE sequence of 3 address statements
Translation of A
A.CODE Three address code to execute the assignment stmt.
Translation of id
id.PLACE denote the name corresponding to the instance of the token id
Newtemp() is a function used to create new temporary names T1, T2 etc.
Abstract translation schemes:
Production Semantic action
Aid:=E {A.CODE:=E.CODE ||id.PLACE || ‘:=’ || E.PLACE}
EE(1)+E(2) {T:=NEWTEP();
E.PLACE:=T;
E.CODE:=E(1).CODE || E(2).CODE ||E.PLACE ||
‘:=’ || E(1).PLACE || ‘+’ || E(2).PLACE }
EE(1)*E(2) {T:=NEWTEP();
E.PLACE:=T;
E.CODE:=E(1).CODE || E(2).CODE ||E.PLACE ||
‘:=’ || E(1).PLACE || ‘*’ || E(2).PLACE }
E - E(1) {T:=NEWTEP();
E.PLACE:=T;
E.CODE:=E(1).CODE || E.PLACE ||
‘:=’ || E(1).PLACE }
E ( E(1)) { E.PLACE:= E(1).PLACE;
E.CODE:=E(1).CODE }
E id { E.PLACE:=id. .PLACE;
E.CODE:=null }
T:=NEWTEMP()
if E(1).MODE = INTEGER and E(2).MODE = INTEGER then
begin
GEN(T:=E(1).PLACE int op E(2).PLACE);
E.MODE = INTEGER
end
else if E(1).MODE = REAL and E(2).MODE = REAL then
begin
GEN(T:=E(1).PLACE real op E(2).PLACE);
E.MODE = REAL
end
else if E(1).MODE = INTEGER /* and E(2).MODE = REAL*/ then
begin
U:=NEWTEMP();
GEN(U:= Inttoreal E(1).PLACE);
GEN(T:= U real op E(2).PLACE);
E.MODE = REAL
end
else /* E(1).MODE = REAL and E(2).MODE = INTEGER */
begin
U:=NEWTEMP();
GEN(U:= Inttoreal E(2).PLACE);
GEN(T:= E(1).PLACE real op U);
E.MODE = REAL
end;
E.PLACE :=T
Numerical representation;
The translation of A or B and C is the following 3 address sequence
T1:=B and C
T2:= A or T1
A relational expression A<B is equivalent to the conditional statement if A<B then 1 else 0.
The semantic rule for the 2 productions, EE or E and E id relop id are as follows
Code for
E
Code
FALSE : for S(2)
While E do S
Code for E
FALSE:
Consider a syntax directed translation scheme that generates quadruples for Boolean expression.
Since bottom up parsing is used, we may not have generated actual quadruples to which the jump has to be
made.(at the time of generating jump statements). So the target of the branching statements are temporarily
left unspecified. Each such quadruple will be filled in when the proper location is found. The subsequent
filling in of quadrupled is called backpatching.
To manipulate the list of quadrupled 3 functions are used.
MAKELIST(i): Creates a new list containing only i. It returns a pointer to the list that it has created.
MERGE(p1,p2): Takes the lists p1 and p2 and concatenate them and returns a pointer to the concatenated
list
BACKPATCH(p,i): takes the quadruple pointed by p, and make i as it's target
3) E not E(1)
{ E.TRUE E(1).FALSE;
E.FALSE E(1).TRUE}
4) E( E(1))
{ E.TRUE E(1).TRUE;
E.FALSE E(1).FALSE}
5) Eid
{E.TRUE:=MAKELIST(NEXTQUAD);
E.FALSE:=MAKELIST(NEXTQUAD+1);
GEN(if id.PLACE goto-);
GEN(goto -)}
7) Mℇ
{M.QUAD:=NEXTQUAD}
Semantic actions 5 and 6 have generated 2 quadruples a conditional goto and unconditional goto
They don’t have their targets filled
The index of first generated quadruple is made into a list and E.TRUE is given a pointer.
The index of the second generated quadruple goto-, is also made into a list and E.FALSE is given a pointer
Fig : Parse tree for P<Q or R<S and T<U
Eg:
Consider the expression P<Q or R<S and T<U
Consider the semantic actions on the numbered nodes as bottom up parsing is done.
Postfix translations can be implemented by emitting the tail as each production is recognized.