Chapter 7
Chapter 7
Prof Chung.
1 113/04/12
Outlines
7.1 Syntax-directed Translation
7.1.1 Using a Syntax Tree Representation of a Parse
7.1.2 Compiler Organization Alternatives
7.1.3 Parsing, Checking, and Translation in a Single Pass
2 113/04/12
Outlines
7.3 Intermediate Representations and Code
Generation
7.3.1 Intermediate Representations vs. Direct Code
Generation
7.3.2 Forms of Intermediate Representations
7.3.3 A Tuple Language
3 113/04/12
7.1
Syntax-directed Translation
7.1.1 Using a Syntax Tree Representation of a Parse
7.1.2 Compiler Organization Alternatives
7.1.3 Parsing, Checking, and Translation in a Single Pass
4 113/04/12
Syntax-Directed Translation
Almost all modern compilers are syntax-directed
The compilation process is driven by the syntactic structure of a
source program, as recognized by the parser
Semantic Routines
Parts of the compiler that interprets the meaning (semantic) of a
program
Perform analysis task: static semantic checking such as variable
declarations, type errors, etc.
Perform synthesis task: IR or actual code generation
The semantic action is attached to the productions
(or sub trees of a syntax tree).
5 113/04/12
7.1.1
Using a Syntax Tree Representation of a Parse (1)
parse tree
Parsing: <assign>
<term> <factor>
<term> <factoor> id
<factor> id
id +
* id
6 113/04/12
const id
7.1.1
Using a Syntax Tree Representation of a Parse (2)
Semantic routines traverse (post-order) the AST,
computing attributes of the nodes of AST.
* id(I)
const(3) id(X)
7 113/04/12
7.1.1
Using a Syntax Tree Representation of a Parse (3)
The attributes are then propagated to other nodes using
some functions, e.g.
build symbol table
attach attributes of nodes
check types, etc.
<stmt> ‘‘‘ ‘‘ ‘‘
declaration
:= ‘‘
‘‘‘‘‘
id + exp.
symbol type
* id ‘
table ‘‘‘check types: integer * or floating *
const id ” Need to consult symbol table for types of id’s.
‘‘
‘
8 113/04/12
7.1.1
Using a Syntax Tree Representation of a Parse (4)
After attribute propagation is done,
the tree is decorated and ready for code generation,
use another pass over the decorated AST to generate code.
9 113/04/12
7.1.2
Compiler Organization Alternatives (1)
A single- pass for analysis and synthesis
Interleaved in a single pass
Scanning
Parsing
Checking
Translation
No explicit IR is generated
Ex. Micro Compiler (chap.2),
Since code generation is limited to looking at one tuple at a time,
few optimizations are possible
Ex. Consider Register Allocation,
Requires a more global view of the AST.
10 113/04/12
7.1.2
Compiler Organization Alternatives (2)
We wish the code generator
completely hide machine details and semantic routines
become independent of machines.
However,
this is violated sometimes in order to produce better code.
Suppose there are several classes of registers,
each for a different purpose.
Then register allocation is better done
by semantic routines than code generator
since semantic routines have a broader view of the AST.
11 113/04/12
7.1.2
Compiler Organization Alternatives (3)
One-pass compiler + peephole optimization
One pass for code generation
one pass for peephole optimization
12 113/04/12
7.1.2
Compiler Organization Alternatives (4)
One pass analysis and IR synthesis + code gen pass
1st pass : Analysis and IR
2nd pass : Code generation
13 113/04/12
7.1.2
Compiler Organization Alternatives (5)
Multipass analysis
For limited address space, four is a pass.
Scanner
Parser
Declaration
Static checking
14 113/04/12
7.1.2
Compiler Organization Alternatives (6)
Multipass synthesis
IR
Machine-independent optimization passes
Machine-dependent optimization passes
Code gen passes
Peephole
…….
15 113/04/12
7.1.2
Compiler Organization Alternatives (7)
Multi-language and multi-target compilers
Components may be shared and parameterized.
Ex : Ada uses Diana (language-dependent IR)
Ex : GCC uses two IRs.
one is high-level tree-oriented
the other(RTL) is more machine-oriented
FORTRAN PASCAL ADA C .....
.....
machine-independent optimization
SUN PC main-frame
16 113/04/12
language - and machine-independent IRs
7.1.3
Single Pass (1)
In Micro of chap 2, scanning, parsing and semantic
processing are interleaved in a single pass.
(+) simple front-end
(+) less storage if no explicit trees
(-) immediately available information is limited since no complete
tree is built.
Relationships
semantic
rtn 1
call call
scanner parser semantic semantic
tokens rtn 2 records
semantic
rtn k
17 113/04/12
7.1.3
Single Pass (2)
Each terminal and non-terminal has a semantic record.
Semantic records may be considered
as the attributes of the terminals and non-terminals.
Terminals
the semantic records are created by the scanner.
Non-terminals
the semantic records are created by a semantic routine when a production
is recognized.
18 113/04/12
B C D #SR
7.1.3
Single Pass (3)
1 pass = 1 post-order traversal of the parse tree
parsing actions -- build parse trees
semantic actions -- post-order traversal
<assign>
ID (A) := <exp>
19 113/04/12
7.2
Semantic Processing Techniques
7.2.1 LL Parsers and Action Symbols
7.2.2 LR Parsers and Action Symbols
7.2.3 Semantic Record Representations
7.2.4 Implementing Action-controlled Semantic Stacks
7.2.5 Parser-controlled Semantic Stacks
20 113/04/12
7.2
Semantic Processing
Semantic routines may be invoked in two ways:
<1> By parsing procedures
as in the recursive descent parser in chap 2
21 113/04/12
7.2.1
LL(1)
Some productions
have no action symbols; others may have several.
Semantic routines
are called when action symbols appear on stack top.
<exp> <term> + <exp> #add
<term>
+ +
parse <exp> <exp> <exp>
<exp> #add #add #add
stack #add
<exp>
semantic + +
stack <term> <term> <exp>
22 113/04/12
7.2.2
LR(1) - (1)
Semantic routines
are invoked only when a structure is recognized.
LR parsing
a structure is recognized when the RHS is reduced to LHS.
cf. In LL parsing,
The structure is recognized when a non-terminal is expanded.
24 113/04/12
7.2.2
LR(1) - (3)
However, sometimes we do need to perform semantic
actions in the middle of a production.
Ex:
<stmt> if <exp> then <stmt> end
25 113/04/12
7.2.2
LR(1) - (4)
Another problem
What if the action is not at the end?
Ex:
<prog> #start begin <stmt> end
We need to call #start.
Ex:
enum kind {OP, EXP, STMT, ERROR};
typedef struct {
enum kind tag;
union {
op_rec_type OP_REC;
exp_rec_type EXP_REC;
stmt_rec_type STMT_REC;
......
}
} sem_rec_type;
27 113/04/12
7.2.3
Semantic Record Representation - (2)
How to handle errors?
Ex.
A semantic routine
needs to create a record for each identifier in an expression.
What if the identifier is not declared?
28 113/04/12
7.2.3
Semantic Record Representation - (3)
Solution 1: make a bogus record
This method may create a chain of
meaningless error messages due to this bogus record.
Implementing stacks:
1. array
2. linked list
30 113/04/12
7.2.4
Action-controlled semantic stack - (2)
Two other disadvantages:
(-)Action routines
need to manage the stack.
31 113/04/12
7.2.4
Action-controlled semantic stack - (3)
Solution 1: Let parser control the stack
Solution 2: Introduce additional stack routines
Ex:
Parser Stack routines Parameter-driven action routines
If action routines
do not control the stack, we can use opague (or abstract)
stack: only push() and pop() are provided.
(+) clean interface
(- ) less efficient
32 113/04/12
7.2.5
parser-controlled stack - (1)
LR
Semantic stack and parse stack operate in parallel [shifts and
reduces in the same way].
12 top
11 D
10 C right
Semantic stack 9 B
: 8 : current
A 7 A left
: :
Need four pointers for the semantic stack (left, right, current, top).
34 113/04/12
7.2.5
parser-controlled stack - (3)
However, when a new production BE F G is predicted,
the four pointers will be overwritten.
E
Parse stack F
G
B EOP(7,9,9,12)
C C
ABCD D B EFG
D
EOP(...) EOP(......)
A
: :
:
top
Semantic stack 15
14 G
13 F right
12 top 12 E
current 11 D 11 D
10 right C current
C 10
9 9 B 9 B left
8 : 8 current :
: 8
7 A 7 A
A left 7
: :
:
36 113/04/12
7.2.5
parser-controlled stack - (5)
Note
All push() and pop() are done by the parser
Not by the action routines.
Semantic records
Are passed to the action routines by parameters.
Example
<primary>(<exp>) #copy ($2,$$)
37 113/04/12
7.2.5
parser-controlled stack - (6)
Initial information
is stored in the semantic record of LHS.
D
initially C finally
B
: : :
A A A
: : :
38 information flow (attributes) 113/04/12
7.2.5
parser-controlled stack - (7)
(-) Semantic stack may grow very big.
<fix>
Certain non-terminals never use semantic records,
e.g. <stmt list> and <id list>.
Example
<stmt list><stmt> #reuse <stmt tail>
<stmt tail><stmt> #reuse <stmt tail>
<stmt tail>
Evaluation
Parser-controlled semantic stack is easy with LR, but not so with LL.
39 113/04/12
7.3
Intermediate Representations
and Code Generation
7.1.1 Using a Syntax Tree Representation of a Parse
7.1.2 Compiler Organization Alternatives
7.1.3 Parsing, Checking, and Translation in a Single Pass
40 113/04/12
7.3
Intermediate representation and code generation
Two possibilities:
semantic IR code
2. ..... Machine code
routines generation
(+) allows higher-level operations e.g. open block, call procedures.
(+) better optimization because IR is at a higher level.
(+) machine dependence is isolated in code generation.
41 113/04/12
7.3
Intermediate representation and code generation
IR
good for optimization and portability
Machine Code
simple
42 113/04/12
7.3.2 - (1)
1. postfix form
Example
a+b ab+
(a+b)*c ab+c*
a+b*c abc*+
a:=b*c+b*d abc*bd*+:=
43 113/04/12
7.3.2 - (2)
a := b*c + b*d
2. 3-addr code (1) (* b c) (1) ( * b c t1 )
Triple (2) (* b d) (2) ( * b d t2 )
(3) (+ (1) (2)) (3) ( + t1 t2 t3 )
op arg1 arg2 (4) (:= (3) a) (4) ( := t3 a _)
Quadruple
intermediate results use temporary
op arg1 arg2 arg3 are referenced by names
the instruction #
Example
(I* b c t1)
(FLOAT b t2)
(F* t2 d t3)
(FLOAT t1 t4)
(F+ t4 t3 t5)
( := t5 a)
46 113/04/12
7.3.2 - (5)
3. Sometimes, we use trees or DAG
:= :=
a + a +
Ex: a := b*c + b*d * * * *
b c b d b c d
More generally, we may use AST as IR.
Machine-independent optimization
Is implemented as tree transformations.
Ex. Ada uses Diana.
.....
47 113/04/12