0% found this document useful (0 votes)
46 views

Compiler Unit 4

The document discusses code generation in compilers. It covers topics like the target machine, instruction selection, register allocation, evaluation order, and generating code for stack allocation. Code generation aims to produce efficient target code that makes effective use of machine resources.

Uploaded by

Siddharth Jain
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views

Compiler Unit 4

The document discusses code generation in compilers. It covers topics like the target machine, instruction selection, register allocation, evaluation order, and generating code for stack allocation. Code generation aims to produce efficient target code that makes effective use of machine resources.

Uploaded by

Siddharth Jain
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 59

1

Unit - IV
Chapter 9
Code Generation
(C) 2014, Prepared by Partha Sarathi Chakraborty

Partha Sarathi Chakraborty


Assistant Professor
Department of Computer Science and Engineering
SRM University, Delhi – NCR Campus
2

Outline
• Issues in the Design of Code Generator
• The Target Machine
• Runtime Storage Management
• Basic Blocks and Flow Graphs
• Next-use Information
(C) 2014, Prepared by Partha Sarathi Chakraborty

• DAG representation of Basic Blocks


• Peephole Optimization
• Cross Compiler – T diagrams
• A simple Code Generator
3

Code Generation
(C) 2014, Prepared by Partha Sarathi Chakraborty

The output code must be correct and of high quality, meaning that it should
make effective use of the resources of the target machine.
4

Issues in Code Generation


• Input to the Code Generator
• Target Programs
• Memory Management
• Instruction Selection
(C) 2014, Prepared by Partha Sarathi Chakraborty

• Register Allocation
• Choice of Evaluation Order
• Approaches to Code Generation
5

Target Program Code


• The back-end code generator of a compiler
may generate different forms of code,
depending on the requirements:
– Absolute machine code (executable code)
– Relocatable machine code (object files for
linker)
– Assembly language (facilitates debugging)
– Byte code forms for interpreters (e.g. JVM)
6

The Target Machine


• Implementing code generation requires thorough
understanding of the target machine architecture
and its instruction set
• Our (hypothetical) machine:
– Byte-addressable (word = 4 bytes)
– Has n general purpose registers R0, R1, …, Rn-1
– Two-address instructions of the form

op source, destination
7

The Target Machine: Op-codes


and Address Modes
• Op-codes (op), for example
MOV (move content of source to destination)
ADD (add content of source to destination)
SUB (subtract content of source from dest.)
• Address modes
Mode Form Address Added Cost
Absolute M M 1
Register R R 0
Indexed c(R) c+contents(R) 1
Indirect register *R contents(R) 0
Indirect indexed *c(R) contents(c+contents(R)) 1
Literal #c N/A 1
8

Instruction Costs
• Machine is a simple, non-super-scalar processor
with fixed instruction costs
• Realistic machines have deep pipelines, I-cache,
D-cache, etc.
• Define the cost of instruction
= 1 + cost(source-mode) + cost(destination-mode)
9

Examples
Instruction Operation Cost
MOV R0,R1 Store content(R0) into register R1 1
MOV R0,M Store content(R0) into memory location M 2
MOV M,R0 Store content(M) into register R0 2
MOV 4(R0),M Store contents(4+contents(R0)) into M 3
MOV *4(R0),M Store contents(contents(4+contents(R0))) into M 3
MOV #1,R0 Store 1 into R0 2
ADD 4(R0),*12(R1) Add contents(4+contents(R0))
to contents(12+contents(R1)) 3
10

Instruction Selection
• Instruction selection is important to obtain
efficient code
• Suppose we translate three-address code
x:=y+z
to: MOV y,R0
MOV a,R0
ADD z,R0 a:=a+1
ADD #1,R0
MOV R0,x
MOV R0,a
Cost = 6
Better Better

ADD #1,a INC a


Cost = 3 Cost = 2
11

Instruction Selection: Utilizing


Addressing Modes
• Suppose we translate a:=b+c into
MOV b,R0
ADD c,R0
MOV R0,a
• Assuming addresses of a, b, and c are stored in
R0, R1, and R2
MOV *R1,*R0
ADD *R2,*R0
• Assuming R1 and R2 contain values of b and c
ADD R2,R1
MOV R1,a
12

Need for Global Machine-


Specific Code Optimizations
• Suppose we translate three-address code
x:=y+z
to: MOV y,R0
ADD z,R0
MOV R0,x
• Then, we translate
a:=b+c
d:=a+e
to: MOV a,R0
ADD b,R0 Redundant
MOV R0,a
MOV a,R0
ADD e,R0
MOV R0,d
13

Register Allocation and


Assignment
• Efficient utilization of the limited set of registers
is important to generate good code
• Registers are assigned by
– Register allocation to select the set of variables that
will reside in registers at a point in the code
– Register assignment to pick the specific register that a
variable will reside in
• Finding an optimal register assignment in general
is NP-complete
14

Example

t:=a+b t:=a*b
t:=t*c t:=t+a
t:=t/d t:=t/d

{ R1=t } { R0=a, R1=t }

MOV a,R1 MOV a,R0


ADD b,R1 MOV R0,R1
MUL c,R1 MUL b,R1
DIV d,R1 ADD R0,R1
MOV R1,t DIV d,R1
MOV R1,t
15

Choice of Evaluation Order


• When instructions are independent, their
evaluation order can be changed
MOV a,R0
ADD b,R0
MOV R0,t1
t1:=a+b MOV c,R1
t2:=c+d ADD d,R1
a+b-(c+d)*e MOV e,R0
t3:=e*t2
t4:=t1-t3 MUL R1,R0 MOV c,R0
MOV t1,R1 ADD d,R0
reorder SUB R0,R1 MOV e,R1
MOV R1,t4 MUL R0,R1
t2:=c+d MOV a,R0
t3:=e*t2 ADD b,R0
t1:=a+b SUB R1,R0
t4:=t1-t3 MOV R0,t4
16

Generating Code for Stack


Allocation of Activation Records
t1 := a + b 100: ADD #16,SP Push frame
param t1 108: MOV a,R0
param c 116: ADD b,R0
t2 := call foo,2 124: MOV R0,4(SP) Store a+b
… 132: MOV c,8(SP) Store c
140: MOV #156,*SP Store return address
148: GOTO 500 Jump to foo
func foo 156: MOV 12(SP),R0 Get return value
… 164: SUB #16,SP Remove frame
return t1 172: …

500: …
564: MOV R0,12(SP) Store return value
572: GOTO *SP Return to caller

Note: Language and machine dependent


Here we assume C-like implementation with SP and no FP
17

Basic Block
• A basic block is a sequence of consecutive statements
in which flow of control enters at the beginning and
leaves at the end without halt or possibility of
branching except at the end.
• The following sequence of three address code forms a
(C) 2014, Prepared by Partha Sarathi Chakraborty

basic block:
t1 = a * a
t2 = a * b
t3 = t1 + t2
18

Basic Blocks

MOV 1,R0
MOV n,R1
MOV 1,R0 JMP L2
MOV n,R1
JMP L2 L1: MUL 2,R0
L1: MUL 2,R0 SUB 1,R1
SUB 1,R1
L2: JMPNZ R1,L1
L2: JMPNZ R1,L1
19

Basic Blocks and Control Flow


Graphs
• A control flow graph (CFG) is a directed
graph with basic blocks Bi as vertices and
with edges BiBj iff Bj can be executed
immediately after Bi
MOV 1,R0
MOV n,R1
MOV 1,R0 JMP L2
MOV n,R1
JMP L2 L1: MUL 2,R0
L1: MUL 2,R0 SUB 1,R1
SUB 1,R1
L2: JMPNZ R1,L1
L2: JMPNZ R1,L1
20

Successor and Predecessor


Blocks
• Suppose the CFG has an edge B1B2
– Basic block B1 is a predecessor of B2
– Basic block B2 is a successor of B1
MOV 1,R0
MOV n,R1
JMP L2

L1: MUL 2,R0


SUB 1,R1

L2: JMPNZ R1,L1


21

Equivalence of Basic Blocks


• Two basic blocks are (semantically)
equivalent if they compute the same set of
expressions
b := 0
t1 := a + b
t2 := c * t1 a := c * a
a := t2 b := 0

a := c*a a := c*a
b := 0 b := 0
Blocks are equivalent, assuming t1 and t2 are dead: no longer used (no longer live)
22

Partition Algorithm for Basic


Blocks
Input: A sequence of three-address statements
Output: A list of basic blocks with each three-address statement
in exactly one block

1. Determine the set of leaders, the first statements if basic blocks


a) The first statement is the leader
b) Any statement that is the target of a goto is a leader
c) Any statement that immediately follows a goto is a leader
2. For each leader, its basic block consist of the leader and all
statements up to but not including the next leader or the end
of the program
23
Dot Product of Two Vectors
and Three-Address Code
(C) 2014, Prepared by Partha Sarathi Chakraborty
24

Basic Block
• Transformations on Basic Blocks
• Structure-Preserving Transformations
• Algebraic Transformations
• Representations of Basic Blocks
(C) 2014, Prepared by Partha Sarathi Chakraborty
25

Transformations on Basic Blocks


• A code-improving transformation is a code
optimization to improve speed or reduce code size
• Global transformations are performed across basic
blocks
• Local transformations are only performed on
single basic blocks
• Transformations must be safe and preserve the
meaning of the code
– A local transformation is safe if the transformed basic
block is guaranteed to be equivalent to its original form
26

Common-Subexpression
Elimination
• Remove redundant computations

a := b + c a := b + c
b := a - d b := a - d
c := b + c c := b + c
d := a - d d := b

t1 := b * c
t1 := b * c
t2 := a - t1
t2 := a - t1
t3 := b * c
t4 := t2 + t1
t4 := t2 + t3
27

Dead Code Elimination


• Remove unused statements

b := a + 1 b := a + 1
a := b + c …

Assuming a is dead (not used)

if true goto L2

b := x + y

Remove unreachable code
28

Renaming Temporary Variables


• Temporary variables that are dead at the end
of a block can be safely renamed

t1 := b + c t1 := b + c
t2 := a - t1 t2 := a - t1
t1 := t1 * d t3 := t1 * d
d := t2 + t1 d := t2 + t3

Normal-form block
29

Interchange of Statements
• Independent statements can be reordered

t1 := b + c t1 := b + c
t2 := a - t1 t3 := t1 * d
t3 := t1 * d t2 := a - t1
d := t2 + t3 d := t2 + t3

Note that normal-form blocks permit all


statement interchanges that are possible
30

Algebraic Transformations
• Change arithmetic operations to transform
blocks to algebraic equivalent forms

t1 := a - a t1 := 0
t2 := b + t1 t2 := b
t3 := 2 * t2 t3 := t2 << 1
31

Flow Graph
• The representation of flow-of-control information to the set of
basic blocks making up a program by constructing a directed
graph called a flow graph.
• The nodes of the flow graph are the basic blocks.
• One node is distinguished as initial; it is the block whose
leader is the first statement. There is a directed edge from
(C) 2014, Prepared by Partha Sarathi Chakraborty

block B1 to block B2 if B2 can immediately follow B1 in some


execution sequence; that is, if
– there is a conditional or unconditional jump from the last statement of
B1 to the first statement of B2, or
– B2 immediately follows B1 in the order of the program, and B1 does not
end in an unconditional jump.
• B1 is a predecessor of B2, and B2 is a successor of B1.
(C) 2014, Prepared by Partha Sarathi Chakraborty

Flow Graph
Representation of Basic Blocks in
32
33

Loop
• A Loop is a collection of nodes in a flow graph such
that
– All nodes in the collection are strongly connected; that is,
from any node in the loop to any other, there is a path of
length one or more, wholly within the loop, and
(C) 2014, Prepared by Partha Sarathi Chakraborty

– The collection of nodes has a unique entry, that is, a node


in the loop such that the only way to reach a node of the
loop from a node outside the loop is to first go through the
entry.

• A loop that contains no other loops is called an inner


loop.
34

Loops (Example)
B1: MOV 1,R0 Strongly connected
MOV n,R1
JMP L2 components:

B2: L1: MUL 2,R0 SCC={{B2,B3},


SUB 1,R1 {B4} }
B3: L2: JMPNZ R1,L1

B4: L3: ADD 2,R2


SUB 1,R0 Entries:
JMPNZ R0,L3
B3, B4
35

Next-Use
• Next-use information is needed for dead-code
elimination and register assignment
• Next-use is computed by a backward scan of a
basic block and performing the following actions
on statement
i: x := y op z
– Add liveness/next-use info on x, y, and z to statement i
– Set x to “not live” and “no next use”
– Set y and z to “live” and the next uses of y and z to i
36

Next-Use (Step 1)

i: a := b + c

j: t := a + b [ live(a) = true, live(b) = true, live(t) = true,


nextuse(a) = none, nextuse(b) = none, nextuse(t) = none ]

Attach current live/next-use information


Because info is empty, assume variables are live
(Data flow analysis Ch.10 can provide accurate information)
37

Next-Use (Step 2)

i: a := b + c live(a) = true nextuse(a) = j


live(b) = true nextuse(b) = j
live(t) = false nextuse(t) = none
j: t := a + b [ live(a) = true, live(b) = true, live(t) = true,
nextuse(a) = none, nextuse(b) = none, nextuse(t) = none ]

Compute live/next-use information at j


38

Next-Use (Step 3)

i: a := b + c [ live(a) = true, live(b) = true, live(c) = false,


nextuse(a) = j, nextuse(b) = j, nextuse(c) = none ]

j: t := a + b [ live(a) = true, live(b) = true, live(t) = true,


nextuse(a) = none, nextuse(b) = none, nextuse(t) = none ]

Attach current live/next-use information to i


39

Next-Use (Step 4)

live(a) = false nextuse(a) = none


live(b) = true nextuse(b) = i
live(c) = true nextuse(c) = i
live(t) = false nextuse(t) = none
i: a := b + c [ live(a) = true, live(b) = true, live(c) = false,
nextuse(a) = j, nextuse(b) = j, nextuse(c) = none ]

j: t := a + b [ live(a) = false, live(b) = false, live(t) = false,


nextuse(a) = none, nextuse(b) = none, nextuse(t) = none ]

Compute live/next-use information i


40

A Code Generator
• Generates target code for a sequence of three-
address statements using next-use information
• Uses new function getreg to assign registers to
variables
• Computed results are kept in registers as long as
possible, which means:
– Result is needed in another computation
– Register is kept up to a procedure call or end of block
• Checks if operands to three-address code are
available in registers
41

The Code Generation Algorithm


• For each statement x := y op z
1. Set location L = getreg(y, z)
2. If y  L then generate
MOV y’,L
where y’ denotes one of the locations where the value
of y is available (choose register if possible)
3. Generate
OP z’,L
where z’ is one of the locations of z;
Update register/address descriptor of x to include L
4. If y and/or z has no next use and is stored in register,
update register descriptors to remove y and/or z
42

Register and Address Descriptors


• A register descriptor keeps track of what is
currently stored in a register at a particular point in
the code, e.g. a local variable, argument, global
variable, etc.
MOV a,R0 “R0 contains a”
• An address descriptor keeps track of the location
where the current value of the name can be found
at run time, e.g. a register, stack location, memory
address, etc.
MOV a,R0
MOV R0,R1 “a in R0 and R1”
43

The getreg Algorithm


• To compute getreg(y,z)
1. If y is stored in a register R and R only holds the
value y, and y has no next use, then return R;
Update address descriptor: value y no longer in R
2. Else, return a new empty register if available
3. Else, find an occupied register R;
Store contents (register spill) by generating
MOV R,M
for every M in address descriptor of y;
Return register R
4. Return a memory location
44

Code Generation Example


Register Address
Statements Code Generated
Descriptor Descriptor
Registers empty
t := a - b MOV a,R0 R0 contains t t in R0
SUB b,R0

u := a - c MOV a,R1 R0 contains t t in R0


SUB c,R1 R1 contains u u in R1

v := t + u ADD R1,R0 R0 contains v u in R1


R1 contains u v in R0

d := v + u ADD R1,R0 R0 contains d d in R0


MOV R0,d d in R0 and
memory
45

DAG representation of Basic Blocks


• useful data structures for implementing transformations on
basic blocks

• gives a picture of how value computed by a statement is used


in subsequent statements

• good way of determining common sub-expressions


(C) 2014, Prepared by Partha Sarathi Chakraborty

• A dag for a basic block has following labels on the nodes


– leaves are labeled by unique identifiers, either variable names or
constants
– interior nodes are labeled by an operator symbol
– nodes are also optionally given a sequence of identifiers for labels
46

DAG for Basic Block : Example


Three Address Code DAG
(C) 2014, Prepared by Partha Sarathi Chakraborty
47

DAG Representation: Example


1. t1 := 4 * i t6 prod
+
2. t2 := a[t1]
3. t3 := 4 * i
prod0 * t5
4. t4 := b[t3]
(1)
5. t5 := t2 * t4 t2 [ ] t4
(C) 2014, Prepared by Partha Sarathi Chakraborty

[] <=
6. t6 := prod + t5
7. prod := t6 t1 t3
a b * + t7 i 20
8. t7 := i + 1
9. i := t7 i0 1
4
10. if i <= 20 goto (1)
48

Peephole Optimization
• A simple but effective technique for locally improving the target
code is peephole optimization, which is done by examining a sliding
window of target instructions (called the peephole) and replacing
instruction sequences within the peephole by a shorter or faster
sequence, whenever possible.
• Peephole optimization can also be applied directly after intermediate
(C) 2014, Prepared by Partha Sarathi Chakraborty

code generation to improve the intermediate representation.


• The peephole is a small, sliding window on a program. The code in
the peephole need not be contiguous, although some
implementations do require this.
• It is characteristic of peephole optimization that each improvement
may spawn opportunities for additional improvements.
• In general, repeated passes over the target code are necessary to get
the maximum benefit.
49
Peephole Optimization
• Redundant-instruction elimination
– Redundant Loads and Store MOV R0, a MOV a, R0
– Unreachable Code
• Flow-of-control optimizations
– Elimination of unnecessary jumps
goto L1  goto L2
L1: goto L2 L1: goto L2
(C) 2014, Prepared by Partha Sarathi Chakraborty

• Algebraic simplifications
x=x+0 x=x*1
• Reduction in Strength
x=x^2  x=x*x x=2*x  x=x+x
• Use of machine idioms
auto increment, auto-decrement, right-shift, left-shift
50

Cross Compiler – T diagrams


• Cross Compiler: A compiler may run on one
machine and produce target code for another
machine.
(C) 2014, Prepared by Partha Sarathi Chakraborty
Third Language for Compiler
Construction
• Machine language
– compiler to execute immediately;
• Another language with existed compiler on the
same target machine : (First Scenario)
– Compile the new compiler with existing compiler
• Another language with existed compiler on
different machine : (Second Scenario)
– Compilation produce a cross compiler
T-Diagram Describing Complex
Situation
• A compiler written in language H that
translates language S into language T.
S T
H
• T-Diagram can be combined in two basic
ways.
The First T-diagram Combination
A B B C A C
H H H

• Two compilers run on the same machine H


– First from A to B
– Second from B to C
– Result from A to C on H
The Second T-diagram Combination
A B A B
H H K K
M

• Translate implementation language of a


compiler from H to K
• Use another compiler from H to K
The First Scenario
A H A H
B B H H
H

• Translate a compiler from A to H written in B


– Use an existing compiler for language B on
machine H
The Second Scenario
A H A H
B B K K
K

• Use an existing compiler for language B on


different machine K
– Result in a cross compiler
Process of Bootstrapping
• Write a compiler in the same language
S T
S
• No compiler for source language yet
• Porting to a new host machine
The First step in bootstrap
A H A H
A A H H
H

• “quick and dirty” compiler written in


machine language H
• Compiler written in its own language A
• Result in running but inefficient compiler
The Second step in bootstrap
A H A H
A A H H
H

• Running but inefficient compiler


• Compiler written in its own language A
• Result in final version of the compiler

You might also like