0% found this document useful (0 votes)
12 views

Chapter 2 - Simple Syntax Directed Translator

Uploaded by

Abdullah
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Chapter 2 - Simple Syntax Directed Translator

Uploaded by

Abdullah
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

1

A Simple Syntax-Directed
Translator

Chapter 2
2

Building a Simple Compiler


• Building our compiler involves:
– Defining the syntax of a programming language
– Develop a source code parser: for our compiler
we will use predictive parsing
– Implementing syntax directed translation to
generate intermediate code
– Generating the intermediate code .
– Optimize the intermediate code.
– Target Code Generation
– Code Optimization
3

The Structure of our Compiler


4

Syntax Definition
• Context-free grammar is a 4-tuple with
– A set of tokens (terminal symbols)
– A set of nonterminals
– A set of productions
– A designated start symbol
5

Example Grammar

Context-free grammar for simple expressions:

G = <{list,digit}, {+,-,0,1,2,3,4,5,6,7,8,9}, P, list>

with productions P =
list → list + digit

list → list - digit

list → digit
digit → 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
6

Derivation
• Given a CF grammar we can determine the
set of all strings (sequences of tokens)
generated by the grammar using derivation
– We begin with the start symbol
– In each step, we replace one nonterminal in the
current sentential form with one of the right-
hand sides of a production for that nonterminal
7

Derivation for the Example


Grammar

list
 list + digit
 list - digit + digit
 digit - digit + digit
 9 - digit + digit
 9 - 5 + digit
9-5+2

This is an example leftmost derivation, because we replaced


the leftmost nonterminal (underlined) in each step.
Likewise, a rightmost derivation replaces the rightmost
nonterminal in each step
8

Parse Trees
• The root of the tree is labeled by the start symbol
• Each leaf of the tree is labeled by a terminal
(=token) or 
• Each interior node is labeled by a nonterminal
• If A → X1 X2 … Xn is a production, then node A has
immediate children X1, X2, …, Xn where Xi is a
(non)terminal or  ( denotes the empty string)
9

Parse Tree for the Example


Grammar
Parse tree of the string 9-5+2 using grammar G

list

list digit

list digit

digit
The sequence of
9 - 5 + 2 leafs is called the
yield of the parse tree
10

Ambiguity

Consider the following context-free grammar:

G = <{string}, {+,-,0,1,2,3,4,5,6,7,8,9}, P, string>

with production P =
string → string + string | string - string | 0 | 1 | … | 9
This grammar is ambiguous, because more than one parse tree
represents the string 9-5+2
11

Ambiguity (cont’d)

string string

string string string

string string string string string

9 - 5 + 2 9 - 5 + 2
12

Associativity of Operators

Left-associative operators have left-recursive productions

left → left + term | term

String a+b+c has the same meaning as (a+b)+c

Right-associative operators have right-recursive productions

right → term = right | term

String a=b=c has the same meaning as a=(b=c)


13

Precedence of Operators
Operators with higher precedence “bind more tightly”
expr → expr + term | term
term → term * factor | factor
factor → number | ( expr )
String 2+3*5 has the same meaning as 2+(3*5)
expr
expr term
term term factor
factor factor number
number number
2 + 3 * 5
14

Syntax of Statements

stmt → id := expr
| if expr then stmt
| if expr then stmt else stmt
| while expr do stmt
| begin opt_stmts end
opt_stmts → stmt ; opt_stmts
|
15

Syntax-Directed Translation
• Uses a CF grammar to specify the syntactic
structure of the language
• AND associates a set of attributes with the
terminals and nonterminals of the grammar
• AND associates with each production a set of
semantic rules to compute values of attributes
• A parse tree is traversed and semantic rules
applied: after the tree traversal(s) are completed,
the attribute values on the nonterminals contain
the translated form of the input
16

Synthesized and Inherited


Attributes
• An attribute is said to be …
– synthesized if its value at a parse-tree node is
determined from the attribute values at the
children of the node
– inherited if its value at a parse-tree node is
determined by the parent (by enforcing the
parent’s semantic rules)
17

Example Attribute Grammar

String concat operator


Production Semantic Rule
expr → expr1 + term expr.t := expr1.t // term.t // “+”
expr → expr1 - term expr.t := expr1.t // term.t // “-”
expr → term expr.t := term.t
term → 0 term.t := “0”
term → 1 term.t := “1”
… …
term → 9 term.t := “9”
18

Example Annotated Parse Tree


19

Depth-First Traversals

procedure visit(n : node);


begin
for each child m of n, from left to right do
visit(m);
evaluate semantic rules at node n
end
20

Depth-First Traversals (Example)

expr.t = “95-2+”

expr.t = “95-” term.t = “2”

expr.t = “9” term.t = “5”

term.t = “9”

9 - 5 + 2 Note: all attributes are


of the synthesized type
21

Translation Schemes
• A translation scheme is a CF grammar
embedded with semantic actions

rest → + term { print(“+”) } rest

Embedded
semantic action
rest

+ term { print(“+”) } rest


22

Example Translation Scheme

expr → expr + term { print(“+”) }


expr → expr - term { print(“-”) }
expr → term
term → 0 { print(“0”) }
term → 1 { print(“1”) }
… …
term → 9 { print(“9”) }
23

Example Translation Scheme


(cont’d)

expr
{ print(“+”) }

expr + term
{ print(“2”) }
{ print(“-”) }
- 2
expr term
{ print(“5”) }
term 5
{ print(“9”) }
9
Translates 9-5+2 into postfix 95-2+
24

Parsing Problem
The parsing Problem: Take a string of symbols in a language (tokens)
and a grammar for that language to construct the parse tree or report that
the sentence is syntactically incorrect.
For correct strings:
Sentence + grammar → parse tree
For a compiler, a sentence is a program:
Program + grammar → parse tree
Types of parsers:
Top-down a.k.a predictive (recursive descent parsing)
Bottom-up parsing.
“We will focus on top-down parsing at present”.
25

Top Down Parsing


Recursive Descent parsing uses recursive procedures to model the parse
tree to be constructed. The parse tree is built from the top down, trying to
construct a left-most derivation.
Beginning with start symbol, for each non-terminal (syntactic class) in
the grammar a procedure which parses that syntactic class is constructed.
Consider the expression grammar:
E → T E’
E’ → + T E’ | e
T → F T’
T’ → * F T’ | e
F → ( E ) | id
The following procedures can parse strings top-down in this language:
26

Recursive Descent
Procedure E Procedure T Procedure F
begin { E } begin { T } begin { F }
call T call F case token is
call E’ call T’ “(“:
print (“ E found ”) print (“ T found ”) print (“ ( found ”)
end { E } end { T } Get next token
call E
Procedure E’ Procedure T’ if token = “)” then
begin { E’ } begin { T’ } begin { IF }
If token = “+” then If token = “ * ” then print (“ ) found”)
begin { IF } begin { IF } Get next token
print (“ + found “) print (“ * found “) print (“ F found “)
Get next token Get next token end { IF }
call T call F else
call E’ call T’ call ERROR
end { IF } end { IF } “id“:
print (“ E’ found “) print (“ T’ found “) print (“ id found ”)
end { E’ } end { T’ } Get next token
print (“ F found “)
otherwise:
call ERROR
end { F }
27

Left Recursion & Top-Down


Ambiguity is not the only problem associated with recursive descent parsing.
Other problems to be aware of are left recursion and left factoring:

Left recursion: A grammar is left recursive if it has a non-terminal A such that


there is a derivation A → A  for some non-empty string .

A is left-recursive if the left-most symbol in any of its alternatives either immediately


(direct left-recursive) or through other non-terminal definitions (indirect/hidden
left-recursive) rewrites to a string with A on the left.

Top-down parsing methods cannot handle left-recursive grammars,


so a transformation is needed to eliminate left recursion.
28

Prediction and Left Recursion


Immediate left-recursion: A → A 
E.g., Expr → Expr + Term
Top-down parser implementation:
function expr() {
expr(); match(‘+’); term();
}

Do you see the problem ?

Indirect left-recursion: A → Ba | C
B → Ab | D
A  Ba  Aba
29

Removing Left Recursion


• Left recursion is eliminated by converting the grammar into a right recursive
grammar.

• If we have the left-recursive pair of productions-
• A → Aα / β
• (Left Recursive Grammar)
• where β does not begin with an A.

• Then, we can eliminate left recursion by replacing the pair of productions
with-
• A → βA’
• A’ → αA’ / ∈
• (Right Recursive Grammar)

• This right recursive grammar functions same as left recursive grammar.
30

Right Recursive Expressions


• Consider the following grammar and eliminate left recursion-
• E→E+T/T
• T→TxF/F
• F → id

• Solution-
• The grammar after eliminating left recursion is-
• E → TE’
• E’ → +TE’ / ∈
• T → FT’
• T’ → xFT’ / ∈
• F → id
31

Syntax Directed Left Rec


Syntax directed translation adds semantic rules to be carried
out when syntactic rules are applied. Let’s do conversion of
infix to postfix.
Expr → Expr + Term {out(“ + “);}
| Expr - Term {out(“ - “);}
| Term
Term →Term * Factor {out(“ * “);}
| Term / Factor {out(“ / “);}
| Factor
Factor → (Expr)
| int {out(“ “, int.val, “ “);}
32

How It Works
Examples of applying previous syntax
directed translation

Input: 15 - 20 + 7 * 3 / 2
Output: 15 20 - 7 3 * 2 / +

Input: 15 - 20 - 7 + 3 * 2
Output: 15 20 - 7 - 3 2 * +
33

Direct Placement of Actions


Expr → Term ExprRest
ExprRest → + Term ExprRest {out (“ + “ );}
| - Term ExprRest {out (“ - “ );}
| 
Term → Factor TermRest
TermRest → * Factor TermRest {out(“ * “);}
| / Factor TermRest {out (“ / “ );}
| 
Factor → (Expr)
| int {out(“ “,int.val,” “);}
34

Problems Galore
Examples of applying previous syntax
directed translation

Input: 15 - 20 + 7 * 3 / 2
Output: 15 20 7 3 2 / * + - (In error)

Input: 15 - 20 - 7 + 3 * 2
Output: 15 20 7 3 2 * + - - (In error)
35

Treat Actions as Terminals


Expr → Term ExprRest
ExprRest → + Term {out (“ + “ );} ExprRest
| - Term {out (“ - “ );} ExprRest
| 
Term → Factor TermRest
TermRest → * Factor {out(“ * “);} TermRest
| * Factor {out(“ / “);} TermRest
| 
Factor → (Expr)
| int {out(“ “,int.val,” “);}
36

Top Down Parsing


Recursive Descent parsing uses recursive procedures to model the parse
tree to be constructed. The parse tree is built from the top down, trying to
construct a left-most derivation.
Beginning with start symbol, for each non-terminal (syntactic class) in
the grammar a procedure which parses that syntactic class is constructed.
Consider the expression grammar G = ({E, E’, T, T’, F}, {+,-
,*,/,id}, E,
{ E → T E’
E’ → + T E’ | - T E’ | 
T → F T’
T’ → * F T’ | / F T’ | 
F → ( E ) | id })
The following procedures have to be written:
37

Recursive Descent
Procedure E Procedure T Procedure F
begin { E } begin { T } begin { F }
call T call F case token is
call E’ call T’ “(“:
end { E } end { T } nextsy()
call E
Procedure E’ Procedure T’ if token = “)” then
begin { E’ } begin { T’ } nextsy()
If token = “+” then If token = “*” then else
begin { addition } begin { multiply } ERROR()
nextsy nextsy() “id“:
call T call F out( id.val )
out(“ + “) out(“ * “) Get next token
call E’ call T’ otherwise:
end { addition } end { multiply } ERROR()
If token = “-” then If token = “/” then end { F }
begin { subtraction } begin { divide }
nextsy nextsy()
call T call F
out(“ - “) out(“ / “)
call E’ call T’
end { subtraction} end { divide }
end { E’ } end { T’ }
38

Process
• Write left recursive grammar with semantic
actions.
• Rewrite as right recursive with actions
treated as terminals in original rules.
• Develop recursive descent parser.
39

Left Factoring
When have rules like
A →  | 
which rule to choose is a problem
Factor as
A→X
X→|

You might also like