0% found this document useful (0 votes)
13 views

Unit 2

Uploaded by

oni6969427
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Unit 2

Uploaded by

oni6969427
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 168

UNIT 2 PARSING

• Context Free Grammar


• Derivation
• CFG vs R.E.
• Types of Parser
• Bottom up: Shift Reduce Parsing, Operator Precedence
Parsing, SLR parser
• Top Down: Recursive Descent Parser - Non-Recursive
Descent Parser.

May 10, 2024 SCS1303-Compiler Design 1


Syntax Analysis
• Every programming language has rules that prescribe the
syntactic structure of well-formed programs.
• For example, in Pascal,
– a program is made out of blocks,
– a block out of statements,
– a statement out of expressions,
– an expressions out of tokens and so on.
• The syntactic or the structural correctness of a program is
checked during the syntax analysis phase of compilation.
• The structural properties of language constructs can be specified
in different ways.

May 10, 2024 SCS1303-Compiler Design 2


Different Formalisms
• Syntax diagram (SD),
• Backus-Naur form (BNF), or
• Context-free grammar (CFG).

Different styles of specification have different purpose.


• SD is good for human understanding and visualization.
• The BNF is very compact. It is used for theoretical analysis
and
also in automatic parser generating softwares.

The syntactic specification of a programming language, written


as a context-free grammar can be used to construct its parser by
synthesizing a push-down automaton (PDA).
May 10, 2024 SCS1303-Compiler Design 3
e.g.
Declaration statement
int a,b,c;
May 10, 2024 SCS1303-Compiler Design 4
Grammars offer significant advantages to both language
designers and compiler writers.

• Gives a precise, easy-to-understand, syntactic specification of


a programming language.

• From certain classes of grammars an efficient parser can be


automatically constructed that determines if a source
program is syntactically well formed.

• A properly designed grammar imparts a structure to a


programming language that is useful for translation of source
code into correct object code and for detection of errors.

May 10, 2024 SCS1303-Compiler Design 5


Role of Parser
• The parser obtains a string of tokens from the lexical analyzer
and verifies that the string can be generated by the grammar.
• The parser should report any syntax errors.
• It should recover from errors so as to continue processing the
remaining inputs.

token
source Lexical parse Rest of front intermediate
program Parser tree representation
Analyzer end
get next
token

Symbol Table

May 10, 2024 SCS1303-Compiler Design 6


Role of Parser (contd..)
Three types of parsers for grammars:

1. Universal Parsing method


i. Cocke-Younger-Kasami Algorithm
ii. Earley’s Algorithm
2. Top-down Parsing
• Build parse trees from the top(root) to the
bottom(leaves)
3. Bottom-up Parsing
• Build parse trees from the bottom(leaves) and work up
to the root.

May 10, 2024 SCS1303-Compiler Design 7


Role of Parser (contd..)
• Assume that the output of the parser is some representation of
the parse tree for the stream of tokens produced by lexical
analyzer.

• Other majors tasks:


– Collecting information about various tokens into the symbol
table.
– Performing type checking and other semantic analysis.
– Generating intermediate code.

May 10, 2024 SCS1303-Compiler Design 8


Syntax Error Handling
• The programs can contain errors at different levels.
e.g. errors can be
– Lexical, such as misspelling of keywords, identifier or
operator.
– Syntactic, such as an arithmetic expression with unbalanced
parenthesis
– Semantic, such as an operator applied to an incompatible
operand.
– Logical, such as an infinitely recursive call.

• The error detection and recovery in a compiler is centered


around the syntax phase.

May 10, 2024 SCS1303-Compiler Design 9


Syntax Error Handling
• The error handler in a parser has simple-to-state goals:

– It should report the presence of errors clearly and accurately.

– It should recover from each error quickly enough to be able


to detect subsequent errors.

– It should not significantly slow down the processing of correct


programs.

May 10, 2024 SCS1303-Compiler Design 10


Error-recovery strategies
• Panic mode
– Discard input symbol one at a time until one of designated
set of synchronization tokens is found.

• Phrase level
– Replacing a prefix of remaining input by some string that
allows the parser to continue.

• Error productions
– Augment the grammar with productions that generate the
erroneous constructs.

• Global correction
– Choosing minimal sequence of changes to obtain a globally
least-cost correction.

May 10, 2024 SCS1303-Compiler Design 11


Context Free Grammar(CFG)

May 10, 2024 SCS1303-Compiler Design 12


Context-Free Grammar(CFG)
Many programming language constructs have an recursive structure
that can be defined by CFG,
e.g. Conditional Statement defined by a rule,
If S1 and S2 are statements and E is an expression ,then
“if E then S1 else S2 “ is a statement (1)
This form of conditional statement cannot be specified using the
regular expression.
Grammar specifications:
With the use of syntactic variable:
stmt to denote the class of statements, and
expr to denote the class of expressions,
We can express (1) as a grammar rule,
stmt → if expr then stmt else stmt
May 10, 2024 SCS1303-Compiler Design 13
Context-Free Grammar(CFG)
• CFG consists of terminals, non-terminals, start symbol, and
productions.
Terminals:
– Basic symbols from which strings are formed. i.e Tokens are
terminals.
Non-terminals:
– Syntactic variables that denote set of strings.
– Helps to define the language generated by the grammar.
Start symbol:
– One non-terminal that denotes the language defined by the
grammar.
Productions:
– Specify the manner in which the terminals and non-terminals
can be combined to form strings.
– Each production consists of a non-terminal , followed by an
arrow→ , followed by a string of non-terminals and terminals.
May 10, 2024 SCS1303-Compiler Design 14
Example
Consider the grammar for simple arithmetic expressions:
expr → expr op expr
expr → ( expr )
expr → - expr
expr → id
op → +
op → -
op → *
op → /
op → ^
Terminals : id + - * / ^ ( )
Non-terminals : expr , op
Start symbol : expr
May 10, 2024 SCS1303-Compiler Design 15
Notational Conventions
1. These symbols are terminals:
i. Lower-case letters early in the alphabet such as a, b, c.
ii. Operator symbols such as +, -, etc.
iii. Punctuation symbols such as parentheses, comma etc.
iv. Digits 0,1,…,9.
v. Boldface strings such as id or if (keywords)

2. These symbols are non-terminals:


vi. Upper-case letters early in the alphabet such as A, B, C..
vii. The letter S, when it appears is usually the start symbol.
viii. Lower-case italic names such as expr or stmt.

May 10, 2024 SCS1303-Compiler Design 16


Notational Conventions(contd..)
3. Upper-case letters late in the alphabet, such as X,Y,Z, represent
grammar symbols, that is either terminals or non-terminals.

4. Greek letters α , β , γ represent strings of grammar symbols.


e.g a generic production could be written as A → α

5. If A → α1 , A → α2 , . . . . , A → αn are all productions with A ,


then we can write A → α1 | α2 |. . . . | αn , (alternatives for A).

6. Unless otherwise stated, the left side of the first production is


the start symbol.

May 10, 2024 SCS1303-Compiler Design 17


Using the shorthand, the grammar can be written as:

E → E A E | ( E ) | - E | id
A→+|-|*|/|^

May 10, 2024 SCS1303-Compiler Design 18


May 10, 2024 SCS1303-Compiler Design 19
CFL vs RL
• Any language that can be generated by a context free grammar
is a context free language (CFL).
• RL  CFL – every regular language can be expressed by a CFG.

May 10, 2024 SCS1303-Compiler Design 20


May 10, 2024 SCS1303-Compiler Design 21
Derivations
• A derivation of a string for a grammar is a sequence of grammar
rule applications that transform the start symbol into the string.
A derivation proves that the string belongs to the grammar's
language.

May 10, 2024 SCS1303-Compiler Design 22


• To create a string from a context-free grammar:

– Begin the string with a start symbol.

– Apply one of the production rules to the start symbol on


the left-hand side by replacing the start symbol with the
right-hand side of the production.

– Repeat the process of selecting non-terminal symbols in


the string, and replacing them with the right-hand side of
some corresponding production, until all non-terminals
have been replaced by terminal symbols.

May 10, 2024 SCS1303-Compiler Design 23


Example

May 10, 2024 SCS1303-Compiler Design 24


May 10, 2024 SCS1303-Compiler Design 25
Example

May 10, 2024 SCS1303-Compiler Design 26


CFG: Parsing
Parser :
– A program that determines if a string w ϵ L(G) by
constructing a derivation ,where G represent the grammar.

Top-down parsers :
– Constructs the derivation tree from root to leaves.
– Leftmost derivation.
Bottom-up parsers :
– Constructs the derivation tree from leaves to root.
– Rightmost derivation in reverse.

May 10, 2024 SCS1303-Compiler Design 27


Next Topic:
Derivations and Parse tree

May 10, 2024 SCS1303-Compiler Design 28


Derivations
• A production is treated as a rewriting rule in which the non-
terminal on the left is replaced by the string on the right side of
the production.
• e.g. Consider the grammar for arithmetic expression
E→E+E
E→E*E
E→(E) E → E + E | E * E | (E ) | - E | id
E→-E
E → id

May 10, 2024 SCS1303-Compiler Design 29


Consider the G,
E → E + E | E * E | (E ) | - E | id

Derive the string id + id * id


E => E + E
=> id + E
Each step of derivation is a
=> id + E * E Sentential Form
=> id + id * E
=> id + id * id

Sentence
May 10, 2024 SCS1303-Compiler Design 30
Context Free Language
Given a grammar G with start symbol S,
define L(G) ,the language generated by G.

i.e. Derives in one or more steps.

The string in L(G) contain only terminals.


A string of terminals w is in L(G) if and only if S w
where w is called the sentence of G.

If S α ,where α may contain non-terminals, then we say


that α is a sentential form of G.
A sentence is a sentential form with no non-terminals.

May 10, 2024 SCS1303-Compiler Design 31


Derivations
A derivation is basically a sequence of production rules, in order to
get the input string. During parsing, we take two decisions for
some sentential form of input:
– Deciding the non-terminal which is to be replaced.
– Deciding the production rule, by which, the non-terminal will

be replaced.

To decide which non-terminal to be replaced with production rule,


we can have two options.

May 10, 2024 SCS1303-Compiler Design 32


Types of Derivation
Leftmost Derivation(LMD):
• If the sentential form of an input is scanned and replaced from
left to right, it is called left-most derivation.
• The sentential form derived by the left-most derivation is called
the left-sentential form.

Rightmost Derivation(RMD):
• If we scan and replace the input with production rules, from
right to left, it is known as right-most derivation.
• The sentential form derived from the right-most derivation is
called the right-sentential form.

May 10, 2024 SCS1303-Compiler Design 33


Parse Tree or Derivation Tree
A parse tree is a graphical representation of a derivation
sequence of a sentential form.
In a parse tree:
– All leaf nodes are terminals.
– All interior nodes are non-terminals.
– In-order traversal gives original input string.

• A parse tree depicts associativity and precedence of


operators.
• The deepest sub-tree is traversed first, therefore the operator
in that sub-tree gets precedence over the operator which is in
the parent nodes.

May 10, 2024 SCS1303-Compiler Design 34


Example 1
Consider the G,
E → E + E | E * E | (E ) | - E | id
Derive the string id + id * id using leftmost derivation
Parse Tree
E E+E E

id + E
id + E * E E + E

id + id * E
id + id * id id E * E

id id

May 10, 2024 SCS1303-Compiler Design 35


Example 1
Consider the G,
E → E + E | E * E | (E ) | - E | id
Derive the string id + id * id using rightmost derivation
Parse Tree
E E+E E

E+E*E
E + E * id E + E

E + id * id
id + id * id id E * E

id id

May 10, 2024 SCS1303-Compiler Design 36


Example 2
Derive the string - ( id + id ) * id using rightmost /leftmost
derivation.
E → E + E | E * E | (E ) | - E | id

Leftmost Derivation:
E E*E E*E
-E*E E * id
-(E)*E - E * id
-(E+E)*E - ( E ) * id
- ( id + E ) * E - ( E + E ) * id
- ( id + id ) * E - ( E + id ) * id
- ( id + id ) * id - ( id + id ) * id

May 10, 2024 SCS1303-Compiler Design 37


Example 3
Derive the string - ( id + id )
E → E + E | E * E | (E ) | - E | id

E -E
-(E)
-(E+E)
- ( id + E )
- ( id + id )

May 10, 2024 SCS1303-Compiler Design 38


Practice

May 10, 2024 SCS1303-Compiler Design 39


Next Topic:
Ambiguity

May 10, 2024 SCS1303-Compiler Design 40


May 10, 2024 SCS1303-Compiler Design 41
Ambiguity
• A grammar that produces more than one parse tree for some
sentence is said to be ambiguous.
– i.e. An ambiguous grammar is one that produce more
than one leftmost or more than one rightmost derivation
for the same sentence.

• For certain types of parsers, it is desirable that the grammar


be made unambiguous.
• We need to design unambiguous grammars for compiling
applications.

May 10, 2024 SCS1303-Compiler Design 42


Ambiguous vs. Unambiguous

May 10, 2024 SCS1303-Compiler Design 43


Eliminating Ambiguity
• An ambiguous grammar can be rewritten to eliminate the
ambiguity.
e.g. Eliminate the ambiguity from “dangling-else” grammar,
stmt → if expr then stmt
| if expr then stmt else stmt
| other
According to this grammar define a compound conditional
statement,

May 10, 2024 SCS1303-Compiler Design 44


For the same grammar, consider the string,

The grammar is ambiguous since the string produces two parse


tree,

May 10, 2024 SCS1303-Compiler Design 45


Match each else with the closest previous unmatched then.
This disambiguity rule can be incorporated into the grammar.

This grammar generates the same set of strings ,but allows only
one parsing for string.

May 10, 2024 SCS1303-Compiler Design 46


Methods To Remove Ambiguity
• The ambiguity from the grammar may be removed using the
following methods:

• The solution to the problem of enforcing precedence is to


introduce several different variables at each level of
precedence.
May 10, 2024 SCS1303-Compiler Design 47
Removing Ambiguity by Precedence &
Associativity Rules
• An ambiguous grammar may be converted into an unambiguous grammar by
implementing:
– Precedence Constraints
– Associativity Constraints
These constraints are implemented using the following rules:

Rule-1:
• The level at which the production is present defines the priority of the
operator contained in it.
– The higher the level of the production, the lower the priority of operator.
– The lower the level of the production, the higher the priority of operator.
Rule-2:
• If the operator is left associative, induce left recursion in its production.
• If the operator is right associative, induce right recursion in its production.

May 10, 2024 SCS1303-Compiler Design 48


Example
Consider the ambiguous Grammar:
E → E + E |E – E | E * E | E / E | (E) | id

Introduce new variable / non-terminals at each level of


precedence,
–an expression E for our example is a sum of one or more terms. (+,-)
–a term T is a product of one or more factors. (*, /)
– a factor F is an identifier or parenthesised expression.
Unambiguous Grammar
E→E+T|E–T|T
T→T*F|T/F|F
F → (E) | id

May 10, 2024 SCS1303-Compiler Design 49


Unambiguous Grammar
E→E+T|T
T→T*F|F
F → (E) | id
Try deriving the string id+id*id using the above grammars.
E E+T
T+T
F+T
id + T
id + T * F
id + F * F
id + id * F
id + id * id

May 10, 2024 SCS1303-Compiler Design 50


May 10, 2024 SCS1303-Compiler Design 51
RE vs. CFG
• Every construct that can be described by a regular expression
can be described by a grammar.
• NFA can be converted to a grammar that generates the same
language as recognized by the NFA.
• Rules:
– For each state i of the NFA, create a non-terminal symbol Ai .
– If state i has a transition to state j on symbol a, introduce the
production Ai → a Aj
– If state i goes to state j on symbol ε, introduce the
production Ai → Aj
– If i is an accepting state, introduce Ai → ε
– If i is the start state make Ai the start symbol of the grammar.

May 10, 2024 SCS1303-Compiler Design 52


RE vs. CFG(contd..)
e.g. The regular expression (a|b)*abb, consider the NFA
a

start a b b
0 1 2 3

b
Equivalent grammar,
A0 → a A0 | b A 0 | a A 1
A1 → b A2
A2 → b A3
A3 → ε

May 10, 2024 SCS1303-Compiler Design 53


Types of parser

Types of Parser

Top-Down Parser Bottom-Up Parser

Backtracking Predictive Parser LR Parser Shift-Reduce Parser

1.Recursive Descent Parsing


2.Non-Recursive Descent Parsing LR SLR LALR

May 10, 2024 SCS1303-Compiler Design 54


May 10, 2024 SCS1303-Compiler Design 55
May 10, 2024 SCS1303-Compiler Design 56
May 10, 2024 SCS1303-Compiler Design 57
May 10, 2024 SCS1303-Compiler Design 58
Elimination of Left Recursion
A grammar is left recursive, if it has a non terminal A such that
there is a derivation, for some string α.

Top-down parsing techniques cannot handle left-recursive


grammars.
So, convert left-recursive grammar into an equivalent grammar
which is not left-recursive.

The left-recursion may appear in a single step of the derivation(immediate


left-recursion), or may appear in more than one step of the derivation.

Note: The grammar with no cycles and no ε-productions.


(i.e. and A → ε )
May 10, 2024 SCS1303-Compiler Design 59
Immediate Left-Recursion

A  A |  where β does not start with A

 eliminate immediate left recursion

A   A' an equivalent grammar


A'  A' | 

May 10, 2024 SCS1303-Compiler Design 60


Example Grammar

E->E+T |T E -> E + T | T A -> A α | β


T->T*F | F
F->(E)|id
E -> T E’ A -> β A’

E->TE’
E’->+TE’|ε E’ -> + T E’| ε A’ -> α A’ | ε
T->FT’
T’->*FT’|ε
F->(E)
F->id
May 10, 2024 SCS1303-Compiler Design 61
Example
Consider the Grammar:
S → Aa | b
A → Ac | Sd | ε
The non-terminal S is left recursive because S=> Aa => Sda, but it is not
immediate left recursive.

On substituting S - production in A →Sd , we get,


A →Ac | Aad | bd | ε
Hence, the new Grammar is:
S → Aa | b
A →Ac | Aad | bd | ε

Eliminating the immediate left recursion among the A-production yields the
following grammar,
S → Aa | b
A → bdA’| A’
A’ → cA’ |ad A’ |ε
May 10, 2024 SCS1303-Compiler Design 62
May 10, 2024 SCS1303-Compiler Design 63
May 10, 2024 SCS1303-Compiler Design 64
May 10, 2024 SCS1303-Compiler Design 65
May 10, 2024 SCS1303-Compiler Design 66
May 10, 2024 SCS1303-Compiler Design 67
May 10, 2024 SCS1303-Compiler Design 68
May 10, 2024 SCS1303-Compiler Design 69
May 10, 2024 SCS1303-Compiler Design 70
May 10, 2024 SCS1303-Compiler Design 71
Model of Non-Recursive Predictive Parser

INPUT a + b $

STACK X
Y
Predictive Parsing OUTPUT
Z Program
$

Parsing Table
M

May 10, 2024 SCS1303-Compiler Design 72


Predictive Parser(contd..)
• A table driven predictive parser has:
– an input buffer, $ is used to mark end of string
– stack, $ at bottom of the stack.
– a parsing table and
– an output stream.

• The construction of a predictive parser is aided by two


functions associated with the grammar G.
– FIRST()
– FOLLOW()

• These functions will fill the entries of a predictive parsing


table for G, whenever possible.
May 10, 2024 SCS1303-Compiler Design 73
May 10, 2024 SCS1303-Compiler Design 74
E→TE’
E’ → +TE’|ε
T → FT’
T’ → *FT’|ε
F →(E)
F → id

May 10, 2024 SCS1303-Compiler Design 75


May 10, 2024 SCS1303-Compiler Design 76
FOLLOW Example
E:
E→TE’ F →(E)
FOLLOW(E)={$,)}
E’ → +TE’|ε
T → FT’ E’:
T’ → *FT’|ε E→TE’
E’ → +TE’
F →(E) FOLLOW(E’)=FOLLOW(E)={$,)}
F → id
T:
E→TE’
E’ → +TE’
FOLLOW(T)={FIRST(E’)}={+, ε}

={ +,FOLLOW(E)} ={+,$,) }

May 10, 2024 SCS1303-Compiler Design 77


May 10, 2024 SCS1303-Compiler Design 78
May 10, 2024 SCS1303-Compiler Design 79
Predictive Parsing Table M[X,a]
M[X,a] id + * ( ) $

E E →TE’ E →TE’

E’ E’ →+TE’ E’ →ε E’ →ε

T T →FT’ T →FT’

T’ T’ →ε T’ →*FT’ T’ →ε T’ →ε

F F →id F → (E)

X-Non terminal (row)


a- Terminal (column)
blank entries – error entries
May 10, 2024 SCS1303-Compiler Design 80
Predictive parsing program
set ip to point to the first symbol of w$;
repeat
let X be the top stack symbol and a the symbol pointed to by ip;
if X is a terminal or $ then
If X=a then
pop X from the stack and advance ip
else error()
else /* X is a nonterminal */
if M[X,a]=X →Y1 Y2 ....... Yk then begin
pop X from the stack;
push Yk , Yk-1 ........ Y1 onto the stack ,with Y1 on
top;
output the production X → Y1 Y2 ...... Yk
end
else error()
until X=$
May 10, 2024
/* stack is empty */
SCS1303-Compiler Design 81
id + * ( ) $ STACK INPUT OUTPUT
E E →TE’ E →TE’ $E id+id*id$ E →TE’
E’ E’ →+TE’ E’ →ε E’ →ε $E’T id+id*id$ T →FT’
T T →FT’ T →FT’ $E’T’F id+id*id$ F →id
T’ T’ →ε T’ →*FT’ T’ →ε T’ →ε $E’T’id id+id*id$ pop
F F →id F → (E) $E’T’ +id*id$ T’ →ε
$E’ +id*id$ E’ →+TE’
$E’T+ +id*id$ pop
$E’T id*id$ T →FT’
$E’T’F id*id$ F →id
$E’T’id id*id$ Pop
$E’T’ *id$ T’ →*FT’
$E’T’F* *id$ Pop
$E’T’F id$ F →id
$E’T’id id$ Pop
$E’T’ $ T’ →ε
$E’ $ E’ →ε
May 10, 2024 SCS1303-Compiler Design $ $ Accept
82
May 10, 2024 SCS1303-Compiler Design 83
May 10, 2024 SCS1303-Compiler Design 84
May 10, 2024 SCS1303-Compiler Design 85
Exercise
1.Show that the following grammar is LL(1).
S → AaAb | BbBa
A →ε
B→ε

2. Consider the following Grammar:


S→(L)|a
L→L,S|S
Implement the predictive parser and check the validity of the string (a,
(a,a))

3. Show that the grammar is LL(1):


S→L=R
S→R
L→*R
L → id
R→L
May 10, 2024 SCS1303-Compiler Design 86
Problem 1:

May 10, 2024 SCS1303-Compiler Design 87


May 10, 2024 SCS1303-Compiler Design 88
NEXT TOPIC
BOTTOM-UP PARSING

May 10, 2024 SCS1303-Compiler Design 89


Types of parser

Types of Parser

Top-Down Parser Bottom-Up Parser

Backtracking Predictive Parser LR Parser Shift-Reduce Parser

LR SLR LALR
Operator
Precedence Parser

May 10, 2024 SCS1303-Compiler Design 90


May 10, 2024 SCS1303-Compiler Design 91
May 10, 2024 SCS1303-Compiler Design 92
S aABe
aAde
Derivation Reduction
aAbcde
abbcde

e
a A B

A b c
d

May 10, 2024 SCS1303-Compiler Design 93


May 10, 2024 SCS1303-Compiler Design 94
May 10, 2024 SCS1303-Compiler Design 95
May 10, 2024 SCS1303-Compiler Design 96
May 10, 2024 SCS1303-Compiler Design 97
Example for Choosing Handle
Consider the grammar
E→E+E
E→E-E
E→(E)
E→id
The rightmost derivation: id + id * id
E E+E E*E
E+E*E E* id Ambiguous
E + E * id E + E * id Grammar
E + id * id E + id * id
id + id * id id + id * id

May 10, 2024 SCS1303-Compiler Design 98


Stack implementation of Shift Reduce Parser

May 10, 2024 SCS1303-Compiler Design 99


May 10, 2024 SCS1303-Compiler Design 100
May 10, 2024 SCS1303-Compiler Design 101
May 10, 2024 SCS1303-Compiler Design 102
May 10, 2024 SCS1303-Compiler Design 103
May 10, 2024 SCS1303-Compiler Design 104
May 10, 2024 SCS1303-Compiler Design 105
In operator-precedence parsing ,we define three disjoint
precedence relations between pair of terminals.

May 10, 2024 SCS1303-Compiler Design 106


May 10, 2024 SCS1303-Compiler Design 107
May 10, 2024 SCS1303-Compiler Design 108
Defining Precedence Relations
The precedence relations are defined using the following rules:

Rule-01:
– If precedence of b is higher than precedence of a, then we define a <
b
– If precedence of b is same as precedence of a, then we define a = b
– If precedence of b is lower than precedence of a, then we define a > b

Rule-02:
– An identifier is always given the higher precedence than any other
symbol.
– $ symbol is always given the lowest precedence.

Rule-03:
– If two operators have the same precedence, then we go by checking
their associativity.

May 10, 2024 SCS1303-Compiler Design 109


Operator-Precedence Relations from
Associativity and Precedence

May 10, 2024 SCS1303-Compiler Design 110


Operator-Precedence Relations

May 10, 2024 SCS1303-Compiler Design 111


May 10, 2024 SCS1303-Compiler Design 112
Example

May 10, 2024 SCS1303-Compiler Design 113


Operator-Precedence Parser
• An operator-precedence parser is a simple shift-reduce parser that
is capable of parsing a subset of LR(1) grammars.
• More precisely, the operator-precedence parser can parse all LR(1)
grammars where two consecutive non-terminals and epsilon never
appear in the right-hand side of any rule.

Steps involved in Parsing:


1. Ensure the grammar satisfies the pre-requisite.
2. Computation of the function LEADING()
3. Computation of the function TRAILING()
4. Using the computed leading and trailing ,construct the operator
Precedence Table
5. Parse the given input string based on the algorithm
6. Compute Precedence Function and graph.
May 10, 2024 SCS1303-Compiler Design 114
Computation of LEADING
• Leading is defined for every non-terminal.
• Terminals that can be the first terminal in a string derived
from that non-terminal.
• LEADING(A)={ a| A γaδ },where γ is ε or any non-
terminal, indicates derivation in one or more steps, A is a
non-terminal.
Algorithm for LEADING(A):
{
1. ‘a’ is in LEADING(A) is A→ γaδ where γ is ε or any non-
terminal.
2.If ‘a’ is in LEADING(B) and A→B, then ‘a’ is in LEADING(A).
}

May 10, 2024 SCS1303-Compiler Design 115


• Consider the unambiguous grammar as an example:

LEADING(E)= { +,LEADING(T)} ={+ , * , ( , id}


LEADING(T)= { *,LEADING(F)} ={* , ( , id}
LEADING(F)= { ( , id}

May 10, 2024 SCS1303-Compiler Design 116


Computation of TRAILING
• Trailing is defined for every non-terminal.
• Terminals that can be the last terminal in a string derived
from that non-terminal.
• TRAILING(A)={ a| A γaδ },where δ is ε or any non-
terminal, indicates derivation in one or more steps, A is a
non-terminal.
Algorithm for TRAILING(A):
{
1. ‘a’ is in TRAILING(A) is A→ γaδ where δ is ε or any non-
terminal.
2.If ‘a’ is in TRAILING(B) and A→B, then ‘a’ is in TRAILING(A).
}

May 10, 2024 SCS1303-Compiler Design 117


• Consider the unambiguous grammar as an example:

TRAILING(E)= { +, TRAILING(T)} ={+ , * , ) , id}


TRAILING(T)= { *, TRAILING(F)} ={* , ) , id}
TRAILING(F)= { ) , id}

May 10, 2024 SCS1303-Compiler Design 118


Construction of Precedence Table
• After computing LEADING and TRAILING ,the table is constructed between all the
terminals in the grammar including the ‘$’ symbol.
ALGORITHM:
{
for each production A→X1 X2 X3 ... Xn
for i=1 to n-1
1. if Xi and Xi+1 are terminals
set Xi Xi+1
2. if i ≤ n-2 and Xi and Xi+2 are terminals and Xi+1 is a non-terminal,
set Xi Xi+2
3. if Xi is a terminal and Xi+1 is a non-terminal ,then for all ‘a’ in
LEADING(Xi+1 )
set Xi a
4. if Xi is a non-terminal and Xi+1 is a terminal ,then for all ‘a’ in
TRAILING(Xi)
set a Xi+1
5. Set $ Leading(S) and Trailing(S) $,where S-start symbol.
}
May 10, 2024 SCS1303-Compiler Design 119
Precedence table

+ * id ( ) $
+ > < < < > >
* > > < < > >
id > > e e > >
LEADING(E)= {+ , * , ( , id}
LEADING(T)= {* , ( , id} ( < < < < = e
LEADING(F)= { ( , id}
) > > e e > >
T followed by NT $ < < < < e Accept
Rule-1. + T
+ < leading(T)
Rule-3. * F TRAILING(E)= {+ , * , ) , id} NT followed by T
* < leading(F) TRAILING(T)= {* , ) , id} Rule-1. E +
Rule-4. ( E TRAILING(F)= { ) , id} Trailing(E) > +
( < leading(E) Rule-3. T *
Trailing(T) > *
Rule-4. E )
May 10, 2024 SCS1303-Compiler Design Trailing(E) > ) 120
Parse the string
STACK REL. INPUT ACTION
+ * id ( ) $ $ $< ( (id+id)*id$ Shift (
+ > < < < > > $( ( < id id+id)*id$ Shift id
* > > < < > > ( < id $( id id > + +id)*id$ Pop id
id > > e e > > $( ( <+ +id)*id$ Shift +
( < < < < = e $(+ + < id id)*id$ Shift id
+< id $(+id id > ) )*id$ Pop id
) > > e e > >
( <+ $(+ +>) )*id$ Pop +
$ < < < < e Acce
pt $( (=) )*id$ Shift )
( =) $() ) >* *id $ Pop )
$ <( $( Pop (
$ $<* *id $ Shift *
$* *<id id$ Shift id
*<id $*id id >$ $ Pop id
$< * $* *>$ $ Pop *
$ $ Accept

May 10, 2024 SCS1303-Compiler Design 121


May 10, 2024 SCS1303-Compiler Design 122
Precedence Functions
Algorithm for Constructing Precedence Functions
1. Create functions fa for each grammar terminal a and for the
end of string symbol.
2. Partition the symbols in groups so that fa and gb are in the
same group if a = b (there can be symbols in the same group
even if they are not connected by this relation).
3. Create a directed graph whose nodes are in the groups, next
for each symbols a and b do: place an edge from the group
of gb to the group of fa if a <· b, otherwise if a ·> b place an
edge from the group of fa to that of gb.
4. If the constructed graph has a cycle then no precedence
functions exist. When there are no cycles collect the length
of the longest paths from the groups of fa and gb
respectively.
May 10, 2024 SCS1303-Compiler Design 123
fid gid

g* f*

f+ g+

g$ f$

The resulting Precedence functions are:


id + * $
f 4 2 4 0 fid →g* →f+ → g+ →f$
g 5 1 3 0

May 10, 2024


g →f* →g* →f+ → g+ →f$
id
SCS1303-Compiler Design 124
Use:
Most calculators use operator precedence parsers to convert
from the human-readable infix notation relying on order of
operations to a format that is optimized for evaluation such
as Reverse Polish notation (RPN).

Advantages :
• It can easily be constructed by hand.
• It is simple to implement this type of parsing.

Disadvantages :
• It is hard to handle tokens like the minus sign (-), which has two
different precedence (depending on whether it is unary or
binary).
• It is applicable only to a small class of grammars.
May 10, 2024 SCS1303-Compiler Design 125
Error Recovery in Operator Precedence
Error cases:
Parsing
– No relation holds between the terminal on the top of the stack
and the next input symbol.
– A handle is found (reduction step),but there is no production
with this handle as a right side.

Error recovery:
– Each empty entry is filled with a pointer to an error routine.
– Decides the popped handle “looks like” which right-hand side,
and tries to recover from that situation.

Handling Shift/Reduce Errors


–To recover ,we must modify(insert/change)
•stack
•Input
•Both
–We must be careful that we don’t get into an infinite loop.
May 10, 2024 SCS1303-Compiler Design 126
Problems

3)

May 10, 2024 SCS1303-Compiler Design 127


Problem 3
S →L = R = * id $
S→R = e < < >
L→*R
* > < < >
L → id
id > e e >
R→L
$ < < < accept
LEADING(S)={=,LEADING(R)}
={=,LEADING(L)}={=,*,id} S →L = R = < LEADING(R)
LEADING(L)={*,id} L→*R * < LEADING(R)
LEADING(R)={*,id} S →L = R TRAILING(L) > =
$<LEADING(S) AND
TRAILING(S)>$
TRAILING(S)={=,TRAILING(R)}
={=,TRAILING(L)}={=,*,id}
TRAILING(L)={*,id}
TRAILING(R)=
May 10, 2024 {*,id} SCS1303-Compiler Design 128
= * id $ Input string : id=*id
= e < < > STACK REL. INPUT ACTION
* > < < > $ $ < id id=*id$ Shift id
id > e e >
$ < id $id id > = =* id $ Pop id

$ < < < accept $ $ < = = * id $ Shift =

$= = < * * id $ Shift *

Input string : id=id*id $=* * < id id $ Shift id

STACK REL. INPUT ACTION * < id $=*id id > $ $ Pop id


$ $ < id id=id*id$ Shift id = <* * > $ Pop *
$=* $
$ < id $id id > = =id * id $ Pop id $ <= = > $ Pop =
$= $
$ $ < = =id * id $ Shift =
$ $ Accept
$= = < id id * id $ Shift id

$=id id e * * id $ error

May 10, 2024 SCS1303-Compiler Design 129


= * id $
= e < < >
fid gid
* > < < >
id > e e >
g* f*
$ < < < accept

f= g=

g$ f$
The resulting Precedence functions are:
id = * $
f 3 1 1 0
g 2 2 2 0

May 10, 2024 SCS1303-Compiler Design 130


May 10, 2024 SCS1303-Compiler Design 131
May 10, 2024 SCS1303-Compiler Design 132
May 10, 2024 SCS1303-Compiler Design 133
May 10, 2024 SCS1303-Compiler Design 134
May 10, 2024 SCS1303-Compiler Design 135
May 10, 2024 SCS1303-Compiler Design 136
May 10, 2024 SCS1303-Compiler Design 137
May 10, 2024 SCS1303-Compiler Design 138
May 10, 2024 SCS1303-Compiler Design 139
May 10, 2024 SCS1303-Compiler Design 140
May 10, 2024 SCS1303-Compiler Design 141
May 10, 2024 SCS1303-Compiler Design 142
May 10, 2024 SCS1303-Compiler Design 143
May 10, 2024 SCS1303-Compiler Design 144
May 10, 2024 SCS1303-Compiler Design 145
May 10, 2024 SCS1303-Compiler Design 146
May 10, 2024 SCS1303-Compiler Design 147
Problem-1
Implement SLR Parser for the given grammar:

Step 1: Augment the grammar by adding a new start symbol – G’


Goal

Our goal is to find an E’ , followed by $

.
E’→ E , $
Whenever we are about to reduce using
rule 0....
Accept! Parse is finished!

May 10, 2024 SCS1303-Compiler Design 148


Step 2: Construction of LR(0) items (C):
Canonical collection of set of LR(0) items for an augmented
Grammar G’.

I0 : E'→•E
GOTO(I0 , E)
E→•E + T GOTO(I,X) → Moving • over
E→• T the symbols(both terminal
GOTO(I0 , T) and non-terminals.
T→•T * F
T→• F GOTO(I0 , F)
F→•(E)
F→• id GOTO(I0 , ()

GOTO(I0 , id)

May 10, 2024 SCS1303-Compiler Design 149


I0 : E'→•E GOTO(I0 , E)
I1 : E'→E •
E→•E + T E→E • + T

E→• T GOTO(I0 , T)
I2 : E→T •
T→•T * F T→T • * F
GOTO(I0 , F)
T→• F I3 : T→F •

F→•(E) GOTO(I0 , ()
I4 : F→(•E)
E→•E + T
F→• id
E→• T
T→•T * F
T→• F
F→•(E)
F→• id
GOTO(I0 , id)

I5 : F→id •
May 10, 2024 SCS1303-Compiler Design 150
DFA
I1 : E'→E •
I0 : E E→E • + T
E'→•E
E→•E + T T I2 : E→T •
E→• T T→T • * F
T→•T * F
F
T→• F
I3 : T→F •
F→•(E)
F→• id (
I4 : F→(•E)
E→•E + T
id E→• T
T→•T * F
T→• F
F→•(E)
F→• id

I5 : F→id •

May 10, 2024 SCS1303-Compiler Design 151


Transition from item set I1
Accept
I1 : E'→E • GOTO(I1, +) E→E + • T
E→E • + T I6
T→•T * F
T→• F
F→•(E)
F→• id
Transition from item set I3

Transition from item set I2 I3 : T→F •


Rule 2 Ready
for reduction Rule 4 Ready
for reduction
I2 : E→T • GOTO(I2, *) T→T * • F
F→•(E) I7
T→T • * F
F→• id

May 10, 2024 SCS1303-Compiler Design 152


Transition from item set I4

I4 : F→(•E)
GOTO(I4, E) F→(E•)
E→•E + T I8
E→E •+ T
E→• T
T→•T * F GOTO(I4, T) E→ T •
T→• F I2
T→T • * F
F→•(E)
F→• id GOTO(I4, F ) I3 F→(•E)
E→•E + T
GOTO
(I4 , ( E→• T I4
GOTO(I4, id) )
T→•T * F
I5 T→• F
F→•(E)
F→• id
Transition from item set I5

I5 : F→id •
Rule 6 Ready
for reduction

May 10, 2024 SCS1303-Compiler Design 153


DFA
I1 : E'→E • I6 : E→E + • T
I0 +
E E→E • + T
E'→•E T→•T * F
E→•E + T T→• F
E→• T T I2 : E→T • * F→•(E)
T→•T * F T→T • * F F→• id
T→• F F I7 : T→T * • F
F→•(E) I3 : T→F • F→•(E)
F→• id F→• id
(
I4 : F→(•E) F
E→•E + T
T
E→• T
T→•T * F
id
T→• F
E
F→•(E) I8 : F→(E•)
F→• id E→E •+ T

(
id
I5 : F→id •
May 10, 2024 SCS1303-Compiler Design 154
Transition from item set I6
Rule 1 Ready
I6 : E→E + • T E→E + T • for reduction
GOTO(I6, T) I9
T→T • * F
T→•T * F
T→• F GOTO(I6, F ) I3
F→•(E) F→(•E)
F→• id E→•E + T
GOTO
GOTO(I6, id) (I6 , ( E→• T I4
)
T→•T * F
I5 T→• F
F→•(E)
F→• id
Transition from item set I7

I7 : T→T * • F I10
GOTO(I7, F) Rule 3 Ready
F→•(E) T→T * F • for reduction
F→• id
GOTO
GOTO(I7, id) (I7 , (
)
I5
I4
May 10, 2024 SCS1303-Compiler Design 155
Transition from item set I8
Rule 5 Ready
I11 for reduction
I8 : F→(E•) GOTO(I8, ) )
F→ ( E )•
E→E •+ T
GO
TO E→E + • T
(I ,
8 +)
I6
T→•T * F
T→• F
F→•(E)
F→• id

Transition from item set I9

I9 : E→E + T • GOTO(I9, * )
T→T • * F T→T * • F
F→•(E) I7
F→• id

May 10, 2024 SCS1303-Compiler Design 156


DFA
I1 : E'→E • I6 : E→E + • T
I0 + I9 : E→E+T •
E E→E • + T T
E'→•E T→•T * F T→T • * F
E→•E + T T→• F
E→• T T I2 : E→T • *
F→•(E)
T→•T * F T→T • * F *
F→• id
T→• F F
F→•(E) I3 : T→F • I7 : T→T * • F
F
F→• id F→•(E) I10 :T→T* F •
( F→• id
I4 : F→(•E) F
E→•E + T
T +
E→• T
T→•T * F
id T→• F
F→•(E)
E I8 : F→(E•)
F→• id
E→E •+ T )
I11 : F→ (E )•
(
id

I5 : F→id •
May 10, 2024 SCS1303-Compiler Design 157
Collection of set of items
I0 : E'→•E I4 : F→(•E)
I7 : T→T * • F
E→•E + T E→•E + T
F→•(E)
E→• T E→• T
F→• id
T→•T * F T→•T * F
T→• F T→• F
F→•(E) F→•(E) I8 : F→(E•)
F→• id F→• id E→E •+ T
Goal! Parser Rule 6 Rule 1
Announce Ready for Ready for
Accept reduction reduction

I1 : E'→E • I5 : F→id •
I9 : E→E+T •
E→E • + T T→T • * F
Rule 2
Ready for Rule 3
reduction Ready for
reduction
I2 : E→T • I6 : E→E + • T
Rule 4 I10 :T→T* F •
T→T • * F T→•T * F
Ready for
reduction T→• F Rule 5
F→•(E) Ready for
I3 : T→F • reduction
F→• id
I11 : F→ (E )•

May 10, 2024 SCS1303-Compiler Design 158


May 10, 2024 SCS1303-Compiler Design 159
Parsing Table

ACTION GOTO
State id + * ( ) $ E T F
1.E→E + T
2.E→T 0 s5 s4 1 2 3
3.T→T * F
4.T→F 1 s6 acc
5.F→(E) 2 r2 s7 r2 r2
6.F→id
3 r4 r4 r4 r4
FOLLOW(E) 4 s5 s4 8 2 3
= {+,),$ }
FOLLOW(T) 5 r6 r6 r6 r6
={*,+,) ,$} 6 s5 s4 9 3
FOLLOW(F)
={*,+,) ,$} 7 s5 s4 10
8 s6 s11
9 r1 s7 r1 r1
10 r3 r3 r3 r3
11 r5 r5 r5 r5
May 10, 2024 SCS1303-Compiler Design 160
May 10, 2024 SCS1303-Compiler Design 161
Moves of LR parser on id*id+id

Stack Input Action


0 id*id+id$ s5 Shift 5
0 id 5 *id+id$ r6 Reduce by F→id
0F3 *id+id$ r4 Reduce by T→F
0T2 *id+id$ s7 Shift 7
0T2*7 id+id$ s5 Shift 5
0 T 2 * 7 id 5 +id$ r6 Reduce by F→id
0 T 2 * 7 F 10 +id$ r3 Reduce by T→T*F
0T2 +id$ r2 Reduce by E→T
0E1 +id$ s6 Shift 6
0 E 1+ 6 id$ s5 Shift 5
0 E 1+ 6 id 5 $ r6 Reduce by F→id
0 E 1+ 6 F 3 $ r4 Reduce by T→F
0 E 1+ 6 T 9 $ r1 Reduce by E→E+T
0E1 $ Accept

May 10, 2024 SCS1303-Compiler Design 162


May 10, 2024 SCS1303-Compiler Design 163
Problem-2
Construct the SLR parsing table and parse the
string abab for the following grammar
S -> AA
A -> aA | b
Soln:
1. Augmented Grammar
S’-> S
S -> AA
A -> aA | b
LR (0) Items
Goto (I0, b) Goto (I3, A)
S’-> • S
I4 A -> b •
S -> • AA I6 A -> aA •
I0 A -> • aA
Goto (I2, A)
A -> • b Goto (I3, a)
Goto (I0, S) I5 S -> AA •
I3 A -> a • A
I1 S’-> S •
Goto (I2, a) A -> • aA
A -> • b
Goto (I0, A)
A -> a • A
S -> A • A I3 Goto (I3, b)
I2 A -> • aA
A -> • aA
A -> • b I4 A -> b •
A -> • b
Goto (I2, b)
Goto (I0, a)
I3 I4 A -> b •
A -> a • A
A -> • aA
A -> • b
SLR Parsing Table S -> AA ----------(1)

A -> aA ---------(2)
A -> b ----------(3)
State Action Goto Reduce Action
I4 :A -> b.
a b $ S A Follow(A)=
FIRST(A)= {a,b,$}
0 S3 S4 1 2
1 Accept I5: S->AA.

2 S3 S4 5 Follow(S)= {$}
3 S3 S4 6
I6: A -> aA .
4 R3 R3 R3 Follow(A)=
5 R1 FIRST(A)= {a,b,$}

6 R2 R2 R2
Parsing
Actions : Shift, Reduce, Accept, Error Stat Action Goto
e
Stack Input Action a b $ S A
0 S3 S4 1 2
0 abab $ Shift (S3) 1 Acce
pt
0a3 bab $ Shift (S4) 2 S3 S4 5

0a3b4 ab $ Reduce (R3) A -> b 3


4
S3
R3
S4
R3 R3
6

0a3A6 ab $ Reduce (R2) A -> aA 5 R1


6 R2 R2 R2

0A2 ab $ Shift (S3)


0A2a3 b$ Shift (S4)
0A2a3b4 $ Reduce (R3) A -> b
0A2a3A6 $ Reduce (R2) A -> aA
0A2A5 $ Reduce (R1) S -> AA
0S1 $ Accept
Thank you

You might also like