0% found this document useful (0 votes)
58 views

Parsing

Here are the steps to transform the grammar into LL: 1. Identify and remove left recursion: The grammar does not contain left recursion. 2. Perform left factoring if possible: Left factoring is not possible since there is no common left factor between the productions for E and T. 3. Introduce nullable/first/follow sets: Calculate the nullable, first, follow sets for each non-terminal. 4. Construct LL(1) parsing table: The grammar contains middle recursion and is not LL(1). It cannot be transformed into LL(1) grammar since middle recursion cannot be eliminated by left factoring or introducing iteration. Therefore, this grammar cannot be

Uploaded by

Saif Ullah
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
58 views

Parsing

Here are the steps to transform the grammar into LL: 1. Identify and remove left recursion: The grammar does not contain left recursion. 2. Perform left factoring if possible: Left factoring is not possible since there is no common left factor between the productions for E and T. 3. Introduce nullable/first/follow sets: Calculate the nullable, first, follow sets for each non-terminal. 4. Construct LL(1) parsing table: The grammar contains middle recursion and is not LL(1). It cannot be transformed into LL(1) grammar since middle recursion cannot be eliminated by left factoring or introducing iteration. Therefore, this grammar cannot be

Uploaded by

Saif Ullah
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 158

Top Down Parser

2
 The parse tree is constructed
– From the top
– From left to right

• Terminals are seen in order of


appearance in the token stream:
t2 t5 t6 t8 t9

3
Top-down parser
 Recursive-Descent Parsing
 Backtracking is needed (If a choice of a production rule
does not work, we backtrack to try other alternatives.)
 It is a general parsing technique, but not widely used.
 Not efficient
 Predictive Parsing
 no backtracking
 efficient
 needs a special form of grammars (LL(1) grammars).
 Non-Recursive (Table Driven) Predictive Parser is also
known as LL(1) parser.
 Recursive Predictive Parsing is a special form of Recursive
Descent parsing without backtracking.
4
 Backtracking is needed.
 It tries to find the left-most derivation.

S  aBc
B  bc | b
S S
Input : abc fails, backtrack
a B c a B c

b c b

5
Consider the following production
S → aAb
A → c |cd
Let the input string be acdb.

6
Consider the following production
SBA| AB
Aa| SA
Bb | SB
w= abab

Parse the above w using recursive decent parsing and


find the problem of recursive decent parser
7

When re-writing a non-terminal in a derivation step, a
predictive parser can uniquely choose a production rule
by just looking the current symbol in the input string.

A  1 | ... | n input: ... a .......

current token
 Unlike recursive-descent, predictive parser can “predict”
which production to use.
– By looking at the next few tokens.
– No backtracking.

8
stmt  if ...... |
while ...... |
begin ...... |
for .....
 When we are trying to write the non-terminal stmt, if the current
token is if we have to choose first production rule.
 When we are trying to write the non-terminal stmt, we can
uniquely choose the production rule by just looking the current
token.

9
A → BC
B → DE
D → FG
F → HI
H → xY

First(A) = {x}
Write the sets of the following:
S -> Ty
T -> AB
T -> sT
A -> aA
A -> λ
B -> bB
B -> λ
 Non-Recursive predictive parsing is a table-driven parser.
 It is a top-down parser.
 It is also known as LL(1) Parser.

input buffer

stack Non-recursive output


Predictive Parser

Parsing Table
72
S→Bc|DB
B→ab|cS
D→d|ε
For this grammar:
 Construct FIRST and FOLLOW Sets

 Apply algorithm to calculate parse table


X FIRST(X) FOLLOW(X)
---------------------------------------------------
D { d, ε } { a, c }
B { a, c } { c, $ }
S { a, c, d } { $, c }
Bc { a, c }
DB { d, a, c }
ab {a}
cS {c}
D {d}
Ε {ε }
a b c d $

S Bc Bc DB
DB DB
B

D ε ε

Finish Filling In Table


input buffer
 our string to be parsed. We will assume that its end is marked
with a special symbol $.
stack
 contains the grammar symbols
 at the bottom of the stack, there is a special end marker symbol $.
 initially the stack contains only the symbol $ and the starting
symbol S. $S  initial stack
 when the stack is emptied (i.e. only $ left in the stack), the
parsing is completed.

81
output
 a production rule representing a step of the

derivation sequence (left-most derivation) of the


string in the input buffer.
parsing table
 a two-dimensional array M[A,a]
 each row is a non-terminal symbol
 each column is a terminal symbol & the special symbol $
 each entry holds a production rule.

82
 The symbol at the top of the stack (say X) and the
current symbol in the input string (say a) determine the
parser action.
 There are four possible parser actions.
1. If X and a are $  parser halts (successful completion)
2. If X and a are the same terminal symbol then
 parser pops X from the stack, and moves the next symbol in the input
buffer.
3. If X is a non-terminal
 M [X,a] holds a production rule XY1Y2...Yk, it pushes Yk,Yk-1,...,Y1 into
the stack. The parser also outputs the production rule XY1Y2...Yk to
represent a step of the derivation.
4. none of the above  error
 all empty entries in the parsing table are errors.
 If X is a terminal symbol different from a, this is also an error case. 83
stack input output
$E id+id$ E  TE’ id + $
$E’T id+id$ E’
T  FT E
$E’ T’F id+id$ F  id TE’
$ E’ T’id id+id$
$ E ’ T’ +id$
E
T’  

E’  E’  
$ E’ +id$ E’  +TE’ +TE’
$ E’ T+ +id$ T T
$ E’ T id$ T  FT’ FT’
$ E ’ T’ F id$ F  id ’
T T’   T’  
$ E’ T’id id$
$ E ’ T’ $ T’  
$ E’ $
F
E’  
F
$ $ accept id
141
a b $
S  aBa LL(1) Parsing
B  bB |  S S  aBa Table
w =abba
B B B  bB
stack input output
$S abba$ S  aBa
$aBa abba$
$aB bba$ B  bB
$aBb bba$
$aB ba$ B  bB
$aBb ba$
$aB a$ B
$a a$
$ $ accept, successful completion

142
Outputs: S  aBa B  bB B  bB B

Derivation(left-most): S  aBa  abBa  abbBa  abba

S
parse tree

a B a

b B

b B


143
PROGRAM → begin DECLIST comma STATELIST
end
DECLIS → d semi DECLIST
DECLIST → d
STATELIST → s semi STATELIST
STATELIST → s
After left factoring, the grammer is changed to

PROGRAM → begin DECLIST comma STATELIST end


DECLIST → dX
X → semi DECLIST | є
STATELIST → sY
Y → semi STATELIST | є
PROGRAM → begin DECLIST comma STATELIST
end
DECLIST → dX
X → semi DECLIST | є
STATELIST → sY
Y → semi STATELIST | є

First(X) = {semi, є} Follow(X) =


{comma}
First(Y) = {semi, є} Follow(Y) = {end}

Write functions for each nonterminal.


main()
{
token = lexical();
PROGRAM();
}
Viod PROGRAM
{
if (token != begin) error();
token = lexical();
DECLIST();
if (token != comma) error();
token = lexical();
STATELIST();
if (token != end) error();
}
void DECLIST()
{
if (token != d) error;
token = lexical();
X();
}
void X()
{
if (token == semi)
{
token = lexical();
DECLIST();
}
else
if (token == comma) ; // do nothing
else error();
}
void STATELIST()
{
if (token != s) error();
token = lexical();
Y();
}

Void Y()
{
if (token == semi)
{
token = lexical();
STATELIST();
}
else
if (token == end) ; // do nothing
else error();
}
PROGRAM → begin DECLIST comma STATELIST
end
DECLIST → dX
X → semi DECLIST | є
STATELIST → sY
Y → semi STATELIST | є

Change productions into an extended notation that


includes the *.

PROGRAM → begin DECLIST comma STATELIST


end
DECLIST → d (semi d)*
STATELIST → s (semi s)*
void DECLIST()
{ if (token != d) error();
token = lexical();
while (token == semi)
{
token = lexical();
if (token != d) error();
token = lexical();
}
}
void STATELIST()
{ if (token != s) error();
token = lexical();
while (token == semi)
{
token = lexical();
if (token != s) error();
token = lexical();
}
}
Removal of recursion is not always possible. A context
free grammar might contain middle recursion and this
can not be replaced by iteration. For example

E→ E ‘+’ T
E→ T
T→ T ‘*’ F
T→ F
F→ ‘(‘ E ‘)’
F→ ‘x’
E→ E ‘+’ T
E→ T
T→ T ‘*’ F
T→ F
Transforming the grammar into LL(1) F→
F→
‘(‘ E ‘)’
‘x’
E → TX
X → ‘ +’ TX | є
T → FY
Y → ‘*’ FY | є
F → ‘(‘ E ‘) | ‘x’

Replacing recursion by iteration, where possible, we


have
E → T( ‘+’ T)*
T → F(‘*’ F)*
F → ‘(‘ E ‘)’ | ‘x’
void E()
E → T( ‘+’ T)*
{
T(); T → F(‘*’ F)*
while (token == plus) F → ‘(‘ E ‘)’ | ‘x’
{
token = lexical();
T();
}
}

Void T()
{
F();
while (token == Times)
{
token = lexical();
F();
}
}
Void F()
{ E → T( ‘+’ T)*
if (token == obracket) T → F(‘*’ F)*
{
token = lexical(); F → ‘(‘ E ‘)’ | ‘x’
E();
if (token == cbracket)
token = lexical();
else
error();
}
else if (token == x)
token = lexical();
else
error();

main()
{
token = lexical(;
E();
}

You might also like