0% found this document useful (0 votes)

56 views

Chapter 3 B Top-Down Parsing

The document discusses top-down parsing and recursive descent parsers. It begins by introducing top-down and bottom-up parsing strategies. It then discusses recursive descent parsers, which implement top-down parsing by using a set of mutually recursive functions. Recursive descent parsers are a type of LL(k) parser that uses leftmost derivations with k symbols of lookahead. The document explains how to build a recursive descent parser by first determining the first and follow sets of the grammar's nonterminals. It provides algorithms for calculating first and follow sets and gives an example of applying the algorithms to a sample grammar.

Uploaded by

Sola

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

56 views

Chapter 3 B Top-Down Parsing

Uploaded by

Sola

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 49

Chapter Three continue

Top-Down Parsing

1
Objective
At the end of this session students will be able to:
 Understand the basics of Parsing techniques(Top-Down Vs. Bottom-Up parsing).

 Understand the Recursive Descent Parsers: First and Fellow sets and how to find

fisrt and fellow sets of a parser, LL(1) parse Tables.

 Understand about Grammars that are not LL(1): Removing Ambiguity, Removing

left recursion, left factoring

 Be familiar with LL(k) grammars.

2
Parsing and Parsers
 Once we have described the syntax of our programming language using a
context-free grammar, the next step is to determine if a string of tokens
returned from the lexical analyzer could be derived from that context-free
grammar
 Determining if a sequence of tokens is syntactically correct is called parsing

 Two main strategies:

A. Top-down parsing: Start at the root of the parse tree and grow toward

leaves
o Pick a production rule & try to match the input

o Bad “pick”  may need to backtrack

B. Bottom-up parsing: Start at the leaves and grow toward root (earlier

parsers: e.g. yacc)

3
o As input is consumed, encode possible parse trees in an internal state
BothConti..
top-down and bottom-up parser scan the input from left to
right
(one symbol at a time).
Efficient top-down and bottom-up parsers can be implemented by
making
use of context-free- grammar.
 LL for top-down parsing
 LR for bottom-up parsing
We will see that the top-down parser try to find the left-most
derivation of
the given source program.
We will see that the bottom-up parser try to find right-most
derivation of the
given source program in the reverse order.
Contd.

LL(1) parsers, recursive descent Parsers

Left-to-right input
Grammars that this can
Leftmost derivation
handle are called LL(1)
1 symbol of look-ahead
grammars
LR(1) parsers, operator precedence
Left-to-right input
Rightmost derivation Grammars that this can
1 symbol of look-ahead handle are called LR(1)
grammars
Also: LL(k), LR(k), SLR, LALR, …

5
Recursive Descent Parsers
 Top-down parsers are usually implemented as a mutual recursive suite of functions
that descend through a parse tree for the string, and as such are called “recursive
descent parsers” (RDP).
 Recursive descent parsers fall into a class of parsers known as LL(k) parsers
 LL(k) stands for Left-to-right, Leftmost-derivation, k-symbol lookahead parsers
 We first examine LL(1) parsers – LL parsers with one symbol lookahead
 To build the RDP, at first, we need to create the “First” and “Follow” sets of the
non-terminals in the CFG.
Example: A CFG and a CFG in its RDP form Terminals = { e, f, g , h, i }
Non-Terminals = {S',S, A, B, C, D }
Terminals = { id, num, while, print,>, {, }, ;, (, ) }
Rules = (0) S'  S$
Non-Terminals = { S, E, B, L }

orm
(1) S  AB
orm

Rules = (1) S  print(E); (2) S  Cf

Pf
Pf

(2) S  while (B) S

RD
(3) A  ef
RD

(3) S  { L } (4) A  ε

it s
its

(5) B  hg

in
(4) E  id
t in

FG
(5) E  num (6) C  DD
no

C
(6) B  E > E (7) C  fi
G

b)
CF

(7) L  SL|ε (8) D  g

6
Start Symbol = S'
a)

Start Symbol = S
First Sets and Follow Sets

Goal:- Given productions A → a |b , the parser should be able to choose

between a and b
How can the next input token help us decide?
Solution: FIRST sets
Informally: FIRST(a) is the set of tokens that could appear as the first
symbol in a string derived from a
Def: x in FIRST(a) iff a → x g

First Sets

 The First set of a non-terminal A is the set of all terminals that can

begin a string derived from A

7  If the empty string ε can be derived from A, then ε is also in the First
Contd.
For instance, given the CFG below ($ is an end-of-file marker, ε means empty
string ) :  In this grammar, the set of all strings
derivable from the non-terminal S’ are
Terminals = { e, f, g , h, i }
{efhg, fi, gg, hg}
Non-Terminals = {S',S, A, B, C, D }  Thus, the First(S’) = {e,f,g,h}, where
e,f,g and h are the first terminal of each
Rules = (0) S'  S$ string in the above terminal set,

(1) S  AB|Cf respectively

 Similarly, we can derive the First sets of
(3) A  ef|ε S, A, B, C and D as follows:
First(S) = {e,f,g,h} First(DD) = {g}
(5) B  hg First(A) = {e, ε} First(AB) = {e, h}
First(B) = {h} First(efB) = {e}
(6) C  DD|fi First(C) = {f,g} First(AC) = {e, f,
8 (8) D  g First(D) = {g} g}
Contd.

Fellow Sets
 For each non-terminal in a grammar, we can also create a Follow set.

 The Follow set for a non-terminal A in a grammar is the set of all terminals

that could appear right after A in a valid sentence while driving it.
 Take another look at the CFG shown in slide no. 5 above, what terminals

can follow A in a derivation?

o Consider the derivation S’  S$  AB$  Ahg$, since h follows A in

this derivation, h is in the Follow set of A. Note: $ is the end-of-file

marker.
o What about the non-terminal D? Consider the partial derivation: S’ 

9 S$  Cf$  DDf$  Dgf$.

Contd.
 The follow sets for all non-terminals in the CFG are shown below:

Fellow(S') = { }
Fellow(S) = {$}
Fellow(A) = {h}
Fellow(B) = {$}
Fellow (C) = {f}
Fellow (D) = {f, g}

10
Finding First and Follow Sets
 To calculate the First set of a non-terminal A, we need to calculate the First set
of a string of terminals and non-terminals, since the rules that define what a
non-terminal can derive contain terminals and non-terminals.
 The First set of a string of terminals and non-terminals  can be defined

recursively using the following two algorithms:

Algorithm to calculate First( ), for a string of Terminals and Non-Terminals
 If = ε then First()= ε
 If the first symbol in  is the terminal a, then First()={a}
 If = A' for some non-terminal A, and (possibly empty) string of terminals
and non-terminals ':
o If First(A) does not contain ε, then First()=First(A)
o If First(A) does contain ε, then First()=(First(A)-{ε})  First(')
11
Contd.
Algorithm to calculate First sets for all Non-Terminals in CFG G
1. For each non-terminal A in G, set First(A)= { }
2. For each rule A (where  is a string of terminals and non-terminals), add
all elements of First() to First(A). That is:
o If = ε add ε to First(A)
o If the first character in  is the terminal a, then add a to First(A)
o If = A1' for some non-terminal A1, and First(A1) does not contain ε,

then add all elements of First(A1) to First(A)

o If = A1' for some non-terminal A1, and First(A1) does contain ε, then

add all elements of First(A1) (other than ε) to First(A), and recursively

add all elements of First(') to First(A)

12 3. If any changes were made in step 2, go back and to 2 and repeat

Example
Consider the CFG given on slide no. 7 above is G.
o Initially, for each non-terminal A in G we set First(A) = { }, an empty set
o We then go through one iteration of the algorithm, which will modify the
first set as follows:
(0) S'  S$. Add { } to First(S' ) ={ } (no change) Non-Terminals First Set
(1) S  AB. Add { } to First(S) = { } (no change)
S' {}
(2) S  C. Add { } to First(S) = { } (no change)
S {}
(3) A  ef. Add e to First(A) = {e}
(4) A  ε. Add ε to First(A) = {e, ε} A {e, ε}
(5) B  hg. Add h to First(B) = {h} B {h}
(6) C  DD. Add { } to First(C) = { } (no change)
C {f}
(7) C  fi. Add f to First(C) = {f}
(8) D  g. Add g to First(D) = {g} D {g}

Note: Since there were 5 changes, we need another iteration

13
Contd.

(0) S'  S$. Add { } to First(S’) = { } (no change)

Non-Terminals First Set
(1) S  AB. Add e,h to First(S) = {e, h}
S' {}
(2) S  C. Add f to First(S) = {e, h, f}
S {e, f, h}
(3) A  ef. Add e to First(A) = {e, ε} (no change)
(4) A  ε. Add ε to First(A) = {e, ε} (no change) A {e, ε}

(5) B  hg. Add h to First(B) ={h} (no change) B {h}

(6) C  DD. Add g to First(C) = {f, g} C {f, g}

(7) C  fi. Add f to First(C) = {f, g} (no change) D {g}

(8) D  g. Add g to First(D) ={g} (no change)

Note: Since there were 3 changes, we need another iteration

14
Contd.

(0) S’  S$. Add e,f,h to First(S’) = {e, f, h}

Non-Terminals First Set
(1) S  AB. Add e,h to First(S) = {e, f, h} (no change)
S' {e, f, h}
(2) S  C. Add f,g to First(S) = {e, h, f, g}
S {e, f, g, h}
(3) A  ef. Add e to First(A) = {e, ε} (no change)
(4) A  ε. Add ε to First(A) = {e, ε} (no change) A {e, ε}

(5) B  hg. Add h to First(B) = {h} (no change) B {h}

(6) C  DD. Add g to First(C) = {f, g} (no change) C {f, g}

(7) C  fi. Add f to First(C) = {f, g} (no change) D {g}

(8) D  g. Add g to First(D) = {g} (no change)

Note: Since there were 2 changes, we need another iteration

15
Contd.

(0) S'  S$. Add e, f, g, h to First(S’) = {e, h, f, g}

Non-Terminals First Set
(1) S  AB. Add e, h to First(S) = {e, f, h} (no change)
S' {e, f, g, h}
(2) S  C. Add f, g to First(S) = {e, h, f, g} (no change)
S {e, f, g, h}
(3) A  ef. Add e to First(A) = {e, ε} (no change)
(4) A  ε. Add ε to First(A) = {e, ε} (no change) A {e, ε}

(5) B  hg. Add h to First(B) = {h} (no change) B {h}

(6) C  DD. Add g to First(C) = {f, g} (no change) C {f, g}

(7) C  fi. Add f to First(C) = {f, g} (no change) D {g}

(8) D  g. Add g to First(D) = {g} (no change)

Note: Since there was 1 change, we need another iteration

16
Contd. Since there were no changes, we stop
(0) S’  S$. Add e, f, g, h to First(S’) (no change)
Non-Terminals First Set
(1) S  AB. Add e, f, h to First(S) (no change)
S' {e, f, g, h}
(2) S  C. Add f, g to First(S) (no change)
S {e, f, g, h}
(3) A  ef. Add e to First(A) (no change)
(4) A  ε. Add ε to First(A) (no change) A {e, ε}

(5) B  hg. Add h to First(B) (no change) B {h}

(6) C  DD. Add g to First(C) (no change) C {f, g}

(7) C  fi. Add f to First(C) (no change) D {g}
(8) D  g. Add g to First(D) (no change)

Note that within each iteration we can examine the rules in any order.
 If we examine the rules in a different order in each iteration, we will still achieve
the same result, but may take a different number of iterations.

17  Check that an order of iteration 8,7,6,5,4,3,2,1,0 requires fewer number of

iteration?
Finding Follow Sets for Non-Terminals
 If the grammar contains the rule: S  Aa, then a is in the Follow set of A,
since a appears immediately after A.
 If the grammar contains the rules:
S AB
Ba│b
then both a and b are in the Follow set of A, Why?
Consider the following two partial derivations:
S  AB  Aa
S  AB  Ab
So both a and b are in the Follow set of A.
S  ABC AaC
If the grammar contains the rules:
S  ABC AbC
S  ABC S  ABC AC Ac
Ba│b│ε S  ABC AD Ad
Cc│d
18
then a, b, c and d are all in the Follow set of A. Why?
Contd.
Consider the grammar:
S  Ax S  Ax Cx

AC
then x is in the Follow sets of A and C. Why?
Consider another grammar:
S  Ax
A  CD
Dε
then x is in the Follow sets of A, C and D. Why?
S  Ax CDx CDx

 The above examples lead us to the following method for finding the

19 follow sets for the non-terminals in a CFG

Finding First and Follow Sets
 We can calculate the Follow sets from the First sets by using the recursive
algorithm given below:
Algorithm to calculate Fellow sets for all Non-Terminals in CFG G
1. Calculate First(A) for all non-terminals A in G
2. Set Fellow(A)={ } for all non-terminals A in G
3. For each rule A in G(where A is a non-terminal and  is a string of
terminals and non-terminals). For each non-terminal A1 in 
i. If the rule is of the form AA1, where  and  are (possibly
empty) strings of terminals and non-terminals, and First() does not
contain ε, then add all elements of First() to Fellow(A1)
ii. If the rule is of the form AA1, where  and  are (possibly
empty) strings of terminals and non-terminals, and First() does
contain ε, then add all elements of First() except ε to Fellow(A1)
20 and add all elements of Fellow(A) to Fellow(A1)
Example

Consider the following CFG, for which we calculate the First sets for all non-

terminals: The First sets of non-terminals are:

Terminals = { a, b, c , d} S' = {a,b}; S = {a,b}; T = {a, ε}; U =
Non-Terminals = {S’,S, T, U, V} {b}; and V = {b,d}
Rules = (0) S'  S$
Non-Terminals First Set
(1) S  TU
S' {a, b}
(2) T  aVa
S {a, b}
(3) T  ε
T {a, ε}
(4) U  bVT
U {b}
(5) V  Ub
V {b, d}
(6) V  d
21
Start Symbol = S'
Contd.
Initially, for each non-terminal A in G we set Fellow(A) = { }, an empty set
(0) S'  S$ Add {$} to Follow (S) = {$}.
Non-Terminals Fellow Set
(1) S  TU Add First(U), {b}, to Follow(T) = {b}
S' {}
Add Follow(S), {$}, to Follow(U) = {$}
S {$}
(2) T  aVa Add {a} to Follow(V) = {a}
T {b, $}
(3) T  ε (no change)
U {b, $}
(4) U  bVT Add First(T), {a}, to Follow(V) = {a}
V {a, $}
Add Follow(U), {$}, to Follow(T) = {b, $}
Add Follow(U), {$}, to Follow(V) = {a, $} (for T  ε)
(5) V  Ub Add {b} to Follow(U) = {b, $}
(6) V  d (no change)
The Follow sets of non-terminals are:
S’ = { }; S = {$}; T = {b, $}; U = {b, $}; and V = {a,$}
22
Note: Since there were some changes, we need another iteration
Contd.
(0) S’  S$ Add $ to Follow (S) = {$}. (no change)
(1) S  TU Add First(U), b, to Follow(T) = {b, $} (no change)
Add Follow(S), {$}, to Follow(U) = {b, $} (no change)
(2) T  aVa Add {a} to Follow(V) = {a, $} (no change)
(3) T  ε (no change)
(4) U  bVT Add First(T), a, to Follow(V) = {a, $} (no change)
Non-Terminals Fellow Set
Add Follow(U), {$}, to Follow(T) = {b, $} (no change) S' {}
Add Follow(U), {b, $}, to Follow(V) = {a, b, $} (for T  ε) S {$}
(5) V  Ub Add b to Follow(U) = {b, $} (no change) T {b, $}
(6) V  d (no change) U {b, $}

The Follow sets of non-terminals are: V {a, b, $}

S’ = { }; S = {$}; T = {b, $}; U = {b, $}; and V = {a, b, $}

Note: Since there was 1 change, we need another iteration

23
Contd.
(0) S’  S$ Add $ to Follow S = {$}.. (no change)
(1)S  TU Add First(U), b, to Follow(T ) = {b, $} (no change)
Add Follow(S), {$}, to Follow(U) = {b, $} (no change)
(2) T  aVa Add a to Follow(V) = {a, $} (no change)
(3) T  ε (no change)
(4) U  bVT Add First(T), a, to Follow(V) = {a, $} (no change)
Add Follow(U), {$}, to Follow(T) = {b, $} (no change) Non-Terminals Fellow Set

Add Follow(U), {b, $}, to Follow(V) = {a, b, $} (no change) S' {}

(5) V  Ub Add b to Follow(U) = {b, $} (no change) S {$}

(6) V  d (no change) T {b, $}

U {b, $}
The Follow sets of non-terminals are: V {a, b, $}
S’ = { }; S = {$}; T = {b, $}; U = {b, $}; and V = {a, b, $}
24
Note: Since there were no changes, we stop.
LL(1) Parse Tables
 Once we have First and Follow sets for all non-terminals in the grammar, we
can create a Parse Table
 A parse table is a blueprint for the creation of a recursive descent parser (RDP)
o The rows in the parse table are labeled with non-terminals and the
columns are labeled with terminals
o Each entry in the parse table is either empty or contain a grammar rule
 The rule located at row S, column a of a parse table tells us which rule to apply
when we are trying to parse the non-terminal S, and the next symbol in the
input is an a
id for the
 For instance, numgrammar while print
in slide no. 5 (a), > table{ is:
the parse } ; ( )
S Swhile(B) S S print(E) S {L}
E Eid Enum
B BE>E BE>E
25 L LSL LSL LSL Lε
Contd.
 Once we have the parse table for a CFG, creating a recursive descent parser is
easy.
o We need to write a function for each non-terminal S in the grammar.
o The row labeled S in the parse table will tell us exactly what the function
parse S needs to do.
Creating Parse Tables
A parse table is created as follows:
1. The rows of the parse table are labeled with the non-terminals of the
grammar.
2. The columns of the parse table are labeled with the terminals of the grammar

3. Each entry of the parse table is either empty or contains grammar rule.
26 o Place each rule of the form S  γ in row S in each column in First(γ),
Example

Consider again the CFG in slide no 7 (b). We have the First and Follow sets of

each non-terminal:
(0) S'  S$ First(S')={e,f,g,h}. S'  S$ goes in row S',
columns e, f, g, h
Non- First Follow
terminal (1) S  AB First(AB)={e, h}. S AB goes in row S,

S' {e,f,g,h {} columns e, h

} (2) S  C First(C)={f, g}. S C goes in row C, columns f,
S {e,f,g,h {$} g
}
(3)A  ef First(ef)={e}. A ef goes in row A, column e
A {e,ε} {h}
(4)A ε First(ε)={ε}
B {h} {$}
Follow(A)={h} A ε goes in row A, column h
C {f,g} {f}
(5) B  hg First(hg)={h}. B hg goes in row B, column h
D {g} {f,g}
(6) C  DD First(DD)={g}. C DD goes in row C, column
g
27 The Resulting parse table is shown on next slide.
(7) C  fi First(C)={f}. C fi goes in row C, column f
Contd.

e f g h i

S' S'  S$ S'  S$ S'  S$ S'  S$

S S  AB SC SC S  AB

A A  ef Aε

B B  hg

C C  fi C  DD

D Dg

28
Example 2
Given the following CFG create the LL(1) of the CFG :

Terminals = { id, num, (, ), ;, if, else, ,, $}

Non-Terminals = {S’, S, L, C, E}
Rules = (0) S'  S$
Find the First and Fellow sets of each non-
(1) S  id(L);
terminal?
(2) S  if(E) S else S
Non-terminal First Follow
(3) L  ε
(4) L  E C S' {id, if} {}

(5) C  ε S {id, if} {$,else}

(6) C  , E C L {id, num, ε} {)}
(7) E  id C {,, ε} {)}
(6) E  num
E {id, num} { ), ,}
29 Start Symbol = S'
Contd.
Given the above First and Fellow sets, the parse table for the CFG is created as
id num ( ) ; if else ,
follows:
S' S'  S$ S'  S$
S S S  if(E) S else
id(L) S
L L E C L EC Lε
E C C,E
ε C
E E  id E
num
Note that we only need to compute Fellow sets for an LL(1) parser if at least one
First contains ε.
o Fellow sets are only used in creation of the parse table for rules of the form S
 γ, where First(γ) contains ε.
o Fellow sets are not necessary if no such rule exists. However, if there exists at
30
least one rule, then we still need to create the fellow sets of all non-terminals in
LL (1) Parser…
Exercise 2:
Let G be the following grammar:
S  [ SX ] | a
X  ε | +SY | Yb
Y  ε | -SXc
A – Find FIRST and FOLLOW sets for the non-terminals in this
grammar.
B – Construct predictive parsing table for the grammar above.
C – Show a top down parse of the string [a+a-ac]
Grammars That Are Not LL(1)
 If we can build an LL(1) parse table for a grammar that has no duplicate
entries, then we say that grammar is LL(1).
o Unfortunately, not all grammars are LL(1). For instance, the following
grammar is not LL(1) grammar. Parse Table for the
Grammar
+ - * / % id
Terminals = { id, +, - , *, /, % }
E  id
Non-Terminals = {E}
EE+E
Rules = (0) E  id
EE-E
(1) E  E + E|E - E E
EE*E
(3) E  E * E|E / E |E % E EE/E
Start Symbol = E EE%E
 The parse table includes only one non-terminal E, but it has 6 entries in the id
column.
o Hence, the above grammar is ambiguous and we can not create an LL(1)
32 parser for it.
o
Ambiguity
 A grammar produces more than one parse tree for a sentence is
called as an ambiguous grammar.
• produces more than one leftmost derivation or
• more than one rightmost derivation for the same sentence.
 We should eliminate the ambiguity in the grammar during the
design
phase of the compiler.
 An unambiguous grammar should be written to eliminate the
ambiguity.
Removing Ambiguity
There are four ways in which ambiguity can creep into (get into) a CFG for a
programming language:
1. Defining expressions:- the straightforward definition of expressions will
often lead to ambiguity, such as the one that we have seen in slide # 31
above.
2. Defining complex variables:- complex variables, such as instance variables
in classes, fields in records or structures, array subscripts and pointer
references, can also lead to ambiguity. Example V  id|V.V
3. Overlap between specific and general cases: For example CFG ,
The terminal id has several leftmost
Terminals = { id, +, - , *, /, % } derivations (and hence several parse
Non-Terminals = {E, T, F} trees):

34 Rules =(0) E  E+T|E-T|T|id E  T , E  T  id, E  T  F 

(1) T T*F|T/F|T%F|F|id id
Contd.
4. Nesting statements:- the most common instance of nesting statements
causing ambiguity is the infamous "dangling else", whose CFG is shown
below: Draw the
Parse tree
S  if e then S else S |S  if e then S for the CFG?
S  a|b
The above CFG has two parse trees
 It is not always possible to remove ambiguity from a context-free grammar.
o There are some languages that are inherently ambiguous. That is, there
exists a language L, such that all CFGs that generate L are ambiguous.
 Inherent ambiguity is not a problem that compiler designers usually
need to face. i.e. no major programming language is inherently
ambiguous.
35
o There is no algorithm that will always remove ambiguity from a context-
Left recursion
 A grammar is left recursive, if it has a non-terminal A such that
there is a derivation
A=>Aα for some string α.
 Top-down parsing methods cannot handle left-recursive
grammar.
 so a transformation that eliminates left-recursion is needed.
 To eliminate left recursion for single production
A  Aα |β could be replaced by the non left- recursive
productions
A  β A’
A’  α A’| ε
Removing Left Recursion
 An unambiguous grammar may still not be LL(1). Consider the unambiguous
expression grammar below:
Terminals = { id, +, - , *, /, % }  Though this CFG is unambiguous, it is not
Non-Terminals = {E, T, F} LL(1). In order for a CFG to be LL(1), it
Rules =(1) E  E+T
must be possible to decide which rule to apply
(2) E  E - T
after looking at only the leftmost symbol of a
(3) E  T
(4) T  T*F T T/F string.

(5) T  T/F  On seeing that rules an id, we cannot tell if

(6) T  F we should apply rule (1), (2), or (3).
(7) F  (E)
 The problem with this CFG is (1), (2) are
(8) F  id
Start Symbol = E left-recursive.
 A rule S  α (where S is a non-terminal
No left-recursive grammar
37
and α is a string of terminals and non-
is LL(1)
Contd.
Consider the following CFG fragment:
(1) S  Sα
(2) S  β
What strings can be derived from S? Consider the following partial derivations:
SSαSααSαααβααα
Any string that can be derived from S will be a string that can be derived from α
followed by zero or more strings that can be derived from β. Using EBNF
notation, we have:
S  β(α)*
Using CFG notations, we have:
S  βA We have removed the left-recursion in
A  αA the above example!!
38
Aε
Contd.

Let’s take a closer look at the

In general, the set of rules of the form: expression grammar:
EE+T
S  Sα1; S  Sα2 ; S  Sα3 ; ..... ; S 
EE–T
Sαn ET
Using the above transformation, we
S  β1 ; S  β2 ; S  β3 ; ….. ; S  βn
get the following CFG, which has no
Can be rewritten as: left-recursion:
E  TE'
S  BA
E'  +TE'
B  β1│β2│β3│…..│βn E'  -TE'
E'  ε
A  α1A│α2A│α3A│.....│αnA Using EBNG notations, we have:
39
E  T((+E)│(-E))*
Removing Left Factoring

 Even if a CFG is unambiguous and has no left-recursion, it still may not be

 LL(1).
When a non-terminal has two or more productions whose right-
hand sides start with the same grammar symbols,(common
prefix) the grammar is not LL(1) and cannot be used for
predictive parsing
 A predictive parser (a top-down parser without backtracking)

insists that the grammar must be left-factored.

In general : A  αβ1 | αβ2 , where α-is a non empty and the first

symbol of β1 and β2.

40
Contd.
 When processing α we do not know whether to expand A to αβ1 or
to αβ2, but if we re-write the grammar as follows:
A  αA’
A’  β1 | β2 so, we can immediately expand A to αA’.

 Example: given the following grammar:

S  iEtS | iEtSeS | a
Eb
 Left factored, this grammar becomes:
S  iEtSS’ | a
S’  eS | ε
Eb

41
Contd...

The following stmt  if expr then stmt else stmt

grammar:
| if expr then stmt

Cannot be parsed by a predictive parser that looks

one element ahead.
But the grammar stmt  if expr then stmt stmt’
can be re-written: stmt‘ else stmt | 
Where  is the empty string.
Rewriting a grammar to eliminate multiple productions
starting with the same token is called left factoring.

42
LL(K) Parsers
 LL(1) parser needs to decide which rule to apply after looking at only one
token
 If more than one single token is required to determine which rule to apply,
then the grammar is not LL(1)
For instance, consider the following simple CFG:
Terminals = {a, b, c}
Non-terminals = {S}
Using EBNF notations to get:
Rules = (1) S  abc S->a(bc|cd)

(2) S  acb
Start symbol = S
 This grammar is not LL(1),
a since the bLL(1) parse
c table has duplicate
43 entries:S S abc; S  acb
Contd.
 When trying to parse a string derivation from an S, we cannot tell
which rule to apply by looking at a single symbol; since all strings
derivable from S start with a.
 We could left-factor the grammar to make it LL(1)
 We also could modify our parser so that it examines the first two
elements in the string to determine which rule to apply
 The resulting parse table would be much larger as follows:

aa ab ac ba bb bc ca cb cc
S S abc S  acb

44
Contd.
 An LL(k) parser examines the first k symbols in the input before
determining which rule to apply
 In order to create an LL(k) parser, we need to generalize the definitions of
First and Follow sets
 Our definitions of generalized First and Follow sets will use the concept of
k-prefix
Definition1 k-prefix: The k-prefix of a string of terminals w is a string
consisting of the first k terminals in w. If │w│≤ k, then the k-prefix of w
is w.
 The k-prefix of a set of strings is the set of K-prefixes of all strings in the

45 set.
Contd.

Definition 2 Firstk: The Firstk set of a non-terminal S is the k-prefix of the set

of all strings of terminals derivable from S.

The Firstk set of a string of terminals and non-terminal γ is the k-prefix of all

strings of terminals derivable from γ.

Definition 3 Followk: The Followk set of a non-terminal S is the k-prefix of

the set of all strings of terminals that follow S in a partial derivation.

46
Algorithm to calculate Firstk for non-terminals
1. For each non-terminal S in G, set Firstk(S) = { }

2. For each rule S  γ in G, add all elements of k-prefix(Firstk(γ)) to

Firstk(S)
3. If any changes were made in step 2, go back to step 2 and repeat
Algorithm to calculate Firstk for a string of terminals and non-terminals:

1. For any terminal a, Firstk(a) = {a}

2. For any string of terminals and non-terminals γ = γ1γ2γ3…γn, Firstk(γ) = k-

prefix( Firstk(γ1) ○ Firstk(γ2) ○ Firstk(γ3) ○ … Firstk(γn))

47
Algorithm to calculate Followk for non-terminals
1. Calculate Firstk(S) for all non-terminals S in G

2. Set Followk = { } for all non-terminals S in G

3. For each rule S  γ in G
For each non-terminal S1 in γ where γ = α S1β, add

[k-prefix(Firstk(β) ○ Followk(S))] to Followk(S1). If Followk(S) = { },

add [k-prefix(Firstk(β))] to Followk(S1).

4. If any changes were made in step 3, go back to step 3 and repeat

48
End of slide!!

Montessori Language Album Elementary 6 9 PDF
100% (4)
Montessori Language Album Elementary 6 9 PDF
94 pages
Lisp Interpreter in Rust
From Everand
Lisp Interpreter in Rust
Vishal Patil
1/5 (1)
Chords for Guitar: Transposable Chord Shapes using the CAGED System
From Everand
Chords for Guitar: Transposable Chord Shapes using the CAGED System
Gareth Evans
5/5 (2)
Obu SLS 65
85% (13)
Obu SLS 65
22 pages
Prepositions Grade 4 2021-22
73% (11)
Prepositions Grade 4 2021-22
4 pages
Chapter 3 - Syntax Analysis (Parsers) Part Two
No ratings yet
Chapter 3 - Syntax Analysis (Parsers) Part Two
24 pages
Crafting A Compiler With C (VIII) : The LL Grammar Class
No ratings yet
Crafting A Compiler With C (VIII) : The LL Grammar Class
18 pages
Lab Compilers
No ratings yet
Lab Compilers
5 pages
naju compiler
No ratings yet
naju compiler
19 pages
CD Unit3
No ratings yet
CD Unit3
74 pages
04 Parsing
No ratings yet
04 Parsing
330 pages
4 Predctive Parser
No ratings yet
4 Predctive Parser
59 pages
4 Parsing
No ratings yet
4 Parsing
55 pages
Lecture#16, Chapter 04 (Part II)
No ratings yet
Lecture#16, Chapter 04 (Part II)
20 pages
Lecture 8
No ratings yet
Lecture 8
20 pages
03_PARSING
No ratings yet
03_PARSING
71 pages
parsing technique baar baar
No ratings yet
parsing technique baar baar
29 pages
First and Follow Set
86% (7)
First and Follow Set
5 pages
CC 4
No ratings yet
CC 4
47 pages
CD Unit-3
No ratings yet
CD Unit-3
146 pages
Top-Down Parsing: - The Parse Tree Is Created Top To Bottom. - Top-Down Parser
No ratings yet
Top-Down Parsing: - The Parse Tree Is Created Top To Bottom. - Top-Down Parser
36 pages
unit7
No ratings yet
unit7
34 pages
Compiler Construction: Lecture 5 - Top-Down Parsing
No ratings yet
Compiler Construction: Lecture 5 - Top-Down Parsing
26 pages
Top Down Parsing Summary
No ratings yet
Top Down Parsing Summary
5 pages
Lecture3 Java
No ratings yet
Lecture3 Java
82 pages
Compiler Unit2
No ratings yet
Compiler Unit2
89 pages
CD_Parser
No ratings yet
CD_Parser
40 pages
UNIT-2-2
No ratings yet
UNIT-2-2
26 pages
Top-Down Parsing: - The Parse Tree Is Created Top To Bottom. - Top-Down Parser
No ratings yet
Top-Down Parsing: - The Parse Tree Is Created Top To Bottom. - Top-Down Parser
31 pages
FIRST Set in Syntax Analysis: Lecture-05
No ratings yet
FIRST Set in Syntax Analysis: Lecture-05
14 pages
Compilers Lecture 7
No ratings yet
Compilers Lecture 7
21 pages
First and Follow Predictive Parser: Compiler Design
75% (8)
First and Follow Predictive Parser: Compiler Design
13 pages
CD Project: Topic: Implementation of LL (1) Parser
No ratings yet
CD Project: Topic: Implementation of LL (1) Parser
20 pages
CSC 4181 Compiler Construction Parsing
No ratings yet
CSC 4181 Compiler Construction Parsing
53 pages
Lecture 05
No ratings yet
Lecture 05
59 pages
Parser Lec4
No ratings yet
Parser Lec4
21 pages
SPCC Writeup 1
No ratings yet
SPCC Writeup 1
4 pages
Chapter 3-Syntax Analysis-II
No ratings yet
Chapter 3-Syntax Analysis-II
28 pages
Theory of Computation and Compiler Design: Module - 4
No ratings yet
Theory of Computation and Compiler Design: Module - 4
31 pages
Parsing
No ratings yet
Parsing
38 pages
Syntax Analysis I 2024
No ratings yet
Syntax Analysis I 2024
38 pages
Module 4 - Top down Parsing
No ratings yet
Module 4 - Top down Parsing
31 pages
CS351 Context Free Grammars
No ratings yet
CS351 Context Free Grammars
9 pages
3 Syntax Analysis
No ratings yet
3 Syntax Analysis
42 pages
Theory of Computation: Automata Theory (CFG, CFL, CNF)
No ratings yet
Theory of Computation: Automata Theory (CFG, CFL, CNF)
39 pages
Lecture9 - First and Follow
No ratings yet
Lecture9 - First and Follow
4 pages
Lecture 08
No ratings yet
Lecture 08
24 pages
Assignment # 2
No ratings yet
Assignment # 2
5 pages
3 Syntax Analysis - Top Down Parsing
No ratings yet
3 Syntax Analysis - Top Down Parsing
9 pages
Chapter 4 - Syntax Analysis
No ratings yet
Chapter 4 - Syntax Analysis
82 pages
Compiler Final PDF
No ratings yet
Compiler Final PDF
12 pages
Nahid - 2474 PDF
No ratings yet
Nahid - 2474 PDF
9 pages
Compiler Design Unit 2
No ratings yet
Compiler Design Unit 2
84 pages
Compiler Design
No ratings yet
Compiler Design
14 pages
FirstFollow LL (1) Parser
No ratings yet
FirstFollow LL (1) Parser
10 pages
J95-3005
No ratings yet
J95-3005
14 pages
Syntax Analysis I 2022 Class
No ratings yet
Syntax Analysis I 2022 Class
33 pages
Syntax Analysis
No ratings yet
Syntax Analysis
90 pages
SPCC Exp 1
No ratings yet
SPCC Exp 1
13 pages
Lisp Programming Language
From Everand
Lisp Programming Language
Faiz ul haque Zeya
No ratings yet
Learn C++
From Everand
Learn C++
Durgesh
4.5/5 (9)
EC Cryptography Tutorials - Herong's Tutorial Examples
From Everand
EC Cryptography Tutorials - Herong's Tutorial Examples
Herong Yang
No ratings yet
Introduction to PHP, Part 2, Second Edition
From Everand
Introduction to PHP, Part 2, Second Edition
Adam Majczak
No ratings yet
Dictation Time
No ratings yet
Dictation Time
2 pages
Narrative Tenses: English 7 M. en Ed. María Del Rosario Alonso Espinoza
No ratings yet
Narrative Tenses: English 7 M. en Ed. María Del Rosario Alonso Espinoza
15 pages
Week 7 EBS 282 WH Ye No Interrogative Nominal Clauses
No ratings yet
Week 7 EBS 282 WH Ye No Interrogative Nominal Clauses
29 pages
T20 Test
No ratings yet
T20 Test
4 pages
Which vs. That
No ratings yet
Which vs. That
4 pages
The University of Chicago Press
100% (1)
The University of Chicago Press
8 pages
Session 5 - PLAGIARISM AND HOW TO AVOID IT
No ratings yet
Session 5 - PLAGIARISM AND HOW TO AVOID IT
21 pages
Bengal Basin
No ratings yet
Bengal Basin
4 pages
Soal Akg Literasi
No ratings yet
Soal Akg Literasi
6 pages
Gill 2005
No ratings yet
Gill 2005
6 pages
Selection 2728
No ratings yet
Selection 2728
73 pages
Issa Dana 1
No ratings yet
Issa Dana 1
2 pages
Regular Verb
No ratings yet
Regular Verb
9 pages
Biological Psychology
No ratings yet
Biological Psychology
19 pages
Curriculum Vitae: Ahakim0005@stu - Kau.edu - Sa
No ratings yet
Curriculum Vitae: Ahakim0005@stu - Kau.edu - Sa
3 pages
Ut 1
No ratings yet
Ut 1
52 pages
Rajarshi Paul: Operation & Center Manager
No ratings yet
Rajarshi Paul: Operation & Center Manager
3 pages
SSC CGL 2023 Tier 1 Question Paper With Answer Key English
No ratings yet
SSC CGL 2023 Tier 1 Question Paper With Answer Key English
1,246 pages
Seigai 201123b
100% (3)
Seigai 201123b
92 pages
A Corpus-Based Study of Vocabulary in Massive Open Online
No ratings yet
A Corpus-Based Study of Vocabulary in Massive Open Online
11 pages
Chapter 2 NGEC 5
No ratings yet
Chapter 2 NGEC 5
14 pages
Linkers and Connectors, SPN N ENG
0% (1)
Linkers and Connectors, SPN N ENG
5 pages
Kanhu Charan Dakua EHS CV
No ratings yet
Kanhu Charan Dakua EHS CV
3 pages
ĐỀ KT GHK2 ANH 9 24-25 - UYEN
No ratings yet
ĐỀ KT GHK2 ANH 9 24-25 - UYEN
3 pages
B1.1 Professional Writing 1 Draft
No ratings yet
B1.1 Professional Writing 1 Draft
5 pages
Drafting Pleading and Conveyance
No ratings yet
Drafting Pleading and Conveyance
6 pages
Discovery A New Animal (Differentiated Activity) Putting All Together Part 1
No ratings yet
Discovery A New Animal (Differentiated Activity) Putting All Together Part 1
4 pages

Chapter 3 B Top-Down Parsing

Uploaded by

Chapter 3 B Top-Down Parsing

Uploaded by

Chapter Three continue

fisrt and fellow sets of a parser, LL(1) parse Tables.

left recursion, left factoring

 Two main strategies:

o Bad “pick”  may need to backtrack

parsers: e.g. yacc)

LL(1) parsers, recursive descent Parsers

Rules = (1) S  print(E); (2) S  Cf

(2) S  while (B) S

(7) L  SL|ε (8) D  g

Goal:- Given productions A → a |b , the parser should be able to choose

begin a string derived from A

(1) S  AB|Cf respectively

can follow A in a derivation?

this derivation, h is in the Follow set of A. Note: $ is the end-of-file

9 S$  Cf$  DDf$  Dgf$.

recursively using the following two algorithms:

then add all elements of First(A1) to First(A)

add all elements of First(A1) (other than ε) to First(A), and recursively

12 3. If any changes were made in step 2, go back and to 2 and repeat

Note: Since there were 5 changes, we need another iteration

(0) S'  S$. Add { } to First(S’) = { } (no change)

(5) B  hg. Add h to First(B) ={h} (no change) B {h}

(6) C  DD. Add g to First(C) = {f, g} C {f, g}

(7) C  fi. Add f to First(C) = {f, g} (no change) D {g}

(8) D  g. Add g to First(D) ={g} (no change)

Note: Since there were 3 changes, we need another iteration

(0) S’  S$. Add e,f,h to First(S’) = {e, f, h}

(5) B  hg. Add h to First(B) = {h} (no change) B {h}

(6) C  DD. Add g to First(C) = {f, g} (no change) C {f, g}

(7) C  fi. Add f to First(C) = {f, g} (no change) D {g}

(8) D  g. Add g to First(D) = {g} (no change)

Note: Since there were 2 changes, we need another iteration

(0) S'  S$. Add e, f, g, h to First(S’) = {e, h, f, g}

(5) B  hg. Add h to First(B) = {h} (no change) B {h}

(6) C  DD. Add g to First(C) = {f, g} (no change) C {f, g}

(7) C  fi. Add f to First(C) = {f, g} (no change) D {g}

(8) D  g. Add g to First(D) = {g} (no change)

Note: Since there was 1 change, we need another iteration

(5) B  hg. Add h to First(B) (no change) B {h}

(6) C  DD. Add g to First(C) (no change) C {f, g}

17  Check that an order of iteration 8,7,6,5,4,3,2,1,0 requires fewer number of

19 follow sets for the non-terminals in a CFG

terminals: The First sets of non-terminals are:

The Follow sets of non-terminals are: V {a, b, $}

S’ = { }; S = {$}; T = {b, $}; U = {b, $}; and V = {a, b, $}

Note: Since there was 1 change, we need another iteration

Add Follow(U), {b, $}, to Follow(V) = {a, b, $} (no change) S' {}

(6) V  d (no change) T {b, $}

S' {e,f,g,h {} columns e, h

S' S'  S$ S'  S$ S'  S$ S'  S$

Terminals = { id, num, (, ), ;, if, else, ,, $}

(5) C  ε S {id, if} {$,else}

34 Rules =(0) E  E+T|E-T|T|id E  T , E  T  id, E  T  F 

(5) T  T/F  On seeing that rules an id, we cannot tell if

Let’s take a closer look at the

 Even if a CFG is unambiguous and has no left-recursion, it still may not be

insists that the grammar must be left-factored.

symbol of β1 and β2.

 Example: given the following grammar:

The following stmt  if expr then stmt else stmt

Cannot be parsed by a predictive parser that looks

of all strings of terminals derivable from S.

strings of terminals derivable from γ.

Definition 3 Followk: The Followk set of a non-terminal S is the k-prefix of

the set of all strings of terminals that follow S in a partial derivation.

2. For each rule S  γ in G, add all elements of k-prefix(Firstk(γ)) to

1. For any terminal a, Firstk(a) = {a}

2. For any string of terminals and non-terminals γ = γ1γ2γ3…γn, Firstk(γ) = k-

prefix( Firstk(γ1) ○ Firstk(γ2) ○ Firstk(γ3) ○ … Firstk(γn))

2. Set Followk = { } for all non-terminals S in G

[k-prefix(Firstk(β) ○ Followk(S))] to Followk(S1). If Followk(S) = { },

add [k-prefix(Firstk(β))] to Followk(S1).

You might also like