0% found this document useful (0 votes)
2 views

unit7

This document discusses predictive parsing, specifically focusing on LL(1) parsers which utilize a predictive parsing table and a stack for parsing without backtracking. It highlights the importance of eliminating left recursion and left factoring in grammars to ensure they are suitable for predictive parsing. Additionally, it provides algorithms for constructing parsing tables and computing First and Follow sets necessary for LL(1) parsing.

Uploaded by

Trịnh Quỳnh
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

unit7

This document discusses predictive parsing, specifically focusing on LL(1) parsers which utilize a predictive parsing table and a stack for parsing without backtracking. It highlights the importance of eliminating left recursion and left factoring in grammars to ensure they are suitable for predictive parsing. Additionally, it provides algorithms for constructing parsing tables and computing First and Follow sets necessary for LL(1) parsing.

Uploaded by

Trịnh Quỳnh
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

Unit 7

Predictive Parsing

1
Back to the backtrack parser

• Is it possible to predict the production to be applied?


• It depends on the grammar

2
Predictive Parsers

• Parser can “predict” which production to use


• By looking at the next few tokens
• No backtracking
• Predictive parsers accept LL(k) grammars
• L means “left-to-right” scan of input
• L means “leftmost derivation”
• k means “predict based on k tokens of lookahead”
• In practice, LL(1) is used

3
LL(1) parser

• accomplished using a predictive parsing table M and a stack.

4
A stringent condition

The grammar must not be left recursive and no two


right sides of a production have a common prefix.

It is obvious if the grammar is LL(1)

5
Left Recursion
A grammar is left recursive if it has a non-terminal A such that there
is a derivation.

A  A for some string 

Top-down parsing techniques cannot handle left-recursive


grammars.
So, we have to convert our left-recursive grammar into an equivalent
grammar which is not left-recursive.
The left-recursion may appear in a single step of the derivation
(immediate left-recursion), or may appear in more than one step of
the derivation.

6
Remind: Immediate Left-Recursion

AA|  where  does not start with A

 eliminate immediate left recursion


A   A’
A’   A’ |  an equivalent grammar

In general,

A  A 1 | ... | A m | 1 | ... | n where 1 ... n do not start with A

 eliminate immediate left recursion


A  1 A’ | ... | n A’
A’  1 A’ | ... | m A’ |  an equivalent grammar

7
Left recursion - problem

• A grammar cannot be immediately left-recursive, but it still


can be left-recursive.
• By just eliminating the immediate left-recursion, we may
not get a grammar which is not left-recursive.
S  Aa | b
A  Sc | d This grammar is not immediately left-
recursive,
but it is still left-recursive.
S  Aa  Sca or
A  Sc  Aac causes to a left-recursion

• So, we have to eliminate all left-recursions from our


grammar

8
Eliminate left recursion - Algorithm

- Arrange non-terminals in some order: A1 ... An


- for i from 1 to n do {
- for j from 1 to i-1 do {
replace each production
Ai  Aj 
by
Ai  1  | ... | k 
where Aj  1 | ... | k
}
- eliminate immediate left-recursions among Ai
productions
}
9
Example of removing immediate left recursion

E  E+T | T
T  T*F | F
F  id | (E)

 eliminate immediate left recursion

E  T E’
E’  +T E’ | 
T  F T’
T’  *F T’ | 
F  id | (E)

10
Predictive Parsing and Left Factoring

• In the grammar
ET+E|T
T  int | int * T | ( E )

• Hard to predict because


• For T two productions start with int
• For E it is not clear how to predict

• A grammar must be left-factored before use for predictive


parsing

11
Left Factoring

• A predictive parser (a top-down parser without backtracking)


insists that the grammar must be left-factored.

grammar  a new equivalent grammar suitable for predictive


parsing

if_stmt  if expr then stmt else stmt |


if expr then stmt

• When we see nested if, we cannot know which production


rule to choose to re-write stmt in the derivation.

12
Left Factoring (con’d)

In general,

A  1 | 2 where  is non-empty and the first symbols


of 1 and 2 (if they have one)are different.
when processing  we cannot know whether expand
A to 1 or
A to 2

But, if we re-write the grammar as follows


A   A’
A’  1 | 2 so, we can immediately expand A to A’

13
Left factoring algorithm

• For each non-terminal A with two or more alternatives


(production rules) with a common non-empty prefix, let say

A  1 | ... | n | 1 | ... | m

convert it into

A  A’ | 1 | ... | m
A’  1 | ... | n

14
Left Factoring example

S  if E then S | if E then S else S


can be rewritten as
S  if E then S S’
S’  else S | 

In KPL

IfSt ::= KW_IF Condition KW_THEN Statement


ElseSt

ElseSt ::= KW_ELSE Statement


ElseSt ::= 
Predictive parser

16
Parsing table M

• M[X, token] indicates which production to use if the top of the


stack is a nonterminal X and the current token is equal
to token; F

• in that case we pop X from the stack and we push all the rhs
symbols of the production M[X, token] .

• We use a special symbol $ to denote the end of file. Let S be


the start symbol

17
Predictive Parser
The input contains the string to be parsed, followed by $ (EOF)
The stack contains a sequence of grammar symbols, preceded by #, the bottom-of-
stack marker.
Initially the stack contains the start symbol of the grammar preceded by $.
The parsing table is a two dimensional array M[A,a], where A is a nonterminal, and a is
a terminal or the symbol $.
- The parser is controlled by a program that behaves as follows:
- The program determines X, the symbol on top of the stack, and a, the current
input symbol.
- These two symbols determine the action of the parser.
There are three possibilities:
1. If X = a = $, the parser halts and announces successful completion of parsing.
2. If X = a ≠ $, the parser pops X off the stack and advances the input pointer to the
next input symbol.
3. If X is a nonterminal, the program consults entry M[X,a] of the parsing table M.
This entry will be either an X-production of the grammar or an error entry.
If M[X,a] = {X → UVW}, the parser replaces X on top of the stack by WVU (with U on
top).
If M[X,a] = error, the parser calls an error recovery routine.

18
Parsing table for grammar S aSb | c

a b c $
S S  aSb Error Sc Error
a Push Error Error Error
b Error Push Error Error
c Error Error Push Error
# Error Error Error Accept

19
LL(1) Parsing Tables. Errors

• Yellow entries indicate error situations

• Consider the [S,b] entry

• “There is no way to derive a string starting with b from


non-terminal S

20
Using Parsing Tables

• Method similar to recursive descent, except


• For each non-terminal S
• We look at the next token a
• And chose the production shown at [S,a]

• We use a stack to keep track of pending non-terminals


• We reject when we encounter an error state
• We accept when we encounter end-of-input

21
LL(1) Parsing Algorithm

initialize stack = <S #>


repeat
case stack of
<T, rest> : if M[X,*next] = T  Y1…Yn
then stack  <Y1… Yn rest>;
else error (); //X-nonterminal T
<t, rest> : if t == *next ++
then stack  <rest>;
else error (); //X-nonterminal
until stack == < >
//*next refers to the symbol to be checked

22
LL(1) Parsing Example for aacbb

Stack Input Action


S# aacbb$ S  aSb
aSb# aacbb$ push
Sb# acbb$ S  aSb
aSbb# acbb$ push
Sbb# cbb$ S c
cbb# cbb$ push
bb# bb$ push
b# b$ push
# $ ACCEPT

23
Constructing Parsing Tables

• LL(1) languages are those defined by a parsing table for the


LL(1) algorithm
• No table entry can be multiply defined
• We want to generate parsing tables from CFG

24
Constructing Parsing Tables (Cont.)

• If A  , where in the line of A we place  ?


• In the column of t (t is a terminal) where t can start a string
derived from 
•  * t 
• We say that t  First()
• In the column of t if  is  and t can follow an A
• S *  A t 
• We say t  Follow(A)

25
Computing First Sets

Definition: First(X) = { t | X * t}  { | X * }

Algorithm sketch :
1. for all terminals t do First(X)  { t } //if X is terminal t
2. for each production X   do First(X)  {  }
3. if X  A1 … An  and   First(Ai), 1  i  n do
• add First() to First(X)
4. for each X  A1 … An s.t.   First(Ai), 1  i  n do
• add  to First(X)
5. repeat steps 4 & 5 until no First set can be grown

26
First Sets. Example

• Recall the grammar


E  T E’ T  FT’
E’  + E |  T’  *F | 
F ( E ) | int
• First sets
First( ( ) = { ( } First( T ) = First (F) = {int, ( }
First( ) ) = { ) } First( E ) = {int, ( }
First( int) = { int } First( E’ ) = {+,  }
First( + ) = { + } First( T’) = {*,  }
First( * ) = { * }

27
Computing Follow Sets

• Definition:
Follow(X) = { t | S *  X t  }
• Intuition
• If S is the start symbol then $  Follow(S)

• If X  A B then First(B)  Follow(A) and


Follow(X)  Follow(B)
• Also if B *  then Follow(X)  Follow(A)

28
Computing Follow Sets (Cont.)

Algorithm sketch:

1. Follow(S) contains $.
2. For each production A   X 
• add First() - {} to Follow(X)
3. For each A   X  where   First()
• add Follow(A) to Follow(X)
• repeat step(s) 2, 3 until no Follow set grows

29
Follow Sets. Example

• Recall the grammar


E  T E’ T  FT’
E’  + E’ |  T’  *F | 
F ( E ) | int
• Follow sets
Follow( + ) = { int, ( } Follow( * ) = { int, ( }
Follow( ( ) = { int, ( } Follow( E ) = {), $}
Follow( E’ ) = {$, ) } Follow( T ) = {+, ) , $}
Follow( ) ) = {*,+, ) , $} Follow( T’ ) = {+, ) , $}
Follow( int) = {*, +, ) , $}

30
Constructing LL(1) Parsing Tables

• Construct a parsing table M for CFG G

• For each production A   in G do:


• For each terminal t  First() do
• M[A, t] = A  
• If   First(), for each t  Follow(A) do
• M[A, t] = A   ( derives  )
• If   First() and $  Follow(A) do
• M[A, $] = A   ( derives  )

31
Example

ETE'
E'+TE'|
TFT'
T'*FT'|
F(E) |int

It’s possible to implement a predictive parser for G

32
Parsing table

+ * ( ) int $
E ETE' ETE'
E' E'+TE' E' E'
T TFT' TFT'
T' T' T'*FT’ T' T'
F F(E) Fint
+ Push
* Push
( Push
) Push
int Push
# Accept

Empty cells are “Error”

33
Notes on LL(1) parsing tables

• If any entry is multiply defined then G is not LL(1), it happens


• If G is ambiguous

• If G is left recursive

• If G is not left-factored

• Most programming language grammars are not LL(1)

• There are tools that build LL(1) tables

34

You might also like