0% found this document useful (0 votes)
9 views

5- Lecture05 - Top-Down Parsing

This document provides an overview of LL(1) predictive parsing, including the construction and use of parsing tables, error handling, and panic-mode recovery strategies. It explains the concepts of First and Follow sets, as well as how to handle syntax errors during parsing. Additionally, it discusses the limitations of LL(1) grammars and presents examples of parsing tables and transition diagrams.

Uploaded by

naimu767
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

5- Lecture05 - Top-Down Parsing

This document provides an overview of LL(1) predictive parsing, including the construction and use of parsing tables, error handling, and panic-mode recovery strategies. It explains the concepts of First and Follow sets, as well as how to handle syntax errors during parsing. Additionally, it discusses the limitations of LL(1) grammars and presents examples of parsing tables and transition diagrams.

Uploaded by

naimu767
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

40-414 Compiler Design

Top-Down Parsing

Lecture 5

1
LL(1) Predictive Parsers

• Parser can “predict” which production to use


– By looking at the next few tokens
– No backtracking

• Predictive parsers accept LL(k) grammars


– L means “left-to-right” scan of input
– L means “leftmost derivation”
– k means “predict based on k tokens of lookahead”
– In practice, LL(1) is used

Prof. Aiken [slightly modified] 2


LL(1) Parsing Table Example

• Left-factored grammar
ETX X+E|
T  ( E ) | int Y Y*T|
• The LL(1) parsing table: next input token

int * + ( ) $
E TX TX
X +E  
T int Y (E)
Y *T   
rhs of production to use
leftmost non-terminal
Prof. Aiken 3
LL(1) Parsing Table Example (Cont.)

• Consider the [E, int] entry


– “When current non-terminal is E and next input is
int, use production E  T X”
– This can generate an int in the first position

• Consider the [Y,+] entry


– “When current non-terminal is Y and current token
is +, get rid of Y”
– Y can be followed by + only if Y  

Prof. Aiken 4
LL(1) Parsing Tables. Errors

• Blank entries indicate error situations

• Consider the [E,*] entry


– “There is no way to derive a string starting with *
from non-terminal E”

Prof. Aiken 5
LL(1) Parsing Algorithm
a + b $ Input (String + terminator $)

Stack

X Predictive Parsing Output


NT + T Program
Y
symbols of
CFG Z What actions parser
Parsing Table should take based on
Empty stack $ M[A,a] stack / input
symbol

General parser behavior: X : top of stack a : current token


1. When X=a = $ halt, accept, success
2. When X=a  $ , POP X off stack, advance input, go to 1.
3. When X is a non-terminal, examine M[X, a], if it is an error, call recovery routine
if M[X, a] = {UVW}, POP X, PUSH U,V,W, and DO NOT advance input
6
LL(1) Parsing Example

Stack Input Action


E$ int * int $ TX
TX$ int * int $ int Y
int Y X $ int * int $ terminal
YX$ * int $ *T
*TX$ * int $ terminal
TX$ int $ int Y
int Y X $ int $ terminal
YX$ $ 
X$ $ 
$ $ ACCEPT
Prof. Aiken 7
Constructing Parsing Tables: The Intuition

• Consider non-terminal A, production A  , & token t


• T[A,t] =  in two cases:

• If  * t 
–  can derive a t in the first position
– We say that t  First()

• If A   and  *  and S *  A t 
– Useful if stack has A, input is t, and A cannot derive t
– In this case only option is to get rid of A (by deriving )
• Can work only if t can follow A in at least one derivation
– We say t  Follow(A)

Prof. Aiken 8
Constructing LL(1) Parsing Tables

• Construct a parsing table T for CFG G

• For each production A   in G do:


– For each terminal t  First() do
• T[A, t] = 
– If   First(), for each t  Follow(A) do
• T[A, t] = 
– If   First() and $  Follow(A) do
• T[A, $] = 

Prof. Aiken 9
Example 1

E TX X  +E| 
T  (E) | int Y Y  * T| 

int * + ( ) $
E TX TX
X +E  
T int Y (E)
Y *T   

Prof. Aiken 10
Example 2

S S a | b
First( S ) = { b }
Follow( S ) = { $, a }

a b $
S b, Sa

Prof. Aiken 11
Notes on LL(1) Parsing Tables

• If any entry is multiply defined then G is not


LL(1)
– If G is ambiguous
– If G is left recursive
– If G is not left-factored
– And in other cases as well

• Most programming language CFGs are not LL(1)

Prof. Aiken 12
Notes on LL(1) Grammars

Grammar is LL(1)  when for all A 

1. First()  First() = ; besides, only one of  or  can


derive 

2. if  derives , then Follow(A)  First() = 

It may not be possible for a grammar to be manipulated


into an LL(1) grammar
13
Implementing Panic Mode in LL(1)

a + b $ Input

Stack
X Predictive Parsing Output
Program
Y
Z
Parsing Table
$ M[A,a]

Error situations include:

1.If X is a terminal and it doesn’t match current token.


2.If M[ X, Input ] is empty – No allowable actions
14
Panic-Mode Recovery

• Assume in a syntax error, non-terminal A is on the


top of the stack.

• The choice for a synchronizing set is important.


– define the synchronizing set of A to be Follow(A).
Then skip input until a token in Follow(A) appears
and then pop A from the stack. Resume parsing...
– add symbols of FIRST(A) to the synchronizing
set. In this case, we skip input and once we find a
token in FIRST(A), we resume parsing from A.

15
Panic-Mode Recovery (Cont.)
Modify the empty cells of the Parsing Table.
1. if M[A, a] = {empty} and a belongs to Follow(A) then
we set M[A, a] = “synch”
Error-recovery Strategy :
If A=top-of-the-stack and a=current-token,
1. If A is NT and M[A, a] = {empty} then skip a from the
input.
2. If A is NT and M[A, a] = {synch} then pop A.
3. If A is a terminal and A!=a then pop A (This is
essentially inserting A before a).
16
Parse Table / Example

id + * ( ) $
E T E’ T E’ synch synch
E’ + T E’  
T F T’ synch F T’ synch synch
T’  * F T’  
F id synch synch (E) synch synch

Pop top of stack NT Skip current-token


for “synch” cells for empty cells

E  T E'
E'  + T E' | 
T  F T'
T'  * F T' | 
F  ( E ) | id
17
Parsing Example

id + * ( ) $
E T E’ T E’ synch synch
E’ + T E’  
T F T’ synch F T’ synch synch
T’  * F T’  
F id synch synch (E) synch synch

Possible Error Msg:


STACK INPUT Remark “Misplaced +
E$ + id * + id $ error, skip + I am skipping it”
E$ id * + id $ E  T E'
E'  + T E' | 
T E’ $ id * + id $
F T’ E’ $ id * + id $
id T’ E’ $ id * + id $
T  F T'
T’ E’ $ * + id $ T'  * F T' | 
* F T’ E’ $ * + id $ F  ( E ) | id
F T’ E’ $ + id $ 18
Parsing Example (Cont.)

id + * ( ) $
E T E’ T E’ synch synch
E’ + T E’  
T F T’ synch F T’ synch synch
T’  * F T’  
F id synch synch (E) synch synch

Possible Error Msg:


STACK INPUT Remark “Missing Term”
F T’ E’ $ + id $ error, M[F,+] = synch, F is popped
T’ E’ $ + id $
E’ $ + id $ E  T E'
+ T E’ $ + id $ E'  + T E' | 
T E’ $ id $
T  F T'
F T’ E’ $ id $
id T’ E’ $ id $
T'  * F T' | 
T’ E’ $ $ F  ( E ) | id
E’ $ $ 19
$ $
Other Parsing Methods

Top-Down Parsing Methods (Cont.)

Transition Diagrams

20
Transition Diagrams

E  TE’ T  FT’ F  ( E ) | id
E’  + TE’ |  T’  * FT’ | 

• Unlike lexical equivalents, E: 0


T
1
E’
2
each edge represents a + T E’
E’: 3 4 5 6
token

F T’
T: 7 8 9
•Transition implies: if
* F T’
token, match input else call T’: 10 11 12 13

proc
( E )
F: 14 15 16 17
id
21
Transition Diagrams can be Simplified
+ T E’
E’: 3 4 5 6

22
Transition Diagrams can be Simplified (2)
+ T E’
E’: 3 4 5 6


+ T
E’: 3 4 5


6

23
Transition Diagrams can be Simplified (3)
+ T E’
E’: 3 4 5 6

T

+ T +
E’: 3 4 5 E’: 3 4

 
6 6

24
Transition Diagrams can be Simplified (4)
+ T E’
E’: 3 4 5 6

T

+ T +
E’: 3 4 5 E’: 3 4

 
6 6

T E’
E: 0 1 2

25
Transition Diagrams can be Simplified (5)
+ T E’
E’: 3 4 5 6

T

+ T +
E’: 3 4 5 E’: 3 4

 
6 6

T E’
E: 0 1 2
+
T
T
E: 0 3
T +
E: 0 3 4

 6 26
6
Similar steps for T and T’

* F T’
T’: 10 11 12 13

*
T’: 10 11


13

F T’
T: 7 8 9

F  27
T: 7 8 13
Simplified Transition diagrams

+
T
E: 0 3


6

F 
T: 7 8 13

( E )
F: 14 15 16 17
id 28
Implementing Panic-Mode Recovery

• The choice for the synchronizing set is


important for improving the performance of
the panic mode method.

• We define First(A)  Follow(A) as the


synchronizing set of non-terminal A.

29
Implementing Panic-Mode Recovery (Cont.)
Suppose the parser is in diagram A, the current token is a,
and a syntax error is detected:
1. if a  Follow (A),
Report the error by ‘illegal a found on line N’,
where N is the line number of token a,
then get the next token from the scanner, and then call
diagram A .
2. if a  Follow (A),
Report the error by: ‘missing A1 on line N’, where N is the
line number of token a; then resume parsing by exiting
from A.
1Note that in a real compiler, in the error message, A should
be replaced by a simple token that can be derived from A.

30
Implementing Panic-Mode Recovery (Cont.)

3. Suppose the error has been caused by a mismatch between


the current token a and the expected token b on link L in
Diagram A:
Report the error by the message ‘missing b on line N, where N
is the line number of token a, and continue the parsing in
diagram A from the end of link L.

31
Question?
Choose the next parse state given the grammar, parse table, and
current state below. The initial string is:
if true then { true } else { if false then { false } } $
if then else { } true false $
E if Bthen { E } E’  B B 
E’ else { E}  
B true false
Stack Input E if B then { E } E’
Current E’$ else { if false then { false } } $ | B| 
$ $ E’  else { E } | 
B true | false
else {E} $ else { if false then { false } } $

E} $ iffalse then { false } } $


else {if Bthen {E} E’ } $ else { if false then { false } } $

Prof. Aiken 32
Question?

For the given grammar, find the First and Follow of


Non-terminals and the Parse table

S  i E t S S’ | a First(S) = Follow(S) =
S’  e S |  First(S’) = Follow(S’) =
E b First(E) = Follow(E) =

a b e i t $
S
S’
E
33
Question?

For the given grammar, First(E,T,F) =


find the First and Follow
First(E’) =
of Non-terminals and
the Parse table First(T’) =
E  T E’ Follow(E) = Follow(E’) =
E’  + T E’ | 
T  F T’ Follow(T) = Follow(T’) =
T’  * F T’ |  Follow(F) =
F  ( E ) | id

id + * ( ) $
E
E’
T
T’
F 34
Question?

• Consider the grammar


ETX X+E|
T  ( E ) | int Y Y*T|

• Convert the given grammar to a transition diagram

• Simplify the Diagram (if it is possible)

• Write a step-by-step parsing of input ‘int * int’

• Draw the parse tree of the input


35

You might also like