Ch3_Syntax Analysis
Ch3_Syntax Analysis
Syntax Analysis
The parser takes the token produced by lexical analysis and builds
the syntax tree (parse tree).
The syntax tree can be easily constructed from Context-Free
Grammar.
The parser reports syntax errors in an intelligible/understandable
fashion and recovers from commonly occurring errors to continue
processing the remainder of the program.
The process of syntax analysis is performed using syntax
analyzer/parser.
on.
D. The digits 0, 1, ... ,9 .
RMD for - ( id + id )
( E ) ( E )
E E E + E
- E - E
-(id+E) -(id+id)
( E ) ( E )
E + E E + E
id id id
id * E + E id
E E
id id
id id
A 1A’|…. | nA’
A’→ 1 A’ | …. | nA’ |
Eliminating left recursion…
Example
E→E+T | T
T→T*F | F
F→(E) | id
After eliminating the left-recursion the grammar
becomes,
E → TE’
E’→+TE’|ε
T → FT’
T’→*FT’|ε
F→ (E) |id
Left factoring
When a non-terminal has two or more productions
whose right-hand sides start with the same
grammar symbols, the grammar is not LL(1) and
cannot be used for predictive parsing
A predictive parser (a top-down parser without
backtracking) insists that the grammar must be left-
factored.
21
Left factoring…
When processing α we do not know whether to expand
A to αβ1 or to αβ2, but if we re-write the grammar as
follows:
A αA’
A’ β1 | β2 so, we can immediately expand A to
αA’.
23
Left factoring…
Example 1 Example 2
A abB|aB|cdg|cdeB|cdfB A ad|a|ab|abc|b
A a A’ |cd A’‘ A a A’ |b
A’ bB|B A’ d| Ɛ| b A’‘
A’‘ g|eB|fB A’‘ Ɛ|c
Context-Free Grammars versus Regular Expressions
Every regular language is a context-free language, but not vice-
versa.
Example: The grammar for regular expression (a|b)*abb
Describe the same language, the set of strings of a's and b's
ending in abb. So we can easily describe these languages either by
finite Automata or PDA.
On the other hand, the language L ={anbn | n ≥1} with an equal
number of a's and b's is a prototypical example of a language that
can be described by a grammar but not by a regular expression.
We can say that "finite automata cannot count" meaning that a
finite automaton cannot accept a language like {anbn | n ≥ 1} that
would require it to keep count of the number of a's before it sees
the b’s.
Recursive descent
Involves Back tracking Operator precedence
predictive parsing
Parsing without LR parsing
backtracking
SLR
Recursive
predictive
Non-Recursive CLR
predictive
Or LL(1) LALR
Chapter – 3 : Syntax Analysis 29 Bahir Dar Institute of Technology
Top Down Parsing…
Two types of top-down parsing
• Recursive-Descent Parsing
• Backtracking is needed (If a choice of a production rule does
not work, we backtrack to try other alternatives.)
• It is a general parsing technique, but not widely used
because it is not efficient
• Predictive Parsing
• no backtracking and hence efficient
• needs a special form of grammars (LL(1) grammars).
• Two types
– Recursive Predictive Parsing is a special form of
Recursive Descent Parsing without backtracking.
– Non-Recursive (Table Driven) Predictive Parser is also
known as LL(1) parser.
E’ E’ +TE’ E’ E’
T T FT’ T FT’
T’ T’ T’ *FT’ T’ T’
F F id F (E)
2. If in FIRST()
for each terminal a in FOLLOW(A) add A to M[A,a]
All other undefined entries of the parsing table are error entries.
get n-2.
Repeat this, until we reach S.
Chapter – 3 : Syntax Analysis 63 Bahir Dar Institute of Technology
Shift reduce parser
Shift-reduce parsing is a type of bottom-up parsing that attempts to
construct a parse tree for an input string beginning at the leaves (the
bottom) and working up towards the root (the top).
The shift reduce parser performs following basic operations/actions:
1. Shift: Moving of the symbols from input buffer onto the stack, this
action is called shift.
2. Reduce: If handle appears on the top of the stack then reduction of it
by appropriate rule is done. This action is called reduce action.
3. Accept: If stack contains start symbol only and input buffer is empty at
the same time then that action is called accept.
4. Error: A situation in which parser cannot either shift or reduce the
symbols, it cannot even perform accept action then it is called error
action.
NB: If a shift-reduce parser cannot be used for a grammar, that grammar is called
non-LR(k) grammar. An ambiguous grammar can never be an LR grammar
inputs
stack
This configuration represents the right-sentential form
(X1 X2 … Xm , ai ai+1,…, an $)
Xi is the grammar symbol Note: S0 is on the top of the
represented by state Si. stack at the beginning of
72
parsing
Behavior of LR parser
73
Behavior of LR parser…
74
LR-parsing algorithm
METHOD: Initially, the parser has s0 on its stack, where s0 is the initial state,
and w$ in the input buffer.
Let a be the first symbol of w$;
while(1)
{ /* repeat forever */
let S be the state on top of the stack;
if ( ACTION[S, a] = shift t )
{ push t onto
the stack;
let a be the next input
symbol;
}
else if ( ACTION[S, a] = reduce
Aβ) //reduce previous input
symbol to head
{ pop |β| symbols off the stack;
let state t now be on top of the stack; push
GOTO[t, A] onto the stack; output the
production Aβ;
} else if ( ACTION[S, a] = accept ) break; /*
Chapter –parsing
3 : SyntaxisAnalysis
done */ 75 Bahir Dar Institute of Technology
(SLR) Parsing Tables for Expression Grammar
5) F 8 s6 s11
9 r1 s7 r1 r1
(E)
10 r3 r3 r3 r3
6) F id 11 r5 r5 r5 r5
Steps for Constructing SLR Parsing Tables
in
FOLLOW(A).
If [S’→S.] is in Ii, then set action[i,$] to “accept”.
If any conflicting actions are generated by the above rules, we say grammar is
not SLR(1).
3. The goto transitions for state i are constructed for all non -terminals A using the
rule : If goto(Ii,A) = Ij, then goto[i,A] = j.
4. All entries not defined by rules (2) and (3) are made “error”
5. The initial
Chapter state Analysis
– 3 : Syntax of the parser is the one
80 constructed
Bahir Darfrom theofsetTechnology
Institute of items
Example for LR(0) parser
Example of LR(0): Let the grammar G1:
NB: I1, I4, I5, I6 are called final items. They lead to fill the
‘reduce’/ri action in specific row of action part in a
NB: In the LR(0) construction table whenever any state having final item in
that particular row of action part put Ri completely.
egg. in row 4, put R3 , 3 is a leveled number for production in G
Chapter – 3 : Syntax Analysis 83 Bahir Dar Institute of Technology
Example of LR(0) parsing Table
Step 6: check the parser by implementing using stack for string abb$
89
I7 = Goto (I2, *) = {[T T * . F], [F .(E)],
[F .id]}
I8 = Goto (I4, E) = {[F (E.)], [E E . + T]}
Goto(I4,T)={[ET.], [TT.*F]}=I2;
Goto(I4,F)={[TF.]}=I3;
Goto (I4, () = I4;
Goto (I4, id) = I5;
I9 = Goto (I6, T) = {[E E + T.], [T T . * F]}
Goto (I6, F) = I3;
Goto (I6, () = I4;
Goto (I6, id) = I5;
I10 = Goto (I7, F) = {[T T * F.]}
Goto (I7, () = I4;
Goto (I7, id) = I5;
I11= Goto (I8, )) = {[F (E).]}
Goto (I8, +) = I6;
Goto (I9, *) = I7;
90
LR(0) automation
91
SLR table construction method…
Construct the SLR parsing table for the grammar
G1’
Follow (E) = {+, ), $} Follow (T) = {+, ), $,
*}
Follow (F) = {+, ), $,*}
E’ E
1 EE+T
2 ET
3 TT*F
4 TF
5 F (E)
6 F id
92
Stat action goto
e
id + * ( ) $ E T F
0 S5 S4 1 2 3
1 S6 accep
t
2 R2 S7 R2 R2
3 R4 R4 R4 R4
4 S5 S4 8 2 3
5 R6 R6 R6 R6
6 S5 S4 9 3
7 S5 S4 10
8 S6 S1
1
9 R1 S7 R1 R1
10 R3 R3 R3 R3
11 R5 R5 R5 R5 93
SLR parser…
How a shift/reduce parser parses an input string w = id * id +
id using the parsing table shown above.
3-94
LR parsing: Exercise
95
LALR and CLR parser
NB: LR(0) and SLR(1) used LR(0) items to create a parsing table
but LALR and CLR parsers used LR(1) items in order to construct a
parsing table.
Reading assignment
• LALR parser and
• CLR parser