CD Unit 2 RV
CD Unit 2 RV
UNIT-2 (Lecture-1)
Top-down Parsing
When the parser starts constructing the parse tree from the start symbol and then tries to
transform the start symbol to the input, it is called top-down parsing.
Recursive descent parsing : It is a common form of top-down parsing. It is called
recursive as it uses recursive procedures to process the input. Recursive descent parsing
suffers from backtracking.
Backtracking : It means, if one derivation of a production fails, the syntax analyzer
restarts the process using different rules of same production. This technique may process
the input string more than once to determine the right production.
Bottom-up Parsing
Bottom up parsing is also known as shift-reduce parsing.
Bottom up parsing is used to construct a parse tree for an input string.
In the bottom up parsing, the parsing starts with the input symbol and construct the parse
tree up to the start symbol by tracing out the rightmost derivations of string in reverse.
Example:
Input string : a + b * c
Production rules:
S→E
E→E+T
E→E*T
E→T
T → id
Read the input and check if any production matches with the input:
a+b*c
T+b*c
E+b*c
E+T*c
E*c
E*T
E
S
Example:
Production
E→T
T→T*F
T → id
T→F
F → id
1. Shift-Reduce Parsing
2. Operator Precedence Parsing
3. Table Driven LR Parsing
a. LR( 1 )
b. SLR( 1 )
c. CLR ( 1 )
d. LALR( 1 )
UNIT-2 (Lecture-2)
Back-tracking
Top- down parsers start from the root node (start symbol) and match the input string against the
production rules to replace them (if matched). To understand this, take the following example of
CFG:
S → rXd | rZd WRITTEN NOTES
X → oa | ea
Z → ai
For an input string: read, a top-down parser, will behave like this:
It will start with S from the production rules and will match its yield to the left-most letter of the
input, i.e. ‘r’. The very production of S (S → rXd) matches with it. So the top-down parser
advances to the next input letter (i.e. ‘e’). The parser tries to expand non-terminal ‘X’ and
checks its production from the left (X → oa). It does not match with the next input symbol. So
the top-down parser backtracks to obtain the next production rule of X, (X → ea).
Now the parser matches all the input letters in an ordered manner. The string is accepted.
Predictive Parser
Predictive parser is a recursive descent parser, which has the capability to predict which
production is to be used to replace the input string. The predictive parser does not suffer from
backtracking.
To accomplish its tasks, the predictive parser uses a look-ahead pointer, which points to the next
input symbols. To make the parser back-tracking free, the predictive parser puts some
constraints on the grammar and accepts only a class of grammar known as LL(k) grammar.
Predictive parsing uses a stack and a parsing table to parse the input and generate a parse tree.
Both the stack and the input contains an end symbol $ to denote that the stack is empty and the
input is consumed. The parser refers to the parsing table to take any decision on the input and
stack element combination.
In recursive descent parsing, the parser may have more than one production to choose from for
a single instance of input, whereas in predictive parser, each step has at most one production to
choose. There might be instances where there is no production matching the input string,
making the parsing procedure to fail.
LL Parser
An LL Parser accepts LL grammar. LL grammar is a subset of context-free grammar but with
some restrictions to get the simplified version, in order to achieve easy implementation. LL
grammar can be implemented by means of both algorithms namely, recursive-descent or table-
driven.
LL parser is denoted as LL(k). The first L in LL(k) is parsing the input from left to right, the
second L in LL(k) stands for left-most derivation and k itself represents the number of look
aheads. Generally k = 1, so LL(k) may also be written as LL(1).
UNIT-2 (Lecture-3)
Shift reduce parsing is a process of reducing a string to the start symbol of a grammar.
Shift reduce parsing uses a stack to hold the grammar and an input tape to hold the string.
Sift reduce parsing performs the two actions: shift and reduce. That's why it is known as
shift reduces parsing.
At the shift action, the current symbol in the input string is pushed to a stack.
At each reduction, the symbols will replaced by the non-terminals. The symbol is the
right side of the production and non-terminal is the left side of the production.
Example:
Grammar:
S → S+S
S → S-S
S → (S)
S→a
Input string:
a1-(a2+a3)
Parsing table:
1. Operator-Precedence Parsing
2. LR-Parser
UNIT-2 (Lecture-4)
Operator precedence grammar is kinds of shift reduce parsing method. It is applied to a small
class of operator grammars.
Operator precedence can only established between the terminals of the grammar. It ignores the
non-terminal.
a ⋗ b means that terminal "a" has the higher precedence than terminal "b".
a ⋖ b means that terminal "a" has the lower precedence than terminal "b".
a ≐ b means that the terminal "a" and "b" both have same precedence.
Precedence table:
Parsing Action
Scan towards left over all the equal precedence until the first left most ⋖ is encountered.
Everything between left most ⋖ and right most ⋗ is a handle.
$ on $ means parsing is successful.
Example
Grammar:
E → E+E
E → E*E
E → id
Given string:
w = id + id * id
On the basis of above tree, we can design following operator precedence table:
Now let us process the string with the help of the above precedence table:
UNIT-2 (Lecture-5)
"K" is the number of input symbols of the look ahead used to make number of parsing decision.
LR algorithm:
The LR algorithm requires stack, input, output and parsing table. In all type of LR parsing, input,
output and stack are same but parsing table is different.
Input buffer is used to indicate end of input and it contains the string to be parsed followed by a $
Symbol.
A stack is used to contain a sequence of grammar symbols with a $ at the bottom of the stack.
Parsing table is a two dimensional array. It contains two parts: Action part and Go To part.
LR (1) Parsing
Augment Grammar
Augmented grammar G` will be generated if we add one more production in the given grammar
G. It helps the parser to identify when to stop the parsing and announce the acceptance of the
input.
Example
Given grammar
S → AA
A → aA | b
S`→ S
S → AA
A → aA | b
Canonical Collection of LR(0) items
An LR (0) item is a production G with dot at some position on the right side of the production.
LR(0) items is useful to indicate that how much of the input has been scanned up to a given point
in the process of parsing.
Example
Given grammar:
S → AA
A → aA | b
Add Augment Production and insert '•' symbol at the first position for every production in G
S` → •S
S → •AA
A → •aA
A → •b
Drawing DFA:
LR(0) Table
If a state is going to some other state on a terminal then it correspond to a shift move.
If a state is going to some other state on a variable then it correspond to go to move.
If a state contain the final item in the particular row then write the reduce node
completely.
S → AA ... (1)
A → aA ... (2)
A → b ... (3)
UNIT-2 (Lecture-6)
UNIT-2 (Lecture-7)
UNIT-2 (Lecture-8)
UNIT-2 (Lecture-9)
UNIT-2 (Lecture-10)