0% found this document useful (0 votes)

93 views36 pages

CD Module2 16 03 23 PDF

The document discusses different types of parsers used in compiler design. It describes that a parser breaks down source code into tokens during syntax analysis. There are three main stages - lexical analysis, syntactic analysis, and semantic analysis. Top-down and bottom-up parsers are discussed along with LR, LL, recursive descent, and predictive parsers. Context-free grammars are also summarized.

Uploaded by

Souvik Das

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

93 views36 pages

CD Module2 16 03 23 PDF

Uploaded by

Souvik Das

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 36

Compiler Design (YCS6003)

Syntax Analysis
Soumya Majumdar
Parser
● Parser is a program that is usually part of a compiler.
● Receives input in the form of sequential source program instructions, interactive online
commands, markup tags or some other defined interface.
● Parsing happens during the analysis stage of compilation
● In parsing, code is taken from the preprocessor, broken into smaller pieces and analyzed
so other software can understand it. The parser does this by building a data structure out
of the pieces of input.
● Parser consists of three components, each of which handles a different stage of the
parsing process. The three stages are: Lexical analysis, Syntactic analysis, Semantic
analysis
●
Lexical Analysis
● Lexical analyzer or scanner takes code from preprocessor and breaks it into smaller
pieces.
● Groups the input code into sequences of characters called lexemes, each of which
corresponds to a token.
● Tokens are units of grammar in the programming language that the compiler understands.
● Lexical analyzers also remove white space characters, comments and errors from the
input.
Syntactic Analysis
● Checks the syntactical structure of the input using a data structure called a parse tree or
derivation tree.
● Syntax analyzer uses tokens to construct a parse tree that combines the predefined
grammar of the programming language with the tokens of the input string.
● Syntactic analyzer reports a syntax error if the syntax is incorrect.
Semantic Analysis
● Verifies the parse tree against a symbol table and determines whether it is semantically
consistent. This process is also known as context sensitive analysis.
● Includes data type checking, label checking and flow control checking.
●
Types of Parser
<sentence> ::= <subject> <verb> <object>

<subject> ::= <article> <noun>

<article> ::= the | a

<noun> ::= dog | cat | person

<verb> ::= pets | fed

<object> ::= <article> <noun>

Types of Parser
● Top-down parsers. These start with a rule at the top, such as <sentence> ::=
<subject> <verb> <object>. Given input string "The person fed a cat," parser would
look at first rule, and work its way down all the rules checking to make sure they are
correct. In this case, the first word is a <subject>, it follows the subject rule, and parser
will continue reading sentence looking for a <verb>.
● Bottom-up parsers. These start with rule at the bottom. In this case, parser would
look for an <object> first, then look for a <verb> next and so on.
Types of Parser in terms of derivation
● LL parsers: parse input from left to right using leftmost derivation to match the rules in the
grammar to the input. This process derives a string that validates the input by expanding the
leftmost element of the parse tree.
● LR parsers: parse input from left to right using rightmost derivation. This process derives a
string by expanding the rightmost element of the parse tree.
Types of Parser
● Recursive descent parsers: Recursive descent parsers backtrack after each decision point to

double-check accuracy. Recursive descent parsers use top-down parsing.

● Predictive Parser : Predictive parser is a recursive descent parser with no backtracking or

backup. It is a top-down parser that does not require backtracking. At each step, the choice of

the rule to be expanded is made upon the next terminal symbol.

● Earley parsers: These parse all context-free grammars, unlike LL and LR parsers. Most

real-world programming languages do not use context-free grammars.

● Shift-reduce parsers: These shift and reduce an input string. At each stage in string, they

reduce word to a grammar rule. This approach reduces the string until it has been completely

checked.
Types of Parser
Top-down parser
● When the parser starts constructing the parse tree from the start symbol and then tries to

transform the start symbol to the input, it is called top-down parsing.

● Recursive descent parsing : It is a common form of top-down parsing. It is called recursive as

it uses recursive procedures to process the input. Recursive descent parsing suffers from

backtracking.

● Backtracking : If one derivation of a production fails, the syntax analyzer restarts process using

different rules of same production. This technique may process the input string more than once

to determine the right production.

●
Recursive-descent parser
● Recursive descent is a top-down parsing technique that constructs the parse tree from the top

and the input is read from left to right.

● This parsing technique recursively parses the input to make a parse tree, which may or may not

require back-tracking.

● A form of recursive-descent parsing that does not require any back-tracking is known as

predictive parsing.

● This parsing technique is regarded recursive as it uses context-free grammar which is recursive

in nature.
Back-tracking

S → rXd | rZd

X → oa | ea

Z → ai
Back-tracking
Input string: read
Predictive Parser
● Predictive parser is a recursive descent parser, which has the capability to predict which
production is to be used to replace the input string.
● Predictive parser does not suffer from backtracking.
● Predictive parser uses a look-ahead pointer, which points to the next input symbols.
● To make parser back-tracking free, predictive parser puts some constraints on grammar and
accepts only a class of grammar known as LL(k) grammar.
● Predictive parsing uses a stack and a parsing table to parse the input and generate a parse tree.
● Both the stack and the input contains an end symbol $ to denote that the stack is empty and the
input is consumed
● Parser refers to the parsing table to take any decision on the input and stack element
combination.
Predictive Parser
Recursive-descent vs Predictive Parser
● In recursive descent parsing, the parser may have more than one production to choose from for a
single instance of input.
● In predictive parser, each step has at most one production to choose.
● There might be instances where there is no production matching the input string, making the
parsing procedure to fail.
LL Parser
● LL Parser accepts LL grammar.
● LL grammar is a subset of context-free grammar but with some restrictions to get the simplified
version, in order to achieve easy implementation.
● LL grammar can be implemented by means of both algorithms namely, recursive-descent or
table-driven.
● LL parser is denoted as LL(k). The first L in LL(k) is parsing the input from left to right, the
second L in LL(k) stands for left-most derivation and k itself represents the number of look
aheads. Generally k = 1, so LL(k) may also be written as LL(1).
LL Parser
Bottom-up Parser
● Bottom-up parsing starts with input symbols and tries to construct the parse tree up to the start
symbol
● Bottom-up parsing starts from the leaf nodes of a tree and works in upward direction till it
reaches the root node.
Bottom-up Parser
Bottom-up Parser
Grammar:

1. S → S+S
2. S → S-S
3. S → (S)
4. S→a

Input string:

a1-(a2+a3)
Bottom-up Parser
Parsing table:
Context free grammar
● A context free grammar (CFG) is a forma grammar which is used to generate all the possible
patterns of strings in a given formal language. It is defined as four tuples −

G=(V,T,P,S)

● G is a grammar, which consists of a set of production rules. It is used to generate the strings of a
language.
● T is the final set of terminal symbols. It is denoted by lower case letters.
● V is the final set of non-terminal symbols. It is denoted by capital letters
● P is a set of production rules, which is used for replacing non-terminal symbols (on the left side
of production) in a string with other terminals (on the right side of production).
● S is the start symbol used to derive the string
Context free grammar
● Context free grammar consists of terminals, non-terminals, start symbol and production.
● Terminal: basic symbols from which strings formed. (can ve also called “token name”)
● Non-terminal: syntactic variables that denotes set of strings. Set of strings denoted by
non-terminal help to define languages generated by grammar.
● Start Symbol: one non-terminal is distinguished as start symbol.
● Production: specify the manner in which terminals and non-terminals can be combined to form
strings
● A production consists of (i) a non-terminal symbol (head/left side of production) (ii) -> symbol
or ::= symbol (iii) terminal/non-terminals (right side of production)
Context free grammar
LR Parser
● Bottom-up parser for context-free grammar that is very generally used by computer
programming language compiler and other associated tools.
● LR parser reads their input from left to right and produces a right-most derivation.
● It is called a Bottom-up parser because it attempts to reduce the top-level grammar productions
by building up from the leaves.
● LR parsers are the most powerful parser of all deterministic parsers in practice.
● LR(k) parser: here the L refers to the left-to-right scanning, R refers to the rightmost derivation
in reverse
● k refers to the number of input symbols for lookahead that are used in making parsing decision.
LR Parser advantages
● Can be constructed to recognise vairually all programming languages construct for which context
free grammer can be written
● LR parsing method is most general non-backtracking shift-reduce parsing method
● LR parser can detect a syntactic error as soon as possible to do so on a left-right scan of the input

Disadvantage: it is too much work to construct an LR parser by hand for a typical programming
language grammer.
LR(0) item
● An LR(0) item is a production of the grammar with exactly one dot on the right-hand side.
● For example, production T → T * F leads to four LR(0) items:

T→⋅T*F

T→T⋅*F

T→T*⋅F

T→T*F⋅

● What is to the left of the dot has just been read, and the parser is ready to read the remainder,
after the dot.
● Two LR(0) items that come from the same production but have the dot in different places are
considered different LR(0) items.
Closure of LR(0) item
S is a set of LR(0) items. The following rules tell how to build closure(S), the closure of S. We must
add LR(0) items to S until there are no more to add.

● All members of S are in the closure(S).

● Suppose closure(S) contains item A → α⋅Bβ, where B is a nonterminal. Find all productions B
→ γ1, …, B → γn with B on the left-hand side. Add LR(0) items B → ⋅γ1, … B → ⋅γn to
closure(S).
Closure of LR(0) item
For example, let's take the closure of set {E → E + ⋅ T}.

Since there is an item with a dot immediately before nonterminal T, we add T → ⋅ F and T → ⋅ T * F.
The set now contains the following LR(0) items.

E→E+⋅T
T→⋅F
T→⋅T*F
Closure of LR(0) item
Now there is an item in the set with a dot immediately followed by F. So we add items F → ⋅ n and F →
⋅ ( E ). The set now contains the following items.

E→E+⋅T

T→⋅F

T→⋅T*F

F→⋅n

F→⋅(E)

No more LR(0) items need to be added, so the closure is finished.

Closure of LR(0) item
What is the point of the closure?

● LR(0) item E → E + ⋅ T indicates that the parser has just finished reading an expression
followed by a + sign. In fact, E + are the top two symbols on the stack.
● Now, the parser is looking to see if there is a T next. (It does not predict that there is a T next. It
is just considering that as a possibility.)
● But that means it should be looking for something that is the right-hand side of a production for
T. So we add items for T with the dot at the beginning.
Problem 1
Consider the following grammar-

E→E–E

E→ExE

E → id

Parse the input string id – id x id using a shift-reduce parser.

Problem 2
Consider the following grammar-

S→(L)|a

L→L,S|S

Parse the input string ( a , ( a , a ) ) using a shift-reduce parser.

Problem 3
Considering the string “10201”, design a shift-reduce parser for the following grammar-

S → 0S0 | 1S1 | 2

HLD & LLD
0% (1)
HLD & LLD
10 pages
Lisp Interpreter in Rust
From Everand
Lisp Interpreter in Rust
Vishal Patil
1/5 (1)
Assignment 3
No ratings yet
Assignment 3
4 pages
Chapter 3 Compiler Design
No ratings yet
Chapter 3 Compiler Design
42 pages
Types of Parser
No ratings yet
Types of Parser
17 pages
Chapter 3 Syntax Analyzer1
No ratings yet
Chapter 3 Syntax Analyzer1
58 pages
Unit - 3 Syntax Analyzer
No ratings yet
Unit - 3 Syntax Analyzer
43 pages
Compiler Design 4
No ratings yet
Compiler Design 4
7 pages
Grammars
No ratings yet
Grammars
34 pages
Complier Construction (Final)
No ratings yet
Complier Construction (Final)
8 pages
PCD - Unit Ii
No ratings yet
PCD - Unit Ii
31 pages
CD - CH3 - Syntax Analysis(Parsing)
No ratings yet
CD - CH3 - Syntax Analysis(Parsing)
109 pages
CD Unit-2
100% (1)
CD Unit-2
60 pages
Parsing Assignment
No ratings yet
Parsing Assignment
6 pages
PCC-CS501
No ratings yet
PCC-CS501
10 pages
Chapter 3 (1)
No ratings yet
Chapter 3 (1)
43 pages
M2 Compiler Design
No ratings yet
M2 Compiler Design
51 pages
Module-2 1
No ratings yet
Module-2 1
51 pages
Compiler Designnotes
No ratings yet
Compiler Designnotes
18 pages
lecture 4
No ratings yet
lecture 4
26 pages
CC-ll
No ratings yet
CC-ll
15 pages
CD Unit 2
No ratings yet
CD Unit 2
19 pages
ACD-UNIT-4 Notes
No ratings yet
ACD-UNIT-4 Notes
32 pages
12.2Unit 2
No ratings yet
12.2Unit 2
25 pages
CD Farre
No ratings yet
CD Farre
13 pages
Cd notes
No ratings yet
Cd notes
194 pages
UNIT 2 PPT
No ratings yet
UNIT 2 PPT
22 pages
Unit-2 F&CD
No ratings yet
Unit-2 F&CD
31 pages
Unit 2
No ratings yet
Unit 2
29 pages
u2 (2)
No ratings yet
u2 (2)
18 pages
CD unit-2
No ratings yet
CD unit-2
13 pages
Unit 2-Part B
No ratings yet
Unit 2-Part B
73 pages
2024_CD-Ch03_Syntaxx_Analysis
No ratings yet
2024_CD-Ch03_Syntaxx_Analysis
28 pages
Chapter 3 Syntax Analysis
No ratings yet
Chapter 3 Syntax Analysis
54 pages
Chapter 3 Syntax Analysis
No ratings yet
Chapter 3 Syntax Analysis
78 pages
Unit 2 CD
No ratings yet
Unit 2 CD
5 pages
Top Down Parser
No ratings yet
Top Down Parser
5 pages
COP CD Unit2 PDF
No ratings yet
COP CD Unit2 PDF
371 pages
Parsing Techniques: Parsers
No ratings yet
Parsing Techniques: Parsers
16 pages
Parsers in Compiler Design and Types
No ratings yet
Parsers in Compiler Design and Types
2 pages
Document From Aditya Tripathi
No ratings yet
Document From Aditya Tripathi
5 pages
Compiler Design Unit-2 - 2
No ratings yet
Compiler Design Unit-2 - 2
3 pages
Compler
No ratings yet
Compler
35 pages
Chapter # 5 Parsing Mechanisms. Chapter # 5 Parsing Mechanisms
No ratings yet
Chapter # 5 Parsing Mechanisms. Chapter # 5 Parsing Mechanisms
31 pages
CD UNIT-2
No ratings yet
CD UNIT-2
107 pages
CS6109-MODULE-5
No ratings yet
CS6109-MODULE-5
117 pages
Top Down PDF
No ratings yet
Top Down PDF
49 pages
Comp Review: Compilers: Fall 1996 Textbook: "Compilers" by Aho, Sethi & Ullman
No ratings yet
Comp Review: Compilers: Fall 1996 Textbook: "Compilers" by Aho, Sethi & Ullman
10 pages
Chapter 3 Syntax Analysis Full Reading Material
No ratings yet
Chapter 3 Syntax Analysis Full Reading Material
76 pages
Adama Science and Technology University: School of Electrical Engineering and Computing
No ratings yet
Adama Science and Technology University: School of Electrical Engineering and Computing
10 pages
Top Down Parser
No ratings yet
Top Down Parser
111 pages
Syntax Analysis (Part-I)
No ratings yet
Syntax Analysis (Part-I)
88 pages
Chapter 3 (2)
No ratings yet
Chapter 3 (2)
41 pages
Mod - 3 (2)
No ratings yet
Mod - 3 (2)
51 pages
CD 2,3 Unit's Material
100% (1)
CD 2,3 Unit's Material
170 pages
Chapter 3a - Syntax Analysis
No ratings yet
Chapter 3a - Syntax Analysis
10 pages
UNIT-2 Protected
No ratings yet
UNIT-2 Protected
29 pages
Btcse 701-Compiler Design
No ratings yet
Btcse 701-Compiler Design
10 pages
Chapter 3
No ratings yet
Chapter 3
96 pages
What Is Parsing: Parsing Is The Process of Analyzing An Input Sequence in Order
No ratings yet
What Is Parsing: Parsing Is The Process of Analyzing An Input Sequence in Order
9 pages
Introduction to Algorithms
From Everand
Introduction to Algorithms
S VASIST
No ratings yet
Oracle® VM Virtualbox Installation Instructions For Windows 10 and Linux Virtual Machine Creation Targeting Avnet Development Boards
No ratings yet
Oracle® VM Virtualbox Installation Instructions For Windows 10 and Linux Virtual Machine Creation Targeting Avnet Development Boards
61 pages
Footprinting & Reconnaissance
No ratings yet
Footprinting & Reconnaissance
7 pages
OOP (O - O P S) : S Bject Riented Rogramming Ystem
No ratings yet
OOP (O - O P S) : S Bject Riented Rogramming Ystem
50 pages
Java Notes
No ratings yet
Java Notes
18 pages
Week 1 Assignment Solution
No ratings yet
Week 1 Assignment Solution
6 pages
Design Document
No ratings yet
Design Document
10 pages
MCS 024 (11 12)
No ratings yet
MCS 024 (11 12)
66 pages
Kernel
No ratings yet
Kernel
2 pages
PMX
No ratings yet
PMX
2 pages
Angular_Course Training Schedule Plan
No ratings yet
Angular_Course Training Schedule Plan
2 pages
Guide Now Platform Guides Technical Best Practices - ServiceNow Developers
No ratings yet
Guide Now Platform Guides Technical Best Practices - ServiceNow Developers
39 pages
My Python Notes
100% (1)
My Python Notes
17 pages
9618_s24_qp_22
No ratings yet
9618_s24_qp_22
28 pages
Cmcsoft 2013 Naque v5
No ratings yet
Cmcsoft 2013 Naque v5
6 pages
Wrapper Classes Exercise: Cognizant Technology Solutions
No ratings yet
Wrapper Classes Exercise: Cognizant Technology Solutions
7 pages
Diploma CS III SEM
No ratings yet
Diploma CS III SEM
18 pages
Advanced React
No ratings yet
Advanced React
26 pages
Day Con Tang Dai Nhat: Tam Tat Ca To Hop Con
No ratings yet
Day Con Tang Dai Nhat: Tam Tat Ca To Hop Con
5 pages
Gaintchart
No ratings yet
Gaintchart
2 pages
Computer Keyboard Shortcut Keys
100% (1)
Computer Keyboard Shortcut Keys
5 pages
Ais15 Finals Reviewer
No ratings yet
Ais15 Finals Reviewer
15 pages
SQL 2
No ratings yet
SQL 2
7 pages
E Commerce Project Management 140427234146 Phpapp01 PDF
No ratings yet
E Commerce Project Management 140427234146 Phpapp01 PDF
73 pages
miniproject
No ratings yet
miniproject
43 pages
Developing Perl and PHP Applications - Db2ape90
No ratings yet
Developing Perl and PHP Applications - Db2ape90
166 pages
Retrieving Data Using The SQL SELECT Statement
No ratings yet
Retrieving Data Using The SQL SELECT Statement
29 pages
GFX 3.3 Integration Tutorial
No ratings yet
GFX 3.3 Integration Tutorial
55 pages
Chương 1: Advanced User Interface
No ratings yet
Chương 1: Advanced User Interface
88 pages
Adnan's Resume
No ratings yet
Adnan's Resume
1 page
8051 Assembly
No ratings yet
8051 Assembly
55 pages
Define The Term XlAT
No ratings yet
Define The Term XlAT
2 pages
UG Structure 2024 28 CSE BTech May 2024
No ratings yet
UG Structure 2024 28 CSE BTech May 2024
24 pages
Rancher Com Docs k3s Latest en Helm
No ratings yet
Rancher Com Docs k3s Latest en Helm
9 pages
Design Concept in Software Engineering
No ratings yet
Design Concept in Software Engineering
14 pages

CD Module2 16 03 23 PDF

Uploaded by

CD Module2 16 03 23 PDF

Uploaded by

Compiler Design (YCS6003)

<subject> ::= <article> <noun>

<article> ::= the | a

<noun> ::= dog | cat | person

<verb> ::= pets | fed

<object> ::= <article> <noun>

double-check accuracy. Recursive descent parsers use top-down parsing.

● Predictive Parser : Predictive parser is a recursive descent parser with no backtracking or

the rule to be expanded is made upon the next terminal symbol.

real-world programming languages do not use context-free grammars.

transform the start symbol to the input, it is called top-down parsing.

● Recursive descent parsing : It is a common form of top-down parsing. It is called recursive as

to determine the right production.

and the input is read from left to right.

● All members of S are in the closure(S).

No more LR(0) items need to be added, so the closure is finished.

Parse the input string id – id x id using a shift-reduce parser.

Parse the input string ( a , ( a , a ) ) using a shift-reduce parser.

You might also like