0% found this document useful (0 votes)

30 views36 pages

Lecture 03

Syntax analysis is the second phase of a compiler that checks the source code for grammatical structure and ensures adherence to language grammar rules. Parsing, a key component of this phase, involves analyzing token sequences to construct parse trees and facilitate semantic analysis. Context-Free Grammar (CFG) is used to define the syntax of programming languages, consisting of production rules that describe how valid strings can be generated.

Uploaded by

nihafahima9

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views36 pages

Lecture 03

Uploaded by

nihafahima9

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 36

Syntax Analysis

Overview of Syntax Analysis

• Definition:
Syntax analysis is the second phase of the compiler.
It checks the source code for its grammatical structure.

• Purpose:
Ensure the program follows the rules of the source language
grammar.
Role of Parsing in Compiler Design
Parsing is the process of analyzing a sequence of tokens to determine its
grammatical structure based on a given formal grammar.

Main Responsibilities:
•Check for Syntax Errors: Ensures the program adheres to the syntax rules of
the language.
•Construct Parse Tree: Represents the structure of the source code
hierarchically.
•Facilitate Semantic Analysis: Provides input for the next stage of the
compiler (semantic analysis) by structuring the code.
•Assist in Code Generation: Helps in translating the parse tree into
intermediate code or machine code.
•Example: Converting a mathematical expression into a parse tree and using
that for further analysis.
Position of parser in compiler model

❑ Parsing Approach
• Top down
• Bottom UP
Context-Free Grammar (CFG)
A Context-Free Grammar (CFG) is a formal system used to define
the syntax of programming languages. It consists of a set of
production rules that describe how strings in a language can be
generated.
•Formal way to describe the syntax of programming languages.
•Defines the syntactic structure of valid strings in a language.
•Composed of rules, called productions.

Notation:
G = (N, Σ, P, S)
CFG is defined by four tuples:

•V: A set of non-terminal symbols (also called variables).

•Σ: A set of terminal symbols (disjoint from V).
•P: A set of production rules, where each rule maps a non-terminal to
a string of terminals and/or non-terminals.
•S: A start symbol, which is a non-terminal.
Example
•Example: CFG for Simple Arithmetic Expression
Formal Definition of CFG
1. Terminals (T) are the basic symbols from which strings
are formed.
• The term "token name" is a synonym for "terminal" and
frequently we will use the word "token" for terminal.
• Terminals are the first components of the tokens
output by the lexical analyzer.
• Keywords: if, else, (, )

stmt → if ( expr ) stmt else stmt

2. Non-terminals (V) are syntactic variables that denote

sets of strings.
Non-terminals: stmt, expr
The sets of strings denoted by non-terminals help
define the language generated by the grammar.
Nonterminals impose a hierarchical structure on the
language that is key to syntax analysis and translation.
stmt → if ( expr ) stmt else stmt

3. Start symbol (S): one non-terminal is distinguished as

the start symbol,
✔ the set of strings it denotes is the language generated
by the grammar.
✔ Conventionally, the productions for the start symbol are
listed first.
stmt → if ( expr ) stmt else stmt
4. Productions (P) of a grammar specify the manner in
which the terminals & non-terminals can be combined
to form strings. Each production consists of:
• (a) A non-terminal called the head or left side of the
production; this production defines some of the strings
denoted by the head.
• (b) The symbol →. Sometimes : : = has been used in
place of the arrow.
• (c) A body or right side consisting of zero or more
terminals and non-terminals.
Example: Grammar for simple arithmetic
expressions
expression → expression + term
expression → expression - term Terminals:
expression → term id, +, -, *, /, (, )
term → term * factor Non-terminals:
term → term / factor
expression, term, factor
term → factor
factor → ( expression ) Start Symbol
factor → id expression
Notational Conventions
1. Terminals:
2. Non-terminals:
-Lowercase letters early in the
alphabet: a, b, c. -Uppercase letters early in the alphabet:
A, B, C.
-Operator symbols: +, *
-Letter S: start symbol.
-Punctuation symbols:
parentheses, comma, -Lowercase, italic: expr, stmt.
-Digits: 0, 1, . . . , 9. -Uppercase letters: E, T, F
-Boldface strings: id, if, each of
which represents a single
terminal symbol.
5. Lowercase Greek letters, α, β,
δ: represent (possibly empty)
3. Uppercase letters late in strings of grammar symbols.
the alphabet: as X, Y, Z, Thus , a generic production can
represent grammar be written as A → α, where
A-head & α-body.
symbols(T or V).
6. A set of productions A → α1, A
4. Lowercase letters late in → α2, ... , Ak → αk with a
the alphabet: u, v, ... ,z, common head A (call them
represent (possibly A-productions) , written
empty) strings of A → α1,| A → α2,| ... ,| Ak → αk
terminals. Call α1, α2, …, αk are the
alternatives for A.

7. Unless stated otherwise, the head of the

first production is the start symbol.
Example
Construct the CFG for the language having any number of a's over the set ∑= {a}.
Solution:
r.e. = a*
Production rule for the Regular expression is as follows:
S → aS rule 1
S → ε rule 2
to derive a string "aaaaaa", we can start with start symbols.
S
aS
aaS rule 1
aaaS rule 1
aaaaS rule 1
aaaaaS rule 1
aaaaaaS rule 1
aaaaaaε rule 2
aaaaaa
The r.e. = a* can generate a set of string {ε, a, aa, aaa,.....}. We can have a null string because S is a start
symbol and rule 2 gives S → ε.
Construct a CFG for the language L = anb2n where n>=1.
Solution:
The string that can be generated for a given language is
{abb, aabbbb, aaabbbbbb....}.
The grammar could be:

S → aSbb | abb
Now if we want to derive a string "aabbbb", we can start
with start symbols.
S → aSbb
S → aabbbb
Example 3:
Construct a CFG for a language L = {wcwR | where w € (a, b)*}.
Solution:
The string that can be generated for a given language is {aacaa, bcb, abcba, bacab,
abbcbba, ....}
The grammar could be:
1.S → aSa rule 1
2.S → bSb rule 2
3.S → c rule 3
Now if we want to derive a string "abbcbba", we can start with start symbols.
1.S → aSa
2.S → abSba from rule 2
3.S → abbSbba from rule 2
4.S → abbcbba from rule 3
Derivation

The derivation is the process of using the production rules (grammar) to

derive the input string. There are two decisions that the parser must make
to form the input string:
Deciding which non-terminal is to be replaced. There are two options to do this:
a) Left-most Derivation: When the non-terminals are replaced from left to right, it is
called left-most derivation.
b) Right-most Derivation: When the non-terminals are replaced from right to left, it is
called right-most derivation.
Deciding the production rule using which the non-terminal will be replaced.
Example:
Production rules:
•S = S + S
•S = S - S
•S = a | b |c
Input:
a-b+c
The left-most derivation is:
•S = S + S
•S = S - S + S
•S = a - S + S
•S = a - b + S
•S = a - b + c
Example:
•S = S + S
•S = S - S
•S = a | b |c
Input:
a-b+c
The right-most derivation is:
•S = S - S
•S = S - S + S
•S = S - S + c
•S = S - b + c
•S = a - b + c
Parse Tree
• Definition: A hierarchical tree that represents the derivation of a
string according to a grammar.
A parse tree contains the following properties:
• The root node is always a node indicating start symbols.
• The derivation is read from left to right.
• The leaf node is always terminal nodes.
• The interior nodes are always the non-terminal nodes.
Example: Illustration of a parse tree for an
arithmetic expression.
Regular Expression: id + id * id
Grammar Rules: Parse Tree
•E → E + T | T E
•T → T * F | F /|\
E + T
• F → id | /\
T T F
/\ | |
F * F id
| |
id id
Production rules:

1.E  E  E
Input:
2.E  E  E
a*b+c
3.E = a | b | c

Draw a derivation tree for the string "bab" from the CFG given by
1.S → bSb | a | b

Now, the derivation tree for the string "bbabb" is as

follows:
Construct a derivation tree for the string aabbabba for the CFG given
by,
1.S → aB | bA
2.A → a | aS | bAA
3.B → b | bS | aBB Now, the derivation tree is as
follows:
Example
Show the derivation tree for string "aabbbb" with the following
grammar.
1.S  AB | ε
2.A → aB
3.B  Sb
Parse Tree & Derivation
-(id+id)

LMD
Ambiguity in Grammar
A grammar is said to be ambiguous if there exists
more than one leftmost derivation or more than one
rightmost derivation or more than one parse tree for
the given input string. If the grammar is not
ambiguous, then it is called unambiguous.
If the grammar has ambiguity, then it is not good for
compiler construction. No method can automatically
detect and remove the ambiguity, but we can remove
ambiguity by re-writing the whole grammar without
ambiguity.
Example 1:
Let us consider a grammar G with the production
rule
1.E  I
2.E  E  E
3.E  E  E
4.E → E
5.I → ε | 0 | 1 | 2 | ... | 9
For the string "3  2  5", the above grammar can generate two parse trees by leftmost
derivation:
Example 2:
Check whether the given grammar G is ambiguous or not.
•E → E + E
•E → E - E
• E → id
From the above grammar derive String "id + id - id"

First Leftmost Second

1.E  E  E Leftmost
1.E  E  E
2. → id + E 1.E  E  E
2.E  E  E
3. → id + E  E 2. → E  E  E
3.E → id
4. → id + id - E 3. → id + E  E
5. → id + id- id 4. → id + id - E
5. → id + id - id
Grammar for mathematical expressions

Example strings:

Denotes any number

Prof. Busch - LSU 30

A leftmost derivation
for

Prof. Busch - LSU 31

Another
leftmost derivation
for

32
Two derivation trees
for

Prof. Busch - LSU 33

take

Prof. Busch - LSU 34

Good Bad
Tree Tree

Compute expression result

using the tree

Prof. Busch - LSU 35

Types of Parsers

• Top-Down Parsing:
• Builds the parse tree from the root.
• Example: Recursive Descent Parser, LL Parser.
• Bottom-Up Parsing:
• Builds the parse tree from the leaves.
• Example: Shift-Reduce Parser, LR Parser.

Answers To CES Test About Personal Survival and Survival Craft
100% (1)
Answers To CES Test About Personal Survival and Survival Craft
19 pages
I S Eniso12944-9-2018
No ratings yet
I S Eniso12944-9-2018
17 pages
SRM Procure To Pay
No ratings yet
SRM Procure To Pay
217 pages
DNV OS-H102 Marine Operations Design and Fabrication 2012-01
No ratings yet
DNV OS-H102 Marine Operations Design and Fabrication 2012-01
12 pages
Module 2
No ratings yet
Module 2
19 pages
Unit 3 Syntax - Analyzer
No ratings yet
Unit 3 Syntax - Analyzer
56 pages
CST302_COMPILER_DESIGN_MODULE 2
No ratings yet
CST302_COMPILER_DESIGN_MODULE 2
19 pages
Module 2 C D Notes
No ratings yet
Module 2 C D Notes
21 pages
Class 18 Context Free Grammar
No ratings yet
Class 18 Context Free Grammar
35 pages
Compiler Design 3
No ratings yet
Compiler Design 3
140 pages
Compiler Design Chapter-3
0% (1)
Compiler Design Chapter-3
177 pages
Lesson 3: Syntax Analysis: Risul Islam Rasel
No ratings yet
Lesson 3: Syntax Analysis: Risul Islam Rasel
106 pages
Compiler Design CS_4
No ratings yet
Compiler Design CS_4
70 pages
TPL Lect 17-20
No ratings yet
TPL Lect 17-20
8 pages
1 Syntax Analyzer
No ratings yet
1 Syntax Analyzer
33 pages
Class Three
No ratings yet
Class Three
74 pages
Chương 3. Phân Tích Cú Pháp
No ratings yet
Chương 3. Phân Tích Cú Pháp
103 pages
1 Syntax Analyzer
No ratings yet
1 Syntax Analyzer
33 pages
Cdmodule 2
No ratings yet
Cdmodule 2
22 pages
Compiler Design - Syntax Analysis
No ratings yet
Compiler Design - Syntax Analysis
14 pages
CD.mod2
No ratings yet
CD.mod2
18 pages
CD Unit2
No ratings yet
CD Unit2
45 pages
APznzaYtAWjYy0s_GBEoizaF1ROv5e2pS_Nl6BcNYabrBN8gt4KeYj7LFiXdkYVxT_V92vXdgLmWE0ZcbyVltch5fozoqQQ4KdG766DLjO8aJsMIPKjEjniZOjL0qtNhMykCRh_ohPtDpZvrHNBAvbbZBhvxDpVEqpjDluyzuJGi-VI3NuG46DY_24QwGBEoRdfQYjfevW6tvweeRG (1)
No ratings yet
APznzaYtAWjYy0s_GBEoizaF1ROv5e2pS_Nl6BcNYabrBN8gt4KeYj7LFiXdkYVxT_V92vXdgLmWE0ZcbyVltch5fozoqQQ4KdG766DLjO8aJsMIPKjEjniZOjL0qtNhMykCRh_ohPtDpZvrHNBAvbbZBhvxDpVEqpjDluyzuJGi-VI3NuG46DY_24QwGBEoRdfQYjfevW6tvweeRG (1)
100 pages
Chapter-3 so far
No ratings yet
Chapter-3 so far
50 pages
Chapter 3 (2)
No ratings yet
Chapter 3 (2)
41 pages
Lecture 03
No ratings yet
Lecture 03
7 pages
Chapter 3 - Syntax Analysis Part One
No ratings yet
Chapter 3 - Syntax Analysis Part One
17 pages
2. Simple Syntax Directed Translation
No ratings yet
2. Simple Syntax Directed Translation
51 pages
6836
No ratings yet
6836
42 pages
Chapter 3 - Syntax Analysis
No ratings yet
Chapter 3 - Syntax Analysis
16 pages
Chapter 3
No ratings yet
Chapter 3
77 pages
G52Cmp Compilers: Syntax Analysis
No ratings yet
G52Cmp Compilers: Syntax Analysis
36 pages
2-Role of Parser and Parse Tree-02!08!2024
No ratings yet
2-Role of Parser and Parse Tree-02!08!2024
69 pages
CH03
No ratings yet
CH03
57 pages
Compiler Design Lec-Three Syntax Analysis
No ratings yet
Compiler Design Lec-Three Syntax Analysis
60 pages
Chapter Four
No ratings yet
Chapter Four
54 pages
Unit Iii
No ratings yet
Unit Iii
28 pages
Compiler Construction Week 04 Syntax Analysis I)
No ratings yet
Compiler Construction Week 04 Syntax Analysis I)
41 pages
03 Compiler Design Lecture - Syntax Analysis
No ratings yet
03 Compiler Design Lecture - Syntax Analysis
39 pages
Chapter-3-Syntax Analysis
No ratings yet
Chapter-3-Syntax Analysis
126 pages
Compiler Design Unit 2
No ratings yet
Compiler Design Unit 2
24 pages
Unit 2
No ratings yet
Unit 2
10 pages
Chapter 3 - Syntax Analysis Part One
No ratings yet
Chapter 3 - Syntax Analysis Part One
10 pages
Chapter 3 Syntax Analysis Full Reading Material
No ratings yet
Chapter 3 Syntax Analysis Full Reading Material
76 pages
2nd Phase Syntax Analyzer -1
No ratings yet
2nd Phase Syntax Analyzer -1
136 pages
SE Compiler Chapter 3-Parser
No ratings yet
SE Compiler Chapter 3-Parser
27 pages
[Week 4] Syntax Analysis (CFG)
No ratings yet
[Week 4] Syntax Analysis (CFG)
50 pages
2014-CD Ch-03 SAn
No ratings yet
2014-CD Ch-03 SAn
21 pages
ToC Notes - Unit 2
No ratings yet
ToC Notes - Unit 2
20 pages
Lex
No ratings yet
Lex
13 pages
Chapter 3 Syntax Analysis (Parsing)
No ratings yet
Chapter 3 Syntax Analysis (Parsing)
29 pages
Toc 4 and 5 Unit Notes
No ratings yet
Toc 4 and 5 Unit Notes
72 pages
Chapter 3- Syntax Analysis
No ratings yet
Chapter 3- Syntax Analysis
9 pages
Syntax Analysis (Part-I)
No ratings yet
Syntax Analysis (Part-I)
88 pages
Top Down Parsing-Note1
No ratings yet
Top Down Parsing-Note1
18 pages
Chapter 3 Syntax Analysis (Parsing)
No ratings yet
Chapter 3 Syntax Analysis (Parsing)
29 pages
2.2 - Syntax Analysis (Upto Top-down Parsing)
No ratings yet
2.2 - Syntax Analysis (Upto Top-down Parsing)
91 pages
CFG & GNF
No ratings yet
CFG & GNF
21 pages
CS6109-MODULE-4
No ratings yet
CS6109-MODULE-4
36 pages
Chapter – three
No ratings yet
Chapter – three
139 pages
3 Role of Parser
No ratings yet
3 Role of Parser
135 pages
Syntax Analyzer
No ratings yet
Syntax Analyzer
38 pages
toc 2
No ratings yet
toc 2
23 pages
Learn C++
From Everand
Learn C++
Durgesh
4.5/5 (9)
Lecture 02
No ratings yet
Lecture 02
150 pages
Compiler Questions
No ratings yet
Compiler Questions
50 pages
Lecture 08
No ratings yet
Lecture 08
36 pages
Lecture 04
No ratings yet
Lecture 04
51 pages
Lecture 01
No ratings yet
Lecture 01
47 pages
Noteartificial intelligence
No ratings yet
Noteartificial intelligence
23 pages
Week 10.1. Gifted Class
No ratings yet
Week 10.1. Gifted Class
5 pages
(G.R. No. L-45168, September 25, 1979)
No ratings yet
(G.R. No. L-45168, September 25, 1979)
4 pages
General Note
No ratings yet
General Note
715 pages
Maty's Beginner Guide To TF2 Trading
No ratings yet
Maty's Beginner Guide To TF2 Trading
15 pages
7SR17 Rho Catalogue Sheet
No ratings yet
7SR17 Rho Catalogue Sheet
20 pages
Arens Aud16 02
No ratings yet
Arens Aud16 02
41 pages
39 77 1 SM
No ratings yet
39 77 1 SM
11 pages
Assignment 5
No ratings yet
Assignment 5
16 pages
English6 Week 4
No ratings yet
English6 Week 4
9 pages
WWII Orbat
No ratings yet
WWII Orbat
4 pages
Design For A Circular Economy Web 1
No ratings yet
Design For A Circular Economy Web 1
35 pages
GTM OS Operating System Report
No ratings yet
GTM OS Operating System Report
12 pages
HIGHLIGHT1
No ratings yet
HIGHLIGHT1
11 pages
Comprog Reviewer
No ratings yet
Comprog Reviewer
11 pages
Embassy: Sport
No ratings yet
Embassy: Sport
4 pages
Institute For Steel Development & Growth (Insdag)
No ratings yet
Institute For Steel Development & Growth (Insdag)
1 page
PROZ690 PDDR4100x150
No ratings yet
PROZ690 PDDR4100x150
213 pages
Signaling Specifications: Technical Data
No ratings yet
Signaling Specifications: Technical Data
122 pages
Hons_AdmitCard_2024_2025_HONS_5076632 (12)
No ratings yet
Hons_AdmitCard_2024_2025_HONS_5076632 (12)
1 page
EDPM Notes on PowerPoint Presentation
No ratings yet
EDPM Notes on PowerPoint Presentation
5 pages
Imenco-CCTV_Marine-CCTV-System_Product-Sheet_RevD
No ratings yet
Imenco-CCTV_Marine-CCTV-System_Product-Sheet_RevD
3 pages
(15.12.22) 홍콩 의약품허가제도
No ratings yet
(15.12.22) 홍콩 의약품허가제도
86 pages
Dingg-Pamphlet 2 - c2c - 27-8-2022 - Open Font - For Digital
No ratings yet
Dingg-Pamphlet 2 - c2c - 27-8-2022 - Open Font - For Digital
2 pages
Building A Whiskey Still
No ratings yet
Building A Whiskey Still
9 pages
Tectite Push Fit
No ratings yet
Tectite Push Fit
64 pages
A Knowledge-Based SWOT-analysis System As An Instrument For Strategic Planning in Small and Medium Sized Enterprises
No ratings yet
A Knowledge-Based SWOT-analysis System As An Instrument For Strategic Planning in Small and Medium Sized Enterprises
11 pages

Lecture 03

Uploaded by

Lecture 03

Uploaded by

Syntax Analysis

Overview of Syntax Analysis

•V: A set of non-terminal symbols (also called variables).

stmt → if ( expr ) stmt else stmt

2. Non-terminals (V) are syntactic variables that denote

3. Start symbol (S): one non-terminal is distinguished as

7. Unless stated otherwise, the head of the

The derivation is the process of using the production rules (grammar) to

Now, the derivation tree for the string "bbabb" is as

First Leftmost Second

Denotes any number

Prof. Busch - LSU 30

Prof. Busch - LSU 31

Prof. Busch - LSU 33

Prof. Busch - LSU 34

Compute expression result

Prof. Busch - LSU 35

You might also like