0% found this document useful (0 votes)

1 views26 pages

Pcdunit2 Continuation

The document discusses the specification and recognition of tokens using regular expressions, defining key concepts such as alphabets, strings, languages, and operations on languages. It explains the rules for constructing regular expressions, provides examples, and illustrates the use of transition diagrams for lexical analysis. Additionally, it contrasts parse trees with syntax trees, highlighting their differences and uses in representing grammatical structures.

Uploaded by

venkat Mohan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1 views26 pages

Pcdunit2 Continuation

Uploaded by

venkat Mohan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 26

Specification and

Recognition of Tokens
Regular Expression
• A regular expression is a pattern which
specifies a set of strings of characters; it is
said to match certain strings.
• Declarative way of defining/describe
regular languages
Example
• Letter ( letter | digit )*
Terminology of Languages
• Alphabet : a finite set of symbols (ASCII characters)
• String :
– Finite sequence of symbols on an alphabet
– Sentence and word are also used in terms of string
  is the empty string
– |s| is the length of string s.
• Language: sets of strings over some fixed alphabet
  the empty set is a language.
– {} the set containing empty string is a language
• Operators on Strings:
– Concatenation: xy represents the concatenation of strings x and y.
– s =s s=s
– sn = s s s ……… s ( n times) s0 = 
Operations on Languages
• Concatenation:
– L1L2 = { s1s2 | s1  L1 and s2  L2 }
• Union
– L1 L2 = { s | s  L1 or s  L2 }
• Exponentiation:
– L0 = {} L1 = L L2 = LL
• Kleene Closure


– L =
* Li

i 0 L* denotes “zero or more concatenations of “ L

• Positive Closure


– L+ =
Li

i 1
L+ denotes “One or more concatenations of “ L
Example
• L1 = {a,b,c,d} L2 = {1,2}

• L1L2 = {a1,a2,b1,b2,c1,c2,d1,d2}

• L1  L2 = {a,b,c,d,1,2}

• L13 = all strings with length three (using a,b,c,d}

• L1* = all strings using letters a,b,c,d and empty
string
• L1+ = doesn’t include the empty string
Regular Set
• A language that can be defined by a
regular expression is called a regular set.
• A language that can be defined by a
context-free grammar is called a context-
free language.
• the set of regular sets  the set of
context-free language
Rule for Regular Expressions
• The rules that define the regular expression over
alphabet  are as follows.
1.  is a regular expression, denoted { }
2. If a is a symbol in , then a is a regular
expression denoting {a}
3. Suppose r and s are regular expressions for the
languages L(r) and L(s), then,
a) (r) | (s) is a regular expression denoting
L(r)L(s)
b) (r) (s) is a regular expression denoting L(r)L(s)
c) (r )* is a regular expression denoting (L(r ))*
Rule for Regular Expressions (Cont.)

Unnecessary parentheses can be avoided in

regular expression if we adopt the following
conventions
1.The unary operator * has the highest precedence
and is left associative.
2.Concatenation has the second highest
precedence and is left associative.
3.| has the lowest precedence and is left
associative
Rule for Regular Expressions
(Example)
• Let  ={a, b}
1. a | b denotes {a, b}
2. (a | b)(a | b) denotes {aa, ab, ba, bb}, the set of
all strings of a’s and b’s of length two.
3. a* denotes {, a, aa, aaa, …}, the set of all
strings of zero or more a’s.
4. (a | b)* denotes the set of all strings containing
zero or more instances of a or b.
5. a | a*b denotes the set containing string a or the
strings consisting zero or more a’s followed by b.
Notational Shorthands
1. One or more instances +
– a+ : the set of all strings of one ore more a’s
– r + = r r*, r* = r + | 
2. Zero or one instance ?
– r? = r | 
digit  0 | 1 | … | 9
digits  digit +
optional_faction  (.digits)?
optional_exponent  (E(+|-) ? digits)?
num  digits optional_fraction optional_exponent
3. Character class:
− [abc] = a | b | c
− [a-z] = a | b | … | z
− id  [A-Za-z][A-Za-z0-9]*
Recognition of Tokens
• Consider the following grammar fragment:
stmt  if expr then stmt
| if expr then stmt else stmt
|
expr  term relop term
| term
term  id
| num
Recognition of Tokens (Cont.)
• The regular definitions for tokens are as follows:
if  if
then  then
else  else
relop  < | <= | = | <>| > | >=
id  letter (letter|digit)*
num  digit+ (.digit+)? (E(+|-)?digit+ )?
delim  blank | tab | newline
ws  delim+
Regular-expression Patterns
for Tokens
Transition Diagrams
• Lexical analysis use transition diagram to
keep track of information about characters
that are seen as the forward pointer scans
the input.
• Positions in a transition diagram are drawn
as circles and are called states. The states
are connected by arrows, called edges.
• A double circle indicated an accepting state,
a state in which a token is found.
• a* indicates that input retraction must take
place.
Transition Diagrams for >=

• start state : stare 0 in the above example

• If input character is >, go to state 6.
• other refers to any character that is not
indicated by any of the other edges leaving s.
Transition Diagrams for
Relational Operators

token attribute-value
Transition Diagrams for Identifiers and Keywords

Transition Diagrams for White Space

Transition Diagrams for
Unsigned Numbers

install_num(
)

install_num(
)
install_num(
)
Write a RE for the language L accepting all the strings
ending with 00 over the alphabet ∑ = {0 , 1}
(0+1)* 00

Construct s RE for the L accepting the strings which

are starting with 1 and ending with 0 over the alphabet
∑ = {0 , 1}
R = 1(0+1)*0

Write a RE to denote a language over the set ∑

={a,b,c} .S.T every string will have atleast one ‘a’
followed by atleast one ‘b’ followed by atleast one ‘c’
a+b+c+
Write a RE to denote a L where ∑ ={a,b} S.T the third character from the
right end of the string is always a

(a+b)* a (a+b)(a+b)

Construct a RE for the L which accepts all strings with atleast two b over
the alphabet ∑ ={a,b}

(a+b)* b (a+b)* b(a+b)*

Write a RE which denotes a L ∑ ={1} having odd length of string.

11,1111,111111 (11)*
1.(11)* for odd
Construct a RE for the L over the set ∑ ={a,b} in which the
total number of a is divisible by 3.
R=(aaa)+ divisible by 3
R= (b*ab*ab*ab*)+
Write the RE for the L over the set of strings over {a,b,c} that
contain exactly one b
R= (a/c)* b (a/c)*

Write the RE for the L over the set of strings over {a,b,c} that
contain no two consecutive b’s
(b/ͼ) (a/c/ab/cb)*
Write the RE for the L over the set of strings over alphabet
{a,b,c} containing an even number of a’s
((b/c)* a (b/c)* a)* (b/c)*
Describe the language denoted by the following RE’s
•0(0/1)* 0
•(0/1)* 0 (0/1) (0/1)
•0* 10*10*10* or a*ba*ba*ba*.
•(00/11)* ((01/10)* (01/10) (00/11)* )*

The set of all string of 0’s and 1’s starting and ending with 0

The set of all strings of 0’s and 1’s with the third symbol from the right end is 0

The set of all strings of 0’s and 1’s with the number of 1’s in the string is 3

The set of all strings of 0’s and 1’s with even number of 0’s and 1’s.

(0 + 1)∗0((0 + 1)(0 + 1)(0 + 1))∗0(0 + 1)∗

Parse Trees Vs Syntax Trees
Syntax Trees-
Syntax trees are abstract or compact representation of parse trees.They are also called
as Abstract Syntax Trees

Parse Tree Syntax Tree

Parse tree is a graphical
Syntax tree is the compact form of a parse
representation of the replacement
process in a derivation.
tree.

Each interior node represents a

grammar rule. Each interior node represents an operator.
Each leaf node represents an operand.
Each leaf node represents a terminal.

Parse trees provide every Syntax trees do not provide every

characteristic information from the characteristic information from the real
real syntax. syntax.

Parse trees are comparatively less Syntax trees are comparatively more dense
dense than syntax trees. than parse trees.
Considering the following grammar-
E→E+T|T
T→TxF|F
F → ( E ) | id
( a + b ) * ( c – d ) + ( ( e / f ) * ( a + b ))

SAP S4H Migration Cockpit - Presentation - D.3
No ratings yet
SAP S4H Migration Cockpit - Presentation - D.3
63 pages
Unit22pdf 2021 03 13 13 38 11
No ratings yet
Unit22pdf 2021 03 13 13 38 11
114 pages
Compiler Design Assignment
No ratings yet
Compiler Design Assignment
6 pages
Specification of Tokens
No ratings yet
Specification of Tokens
21 pages
Lexi Cal a Analyzer
No ratings yet
Lexi Cal a Analyzer
38 pages
SPECIFICATION OF TOKENS - Unit 1
No ratings yet
SPECIFICATION OF TOKENS - Unit 1
13 pages
Lexical Analyzer 1
No ratings yet
Lexical Analyzer 1
37 pages
Lec 4
No ratings yet
Lec 4
16 pages
chapter 3
No ratings yet
chapter 3
10 pages
2. Regular Expressions
No ratings yet
2. Regular Expressions
4 pages
Specification of Tokens
No ratings yet
Specification of Tokens
21 pages
Lecture 3a and 3b
No ratings yet
Lecture 3a and 3b
21 pages
chapter two
No ratings yet
chapter two
59 pages
Lexical Analysis
No ratings yet
Lexical Analysis
41 pages
TPL lect 15 - 16
No ratings yet
TPL lect 15 - 16
5 pages
unit2
No ratings yet
unit2
135 pages
Lexical Analyzer 2023
No ratings yet
Lexical Analyzer 2023
38 pages
Lecture 7
No ratings yet
Lecture 7
70 pages
Regular expression
No ratings yet
Regular expression
89 pages
Regular Expression
No ratings yet
Regular Expression
23 pages
Unit I
No ratings yet
Unit I
37 pages
Chapter 3 - Regular Expression
No ratings yet
Chapter 3 - Regular Expression
16 pages
CC 2
No ratings yet
CC 2
65 pages
Chapter Two (3) (Autosaved)
No ratings yet
Chapter Two (3) (Autosaved)
29 pages
Operations On Languages
No ratings yet
Operations On Languages
3 pages
Automata Theory: CS411-2012S-02 Formal Languages
No ratings yet
Automata Theory: CS411-2012S-02 Formal Languages
33 pages
ch3 M.PPTX - 0
No ratings yet
ch3 M.PPTX - 0
46 pages
Unit Ii
No ratings yet
Unit Ii
25 pages
Specification of Tokens
0% (1)
Specification of Tokens
17 pages
Specification of Tokens
No ratings yet
Specification of Tokens
17 pages
2_2Specification of Tokens
No ratings yet
2_2Specification of Tokens
17 pages
Bcs503 Module 2
No ratings yet
Bcs503 Module 2
46 pages
Regular Expression: Anab Batool Kazmi
No ratings yet
Regular Expression: Anab Batool Kazmi
32 pages
Regular Expressions and Regular Languages
No ratings yet
Regular Expressions and Regular Languages
5 pages
Regular Expressions
No ratings yet
Regular Expressions
169 pages
Lexical Analysis
No ratings yet
Lexical Analysis
31 pages
TOC SY Unit-3
No ratings yet
TOC SY Unit-3
80 pages
Theory of Computation: Dr. Krishnendu Rarhi E: Krishnendu.e9621@cumail - in
No ratings yet
Theory of Computation: Dr. Krishnendu Rarhi E: Krishnendu.e9621@cumail - in
44 pages
Unit 3 - Regular Expression
No ratings yet
Unit 3 - Regular Expression
45 pages
Theory of Automata Lecture#2: by Riaz Ahmad Ziar R.ziar@kardan - Edu.af
No ratings yet
Theory of Automata Lecture#2: by Riaz Ahmad Ziar R.ziar@kardan - Edu.af
22 pages
Lecture02 Scanning 1
No ratings yet
Lecture02 Scanning 1
72 pages
TOA Lecture 03
No ratings yet
TOA Lecture 03
63 pages
Chap-2 2 (RegularExpression)
No ratings yet
Chap-2 2 (RegularExpression)
46 pages
3. Lexical Analysis
No ratings yet
3. Lexical Analysis
73 pages
Specification of Tokens Using Regular Expressions
No ratings yet
Specification of Tokens Using Regular Expressions
8 pages
TOC Full Syllabus Notes
No ratings yet
TOC Full Syllabus Notes
145 pages
Lect2 Lexical
No ratings yet
Lect2 Lexical
9 pages
Unit 2
No ratings yet
Unit 2
53 pages
03 Regular Expression
No ratings yet
03 Regular Expression
18 pages
Small17 PDF
No ratings yet
Small17 PDF
64 pages
Regular Expressions (2)
No ratings yet
Regular Expressions (2)
17 pages
Wa0014.
No ratings yet
Wa0014.
85 pages
Chapter THREE
No ratings yet
Chapter THREE
24 pages
WINSEM2023-24_CSI2005_TH_VL2023240501823_2024-01-08_Reference-Material-I
No ratings yet
WINSEM2023-24_CSI2005_TH_VL2023240501823_2024-01-08_Reference-Material-I
23 pages
Regular expressions
No ratings yet
Regular expressions
21 pages
Regular Expressions
100% (2)
Regular Expressions
4 pages
2022 CSC 353 2.0 2 Alphabets and Languages
No ratings yet
2022 CSC 353 2.0 2 Alphabets and Languages
3 pages
Regular Expression Notes
No ratings yet
Regular Expression Notes
46 pages
Regular Expression - Languages and Regular Expressions-12!01!2023
No ratings yet
Regular Expression - Languages and Regular Expressions-12!01!2023
20 pages
Automata Lecture 03 RE
No ratings yet
Automata Lecture 03 RE
20 pages
Learn C++
From Everand
Learn C++
Durgesh
4.5/5 (9)
7 SEM OPEN ELECTIVE III CSE D
No ratings yet
7 SEM OPEN ELECTIVE III CSE D
4 pages
Honours Courses - Registration Semester 7 (1)
No ratings yet
Honours Courses - Registration Semester 7 (1)
1 page
AI lab Question Bank
No ratings yet
AI lab Question Bank
18 pages
DN 4.0_Java FSE_Assessment prerequisite
No ratings yet
DN 4.0_Java FSE_Assessment prerequisite
7 pages
Informatica JD - GCS Internship
No ratings yet
Informatica JD - GCS Internship
3 pages
Guidelines for Open elective registration MAY 2025
No ratings yet
Guidelines for Open elective registration MAY 2025
6 pages
OE Regn - coordinator instructions
No ratings yet
OE Regn - coordinator instructions
1 page
2024-25 Library Pending Dues
No ratings yet
2024-25 Library Pending Dues
12 pages
DN 4.0 .Net Student Brochure
No ratings yet
DN 4.0 .Net Student Brochure
5 pages
FSD_new-pages-deleted-compressed
No ratings yet
FSD_new-pages-deleted-compressed
94 pages
fsd extra-pages-deleted-compressed
No ratings yet
fsd extra-pages-deleted-compressed
39 pages
AI ALGO REFERENCE
No ratings yet
AI ALGO REFERENCE
6 pages
Quiz Status 2
No ratings yet
Quiz Status 2
4 pages
WEC ideathon
No ratings yet
WEC ideathon
1 page
BEMS.pptx_20250411_080331_0000
No ratings yet
BEMS.pptx_20250411_080331_0000
11 pages
II & III Year Retest SLOT 3
No ratings yet
II & III Year Retest SLOT 3
2 pages
8682324378_304018075190
No ratings yet
8682324378_304018075190
1 page
RE- TEST II- 12.04.2025-- HALL 1
No ratings yet
RE- TEST II- 12.04.2025-- HALL 1
1 page
PCD - ALGO
No ratings yet
PCD - ALGO
8 pages
III-CSE-D-arrear list-2
No ratings yet
III-CSE-D-arrear list-2
2 pages
RE- TEST II- 12.04.2025--- HALLL 2
No ratings yet
RE- TEST II- 12.04.2025--- HALLL 2
1 page
Random Forest Algorithm - Titanic Dataset
No ratings yet
Random Forest Algorithm - Titanic Dataset
12 pages
ANNUAL DAY SEATING ARRANGEMENT
No ratings yet
ANNUAL DAY SEATING ARRANGEMENT
1 page
7m
No ratings yet
7m
12 pages
202504 PRI Circular Annual Day Working Hours Adjustment
No ratings yet
202504 PRI Circular Annual Day Working Hours Adjustment
1 page
Private Cloud
No ratings yet
Private Cloud
26 pages
Random forest algorithm 1
No ratings yet
Random forest algorithm 1
14 pages
MONGODB Problem solving ques
No ratings yet
MONGODB Problem solving ques
3 pages
User Schema
No ratings yet
User Schema
14 pages
Solar
No ratings yet
Solar
9 pages
SCIP
No ratings yet
SCIP
41 pages
Control Blocks in PictoBlox_ Block Coding
No ratings yet
Control Blocks in PictoBlox_ Block Coding
3 pages
MBA III (I Mid) OR
No ratings yet
MBA III (I Mid) OR
2 pages
FILE System in C
No ratings yet
FILE System in C
10 pages
Writing Netfilter Modules: Jan Engelhardt, Nicolas Bouliane Rev. February 07, 2011
No ratings yet
Writing Netfilter Modules: Jan Engelhardt, Nicolas Bouliane Rev. February 07, 2011
67 pages
Topics: Comparable and Comparator Interfaces in Java
No ratings yet
Topics: Comparable and Comparator Interfaces in Java
14 pages
MM 11
No ratings yet
MM 11
41 pages
MicroBasic Scripting Manual
No ratings yet
MicroBasic Scripting Manual
35 pages
MATLAB III: More Arrays and Design Recipe
No ratings yet
MATLAB III: More Arrays and Design Recipe
43 pages
Log
No ratings yet
Log
22 pages
Ooc Model QP Vtu1
No ratings yet
Ooc Model QP Vtu1
2 pages
AI - Tree (31-1-2023)
No ratings yet
AI - Tree (31-1-2023)
1 page
Language Reference: IBM XL C Enterprise Edition V8.0 For AIX
No ratings yet
Language Reference: IBM XL C Enterprise Edition V8.0 For AIX
226 pages
Chapter 7 - A First Look at GUI Applications
No ratings yet
Chapter 7 - A First Look at GUI Applications
101 pages
Spiral Print
No ratings yet
Spiral Print
5 pages
Elements of Robotics
No ratings yet
Elements of Robotics
312 pages
02 Python Basics
No ratings yet
02 Python Basics
64 pages
Tanish Khandal's Industrial Training Presentation On Python
50% (2)
Tanish Khandal's Industrial Training Presentation On Python
12 pages
Backuped Active Metamask
0% (1)
Backuped Active Metamask
5 pages
Dynamic Timetable Scheduler: Click To Edit Master Title Style
No ratings yet
Dynamic Timetable Scheduler: Click To Edit Master Title Style
9 pages
Git With Eclipse (EGit) - Tutorial
No ratings yet
Git With Eclipse (EGit) - Tutorial
27 pages
SAT Presentation
No ratings yet
SAT Presentation
30 pages
Excel VBA Type Mismatch Error Passing Range To Array - Stack Overflow
No ratings yet
Excel VBA Type Mismatch Error Passing Range To Array - Stack Overflow
1 page
Best First Search Program
No ratings yet
Best First Search Program
12 pages
Data Structures Unit - 1 1. Algorithm
No ratings yet
Data Structures Unit - 1 1. Algorithm
64 pages
Class Presentation 1
No ratings yet
Class Presentation 1
112 pages
build-a-blog-using-nextjs-a5
No ratings yet
build-a-blog-using-nextjs-a5
219 pages
Common Project Defense Questions and Answers
100% (2)
Common Project Defense Questions and Answers
3 pages
Functional Specification: FS - FI - VAT - Profit Center Jan - 18
No ratings yet
Functional Specification: FS - FI - VAT - Profit Center Jan - 18
7 pages

Pcdunit2 Continuation

Uploaded by

Pcdunit2 Continuation

Uploaded by

Specification and

i 0 L* denotes “zero or more concatenations of “ L

• L13 = all strings with length three (using a,b,c,d}

Unnecessary parentheses can be avoided in

• start state : stare 0 in the above example

Transition Diagrams for White Space

Construct s RE for the L accepting the strings which

Write a RE to denote a language over the set ∑

(a+b)* b (a+b)* b(a+b)*

(0 + 1)∗0((0 + 1)(0 + 1)(0 + 1))∗0(0 + 1)∗

Parse Tree Syntax Tree

Each interior node represents a

Parse trees provide every Syntax trees do not provide every

You might also like