Specification of Tokens Using Regular Expressions

Regular expressions can be used to specify patterns for tokens. They are built recursively from smaller expressions using rules like: - Symbols represent single character languages - Union (|) combines alternative languages - Concatenation (.) combines sequential languages - Kleene star (*) represents zero or more repetitions Common operations include union, concatenation, Kleene star. Precedence is * highest, then concatenation, then union. Regular expressions can define languages and names can be given to subexpressions for readability.

Uploaded by

Sam Alex

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

158 views

Specification of Tokens Using Regular Expressions

Uploaded by

Sam Alex

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 8

SPECIFICATION OF

TOKENS USING
REGULAR
EXPRESSIONS
 Strings and Languages
– An alphabet is any finite set of symbols.
– Examples of symbols are letters, digits,
and punctuation.
– The set {0,1} is the binary alphabet.
 A string over an alphabet is a finite sequence
of symbols drawn from that alphabet.
– "sentence" and "word" are often used as synonyms
for "string.“
– The empty string, denoted , is the string of
length zero.
 A language is any countable set of strings over
some fixed alphabet.
– The set containing only the empty string, are
languages {}.
REGULAR EXPRESSION
Regular expressions are an important notation
for specifying lexeme patterns. They are
effective in specifying those types of patterns
that we need for tokens

Regular expression notations for identifiers are

identifier=letter (letter/digit)*
– The regular expressions are built recursively out of smaller
regular expressions, using the rules described below.
– Regular expression construction rules
• Є is a regular expression denoting {є}, that is, the language
containing only the empty string
• If a is a symbol in ∑(alphabet), a is a regular expression denoting
{a}, the language with only one string.
• If r and s are regular expressions denoting languages
L ( r) and L(s ) respectively, then
» (r)|(s) is a regular expression denoting L( r) U L(s)
» (r).(s) is a regular expression denoting L( r). L(s)
» (r)* is a regular expression denoting (L(r ))*
• Precedence of operations
• The unary operator * has highest precedence and is left associative.
• Concatenation has second highest precedence and is left
associative.
• | has lowest precedence and is left associative
• For any regular expressions R , S and T the following
axioms holds
• R|S=S|R (| is commutative)
• R|(S|T)=(R|S)|T (| is assosiative)
• R(ST)=(RS)T (concatenation is assosiative)
• R(S|T)=RS|RT (concatenation distributes over |)
• ЄR=Rє=R (є is the identity for concatenation)
• The regular expression a|b denotes the
language {a, b}.
• (a|b)(a|b) denotes {aa, ah, ba, bb}, the set
of all strings of length two over the
alphabet.
• a* denotes the language consisting of all
strings of zero or more a's, that is, { , a , a
a , a a a , . . . }.
• (a|b)* denotes the set of all strings
consisting of zero or more instances of a or
b
• a|a*b denotes the language {a, b, ab, aab,
aaab,...}
REGULAR DEFINITION
 We may wish to give names to certain
regular expressions and use those names in
subsequent expressions as if the names were
themselves symbols
 di-> ri
 e.g. for language of C identifiers
 letter_->A|B|…|Z|a|b|…|z|_
 digit -> 0|1|…|9
 id -> letter_(letter_|digit)*
EXTENSION OF REGULAR
EXPRESSION
 + one/ more instance
 * zero/ more instance
 ? Zero/ one instance
 [ ] character classes e.g. [a-z]

 ws -> (blank|tab|newline)+
 When ws is recognized , we do not return
anything but restart to the character following
white space

Specification of Tokens
No ratings yet
Specification of Tokens
21 pages
Specification of Tokens
0% (1)
Specification of Tokens
17 pages
Specification of Tokens
No ratings yet
Specification of Tokens
17 pages
2_2Specification of Tokens
No ratings yet
2_2Specification of Tokens
17 pages
Compiler Design Assignment
No ratings yet
Compiler Design Assignment
6 pages
2. Regular Expressions
No ratings yet
2. Regular Expressions
4 pages
SPECIFICATION OF TOKENS - Unit 1
No ratings yet
SPECIFICATION OF TOKENS - Unit 1
13 pages
Regular expressions
No ratings yet
Regular expressions
21 pages
Lec 4
No ratings yet
Lec 4
16 pages
Regular Expression and Languages: Prepared By: Ochovillo, Divina T. & Behic, Esterlita G
No ratings yet
Regular Expression and Languages: Prepared By: Ochovillo, Divina T. & Behic, Esterlita G
9 pages
Lecture 3a and 3b
No ratings yet
Lecture 3a and 3b
21 pages
TPL lect 15 - 16
No ratings yet
TPL lect 15 - 16
5 pages
Chapter No.1
No ratings yet
Chapter No.1
31 pages
Regular Expressions and Tokens
No ratings yet
Regular Expressions and Tokens
14 pages
Specification of Tokens
No ratings yet
Specification of Tokens
21 pages
Lexical Analyzer 1
No ratings yet
Lexical Analyzer 1
37 pages
Theory of Automata RE 3
No ratings yet
Theory of Automata RE 3
13 pages
Lecture#01,2
No ratings yet
Lecture#01,2
25 pages
Day 3 - Regexps
No ratings yet
Day 3 - Regexps
52 pages
Lecture # 06
No ratings yet
Lecture # 06
27 pages
Regular Grammar
No ratings yet
Regular Grammar
56 pages
Compiler Design Unit-1 - 4
No ratings yet
Compiler Design Unit-1 - 4
4 pages
Languages,Grammar and Recognizers
No ratings yet
Languages,Grammar and Recognizers
17 pages
Languages, Automata and Grammars Lecture Notes
No ratings yet
Languages, Automata and Grammars Lecture Notes
21 pages
2_scanning-slides-sanyal-part2
No ratings yet
2_scanning-slides-sanyal-part2
14 pages
Language About Complier Construction
No ratings yet
Language About Complier Construction
23 pages
ECS 20 Chapter 12, Languages, Automata, Grammars: R R 1 2 N R N n-1 2 1 R
No ratings yet
ECS 20 Chapter 12, Languages, Automata, Grammars: R R 1 2 N R N n-1 2 1 R
4 pages
Regular Expressions
No ratings yet
Regular Expressions
31 pages
Chapter 4
No ratings yet
Chapter 4
31 pages
Exercises For Section 3.3
No ratings yet
Exercises For Section 3.3
8 pages
Automata and Formal Language Theory
No ratings yet
Automata and Formal Language Theory
18 pages
Unit I
No ratings yet
Unit I
37 pages
Lecture02 Scanning 1
No ratings yet
Lecture02 Scanning 1
72 pages
Module 4 - Regular Expression
No ratings yet
Module 4 - Regular Expression
20 pages
Absent Tha
No ratings yet
Absent Tha
33 pages
lecture 3, 4 (1)
No ratings yet
lecture 3, 4 (1)
33 pages
Cs402 Theory of Automata Notes
No ratings yet
Cs402 Theory of Automata Notes
3 pages
Unit-3 Syntax Analysis
No ratings yet
Unit-3 Syntax Analysis
319 pages
Unit 3 - Regular Expression
No ratings yet
Unit 3 - Regular Expression
45 pages
FLAT Unit 1 LM
No ratings yet
FLAT Unit 1 LM
10 pages
Chap-2 2 (RegularExpression)
No ratings yet
Chap-2 2 (RegularExpression)
46 pages
Chapter THREE
No ratings yet
Chapter THREE
24 pages
Akash Chowdhury - CD
No ratings yet
Akash Chowdhury - CD
8 pages
Theory of Automata Chapter 1
No ratings yet
Theory of Automata Chapter 1
17 pages
2 Lex
No ratings yet
2 Lex
45 pages
Theoretical Foundations of CS / Wide Variety of Problems - Frameworks/models To Solve Problems - Analyze Problems & Algorithms
No ratings yet
Theoretical Foundations of CS / Wide Variety of Problems - Frameworks/models To Solve Problems - Analyze Problems & Algorithms
50 pages
Autumata Cha1
No ratings yet
Autumata Cha1
20 pages
FL 2
No ratings yet
FL 2
34 pages
Chapter 2 RegularExpressions (3)
No ratings yet
Chapter 2 RegularExpressions (3)
95 pages
LECTURE_1
No ratings yet
LECTURE_1
47 pages
Input Buffering: or or If or I, F
No ratings yet
Input Buffering: or or If or I, F
13 pages
Lecture 7 & 8 - Regular Expressions
No ratings yet
Lecture 7 & 8 - Regular Expressions
39 pages
Lecture 3 (30-1-23)
No ratings yet
Lecture 3 (30-1-23)
11 pages
CompilerD L3
No ratings yet
CompilerD L3
36 pages
re-nfa-220110090941
No ratings yet
re-nfa-220110090941
20 pages
Lecture#01
No ratings yet
Lecture#01
25 pages
CPSC 388 - Compiler Design and Construction: Scanners - Regular Expressions
No ratings yet
CPSC 388 - Compiler Design and Construction: Scanners - Regular Expressions
20 pages
ch3 M.PPTX - 0
No ratings yet
ch3 M.PPTX - 0
46 pages
Cisco Regexp
No ratings yet
Cisco Regexp
4 pages
The Genetic Code of All Languages,(Part-1; An Overview)
From Everand
The Genetic Code of All Languages,(Part-1; An Overview)
Moni Kanchan Panda
No ratings yet
WT Module1
No ratings yet
WT Module1
95 pages
E - COM Notes PDF
No ratings yet
E - COM Notes PDF
63 pages
6.implementing Lexical Analyzer Using Finite Automation
No ratings yet
6.implementing Lexical Analyzer Using Finite Automation
15 pages
Ai-Module Iii
No ratings yet
Ai-Module Iii
22 pages
Phases of A Compiler
No ratings yet
Phases of A Compiler
17 pages
Sudoku Solver
No ratings yet
Sudoku Solver
2 pages
TR2103 Introduction To Numerical Methods Assignment 1-1 (Submit Before 30 October)
No ratings yet
TR2103 Introduction To Numerical Methods Assignment 1-1 (Submit Before 30 October)
10 pages
DAA Unit 3
No ratings yet
DAA Unit 3
46 pages
ML Unit 2
No ratings yet
ML Unit 2
39 pages
EE351Chap3-2.0 - Transfer FTN & SFGs
No ratings yet
EE351Chap3-2.0 - Transfer FTN & SFGs
19 pages
Algorithmic Graph Theory - David Joyner, Minh Van Nguyen, Nathann Cohen
No ratings yet
Algorithmic Graph Theory - David Joyner, Minh Van Nguyen, Nathann Cohen
319 pages
BFS & DFS
No ratings yet
BFS & DFS
6 pages
L13 Graph Part02
No ratings yet
L13 Graph Part02
41 pages
UCS802
No ratings yet
UCS802
1 page
Prolog As AI Programming Language - 1
No ratings yet
Prolog As AI Programming Language - 1
38 pages
Zonal Informatics Olympiad, 2016: Instructions To Candidates
No ratings yet
Zonal Informatics Olympiad, 2016: Instructions To Candidates
5 pages
Compiler Design
No ratings yet
Compiler Design
4 pages
CHAPTER 01 - Basics of Coding Theory
No ratings yet
CHAPTER 01 - Basics of Coding Theory
36 pages
Lab Activity 2: Basic C++ Program: Duration: 2 Hours
No ratings yet
Lab Activity 2: Basic C++ Program: Duration: 2 Hours
5 pages
Bubble Sort
No ratings yet
Bubble Sort
6 pages
s18 Cu6051np Cw1 17031944 Nirakar Sigdel
No ratings yet
s18 Cu6051np Cw1 17031944 Nirakar Sigdel
18 pages
Polymorph Is M
No ratings yet
Polymorph Is M
5 pages
Unit 5 Graph Theory - Part 1
No ratings yet
Unit 5 Graph Theory - Part 1
60 pages
Btech It 5 Sem Compiler Design Kit052 2021
No ratings yet
Btech It 5 Sem Compiler Design Kit052 2021
1 page
Heap Sort - Javatpoint
No ratings yet
Heap Sort - Javatpoint
16 pages
Conic Sections Practice Test02
No ratings yet
Conic Sections Practice Test02
9 pages
Final Exams - Sample Paper CS301P
No ratings yet
Final Exams - Sample Paper CS301P
9 pages
Dr. Zahid Halim: Ghulam Ishaq Khan Institute of Engineering Sciences and Technology, Topi
No ratings yet
Dr. Zahid Halim: Ghulam Ishaq Khan Institute of Engineering Sciences and Technology, Topi
10 pages
Unit5.ipynb - Numerical Integration
No ratings yet
Unit5.ipynb - Numerical Integration
5 pages
Bit Hacks: 6.172 Performance Engineering of Software Systems
No ratings yet
Bit Hacks: 6.172 Performance Engineering of Software Systems
70 pages
Workbook Keys All 2018 MADEEASY
100% (1)
Workbook Keys All 2018 MADEEASY
15 pages
Nondeterministic Finite Automata (NFA) : Multiple Next State
No ratings yet
Nondeterministic Finite Automata (NFA) : Multiple Next State
4 pages
AI
No ratings yet
AI
114 pages
QT _Unit 3 - Linear Programming
No ratings yet
QT _Unit 3 - Linear Programming
88 pages
Entrance Exam DP1
No ratings yet
Entrance Exam DP1
5 pages