0% found this document useful (0 votes)

11 views16 pages

Lec 4

Uploaded by

Mohammad Humayun

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views16 pages

Lec 4

Uploaded by

Mohammad Humayun

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 16

 Specification of Tokens

 Strings and Languages

 Operations on Languages
 Regular Expressions
 Regular Definitions
 Extensions of Regular Expressions

1
Specification of Tokens
 In this section first we need to know about finite vs infinite sets
and also uses the notion of a countable set.

 A countable set is either a finite set or one whose elements can be

counted.

 The set of rational numbers, i.e., fractions in lowest terms, is

countable;.

 The set of real numbers is uncountable, because it is strictly bigger,

i.e., it cannot be counted.

2
Strings and Languages
 For strings and languages we need to see a bunch of definitions:

 Def: An alphabet is a finite set of symbols.

Ex: {0,1}, presumably φ , ascii, unicode, ebcdic

 Def: A string over an alphabet is a finite sequence of symbols from

that alphabet. Strings are often called words or sentences.
Ex: Strings over {0,1}: ε, 0, 1, 111010.
Strings over ascii: ε, system, the string consisting of 3
blanks.

3
Strings and Languages..
 Def: A language over an alphabet is a countable set of strings over
the alphabet.
Ex: All grammatical English sentences with five, eight, or twelve
words is a language over ascii.

 Def: The concatenation of strings s and t is the string formed by

appending the string t to s. It is written st.
Ex: εs = sε = s for any string s.

 Def: The length of a string is the number of symbols (counting

duplicates) in the string.
Ex: The length of vciit, written |vciit|, is 5.
4
Strings and Languages...
 Def: A prefix of string S is any string obtained by removing zero or
more symbols from the end of s.
Ex: ban, banana, and ε are prefixes of banana.

 Def: A suffix of string s is any string obtained by removing zero or

more symbols from the beginning of s.
Ex: nana, banana, and ε are suffixes of banana.

 Def: substring of s is obtained by deleting any prefix and any suffix

from s.
Ex: banana, nan, and ε are substrings of banana.

5
Strings and Languages...
 Def: The proper prefixes, suffixes, and substrings of a string s are
those, prefixes, suffixes, and substrings, respectively, of s that
are not ε or not equal to s itself.
Ex: ban is proper prefix of banana.

 Def: A subsequence of s is any string formed by deleting zero or

more not necessarily consecutive positions of s.
Ex: baan is a subsequence of banana.

6
Operation on Languages

 In lexical analysis, the most important operations on languages are

union, concatenation, and closure, which are defined as follows:

 The union of L1 and L2, written L ∪ M is simply the set-theoretic

union, i.e., it consists of all words (strings) in either L1 or L2.

The union of {English sentences with one, three, or five words} with
{English sentences with two or four words} is {English sentences with five
or fewer words}.

7
Operation on Languages..
 The concatenation of L1 and L2 is the set of all strings st, where s is
a string of L1 and t is a string of L2

The concatenation of {a,b,c} and {1,2} is {a1,a2,b1,b2,c1,c2}

 As with strings, it is natural to define powers of a language L

L0={ε}, which is not φ. Li+1=LiL

8
Operation on Languages..
 The (Kleene) closure of L, denoted L* is L0 ∪ L1 ∪ L2 ...

 The positive closure of L, denoted L+ is L1 ∪ L2 ...

 Ex:
{a,b}* is {ε,a,b,aa,ab,ba,bb,aaa,aab,aba,abb,baa,bab,bba,bbb,...}
{a,b}+ is {a,b,aa,ab,ba,bb,aaa,aab,aba,abb,baa,bab,bba,bbb,...}
{ε,a,b}* is {ε,a,b,aa,ab,ba,bb,...}.
{ε,a,b}+ is the same as {ε,a,b}*.

9
Regular Expressions
 A regular expression is a sequence of characters that forms a
search pattern, mainly for use in pattern matching with strings.

 The idea is that the regular expressions over an alphabet consist of

the alphabet, and expressions using union, concatenation, and *,
but it takes more words to say it right.

 Each regular expression r denotes a language L(r) , which is also

defined recursively from the languages denoted by r's sub
expressions.

10
Regular Expressions..
 Rules that define the regular expressions over some alphabet Σ
and the languages that those expressions denote are:

 BASIS: There are two rules that form the basis:

1. ε is a regular expression, and L(ε) is {ε} , that is, the

language whose sole member is the empty string.

2. If a is a symbol in Σ, then a is a regular expression, and

L(a) = {a}, that is, the language with one string, of length
one, with a in its 1st position.

11
Regular Expressions...
 INDUCTION: There are four parts to the induction whereby larger
regular expressions are built from smaller ones.

Suppose r and s are regular expressions denoting languages L(r)

and L(s), respectively.

1. (r) | (s) is a regular expression denoting the language L(r) U L(s)

2. (r) (s) is a regular expression denoting the language L(r)L(s)
3. (r)* is a regular expression denoting (L (r)) *
4. (r) is a regular expression denoting L(r)

12
Regular Expressions...
 Regular expressions often contain unnecessary pairs of
parentheses. We may drop certain pairs of parentheses if we
adopt the conventions that:

a) The unary operator * has highest precedence and is left associative.

b) Concatenation has second highest precedence and is left associative.

c) | has lowest precedence and is left associative.

13
Regular Expressions...
 Ex. Let Σ = {a, b}

1. The regular expression a | b denotes the language {a, b} .

2. (a|b)(a|b) denotes {aa, ab, ba, bb} , the language of all strings of
length two over the alphabet Σ .
Another regular expression for the same language is
aa | ab | ba | bb

3. a* denotes the language consisting of all strings of zero or more

a's, that is, {ε , a, aa, aaa, ... }.

14
Regular Expressions...
4. (a|b)* denotes the set of all strings consisting of zero or
more instances of a or b, that is, all strings of a's and b's:
{ε, a, b, aa, ab, ba, bb, aaa, ... }.

Another regular expression for the same language is

(a*b*)*

5. a | a*b denotes the language {a, b, ab, aab, aaab, ... }, that
is,
the string a and all strings consisting of zero or more a's and
ending in b

15
Thank You

Plantilla de Verbos Irregulares Con 3 Persona Singular
100% (1)
Plantilla de Verbos Irregulares Con 3 Persona Singular
2 pages
Specification of Tokens
No ratings yet
Specification of Tokens
21 pages
SPECIFICATION OF TOKENS - Unit 1
No ratings yet
SPECIFICATION OF TOKENS - Unit 1
13 pages
Unit I
No ratings yet
Unit I
37 pages
Lecture 3a and 3b
No ratings yet
Lecture 3a and 3b
21 pages
Chap-2 2 (RegularExpression)
No ratings yet
Chap-2 2 (RegularExpression)
46 pages
3 RegularExpressions
No ratings yet
3 RegularExpressions
25 pages
Lexical Analysis
No ratings yet
Lexical Analysis
41 pages
Languages, Automata and Grammars Lecture Notes
No ratings yet
Languages, Automata and Grammars Lecture Notes
21 pages
Compiler Design Assignment
No ratings yet
Compiler Design Assignment
6 pages
Regular expression
No ratings yet
Regular expression
89 pages
Pcdunit2 Continuation
No ratings yet
Pcdunit2 Continuation
26 pages
Bcs503 Module 2
No ratings yet
Bcs503 Module 2
46 pages
Unit22pdf 2021 03 13 13 38 11
No ratings yet
Unit22pdf 2021 03 13 13 38 11
114 pages
Specification of Tokens
No ratings yet
Specification of Tokens
17 pages
Specification of Tokens
0% (1)
Specification of Tokens
17 pages
Operations On Languages
No ratings yet
Operations On Languages
3 pages
2_2Specification of Tokens
No ratings yet
2_2Specification of Tokens
17 pages
TOA Lecture 03
No ratings yet
TOA Lecture 03
63 pages
Specification of Tokens
No ratings yet
Specification of Tokens
21 pages
Theory of Computation: Dr. Krishnendu Rarhi E: Krishnendu.e9621@cumail - in
No ratings yet
Theory of Computation: Dr. Krishnendu Rarhi E: Krishnendu.e9621@cumail - in
44 pages
chapter two
No ratings yet
chapter two
59 pages
TPL lect 15 - 16
No ratings yet
TPL lect 15 - 16
5 pages
chapter 3
No ratings yet
chapter 3
10 pages
Chapter 3 - Regular Expression
No ratings yet
Chapter 3 - Regular Expression
16 pages
Lexical Analyzer 1
No ratings yet
Lexical Analyzer 1
37 pages
21CS51 ATCD MODULE 2 - 1 Regular Expressions
No ratings yet
21CS51 ATCD MODULE 2 - 1 Regular Expressions
148 pages
Regular Expressions
No ratings yet
Regular Expressions
31 pages
Chapter 4
No ratings yet
Chapter 4
31 pages
LECTURE_1
No ratings yet
LECTURE_1
47 pages
Week 4 Lec 8 CC p2-1
No ratings yet
Week 4 Lec 8 CC p2-1
17 pages
Automata Theory: CS411-2012S-02 Formal Languages
No ratings yet
Automata Theory: CS411-2012S-02 Formal Languages
33 pages
Small17 PDF
No ratings yet
Small17 PDF
64 pages
2022 CSC 353 2.0 2 Alphabets and Languages
No ratings yet
2022 CSC 353 2.0 2 Alphabets and Languages
3 pages
21-Pumping Lemma For Regular Languages - Closure Properties of Regular Languages-08!02!2024
No ratings yet
21-Pumping Lemma For Regular Languages - Closure Properties of Regular Languages-08!02!2024
26 pages
Regular Expression
No ratings yet
Regular Expression
16 pages
Unit Ii
No ratings yet
Unit Ii
25 pages
5 - Regular Expression
No ratings yet
5 - Regular Expression
144 pages
ATCD Material
No ratings yet
ATCD Material
50 pages
Regular Expressions (2)
No ratings yet
Regular Expressions (2)
17 pages
Lecture 03
No ratings yet
Lecture 03
16 pages
Chapter THREE
No ratings yet
Chapter THREE
24 pages
Lecture # 2: Automata Theory and Formal Languages (CSC-221)
No ratings yet
Lecture # 2: Automata Theory and Formal Languages (CSC-221)
48 pages
Regular expressions
No ratings yet
Regular expressions
21 pages
02 PDF
No ratings yet
02 PDF
13 pages
2. Regular Expressions
No ratings yet
2. Regular Expressions
4 pages
Lecture Slides Regular Expressions
No ratings yet
Lecture Slides Regular Expressions
138 pages
Language About Complier Construction
No ratings yet
Language About Complier Construction
23 pages
Regular Expression and Languages: Prepared By: Ochovillo, Divina T. & Behic, Esterlita G
No ratings yet
Regular Expression and Languages: Prepared By: Ochovillo, Divina T. & Behic, Esterlita G
9 pages
Chapter 3 - Regular Expressions
No ratings yet
Chapter 3 - Regular Expressions
49 pages
Automata Lectuee3
No ratings yet
Automata Lectuee3
27 pages
CSC236 Week 9: Larry Zhang
No ratings yet
CSC236 Week 9: Larry Zhang
44 pages
Regular Expressions and Regular Languages
No ratings yet
Regular Expressions and Regular Languages
5 pages
Week4-5
No ratings yet
Week4-5
43 pages
ECS 20 Chapter 12, Languages, Automata, Grammars: R R 1 2 N R N n-1 2 1 R
No ratings yet
ECS 20 Chapter 12, Languages, Automata, Grammars: R R 1 2 N R N n-1 2 1 R
4 pages
03-RegularExpression 112422
No ratings yet
03-RegularExpression 112422
22 pages
Chapter 3 Finite Automata and Lexical Analysis
No ratings yet
Chapter 3 Finite Automata and Lexical Analysis
100 pages
3B-Formal Languages
No ratings yet
3B-Formal Languages
24 pages
Formal Languages and Automata Theory - Regular Expressions and Finite Automata
No ratings yet
Formal Languages and Automata Theory - Regular Expressions and Finite Automata
17 pages
Introduction to Formal Languages
From Everand
Introduction to Formal Languages
György E. Révész
2/5 (1)
The Genetic Code of All Languages,(Part-1; An Overview)
From Everand
The Genetic Code of All Languages,(Part-1; An Overview)
Moni Kanchan Panda
No ratings yet
Lec 3
No ratings yet
Lec 3
10 pages
Parser Lec5
No ratings yet
Parser Lec5
13 pages
Lec 1
No ratings yet
Lec 1
15 pages
Lec 2
No ratings yet
Lec 2
21 pages
Assignment 2 G1
No ratings yet
Assignment 2 G1
2 pages
Operator Overloading More Operators
No ratings yet
Operator Overloading More Operators
34 pages
COAL Theory Outline Fall 2022
No ratings yet
COAL Theory Outline Fall 2022
4 pages
FOIT - Mid Term Date Sheet F22 (Final Version)
No ratings yet
FOIT - Mid Term Date Sheet F22 (Final Version)
17 pages
Unit 1. Grammar and Vocab Exercises - R
No ratings yet
Unit 1. Grammar and Vocab Exercises - R
3 pages
The Classroom
No ratings yet
The Classroom
3 pages
Underline The Prepositional Phrase in Each Sentence
No ratings yet
Underline The Prepositional Phrase in Each Sentence
3 pages
Handout - Nominative, Accusative, and Dative - When To Use Them
No ratings yet
Handout - Nominative, Accusative, and Dative - When To Use Them
2 pages
Past Regular Verbs Add - Ed. Irregular Verbs Change.: Present Add An - S in 3rd Person in Singular
100% (1)
Past Regular Verbs Add - Ed. Irregular Verbs Change.: Present Add An - S in 3rd Person in Singular
3 pages
Lit
No ratings yet
Lit
4 pages
Bimestral Inglés Grado 9
No ratings yet
Bimestral Inglés Grado 9
3 pages
Vocabulary Prefix and Suffix
No ratings yet
Vocabulary Prefix and Suffix
21 pages
Ghanaian Languages and Culture Primary 4 - 6
No ratings yet
Ghanaian Languages and Culture Primary 4 - 6
72 pages
Comparatives and Superlatives
No ratings yet
Comparatives and Superlatives
1 page
Fahriza Ghozali Alkaf 2193121019 RI LexicoGrammar
No ratings yet
Fahriza Ghozali Alkaf 2193121019 RI LexicoGrammar
12 pages
Definition of Term: 1) Mistake
No ratings yet
Definition of Term: 1) Mistake
3 pages
Most Common Irregular Verbs
No ratings yet
Most Common Irregular Verbs
2 pages
Comparison of Tenses PDF
No ratings yet
Comparison of Tenses PDF
1 page
Teaching Vocabulary Explicitely
100% (5)
Teaching Vocabulary Explicitely
58 pages
Plano Analítico 10 Classe - Inglês 2024
No ratings yet
Plano Analítico 10 Classe - Inglês 2024
2 pages
Worksheet - Seasons Change-Cls.6
No ratings yet
Worksheet - Seasons Change-Cls.6
2 pages
First Grade English Syllabus Example
No ratings yet
First Grade English Syllabus Example
3 pages
Composition in Modern English
No ratings yet
Composition in Modern English
36 pages
SM 12 English
No ratings yet
SM 12 English
225 pages
conjuction
No ratings yet
conjuction
5 pages
Unit 6: Great Achievers: Past Simple
No ratings yet
Unit 6: Great Achievers: Past Simple
10 pages
G2 English PPA 3 Revision Worksheet PDF
No ratings yet
G2 English PPA 3 Revision Worksheet PDF
3 pages
Prepositions of Movement
No ratings yet
Prepositions of Movement
9 pages
TNPSC Group 4 Govt Material
No ratings yet
TNPSC Group 4 Govt Material
184 pages
What Translation Tells Us About Motion: A Contrastive Study of Typologically Different Languages
No ratings yet
What Translation Tells Us About Motion: A Contrastive Study of Typologically Different Languages
26 pages
Using Be and Have
No ratings yet
Using Be and Have
13 pages
New Headway Elementary 4 TH Edition Workbook - Ok - Copia-Páginas-Eliminadas
No ratings yet
New Headway Elementary 4 TH Edition Workbook - Ok - Copia-Páginas-Eliminadas
30 pages
16 - Modul Bahasa Inggris Stan
No ratings yet
16 - Modul Bahasa Inggris Stan
54 pages

Lec 4

Uploaded by

Lec 4

Uploaded by

Contents

 Strings and Languages

 A countable set is either a finite set or one whose elements can be

 The set of rational numbers, i.e., fractions in lowest terms, is

 The set of real numbers is uncountable, because it is strictly bigger,

 Def: An alphabet is a finite set of symbols.

 Def: A string over an alphabet is a finite sequence of symbols from

 Def: The concatenation of strings s and t is the string formed by

 Def: The length of a string is the number of symbols (counting

 Def: A suffix of string s is any string obtained by removing zero or

 Def: substring of s is obtained by deleting any prefix and any suffix

 Def: A subsequence of s is any string formed by deleting zero or

 In lexical analysis, the most important operations on languages are

 The union of L1 and L2, written L ∪ M is simply the set-theoretic

The concatenation of {a,b,c} and {1,2} is {a1,a2,b1,b2,c1,c2}

 As with strings, it is natural to define powers of a language L

L0={ε}, which is not φ. Li+1=LiL

 The positive closure of L, denoted L+ is L1 ∪ L2 ...

 The idea is that the regular expressions over an alphabet consist of

 Each regular expression r denotes a language L(r) , which is also

 BASIS: There are two rules that form the basis:

1. ε is a regular expression, and L(ε) is {ε} , that is, the

2. If a is a symbol in Σ, then a is a regular expression, and

Suppose r and s are regular expressions denoting languages L(r)

1. (r) | (s) is a regular expression denoting the language L(r) U L(s)

a) The unary operator * has highest precedence and is left associative.

b) Concatenation has second highest precedence and is left associative.

c) | has lowest precedence and is left associative.

1. The regular expression a | b denotes the language {a, b} .

3. a* denotes the language consisting of all strings of zero or more

Another regular expression for the same language is

You might also like