Chap-2 2 (RegularExpression)
Chap-2 2 (RegularExpression)
Regular Expressions
▪ Lexical Analysis is
also known as
lexical scanner.
average = (sum/count)
average identifier
= Assignment operator
( open parenthesis
sum identifier
/ Division operator
count Identifier
) Close parenthesis
Valid Token/string
( a ( b + c )* )* d
12
Given a finite alphabet Σ, the following constants are
defined as Regular Expressions:
•(empty set) ∅ denoting the set ∅.
L1 is {0,10,1011}
L2 is {ε,0,00,000,0000,00000,. . . . }
A language is a set of strings
16
REGULAR EXPRESSIONS
1. Union.
2. Concatenation.
3. Kleene Star.
REGULAR EXPRESSIONS
Language Operators
The union of two sets is that set which contains all the
elements in each of the two sets and nothing else.
For example,
For example,
abc . ba = abcba
REGULAR EXPRESSIONS
Language Operators
s . ε = s.
If L is a language, we define:
L0 = {ε}
L1 = L
L2 = L . L
L3 = L . L.L
Ln = L . Ln-1
L* = L0 + L1 + L2 + L3 + L4 + L5 + ...
“i” here refers to how many strings to concatenate from the parent
language L to produce strings in the language Li
• L0= {}
Highest to lowest:
Example:
01* + 1 = ( 0 . ((1)*) ) + 1
REGULAR EXPRESSIONS
(3) Kleene Closure (the * operator)
Example:
L0 = {}
L1 = {1,00}
L2 = {11,100,001,0000}
L3 = {111,1100,1001,10000,000000,00001,00100,0011}
…….
L* = L0 U L1 U L2 U …
REGULAR EXPRESSIONS
(3) Kleene Closure (the * operator)
Example:
If M is a language:
, we define:
Examples
Regular Language
a|b L= {a, b}
(a | b)(a | b) L= {aa, ab, ba, bb}
a* L= {, a, aa, aaa, ...}
(a | b)* The set of all strings of a’s and b’s.
L= {a, b, ……?……}
a | a *b The set containing the string a and
all strings consisting of zero or
31
more a’s followed by b.
L= {a, b, ab, aab, …, a…ab}
Examples
(a|b)* denotes the set of all strings with no symbols other than "a"
and "b", including the empty string: {ε, "a", "b", "aa", "ab", "ba", "bb",
"aaa", …}
ab*(c|ε) denotes the set of strings starting with "a", then zero or
more "b"s and finally optionally a "c": {"a", "ac", "ab", "abc", "abb",
"abbc", …}
Commutative:
• E+F = F+E
Associative:
• (E+F)+G = E+(F+G)
• (EF)G = E(FG)
Identity:
• E+Φ = E
•E=E=E
Annihilator:
• ΦE = EΦ = Φ
Algebraic Laws of Regular Expressions
Distributive:
• E(F+G) = EF + EG
• (F+G)E = FE+GE
Idempotent: E + E = E
Involving Kleene closures:
• (E*)* = E*
• Φ* =
• * =
• E+ =EE*
• E? = +E
Summary
• These are formulas or expressions consisting of three possible
operations on languages – (union, concatenation, and Kleene
star)
• Union –The union of two sets is that set which contains all the
elements in each of the two sets and nothing else. And it is
designated with a ‘+’.
• For example: {abc, ab, ba} + {ba, bb} = {abc, ab, ba, bb}
• Concatenation –concatenating each string in one set with each
string in the other set. And it is designated with a ‘.’
• For example, {ab, a, c} . {b} = {ab.b, a.b, c.b} = {abb, ab, cb}
• Kleene * -generates zero or more concatenations of strings
from the language to which it is applied. And it is designated
with a ‘*’.
• For example, a* = {, a, aa, aaa, aaaa, aaaaa, aaaaaaaaaaaaa}
◼ Optional characters ? ,* and +
➢ ? (0 or 1)
◼ /colou?r/ ➔ color or colour
➢ * (0 or more)
◼ /oo*h!/ ➔ oh! or Ooh! or Ooooh!
➢ + (1 or more)
◼ /o+h!/ ➔ oh! or Ooh! or Ooooh!
Examples
• For each of the following regular expressions,
list six strings which are in its language.
1. (a(b+c)*)*d
2. (a+b)*(c+d)
3. (a*b*)*
Regular Expiration For Language
Example:
String of a’s and b’s that start and end with a.
a (a | b)* a
Example:
all strings of lowercase letters in which the letters
are in ascending lexicographic order.
a* b* c* …..z*
Exercises
• Suppose L1 represents the set of all strings
from the alphabet 0,1 which contain an even
number of ones (even parity). Which of the
following strings belong to L1?
(a) 0101
(b) 110211
(c) 000
(d) 010011
(e)
41
Exercises
• Suppose L2 represents the set of all strings
from the alphabet a,b,c which contain an
equal number of a’s, b’s, and c’s. Which of
the following strings belong to L2?
(a) bca
(b) accbab
(c)
(d) aaa
(e) aabbcc
42
Exercises
• Which of the following strings belong to the
language specified by this regular expression:
(a+bb)*a
(a) ε
(b) aaa
(c) ba
(d) bba
(e) abba
43
TRUE OR FALSE?
Let R and S be two regular expressions.
Then:
1. ((R*)*)* = R* ?
2. (R+S)* = R* + S* ?
1.Regular Expression for no 0 or many triples of 0’s and many 1 in the strings.
8.Regular Expression for an odd number of 0’s or an odd number of 1’s in the strings.