0% found this document useful (0 votes)
11 views

Unit-III (Regular Expression)

Uploaded by

Sabu Dahal
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

Unit-III (Regular Expression)

Uploaded by

Sabu Dahal
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 58

Unit III

Regular Expressions

Prepared By:
Ghanashyam BK

1
Regular Language
A language is said to be a REGULAR LANGUAGE if
and only if some Finite State Machine recognizes it
So what languages are NOT REGULAR?
 The languages
Which are not recognized by any FSM
Which require memory

2
Regular Expressions
those algebraic expressions used for representing
regular languages, the languages accepted by finite
automaton.
offer a declarative way to express the strings we want
to accept.
Many system uses regular expression as input
language.
Search commands such as UNIX grep
Lexical analyzer generator such as LEX or FLEX.
Lexical analyzer is a component of compiler that breaks
the source program into logical unit called tokens.
3
Regular Expressions
 Each regular expression ‘r’ denotes a language L(r)
 The defining rules specify how L(r) is formed by combining in
various ways
Method:
 Let Σ be an alphabet, the regular expression over the alphabet Σ
are defined inductively as follows:
Basic steps:
 Φ is a regular expression representing empty language.
 Є is a regular expression representing the language of empty
strings. i.e.{Є}
 if ‘a’ is a symbol in Σ, then ‘a’ is a regular expression
representing the language {a}.
4
Regular Expressions
following operations over basic regular expression define
the complex regular expression as:
-if ‘r’ and ‘s’ are the regular expressions representing the
language L(r) and L(s) then
r U s is a regular expression denoting the language L(r) U
L(s).
r.s is a regular expression denoting the language L(r).L(s).
r* is a regular expression denoting the language (L(r))*.
(r) is a regular expression denoting the language (L(r)).
Note: any expression obtained from Φ, Є, a using above
operation and parenthesis where required is a regular
expression.
5
Regular Operators
Basically, there are three operators that are used to
generate the languages that are regular
Union (U / | /+): If L1 and L2 are any two regular
languages then
L1UL2 ={s | s ε L1, or s ε L2 }
For Example:
 L1 = {00, 11}, L2 = (Є, 10} then
 L1UL2 = {Є, 00, 11, 10}

6
Regular Operators
Concatenation (.):
If L1 and L2 are any two regular languages then,
L1.L2 = {l1.l2|l1 ε L1 and l2 ε L2}
For examples:
 L1 = {00, 11} and L2 = {Є, 10} then
 L1.L2={00,11,0010,1110}
 L2.L1={1000,1011,00,11}
 So L1.L2 !=L2.L1

7
Regular Operators
Kleen Closure (*):
If L is any regular Language then,
L* = Li =L0 UL1UL2U………….
Precedence of regular operator:
The star operator is of highest precedence. i.e it applies
to its left well formed RE.
Next precedence is taken by concatenation operator.
Finally, unions are taken

8
Regular Languages
Let Σ be an alphabet, the class of regular language
over Σ is defined inductively as:
Φ is a regular language representing empty language
{Є} is a regular language representing language of
empty strings.
For each a ε Σ, {a} is a regular language.
If L1, L2…………. Ln is regular languages, then so is
L1U L2U………..ULn.
If LI,L2,L3,…………..Ln are regular languages, then so
is L1.L2.L3………Ln
If L is a regular language, then so is L*

9
Applications of Regular Languages
Validation:
Determining that a string complies with a set of
formatting constraints. Like email address validation,
password validation etc.
Search and Selection:
Identifying a subset of items from a larger set on the
basis of a pattern match.
 Tokenization:
Converting a sequence of characters into words, tokens
(like keywords, identifiers) for later interpretation.

10
Algebraic Rules for Regular Expressions
Commutativity:
Commutative of operator means we can switch the order
of its operands and get the same result.
The union of regular expression is commutative but
concatenation of regular expression is not commutative.
Associativity:
The unions as well as concatenation of regular
expressions are associative.
i.e. if t, r, s are regular expressions representing regular
languages L(t),L(r) and L(s) then,
 t+(r+s) = (t+r)+s and t.(r.s) = (t.r).s

11
Algebraic Rules for Regular Expressions
Distributive law:
For any regular expression r,s,t representing regular
language L(r), L(s) and L(t) then,
r(s+t) = rs+rt ------ left distribution.
(s+t)r = sr+tr ------ right distribution
Identity law:
Φ is identity for union. i.e. for any regular expression r
representing regular expression L(r).
r + Φ = Φ + r = r i.e. Φ U r = r.
Є is identity for concatenation. i.e. Є.r = r = r. Є

12
Algebraic Rules for Regular Expressions
Annihilator:
An annihilator for an operator is a value such that when
the operator is applied to the annihilator and some other
value, the result is annihilator.
Φ is annihilator for concatenation.
i.e. Φ.r = r.Φ = Φ
Idempotent law of union:
For any regular expression r representing the regular
language L(r), r + r = r.
This is the idempotent law of union.

13
Algebraic Rules for Regular Expressions
Law of closure:
for any regular expression r, representing the regular
language L(r),then
(r*)*=r*
Closure of Φ = Φ* = Є
Closure of Є = Є* = Є
Positive closure of r, r+ = rr*.

14
Regular Expressions Examples
Consider Σ = {0, 1}, then some regular expressions over
Σ are:
0*10* is RE that represents language {w|w contains a
single 1}
Σ * 1Σ* is RE for language{w|w contains at least single
1}
Σ*001 Σ* = {w|w contains the string 001 as substring}
(Σ Σ)* or ((0+1)*.(0+1)*) is RE for {w|w is string of
even length}
1*(01*01*)* is RE for {w|w is string containing even
number of zeros}
15
Regular Expressions Examples
0*10*10*10* is RE for {w|w is a string with exactly
three 1’s}
For string that have substring either 001 or 100, the
regular expression is (1+0)*.001.(1+0)*+(1+0)*.(100).
(1+0)*
For strings that have at most two 0’s with in it, the
regular expression is 1*.(0+Є).1*.(0+Є).1*
For the strings ending with 11, the regular expression
is (1+0)*.(11)

16
Finite Automata and Regular expression
In order to show that the RE define the same class of
language as Finite automata, we must show that:
Any language define by one of these finite automata is
also defined by RE.
Every language defined by RE is also defined by any of
these finite automata.

17
Reduction of Regular Expression to ε –
NFA
We can show that every language L(R) for some RE R,
is also a language L(E) for some epsilon NFA.
This say that both RE and epsilon-NFA are equivalent
in terms of language representation.
Theorem 1
For any regular expression r, there is an Є-NFA that
accepts the same language represented by r.
Proof:
Let L =L(r) be the language for regular expression r,
now we have to show there is an Є-NFA E such that L
(E) =L.
18
Reduction of Regular Expression to ε –
NFA
 The proof can be done through structural induction on r, following the
recursive definition of regular expressions.
 For this we know Φ, Є, ‘a’ are the regular expressions representing
languages {Φ}; an empty language, {Є};language for empty strings
and {a} respectively.
 The Є-NFA accepting these languages can be constructed as;

19 This Forms the basic steps


Reduction of Regular Expression to ε –
NFA
Now the induction parts are shown below
Let r be a regular expression representing language
L(r) and r1,r2 be regular expressions for languages
L(r1) and L(r2),
For union ‘+’: From basis step we can construct Є-
NFA’s for r1 and r2. Let the Є-NFA’s be M1 and M2
respectively

20
Reduction of Regular Expression to ε –
NFA
Then, r=r1+r2 can be constructed as:

The language of this automaton is L(r1) U L(r2) which


is also the language represented by expression r1+r2.
 For concatenation ‘.’ : Now, r = r1.r2 can be
constructed as;

21
Reduction of Regular Expression to ε –
NFA
Here, the path from starting to accepting state go first
through the automaton for r1, where it must follow a
path labeled by a string in L(r1), and
then through the automaton for r2, where it follows a
path labeled by a string in L(r2).
Thus, the language accepted by above automaton is
L(r1).L(r2).

22
Reduction of Regular Expression to ε –
NFA
For *(Kleen closure)
Now, r* Can be constructed as;

Clearly language of this Є-NFA is L(r*) as it can also


just Є as well as string in L(r), L(r)L(r), L(r)L(r)L(r)
and so on. Thus covering all strings in L(r*).
This completes the proof.

23
Examples (Conversion from RE to Є-NFA)
For regular expression (1+0) the Є-NFA is:

for (0+1)*, the Є-NFA is:

24
Examples (Conversion from RE to Є-NFA)
For regular expression (00+1)*10 the Є-NFA is as:

Now,Find Є-NFA for whole regular expression


(0+1)*1(0+1)

25
Equivalence of Regular Expression and
Finite Automata
Discussed in class.

26
Conversion of DFA to Regular Expression
Arden’s Theorem:
Let p and q be the regular expressions over the
alphabet Σ, if p does not contain any empty string then
r = q + rp has a unique solution r = qp*.
Proof:
Here, r = q + rp ……………… (i)
Let us put the value of r = q + rp on the right hand side
of the relation (i), so;
r = q + (q + rp)p
r = q + qp + rp2………………(ii)

27
Conversion of DFA to Regular Expression
Again putting value of r = q + rp in relation (ii), we get;
r = q + qp + (q +rp) p2
r = q+ qp + qp2 + rp3………………
Continuing in the same way, we will get as;
r = q + qp + qp2 + qp3………………..
r = q(Є + p + p2 +p3 +…………………..
Thus r = qp* Proved.

28
Conversion of DFA to Regular Expression
Use of Arden’s rule to find the regular expression
for DFA:
To convert the given DFA into a regular expression,
here are some of the assumptions regarding the
transition system:
The transition diagram should not have the Є-transitions.
There must be only one initial state.
The vertices or the states in the DFA are as;
 q1,q2,……………..qn (Any qi is final state)

29
Conversion of DFA to Regular Expression
Wij denotes the regular expression representing the set
of labels of the edjes from qi to qj.
Thus we can write expressions as;
q1=q1w11+q2w21+q3w31+………………qnwn1+Є
q2=q1w12+q2w22+q3w32+………………+qnwn2
q3=q1w13+q2w23+q3w33+………………+qnwn3
…………………………………………………
…………………………………………………
qn=q1w1n+q2wn2+q3wn3+………………………qnwnn
Solving these equations for qi in terms of wij gives the
regular expression eqivalent to given DFA.
30
Conversion of DFA to Regular Expression
 Examples: Convert the following DFA into regular
expression.

 Let the equations are:


 q1= Є + q21+q30……….(i)
 q2=q10…………………(ii)
 q3=q11…………………..(iii)
 q4=q20+q31+q40+ q41……(iv

31
Conversion of DFA to Regular Expression
Putting the values of q2 and q3 in (i)
q1=q101+q110+ Є
i.e.q1=q1(01+10)+ Є
i.e.q1= Є+q1(01+10) (since r = q+rp)
i.e. q1= Є(01+10)* (using Arden’s rule)
Since, q1 is final state, the final regular expression for
the DFA is
Є(01+10)*
= (01+10)*

32
Excercises
Convert the following DFA into RE.

33
Representation of Languages
Representations can be formal or informal.
Example (formal): represent a language by a RE or
DFA defining it.
Example: (informal): a logical or prose statement
about its strings:
{0n1n | n is a nonnegative integer}
The set of strings consisting of some number of 0’s
followed by the same number of 1’s.

34
Properties of Regular Languages
 Language classes have two important kinds of properties:
 Decision properties.
A decision property for a class of languages is an algorithm that
takes a formal description of a language (e.g., a DFA) and tells
whether or not some property holds.
Example: Is language L empty?
 Closure properties.
A closure property of a language class says that given languages
in the class, an operator (e.g., union) produces another language in
the same class.
Example: the regular languages are obviously closed under union,
concatenation, and (Kleene) closure.Use the RE representation of
languages.

35
Pumping Lemma
It is shown that the class of language known as regular
language has at least four different descriptions.
They are the language accepted by DFA‟s, by NFA‟s,
by Є-NFA, and defined by RE.
Not every language is Regular.
To show that a langauge is not regular, the powerfull
technique used is known as Pumping Lemma.

36
Pumping Lemma
Statement:
Let L be a regular language. Then, there exists an
integer constant n so that for any x ε L with |x| ≥ n,
there are strings u, v, w such that x = uvw,
v is not equal to Є
|uv| ≤ n,
|v| > 0.
 Then uvkw ε L for all k ≥ 0.
Note: Here k is the string that can be pumped i.e
repeating k any number of times or deleting it, keeps
the resulting string in the language.
37
Pumping Lemma
Proof:
Suppose L is a regular language, then L is accepted by
some DFA M. Let M has n states. Also L is infinite so
M accepts some string x of length n or greater. Let
length of x, |x| =m where m ≥ n.
Now suppose;
X = a1a2a3………………am where each ai ε Σ be an input
symbol to M. Now, consider for j = 1,………….n, qj be
states of M

38
Pumping Lemma
 Then,
 (q0,x) = (q0,a1a2………..am) [q0 being start state of M]
= (q1,a2………am)
=…………………
=…………………
= (qm,Є) [qm being final state]

 Since m ≥ n, and DFA M has only n states, so by pigeonhole


principle, there exists some i and j; 0 ≤ i < j ≤ m such that q i =qj.

39
Pumping Lemma
Now we can break x=uvw as
u = a1a2…………..ai
v =ai+1……………aj
w =aj+1……………am
i.e. string ai+1 ………………aj takes M from state qi
back to itself since qi = qj. So we can say M accepts
a1a2…………ai(ai+1…………aj)k aj+1……………am
for all k≥0.
Hence, uvkw ε L for all k≥0.

40
Application of Pumping Lemma
To prove any language is not a regular language.
For example: Show that language, L={0r1r|r ≥0} is
not a regular language.
Solution:
Let L is a regular language. Then by pumping lemma,
there are strings u, v, w with v≥1 such that uvkw ε L for
k≥0.

41
Application of Pumping Lemma
Case I:
Let v contain 0’s only. Then,
 suppose u = 0p , v = 0q ,w = 0r1s ;
Then we must have p+q+r = s (as we have 0r1r ) and
q>0
Now, uvkw = 0p(0q)k0r1s = 0p+qk+r1s
Only these strings in 0p+qk+r1s belongs to L for k=1
otherwise not.
Hence we conclude that the language is not regular.

42
Application of Pumping Lemma
Case II
Let v contains 1’s only. Then u= 0p1q , v = 1r , w=1s
Then p= q+r+s and r>0
Now, 0p1q(1r)k1s = 0p1q+rk+s
Only those strings in 0p1q+rk+s belongs to L for k =1
otherwise not.
Hence the language is not regular.

43
Application of Pumping Lemma
Case III
V contains 0’s and 1’s both. Then, suppose,
u = 0p , v = 0q1r , w = 1s ;
p+q = r+s and q+r>0
Now, uvkw = 0p(0q1r)k1s = 0p+qk1rk+s
Only those strings in 0p+qk1rk+s belongs to L for k=1,
otherwise not. (As it contains 0 after 1 for k>1 in the
string.)
Hence the language is not regular.

44
Closure Properties of Regular Languages
The union of two regular languages is regular
The intersection of two regular languages is regular.
The complement of a regular language is regular
The difference of two regular language is regular.
The reversal of a regular language is regular.
The closure (star) of a regular language is regular.
The concatenation of a regular language is a regular.

45
Properties of Regular Languages over
Union(U)
Theorem: If L and M are regular languages, then so
is L U M.
Proof:
Since, L and M are regular, they have regular
expressions,
Say L=L(R) and M = L(S). Then
L U M = L(R+S) by the definition of the + operator for
regular expressions.

46
Properties of Regular Languages over
Complement

47
Minimization of Finite State Machines:
Table Filling Algorithm
Given a DFA M, that accepts a language L (M). Now,
configure a DFA M’. During the course of
minimization, it involves identifying the equivalent
states and distinguishable states.
Equivalent States: Two states p & q are called
equivalent states, denoted by p ≡ q if and only if for
each input string x, (
(p, x) is a final state if and only if (q, x) is a final
state.
Distinguishable state: Two states p & q are said to be
distinguishable states if (for any) there exists a string
x, such that (p, x) is a final state (q, x) is not a
48
final state.
Minimization of Finite State Machines:
Table Filling Algorithm
The steps of the algorithm are; For identifying the
pairs (p, q) with p ≠ q;
List all the pairs of states for which p ≠ q.
Make a sequence of passes through each pairs.
On first pass, mark the pair for which exactly one
element is final (F).
On each sequence of pass, mark the pair (r, s) if for any a
ε Σ, δ(r, a) = p and δ(s, a) = q and (p, q) is already
marked.
After a pass in which no new pairs are to be marked, stop
Then marked pairs (p, q) are those for which p and q are
not equivalent and unmarked pairs are those for which p
49 ≡ q.
Minimization of Finite State Machines:
Table Filling Algorithm
Example

50
Minimization of Finite State Machines:
Table Filling Algorithm
Now to solve this problem first we should determine
weather the pair is distinguishable or not.

51
Minimization of Finite State Machines:
Table Filling Algorithm
For pair (b, a)
(δ(b, 0 ), δ(a, 0)) = (g, h) – unmarked
(δ(b, 1), δ(a, 1)) = (c, f) – marked
For pair (d, a)
(δ(d, 0), δ(a, 0)) = (c, b) – marked
Therefore (d, a) is distinguishable.
For pair (e, a)
(δ(e, 0), δ(a, 0)) = (h, h) – unmarked.
(δ(e, 1), δ(a, 1)) = (f, f) –unmarked.
[(e, a) is not distinguishable)]

52
Minimization of Finite State Machines:
Table Filling Algorithm
For pair (g, a)
(δ(g, 0), δ( a, 0)) = (a, g) – unmarked.
(δ(g, 1), δ(a, 1)) = (e, f) – unmarked
For pair (h, a)
(δ(h, 0), δ(a, 0)) = (g, h) –unmarked
(δ(h, 1), δ(a 1) = (c, f) – marked
Therefore (h, a) is distinguishable.
For pair (d, b)
(δ(d, 0), δ(b,0)) = (c, g) – marked
Therefore (d, b) is distinguishable.

53
Minimization of Finite State Machines:
Table Filling Algorithm
For pair (e, b)
(δ(e, 0), δ(b,0)) = (h, g) –unmarked
(δ(e, 1), δ(b,1) = (f, c) – marked.
For pair (f, b)
(δ(f, 0), δ(b,0)) = (c, g) – marked
For pair (g, b)
(δ(g, 0), δ(b, 0)) = (g, g) – unmarked
(δ(h, 1), δ(b, 1)) = (e, c) – marked
For pair (h, b)
(δ(h, 0), δ(b, 0)) = (g, g) – unmarked
(δ(h,1), δ(b,1)) = (c, c) - unmarked.

54
Minimization of Finite State Machines:
Table Filling Algorithm
For pair (e, d)
(δ(e, 0), δ(d, 0)) = (h, c) – marked
(e, d) is distinguishable.
For pair (f, d)
(δ(f, 0), δ(d, 0)) = (c, c) – unmarked
(δ(f,1), δ(f,1)) = (g, g) - unmarked.
For pair (g, d)
(δ(g, 0), δ(d, 0)) = (g, c) – marked
For pair (h, d)
(δ(h, 0), δ(d, 0)) = (g, c) – marked

55
Minimization of Finite State Machines:
Table Filling Algorithm
For pair (f, e)
(δ(f, 0), δ(e, 0)) = (c, h) – marked
For pair (g, e)
(δ(g, 0), δ(e, 0)) = (g, h) – unmarked
(δ(g,1), δ(e,1)) = (e, f) -marked.
For pair (h, e)
(δ(h, 0), δ(e, 0)) = (g, h) – unmarked
(δ(h,1), δ(e,1)) = (c, f) -marked.
For pair (g, f)
(δ(g, 0), δ(f, 0)) = (g, c) – marked

56
Minimization of Finite State Machines:
Table Filling Algorithm
For pair (h, f)
(δ(h, 0), δ(f, 0)) = (g, c) – marked
For pair (h, g)
(δ(h, 0), δ(g, 0)) = (g, g) – unmarked
(δ(h,1), δ(g,1)) = (c, e) -marked.
Thus (a, e), (b, h) and (d, f) are equivalent pairs of
states.

57
Minimization of Finite State Machines:
Table Filling Algorithm
Hence the minimized DFA is

58

You might also like