0% found this document useful (0 votes)
31 views

Regular Language: Md. Mohibullah, Assistant Professor, Department of CSE, Comilla University

Uploaded by

MD.MAZEDUL ISLAM
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views

Regular Language: Md. Mohibullah, Assistant Professor, Department of CSE, Comilla University

Uploaded by

MD.MAZEDUL ISLAM
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Theory of Computation: Regular Language, Regular Operations & Regular

Expressions[Part-01]

Regular Language
A regular language is a language that can be expressed with a regular expression or a
deterministic or non-deterministic finite automata or state machine. A language is a set
of strings which are made up of characters from a specified alphabet, or set of symbols. Regular
languages are a subset of the set of all strings. Regular languages are used in parsing and designing
programming languages and are one of the first concepts taught in computability courses. These
are useful for helping computer scientists to recognize patterns in data and group certain
computational problems together — once they do that, they can take similar approaches to solve
the problems grouped together. Regular languages are a key topic in computability theory.
Operations on Regular Language
The various operations on regular language are:
Union: If A and B are two regular languages then their union A U B is also a union.
A U B = {w | w is in A or w is in B} or A U B = {w: w ∈ A or w ∈ B}
Concatenation: If A and B are two regular languages then their intersection is also an intersection.
AoB = {wx | w is in A and x is in B} or AB = {wx: w ∈ A and x ∈ B}.
Kleen closure or Star: If A is a regular language then its Kleen closure A* will also be a regular
language.
A* = {u1u2 . . . uk : k ≥ 0 and ui ∈ A for all i = 1, 2, . . . , k}.
In words, A* is obtained by taking any finite number of strings in A, and gluing them together.
Observe that k = 0 is allowed; this corresponds to the empty string Ɛ. Thus, Ɛ ∈ A*.
That is, A* = Zero or more occurrence of language A.

For example: let A = {0, 01} and B = {1, 10}. Then


A ∪ B = {0, 01, 1, 10},
AB = {01, 010, 011, 0110},
and
A* = {Ɛ, 0, 01, 00, 001, 010, 0101, 000, 0001, 00101, . . .}.

Md. Mohibullah, Assistant Professor, Department of CSE, Comilla University


1
Theory of Computation: Regular Language, Regular Operations & Regular
Expressions[Part-01]
Order for precedence for the operations is: kleen > concatenation > union. This rule allows us
to lessen the use of parentheses while writing the regular expression. For example, a + b*c is the
simplified form of (a + ((b) *c)). Note that (a + b)* is not the same as a + b*, a + b* is (a + (b)*).

Regular Expression
o The language accepted by finite automata can be easily described by simple expressions called
Regular Expressions. It is the most effective way to represent any language. In other words,
Regular expressions can be thought of as the algebraic description of a regular language.
o A regular expression can also be described as a sequence of pattern that defines a string.
o Regular expressions are used to match character combinations in strings. String searching
algorithm used this pattern to find the operations on a string.
In a regular expression, x* means zero or more occurrence of x. It can generate {ε, x, xx, xxx,
xxxx, ......} [Here ε or Ʌ or ʎ can be used]
and, x+ means one or more occurrence of x. It can generate {x, xx, xxx, xxxx, ......}
Before formally defining the notion of a regular expression, we give some examples.
Consider the expression
(0 ∪ 1)01∗
The language described by this expression is the set of all binary strings
1. that start with either 0 or 1 (this is indicated by (0 ∪ 1)),
2. for which the second symbol is 0 (this is indicated by 0), and
3. that end with zero or more 1s (this is indicated by 1∗).
That is, the language described by this expression is
{00, 001, 0011, 00111, . . ., 10, 101, 1011, 10111, . . .}.
Here another example, the alphabet is {0, 1}:

• The language {1011, 0} is described by the expression


1011 ∪ 0.

Md. Mohibullah, Assistant Professor, Department of CSE, Comilla University


2
Theory of Computation: Regular Language, Regular Operations & Regular
Expressions[Part-01]
Formal (and inductive) definition of regular expressions:
Definition 1:
Let Σ be a non-empty alphabet.
1. Ɛ is a regular expression.
2. ∅ is a regular expression.
3. For each a ∈ Σ, a is a regular expression.
4. If R1 and R2 are regular expressions, then R1 ∪ R2 is a regular expression.
5. If R1 and R2 are regular expressions, then R1R2 is a regular expression.
6. If R is a regular expression, then R∗ is a regular expression.
You can regard 1., 2., and 3. as being the “building blocks” of regular expressions. Items 4., 5.,
and 6. give rules that can be used to combine regular expressions into new (and “larger”) regular
expressions. To give an example, we claim that
(0 ∪ 1) ∗ 101(0 ∪ 1) ∗
is a regular expression (where the alphabet Σ is equal to {0, 1}). In order to prove this, we have
to show that this expression can be “built” using the “rules” given in Definition 1. Here we go:
• By 3., 0 is a regular expression.
• By 3., 1 is a regular expression.
• Since 0 and 1 are regular expressions, by 4., 0∪1 is a regular expression.
• Since 0∪1 is a regular expression, by 6., (0∪1) ∗ is a regular expression.
• Since 1 and 0 are regular expressions, by 5., 10 is a regular expression.
• Since 10 and 1 are regular expressions, by 5., 101 is a regular expression.
• Since (0 ∪ 1) ∗ and 101 are regular expressions, by 5., (0 ∪ 1) ∗ 101 is a regular expression. •
Since (0 ∪ 1) ∗101 and (0 ∪ 1) ∗ are regular expressions, by 5., (0 ∪ 1) ∗101(0 ∪ 1) ∗ is a regular
expression.
Next, we define the language that is described by a regular expression:
Definition 2
Let Σ be a non-empty alphabet.
1. The regular expression Ɛ describes the language {Ɛ}.
2. The regular expression ∅ describes the language ∅.
3. For each a ∈ Σ, the regular expression a describes the language {a}.

Md. Mohibullah, Assistant Professor, Department of CSE, Comilla University


3
Theory of Computation: Regular Language, Regular Operations & Regular
Expressions[Part-01]
4. Let R1 and R2 be regular expressions and let L1 and L2 be the languages described by them,
respectively. The regular expression R1∪R2 describes the language L1 ∪ L2.
In other words, R1 + R2 is a regular expression denoting union of L(R1) and L(R2). That is L (R1
+ R2) = L(R1) U L(R2)
5. Let R1 and R2 be regular expressions and let L1 and L2 be the languages described by them,
respectively. The regular expression R1R2 describes the language L1L2.
In other words, R1R2 is a regular expression denoting the concatenation of L(R1) and L(R2). That
is L(R1R2) = L(R1). L(R2)
6. Let R be a regular expression and let L be the language described by it. The regular expression
R∗ describes the language L ∗.
In other words, R* is a regular expression denoting the closure of L(R). That is
L(R*) = L(R)*.
We consider some examples:
• The regular expression (0∪ ε) (1∪ ε) describes the language {01, 0, 1, ε}
• The regular expression 0 ∪ ε describes the language {0, ε}, whereas the regular expression 1∗
describes the language {ε, 1, 11, 111, . . .}. Therefore, the regular expression (0 ∪ ε)1∗ describes
the language
{0, 01, 011, 0111, . . ., ε, 1, 11, 111, . . .}.
Observe that this language is also described by the regular expression 01∗ ∪ 1 ∗.
• The regular expression 1∗∅ describes the empty language, i.e., the language ∅. (You should
convince yourself that this is correct.)
• The regular expression ∅∗ describes the language {ε}.

Definition 3
Let R1 and R2 be regular expressions and let L1 and L2 be the languages described by them,
respectively. If L1 = L2 (i.e., R1 and R2 describe the same language), then we will write R1 = R2.
Hence, even though (0∪ ε)1∗ and 01∗∪1 ∗ are different regular expressions, we write
(0 ∪ ε)1∗ = 01∗ ∪ 1 ∗
because they describe the same language.

Md. Mohibullah, Assistant Professor, Department of CSE, Comilla University


4
Theory of Computation: Regular Language, Regular Operations & Regular
Expressions[Part-01]
Theorem
Let R1, R2, and R3 be regular expressions. The following identities hold:
1. R1∅ = ∅ R1 = ∅. 11. R1* R1* = R1*
2. R1 ε = ε R1 = R1. 12. (R1*) * = R1*
3. R1 ∪ ∅ = ∅ ∪ R1 = R1. 13. R1 R1* = R1* R1
4. R1 ∪ R1 = R1. 12. (ε ∪ R1) ∗ = R1∗.
5. R1 ∪ R2 = R2 ∪ R1. 12. (ε ∪ R1) (ε ∪ R1) ∗ = R1∗.
6. R1 (R2 ∪ R3) = R1R2 ∪ R1R3. 13. R1∗ (ε ∪ R1) = (ε ∪ R1) R1∗= R1∗.
7. (R1 ∪ R2) R3 = R1R3 ∪ R2R3. 14. R1∗ R2 ∪ R2= R1∗ R2.
8. R1(R2R3) = (R1R2) R3. 15. R1 (R2 R1) ∗ = (R1 R2) ∗ R1.
9. ∅ ∗ = ε. 16. (R1 ∪ R2) ∗ = (R1∗ R2) ∗ R1∗ = (R2∗R1) ∗ R2∗.
10. ε* = ε.

We will not present the (boring) proofs of these identities, but urge you to convince yourself
informally that they make perfect sense. To give an example, we mentioned above that
(0 ∪ ε)1∗ = 01∗ ∪ 1∗.
We can verify this identity in the following way:
(0 ∪ ε)1∗ = 01∗ ∪ ε1∗ (by identity 7)
= 01∗ ∪ 1∗ (by identity 2)

Md. Mohibullah, Assistant Professor, Department of CSE, Comilla University


5
Theory of Computation: Regular Language, Regular Operations & Regular
Expressions[Part-01]
Exercise 1
Write the regular expression for the language accepting all combinations of a's, over the set ∑ =
{a} or Write the RE for zero or more a’s over the set ∑ = {a}.
Solution:
All combinations of a's mean a may be zero, single, double and so on. If a is appearing zero times,
that means a null string. That is, we expect the set of {ε, a, aa, aaa, ....}. So, we give a regular
expression for this as:
RE = a*
That is Kleen closure of a [a*= {є, a, aa, aaa,aaaa,aaaaa, .....} ]
………………………………………………………………………………………………………
……………………………………
Similarly,
Zero or more ab’s i.e. (ab) *= {є, ab, abab, ababab, abababab,.....}
ab* = {a,ab,abb,abbb,abbbb .....}
Exercise 2
Write the regular expression for the language accepting all combinations of a's except the null
string, over the set ∑ = {a} or Write the RE for one or more a’s over the set ∑ = {a}.
Solution:
The regular expression has to be built for the language
L = {a, aa, aaa, ....}
This set indicates that there is no null string. So, we can denote regular expression as:
RE = a+
Or We can say aa* or a*a means one or more a’s
a+= {a, aa, aaa,aaaa,aaaaa, .....}
aa*= a.{ є , a, aa, aaa,aaaa,aaaaa, .....} ={a,aa,aaa,aaaa,aaaaa,……….}
Similarly, to describe “one or more ab’s” i.e. {ab, abab, ababab, …………}

Exercise 3
Write the regular expression for the language accepting all the string containing any number of a's
and b’s i.e. all string at all over ∑ = {a, b}.

Md. Mohibullah, Assistant Professor, Department of CSE, Comilla University


6
Theory of Computation: Regular Language, Regular Operations & Regular
Expressions[Part-01]
Solution:
The regular expression will be:
RE = (a + b) * or RE= (a|b) * or RE= (a U b) *
This will give the set as L = {ε, a, aa, b, bb, ab, ba, aba, bab, ......}, any combination of a and b.
The (a + b) * shows any combination with a and b even a null string.
Explanation:
(a + b)0 =є
(a + b)1 = a+b
(a + b)2 = (a + b) (a + b) =aa+ab+ba+bb
(a + b)3 =(a + b)2(a + b)=(aa+ab+ba+bb )(a+b)=aaa+aba+baa+bba+aab+abb+bab+bbb
……………………………………………………………………………………………….
………………………………………………………………………………………………..
We will get all string in this way over ∑ = {a, b}.
1. Why Not a*+b*:
a*+b*={є, a, aa, aaa,aaaa,aaaaa, .....} or {є, b,bb,bbb,bbbb,bbbbb, .....}
Here we do not get the strings: ab,ba, aba,baa,bba,aab,abb,bab……..etc.
So, a*+b* cannot be the solution.
2. Why Not a*b*:
a*b*={є, a, aa, aaa,aaaa,aaaaa, .....}{є, b,bb,bbb,bbbb,bbbbb, .....}
= є, b,bb,bbb,…..a,ab,abb,abbb,…..aa,aab,aabb,….aaa,aaab,aaabb,….
Here we do not get the strings: ba, aba,baa,bba,bab……..etc.
So, a*b* cannot be the solution.

Exercise 4
Write the regular expression for the language accepting over ∑ = {0, 1}.
i. all the string which are starting with 1 and ending with 0
ii. all strings beginning with 1 and ending with 00
iii. all strings ending with either 010 or 0010
iv. all strings beginning with 00

Md. Mohibullah, Assistant Professor, Department of CSE, Comilla University


7
Theory of Computation: Regular Language, Regular Operations & Regular
Expressions[Part-01]
Solution:
i. In a regular expression, the first symbol should be 1, and the last symbol should be 0.
The RE is as follows:
RE = 1 (0+1) * 0 or RE = 1 (0|1) * 0
ii. RE = 1 (0+1) * 00
iii. RE = (0+1) * (010+0010)
iv. RE = 00 (0+1) *

Md. Mohibullah, Assistant Professor, Department of CSE, Comilla University


8

You might also like