Preliminary
Preliminary
What is Alphabet?
The term alphabet denotes the finite and nonempty set of symbols (or characters). For
example,
(1) If is an alphabet containing all the 26 characters used in English language, then
is finite and nonempty set, and ={a, b, c, …, z}.
(2) B={0, 1} is an alphabet.
(3) X={1, 2, 3, …} is not an alphabet because it is infinite.
(4) A={ } is not an alphabet because it is empty.
a q1
q0
b
b
ab
b
ab
q2 qf
a
Fig. 1 Transition Graph of Deterministic finite automaton
A successful path is a series of edges beginning at some initial state and ending
at a final state.
The concatenation of all the substrings that label the edges in the successful path
is a word accepted by the transition graph and the set of words accepted is the
language of the transition graph.
For example, words “ababa”, “ba”, “bba” are accepted by the transition graph shown in
Fig. 1.
The set states V={q0, q1, q2, qf}, set of final states is {qf}, and initial state is q0, and the
input alphabet ={a, b}.
3
0+1
011
S F
In the above generalized transition graph, set of states is {S, F}, initial state is S, final
state is F and accepted language is (0+1)*011.
Power Set
The power set of a set X is also a set which contains all the subsets of X including X and
denoted by 2X.
Example let X={a, b}, then power set of X is defined as
2X={, {a}, {b}, {a,b}}
Number of elements in 2X, 2X=22 = 4
(d) {1, 2}
{1, 2}
(e) 2 {1, 2}
Solution:
(a) X={a, b, {a, b}},
X
Then 2 ={All subsets of set X}
={, {a}, {b}, {a, b}, {{a, b}}, {a,{a, b}}, {b, {a, b}},{a, b, {a, b}}}
2. Length of a String
The length of a string is the number of occurrences (or appearances) of characters or
symbols in that string. If w is a string then its length is denoted by w . There may be
situation that, single symbol in a string occurs one or more times.
For example,
(1) w=abcd, then length of w is w =4
(2) n=124 is a string, then n =3
(3) is the empty string and has length zero.
5
For example,
(1) Let us consider alphabet ={0, 1}, then 1={0,1}, 2={00, 01, 10, 11}, 3={000,
001, 010, 011, 100, 101, 110, 111}.
1 =2=21 (No. of strings of length one),
2 = 4=22 (No. of strings of length two), and
3 =8=23 (No. of strings of length three)
(2) S={a, b, c}, then S2={aa, ab, ac, bb, ba, bc, cc, ca, cb}, and S2 =9=32
4. Concatenation of Strings
If w1 and w2 are two strings then concatenation of string w2 with string w1 is a string and
it is denoted by w1w2. The length of string w1w2 is the sum of lengths of strings w1 and
w2, i.e.
w1w2= w1+ w2
6. Substrings of a String
A string obtained by removing a prefix and a suffix from a string w is called substring.
For example, if a string w=xyz, then y is a substring of w. Every prefix and suffix of a
string w is a substring of w, but not every substring of w is a prefix or suffix of w. For
every string w, both w and are prefixes, suffixes, and substrings of w.
6
Kleene Closure
Let be some alphabet. Then Kleene closure of is denoted by *, also known as
reflexive-transitive closure. The length of Kleene closure of any alphabet is infinite. It is
defined as follows:
For example,
(1) ={0, 1} and a language L over . Then *= 0 12…
0 ={},
1 ={0, 1},
2 ={00,01, 10, 11}, and so on
*
So, ={, 0, 1, 00, 01, 10, 11…}
*
(2) S={a}, then S ={, a, aa, aaa, aaaa, aaaaa,..}
(3) Let A=, then A* = * ={}
Positive Closure
If is an alphabet then positive closure of is denoted by + and defined as follows:
+ *
= -{}
={Set of all words over excluding empty string }
For example,
(1) If ={a}, then +={a, aa, aaa, aaaa, aaaaa, …}
(2) If A={0, 1}, then A+={0, 1, 00, 11, 01, 10, 000, 111, 001, …}
𝐿∗ = 𝐿 𝑓𝑜𝑟 𝑘 ≥ 0
= 𝐿 𝑓𝑜𝑟 𝑘 ≥ 0
Formal Languages
Languages are basically communication means in form of symbols. To define the concept
of a Formal Language, we need the ideas of an Alphabet and a string. We have seen the
Kleene Closure and positive closure.
A formal language L on the alphabet is a subset of *, i.e.
L *
Here we are defining a formal language L as a collection of strings (words). Is it
appearing practical in our real world? Obviously, the answer is no, because without the
grammar we cannot get the desired language. In our real life, a language means a means
of communication of our thoughts, desires, facts, etc.
There are many approaches to this but one has become very standard in recent years,
particularly in Computer Science. This approach is due to the well-known linguistic
theorist Noam Chomsky (1959). He developed this approach in studying “natural
languages”, but it has since been taken over in a big way to describe computer languages.
We will discuss this in Chapter- 4.