0% found this document useful (0 votes)
64 views

Automata and Complexity Theory Module

The document outlines a course module on Automata and Complexity Theory, detailing the objectives and content structure for students in the Department of Computer Science. It covers fundamental concepts in automata theory, formal languages, and computational complexity, including various types of automata, grammars, and Turing machines. The course aims to equip students with analytical and problem-solving skills relevant to computer science through understanding these theoretical foundations.

Uploaded by

jacobdiriba
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
64 views

Automata and Complexity Theory Module

The document outlines a course module on Automata and Complexity Theory, detailing the objectives and content structure for students in the Department of Computer Science. It covers fundamental concepts in automata theory, formal languages, and computational complexity, including various types of automata, grammars, and Turing machines. The course aims to equip students with analytical and problem-solving skills relevant to computer science through understanding these theoretical foundations.

Uploaded by

jacobdiriba
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 102

College of Natural and computational science

Department of Computer Science

Module prepared on Automata and Complexity Theory

Prepared By: Getachew Abera (M.sc. in Computer Science)

Jan 2024
Gambella, Ethiopia

1
Course Goals or Objectives:
On completion of this course students should be able to:
 Introduce concepts in automata theory and theory of computation
 Study the central concepts of automata theory
 Acquire insights into the relationship among formal languages, formal grammars, and
automata.
 Identify different formal language classes and their relationships
 Design grammars and recognizers for different formal languages
 Prove or disprove theorems in automata theory using its properties
 Familiar with thinking analytically and intuitively for problem solving situations in
related areas of theory in computer science
 Acquire a full understanding and mentality of Automata Theory as the basis of all
computer science
 languages design
 Have a clear understanding of the Automata theory concepts such as RE's, DFA's, NFA's,
Stacks, Turing machines, and Grammars
 Be able to design FAs, NFAs, Grammars, languages modelling, small compilers basics
 Be able to design sample automata
 Be able to design parsers

2
Table of Contents
Chapter One .............................................................................................................................................. 5
1.1 Introduction ......................................................................................................................................... 5
1.2 Languages and Grammars................................................................................................................... 6
1.3 Finite automata, Deterministic and Non-deterministic finite automata .............................................. 6
1.3.1 Deterministic Finite Automaton (DFA) ........................................................................... 6
1.3.2 Non-Deterministic Finite State Automata .................................................................... 13
1.3.3 The Equivalence of DFA and NFA ................................................................................... 16
Chapter Two ............................................................................................................................................ 22
2.1 Regular Expression and Regular languages ........................................................................ 22
2.2 Properties of Regular Sets ................................................................................................... 25
2.3 Identities Related to Regular Expressions ........................................................................... 26
2.5 Regular grammar ................................................................................................................. 29
2.6 Chomsky Classification of Grammars ................................................................................ 34
Chapter Three ......................................................................................................................................... 37
3.1 Context free languages ........................................................................................................ 37
3.2 Parsing and ambiguity ......................................................................................................... 38
3.3 Derivation tree or parse tree ................................................................................................ 45
3.4 Left most and right most derivations................................................................................... 46
3.5 Chomsky’s hierarchy of grammars ..................................................................................... 49
Chapter Four ........................................................................................................................................... 56
4.1 Push down automata ......................................................................................................................... 56
4.2 Non-deterministic pushdown automata ............................................................................................ 68
4.3 Push down automata and context free languages .............................................................................. 73
4.4 Deterministic push down automata ................................................................................................... 73
4.5 Deterministic context free languages ................................................................................................ 74
Chapter Five ............................................................................................................................................ 76
5.1 Turing machines .................................................................................................................. 76
5.2 Construction of Turing Machine ......................................................................................... 77
5.3 Turing Decidable and Turing Acceptable ........................................................................... 82
5.4 Language accepted by Turing machine ............................................................................... 84
5.5 Undecidable problems ......................................................................................................... 86

3
Chapter Six: Computability................................................................................................................. 88
6.1 Introduction ......................................................................................................................... 88
6.2. Recursive languages and recursive Enumerable languages ............................................... 89
Chapter 7: Computational Complexity............................................................................................. 92
7.1 Big-O notations ................................................................................................................... 92
7.2. Class P vs class NP ............................................................................................................ 96
7.4. Cook’s Theorem ............................................................................................................... 100

4
Chapter One

1.1 Introduction

An Automaton is defined as a system where energy, materials and information’s


are transformed, transmitted and used for performing some functions without direct
help of man. Example automatic packing machine and automatic photo printing
machines.

Fig. shows model discrete automaton


Input. At each of the discrete instants of time t1,t2,…., input values I1,I2….,
each of which can take a finite numbers of fixed values from input alphabets ,
are applied to input side of the model shown in above fig.
Output. O1, O2,….OQ are the outputs of the model, each of which can take
finite numbers of fixed values from an output.
States. At any instants of time automaton can be in one of the states
Q1, Q2 …Qn.
State relation. The next stage of an automaton at any instant of time is
determined by the present state and the present input.
Output Relation. Output is related to either state only or to both the input and
state.

The term "Automata" is derived from the Greek word "αὐτόματα" which means "self-acting".
An automaton (Automata in plural) is an abstract self-propelled computing device which
follows a predetermined sequence of operations automatically.
An automaton with a finite number of states is called a Finite Automaton (FA) or Finite State
Machine (FSM).
Alphabets and strings

Alphabet
 Definition − An alphabet is any finite set of symbols.
 Example − ∑ = {a, b, c, d} is an alphabet set where ‘a’, ‘b’, ‘c’, and ‘d’ are symbols.
String
 Definition − A string is a finite sequence of symbols taken from ∑.

5
 Example − ‘cabcad’ is a valid string on the alphabet set ∑ = {a, b, c, d}
Length of a String
 Definition − It is the number of symbols present in a string. (Denoted by |S|).
 Examples −
o If S = ‘cabcad’, |S|= 6
o If |S|= 0, it is called an empty string (Denoted by λ or ε)

1.2 Languages and Grammars


 Definition − A language is a subset of ∑* for some alphabet ∑. It can be finite or
infinite.
 Example − If the language takes all possible strings of length 2 over ∑ = {a, b}, then
L = { ab, aa, ba, bb }
Automata

Finite Automaton can be classified into two types −


 Deterministic Finite Automaton (DFA)
 Non-deterministic Finite Automaton (NDFA / NFA)

1.3 Finite automata, Deterministic and Non-deterministic finite automata

1.3.1 Deterministic Finite Automaton (DFA)

In DFA, for each input symbol, one can determine the state to which the machine will move.
Hence, it is called Deterministic Automaton. As it has a finite number of states, the machine is
called Deterministic Finite Machine or Deterministic Finite Automaton.

A DFA can be represented by a 5-tuple (Q, ∑, δ, q0, F) where −


 Q is a finite set of states.

 ∑ is a finite set of symbols called the alphabet.


 δ is the transition function where δ: Q × ∑ → Q
 q0 is the initial state from where any input is processed (q 0 ∈ Q).
 F is a set of final state/states of Q (F ⊆ Q).
Graphical Representation of a DFA

A DFA is represented by digraphs called state diagram.

 The vertices represent the states.


 The arcs labeled with an input alphabet show the transitions.

6
 The initial state is denoted by an empty single incoming arc.
 The final state is indicated by double circles.
Example

Let a deterministic finite automaton be →

 Q = {a, b, c},
 ∑ = {0, 1},
 q0 = {a},
 F = {c}, and
Transition function δ as shown by the following table −

Present State Next State for Input 0 Next State for Input 1

a a b

b c a

c b c

Its graphical representation would be as follows −

Transition Systems
A transition graph or a transition system is a finite directed labeled graph in which
each vertex or node represents a state and the directed edges indicate the transition
of a state and the edges are labeled with input/output. An initial state
represented by a circle with an arrow pointing towards it, the final state by two
concentric circles, and the other states are represented by one circle.
The transitions frorn one internal state to another are governed bv the transition function δ, for
example, for

7
Then the dfa is in state q0 and the current input symbol is a, the dfa will go into state q1.
Properties of Transition Function.
Property 1. δ(q, ) = q in a finite automaton. This means the state of the
system can be changed only by an input symbol.
Property 2. For all strings W ∑* and input symbols a.
δ(q, aw) = δ ( δ(q, a), w)
δ(q, wa) = δ ( δ(q, w), a)
This property gives the state after the automaton reads the first symbol
of a string aw and the state after the automaton reads a prefix of the string
wa.
Example : prove that for any transition function δ and for any two input
string x and y.
δ(q, xy) = δ ( δ(q, x), y)  2.1
Proof: By the method of induction on |y|. i.e. length of y. when
|y|=1,y=a.
= δ(q, xa) = δ ( δ(q, x), a) by property 2.
Consider the finite state machine whose ={0,1}, K={q0,q1,q2, q3},q0 in
K is a initial state, final state is q0. Transition function δ is given in the
table. Give the entire sequence of state for the input string 110101.

δ(q0, 110101) = δ(q1, 10101) = δ(q0, 0101)


= δ(q2, 101) = δ(q3, 01)
= δ(q1, 1) = δ(q0, )
= q0
Hence state sequence is given below
1 1 0 1 0 1
q0  q1  q0  q2  q3  q1 q0.
Initial state is q0. Input is either 0 or 1. Start from q0, read 1 and go to q1, from q1,
read 1 and go to q0, from q0, read 0 and go to q2, from q2, read 1 and go to
q3, from q3, read 0 and go to q1, from q1, read 1 and go to q0, State q0 is
a final state.

Let consider the DFA ={a,b}, K={ q0 },q0 in K is a initial state, final state is q0.

This machine also accepts empty string. If initial state is


a final state, the machine will accept the empty string.

8
Set of string starting with a’s and b’s accepted by this
machine.

Languages L(m) = { x/x  * }


Let consider the DFA ={a,b}, K={q0,q1,D}, q0 in K is an initial state, final
state is q0.

Example

Let a deterministic finite automaton be →

 Q = {a, b, c},
 ∑ = {0, 1},
 q0 = {a},
 F = {c}, and
Transition function δ as shown by the following table −

Present State Next State for Input 0 Next State for Input 1

a a b

b c a

c b c

Its graphical representation would be as follows −

ACCEPTABILITY OF A STRING BY A FINITE AUTOMATON

Consider the finite state machine whose ={a,b}, K={q0,q1,q2,D},q0 in K is a


initial state, final state is q2 transition function δ is given in the table. Give the
entire sequence of state for the input string aaabb.

9
a b
 q0 q1 D
q1 q1 q2
* q2 D q1
D D D

The sequence of a’s followed by sequence b’s accepted by this machine. At


least one a and one b will be accepted. In the above fig. and table D is dead
state. Start from q0, read a and go to q1, from q1, read a and again go to
q1, from q1, read a and again go to q1, from q1, read b and go to q2,
from q2, read b and again go to q2, State q2 is a final state. The state
sequence of given input is q0q1q1q1q2q2. The string starting
with b is not accepted by the machine. The string end with a is not accepted
by the machine.

Consider the finite state machine whose ={0,1}, K={q0,q1,q2, q3},q0 in


K is a initial state, final state is q0. transition function δ is given in the
table. Give the entire sequence of state for the input string 110101.

δ(q0, 110101) = δ(q1, 10101) = δ(q0, 0101)


= δ(q2, 101) = δ(q3, 01)
= δ(q1, 1) = δ(q0, )
= q0
Hence, state sequence is given below
1 1 0 1 0 1
q0  q1  q0  q2  q3  q1 q0.
Initial state is q0. Input is either 0 or 1. Start from q0, read 1 and go to q1,
from q1, read 1 and go to q0, from q0, read 0 and go to q2, from q2,
read 1 and go to q3, from q3, read 0 and go to q1, from q1, read 1 and go to
q0, State q0 is a final state.

Let consider the DFA ={a,b}, K={ q0 },q0 in K is a initial state, final state is q0.

Empty string is also accepted by this machine. If initial


state is a final state the empty string will be accepted

10
by the machine. Set of string starting with a’s and b’s
accepted by this machine.

Languages L(m) = { x/x  * }


Let consider the DFA ={a,b}, K={q0,q1,D}, q0 in K is an initial state, final
state is q0.

a b
 * q0 q1 D
q1 D q0
D D D

Input string = ababab, Sequences of ab’s accepted


by this machine. At least one ab’s will be
accepted. In the above fig. and table D is dead state.
Start from q0, read a and go to q1, from q1, read b
and go to q0, same manner the machine will
accepted the string ababab. State q0 is a final state.
The string starting with b and ending with a will not
accepted by the machine. Languages of the machine.
L(m) = { (ab)n / n >=0 }.

Consider the finite state machine whose ={ a,b }, K={q0,q1,q2},q0 in K is an initial state,
final state is q0.

a b
 * q0 q1 q1
q1 q2 q2
q2 q0 q0

Any string starting with a’s and b’s will be accepted by the machine. But string length should divisible
by 3. Language of the machine is L(m) = { x /x  *,|x| = 3K }. K is a integer.

Consider the DFA whose ={ A,B,…,Z,0,1…,9 }, K={ q0,q1,q2,q3,q4,q5,q6 },q0 in


K is a initial state, F = { q1,q2,q3,q4,q5,q6 }. Fortran Identifier maximum length is
6.
A-Z 0-9
 q0 q1 Ø
* q1 q2 q2
* q2 q3 q3
* q3 q4 q4
* q4 q5 q5
* q5 q6 q6 11
Any type of string starting with A-Z followed by A-Z or 0-9 and string length
should be 1 to 6 character. Languages L(m) = { x /x  A-Z and x  A-Z or 0-9,|x|
<=6}

Example

Let a deterministic finite automaton be →

 Q = {a, b, c},
 ∑ = {0, 1},
 q0 = {a},
 F = {c}, and
Transition function δ as shown by the following table −

Present State Next State for Input 0 Next State for Input 1

a a b

b c a

c b c

Its graphical representation would be as follows −

Example
Let us consider the DFA shown below. From the DFA, the acceptable strings can be derived.

12
Strings accepted by the above DFA: {0, 00, 11, 010, 101, ...........}
Strings not accepted by the above DFA: {1, 011, 111, ........}
1.3.2 Non-Deterministic Finite State Automata

In NFA, for a particular input symbol, the machine can move to any combination of the states in
the machine. In other words, the exact state to which the machine moves cannot be determined.
Hence, it is called Non-deterministic Automaton. As it has finite number of states, the machine is
called Non-deterministic Finite Machine or Non-deterministic Finite Automaton.

An NFA can be represented by a 5-tuple (Q, ∑, δ, q0, F) where −


 Q is a finite set of states.
 ∑ is a finite set of symbols called the alphabets.
 δ is the transition function where δ: Q × ∑ → 2Q
(Here the power set of Q (2Q) has been taken because in case of NDFA, from a state,
transition can occur to any combination of Q states)
 q0 is the initial state from where any input is processed (q0 ∈ Q).
 F is a set of final state/states of Q (F ⊆ Q).
Let see the example below, which describes the nondeterministic automaton since there are two
transitions labeled with ‘a’ out of qo.

13
Let a non-deterministic finite automaton be →

 Q = {a, b, c}
 ∑ = {0, 1}
 q0 = {a}
 F = {c}
The transition function δ as shown below −

Present State Next State for Input 0 Next State for Input 1

a a, b b

b c a, c

c b, c c

Its graphical representation would be as follows −

Consider the NFA whose ={a,b}, Q={ q0,q1,q2 },q0 in Q is initial state,
final state is q2, transition function δ is given in the table. To check
whether the given string aaabb is accepted by the machine or not.

14
Initial state q0. Two choose are there
reading a from q0 either it is go to q0 or
q1. Similarly two choose are there reading
b from q1 it is go either to q1 or q2. Final
state is q2. The string sequence of a’s
followed by sequence of b’s will be
accepted by the machine. The string
starting with b and ending a are not
accepted by the machine. The state
sequence is q0 ->q0->q0-> q1 -> q1 -> q2
for the input aaabb.
The initial state q0. Reading a from q0 and go to q0,
again a is reading from q0 and go to q0, again a is
reading from q0 and go to q1, again b reading from q1
and go to q1 and finally b reading from q1 and go to
q2,
Example: Ending of Strings

An NFA that accepts all binary strings that end with 101.

Design a NFA for the transition table as given below:

The transition diagram can be drawn by using the mapping function as given in the table.

15
Design an NFA with ∑ = {0, 1} in which double '1' is followed by double '0'.

Now before double 1, there can be any string of 0 and 1. Similarly, after double 0, there can be
any string of 0 and 1.

1.3.3 The Equivalence of DFA and NFA

Let, M = (Q, ∑, δ, q0, F) is an NFA which accepts the language L(M). There should be
equivalent DFA denoted by M' = (Q', ∑', q0', δ', F') such that L(M) = L(M'). Any string is
accepted by NFA can be accepted by a DFA. Method is called as subset.

Example 1:

Convert the given NFA to DFA.

For the given transition diagram we will first construct the transition table.

State 0 1

→q0 {q0, q1} {q1}

*q1 ϕ {q0, q1}


Now we will obtain δ' transition for state q0.

16
δ'([q0], 0) = {q0, q1}
= [q0, q1] (new state generated)
δ'([q0], 1) = {q1} = [q1]

The δ' transition for state q1 is obtained as:

δ'([q1], 0) = ϕ
δ'([q1], 1) = [q0, q1]
Now we will obtain δ' transition on [q0, q1].
δ'([q0, q1], 0) = δ(q0, 0) ∪ δ(q1, 0)
= {q0, q1} ∪ ϕ
= {q0, q1}
= [q0, q1]
Similarly,

δ'([q0, q1], 1) = δ(q0, 1) ∪ δ(q1, 1)


= {q1} ∪ {q0, q1}
= {q0, q1}
= [q0, q1]
As in the given NFA, q1 is a final state, then in DFA wherever, q1 exists that state
becomes a final state. Hence in the DFA, final states are [q1] and [q0, q1]. Therefore
set of final states F = {[q1], [q0, q1]}.

The transition table for the constructed DFA will be:

State 0 1

→[q0] [q0, q1] [q1]

*[q1] ϕ [q0, q1]

*[q0, q1] [q0, q1] [q0, q1]


The Transition diagram will be:

Example 2:

Convert the given NFA to DFA.

17
For the given transition diagram we will first construct the transition table.

State 0 1

→q0 q0 q1

q1 {q1, q2} q1

*q2 q2 {q1, q2}

Now we will obtain δ' transition for state q0.


δ'([q0], 0) = [q0]
δ'([q0], 1) = [q1]
The δ' transition for state q1 is obtained as:
δ'([q1], 0) = [q1, q2] (new state generated)
δ'([q1], 1) = [q1]
The δ' transition for state q2 is obtained as:
δ'([q2], 0) = [q2]
δ'([q2], 1) = [q1, q2]
Now we will obtain δ' transition on [q1, q2].
δ'([q1, q2], 0) = δ(q1, 0) ∪ δ(q2, 0)
= {q1, q2} ∪ {q2}
= [q1, q2]
δ'([q1, q2], 1) = δ(q1, 1) ∪ δ(q2, 1)
= {q1} ∪ {q1, q2}
= {q1, q2}
= [q1, q2]
The state [q1, q2] is the final state as well because it contains a final state q2. The transition
table for the constructed DFA will be:

State 0 1

→[q0] [q0] [q1]

[q1] [q1, q2] [q1]

18
*[q2] [q2] [q1, q2]

*[q1, q2] [q1, q2] [q1, q2]


The Transition diagram will be:

Minimization of Finite Automata

Consider a DFA over . Two states are Equivalent iff for any string x in *.
δ(p, x) and δ(q, x) are both in final state or both not in final state.

If P and Q are distinguishable, if a string x / δ(p, x)  F. and if a string x / δ(Q,


x)  F. one of them may be in Final state and another may not be in final state. X is
called the distinguishable string sequence.

If P and Q are two states, p reads a and go to r, similarly q reads a and go to s, r =


δ(p, x) and s = δ(q, x). if p and q are Equivalent, means automatically r and s are
equivalent. r and s are both in final state or both not in final state.

If r and q are distinguishable, if a string a / δ(p, a)  F. and if a string a / δ(Q, a)


 F. one of them may be in Final state and another may not be in final state. a is
called the distinguishable string sequence. Automatically, P and Q will be a
distinguishable.

We have to follow the various steps to minimize the DFA. These are as follows:
Step 1: Remove all the states that are unreachable from the initial state via any set of the
transition of DFA.

Step 2: Draw the transition table for all pair of states.


Step 3: Now split the transition table into two tables T1 and T2. T1 contains all final states,
and T2 contains non-final states.
Step 4: Find similar rows from T1 such that:
1. δ (q, a) = p
2. δ (r, a) = p
That means, find the two states which have the same value of a and b and remove one of

19
them.
Step 5: Repeat step 3 until we find no similar rows available in the transition table T1.
Step 6: Repeat step 3 and step 4 for table T2 also.
Step 7: Now combine the reduced T1 and T2 tables. The combined transition table is the
transition table of minimized DFA.

Step 1: In the given DFA, q2 and q4 are the unreachable states so remove them.
Step 2: Draw the transition table for the rest of the states.

State 0 1

→q0 q1 q3

q1 q0 q3

*q3 q5 q5

*q5 q5 q5
Step 3: Now divide rows of transition table into two sets as:
1. One set contains those rows, which start from non-final states:

State 0 1

q0 q1 q3

q1 q0 q3

2. Another set contains those rows, which starts from final states.

20
State 0 1

q3 q5 q5

q5 q5 q5

Now, eliminate the indistinguishable states


Since q3 and q5 in set 2 have the same rows for the same input 0 and 1, replace q5 by q3 for
each row in a set.
Step 4: Set 1 has no similar rows so set 1 will be the same.
Step 5: In set 2, row 1 and row 2 are similar since q3 and q5 transit to the same state on 0 and
1. So skip q5 and then replace q5 by q3 in the rest.

State 0 1

q3 q3 q3

Step 6: Now combine set 1 and set 2 as:

State 0 1

→q0 q1 q3

q1 q0 q3

*q3 q3 q3

Now it is the transition table of minimized DFA.

21
Chapter Two

2.1 Regular Expression and Regular languages

o The language accepted by finite automata can be easily described by simple expressions
called Regular Expressions. It is the most effective way to represent any language.
o The languages accepted by some regular expression are referred to as Regular languages.
o A regular expression can also be described as a sequence of pattern that defines a string.
o Regular expressions are used to match character combinations in strings. String searching
algorithm used this pattern to find the operations on a string.
o regular Expressions are an algebraic way to describe languages
o It is a way of representing regular languages.
o The algebraic description for regular languages is done using regular expressions.
o They can define the same language that various forms of finite automata can describe.
o Regular expressions offer something that finite automata do not, i.e. it is a declarative way
to express the strings that we want to accept.

A Regular Expression can be recursively defined as follows −

 ε is a Regular Expression indicates the language containing an empty string. (L (ε) = {ε})

 φ is a Regular Expression denoting an empty language. (L (φ) = { })

 x is a Regular Expression where L = {x}

 If X is a Regular Expression denoting the language L(X) and Y is a Regular Expression


denoting the language L(Y), then

o X + Y is a Regular Expression corresponding to the language L(X) ∪


L(Y) where L(X+Y) = L(X) ∪ L(Y).

o X . Y is a Regular Expression corresponding to the language L(X) .


L(Y) where L(X.Y) = L(X) . L(Y)

o R* is a Regular Expression corresponding to the language L(R*)where L(R*) =


(L(R))*

Some RE Examples

22
Regular Expressions Regular Set

(0 + 10*) L = { 0, 1, 10, 100, 1000, 10000, … }

(0*10*) L = {1, 01, 10, 010, 0010, …}

(0 + ε)(1 + ε) L = {ε, 0, 1, 01}

(a+b)* Set of strings of a’s and b’s of any length including the null string. So
L = { ε, a, b, aa , ab , bb , ba, aaa…….}

(a+b)*abb Set of strings of a’s and b’s ending with the string abb. So L = {abb,
aabb, babb, aaabb, ababb, …………..}

(11)* Set consisting of even number of 1’s including empty string, So L= {ε,
11, 1111, 111111, ……….}

(aa)*(bb)*b Set of strings consisting of even number of a’s followed by odd


number of b’s , so L = {b, aab, aabbb, aabbbbb, aaaab, aaaabbb,
…………..}

(aa + ab + ba + bb)* String of a’s and b’s of even length can be obtained by concatenating
any combination of the strings aa, ab, ba and bb including null, so L =
{aa, ab, ba, bb, aaab, aaba, …………..}

Example 1

Write the regular expression for the language accepting all combinations of a's, over the set ∑ =
{a}

Solution:

All combinations of a's means a may be zero, single, double and so on. If a is appearing zero
times, that means a null string. That is we expect the set of {ε, a, aa, aaa, ....}. So we give a
regular expression for this as

R = a*

Example 2. Write the regular expression for the language accepting all the string containing any
number of a's and b's.

Solution:

The regular expression will be:

R = (a + b)*

23
This will give the set as L = {ε, a, aa, b, bb, ab, ba, aba, bab, .....}, any combination of a and b.

The (a + b)* shows any combination with a and b even a null string.

Example 3. Write the regular expression for the language accepting all the string which are
starting with 1 and ending with 0, over ∑ = {0, 1}.

Solution:

In a regular expression, the first symbol should be 1, and the last symbol should be 0. The r.e. is
as follows:

R = 1 (0+1)* 0

Example 4. Describe the language denoted by following regular expression

r.e. = (b* (aaa)* b*)*

Solution:

The language can be predicted from the regular expression by finding the meaning of it. We will
first split the regular expression as:

r.e. = (any combination of b's) (aaa)* (any combination of b's)

L = {The language consists of the string in which a's appear triples, there is no restriction on the
number of b's}

Example 5. Write the regular expression for the language starting and ending with a and having
any having any combination of b's in between.

Solution:

The regular expression will be:

R = a b* a

Example 6. Write the regular expression for the language starting with a but not having
consecutive b's.

Solution: The regular expression has to be built for the language:

L = {a, aba, aab, aba, aaa, abab, .....}

The regular expression for the above language is:

R = {a + ab}*

24
Regular Sets

Any set that represents the value of the Regular Expression is called a Regular Set.

2.2 Properties of Regular Sets

Property 1. The union of two regular set is regular.


Proof −
Let us take two regular expressions
RE1 = a(aa)* and RE2 = (aa)*
So, L1 = {a, aaa, aaaaa,.....} (Strings of odd length excluding Null)
and L2 ={ ε, aa, aaaa, aaaaaa,.......} (Strings of even length including Null)
L1 ∪ L2 = { ε, a, aa, aaa, aaaa, aaaaa, aaaaaa,.......}
(Strings of all possible lengths including Null)
RE (L1 ∪ L2) = a* (which is a regular expression itself)
Property 2. The intersection of two regular set is regular.
Proof −
Let us take two regular expressions
RE1 = a(a*) and RE2 = (aa)*
So, L1 = { a,aa, aaa, aaaa, ....} (Strings of all possible lengths excluding Null)
L2 = { ε, aa, aaaa, aaaaaa,.......} (Strings of even length including Null)
L1 ∩ L2 = { aa, aaaa, aaaaaa,.......} (Strings of even length excluding Null)
RE (L1 ∩ L2) = aa(aa)* which is a regular expression itself.
Property 3. The complement of a regular set is regular.
Proof −
Let us take a regular expression −
RE = (aa)*
So, L = {ε, aa, aaaa, aaaaaa, .......} (Strings of even length including Null)
Complement of L is all the strings that is not in L.
So, L’ = {a, aaa, aaaaa, .....} (Strings of odd length excluding Null)
RE (L’) = a(aa)* which is a regular expression itself.
Property 4. The difference of two regular set is regular.
Proof −

25
Let us take two regular expressions −
RE1 = a (a*) and RE2 = (aa)*
So, L1 = {a, aa, aaa, aaaa, ....} (Strings of all possible lengths excluding Null)
L2 = { ε, aa, aaaa, aaaaaa,.......} (Strings of even length including Null)
L1 – L2 = {a, aaa, aaaaa, aaaaaaa, ....}
(Strings of all odd lengths excluding Null)
RE (L1 – L2) = a (aa)* which is a regular expression.
Property 5. The reversal of a regular set is regular.
Proof −
We have to prove LR is also regular if L is a regular set.
Let, L = {01, 10, 11, 10}
RE (L) = 01 + 10 + 11 + 10
LR = {10, 01, 11, 01}
RE (LR) = 01 + 10 + 11 + 10 which is regular
Property 6. The closure of a regular set is regular.
Proof −
If L = {a, aaa, aaaaa, .......} (Strings of odd length excluding Null)
i.e., RE (L) = a (aa)*
L* = {a, aa, aaa, aaaa , aaaaa,……………} (Strings of all lengths excluding Null)
RE (L*) = a (a)*
Property 7. The concatenation of two regular sets is regular.
Proof −
Let RE1 = (0+1)*0 and RE2 = 01(0+1)*
Here, L1 = {0, 00, 10, 000, 010, ......} (Set of strings ending in 0)
and L2 = {01, 010,011,.....} (Set of strings beginning with 01)
Then, L1 L2 = {001,0010,0011,0001,00010,00011,1001,10010,.............}
Set of strings containing 001 as a substring which can be represented by an RE − (0 + 1)*001(0 +
1)*
Hence, proved.
2.3 Identities Related to Regular Expressions

Given R, P, L, Q as regular expressions, the following identities hold −

26
 ∅* = ε
 ε* = ε
 RR* = R*R
 R*R* = R*
 (R*)* = R*
 RR* = R*R
 (PQ)*P =P(QP)*
 (a+b)* = (a*b*)* = (a*+b*)* = (a+b*)* = a*(ba*)*
 R + ∅ = ∅ + R = R (The identity for union)
 R ε = ε R = R (The identity for concatenation)
 ∅ L = L ∅ = ∅ (The annihilator for concatenation)
 R + R = R (Idempotent law)
 L (M + N) = LM + LN (Left distributive law)
 (M + N) L = ML + NL (Right distributive law)
 ε + RR* = ε + R*R = R*

2.4 Arden's Theorem

In order to find out a regular expression of a Finite Automaton, we use Arden’s Theorem along
with the properties of regular expressions.
Let P and Q be two regular expressions.
If P does not contain null string, then R = Q + RP has a unique solution that is R = QP*
Proof −
R = Q + (Q + RP)P [After putting the value R = Q + RP]
= Q + QP + RPP
When we put the value of R recursively again and again, we get the following equation −
R = Q + QP + QP2 + QP3…..
R = Q (ε + P + P2 + P3 + …. )
R = QP* [As P* represents (ε + P + P2 + P3 + ….) ]
Assumptions for Applying Arden’s Theorem

 The transition diagram must not have NULL transitions


 It must have only one initial state
Method
Step 1 − Create equations as the following form for all the states of the DFA having n states with
initial state q1.
27
q1 = q1R11 + q2R21 + … + qnRn1 + ε
q2 = q1R12 + q2R22 + … + qnRn2
...
qn = q1R1n + q2R2n + … + qnRnn
Rij represents the set of labels of edges from qi to qj, if no such edge exists, then Rij = ∅
Step 2 − Solve these equations to get the equation for the final state in terms of Rij

Example 1

Construct a regular expression corresponding to the automata given below −

Here the initial state and final state is q1.

The equations for the three states q1, q2, and q3 are as follows −

q1 = q1a + q3a + ε (ε move is because q1 is the initial state0

q2 = q1b + q2b + q3b

q3 = q2a

Now, we will solve these three equations −

q2 = q1b + q2b + q3b

= q1b + q2b + (q2a)b (Substituting value of q3)

= q1b + q2(b + ab)

= q1b (b + ab)* (Applying Arden’s Theorem)

q1 = q1a + q3a + ε

= q1a + q2aa + ε (Substituting value of q3)

28
= q1a + q1b(b + ab*)aa + ε (Substituting value of q2)

= q1(a + b(b + ab)*aa) + ε

= ε (a+ b(b + ab)*aa)*

= (a + b(b + ab)*aa)*

Hence, the regular expression is (a + b(b + ab)*aa)*.

Example 2
Construct a regular expression corresponding to the automata given below −

Solution −
Here the initial state is q1 and the final state is q2
Now we write down the equations −
q1 = q10 + ε
q2 = q11 + q20
q3 = q21 + q30 + q31
Now, we will solve these three equations −
q1 = ε0* [As, εR = R]
So, q1 = 0*
q2 = 0*1 + q20
So, q2 = 0*1(0)* [By Arden’s theorem]
Hence, the regular expression is 0*10*.
2.5 Regular grammar

In the literary sense of the term, grammars denote syntactical rules for conversation in natural
languages. Linguistics have attempted to define grammars since the inception of natural
languages like English, Sanskrit, Mandarin, etc.

29
The theory of formal languages finds its applicability extensively in the fields of Computer
Science. Noam Chomsky gave a mathematical model of grammar in 1956, which is effective for
writing computer languages.

Grammar

A grammar G can be formally written as a 4-tuple (N, T, S, P) where −

 N or VN is a set of variables or non-terminal symbols or (states in FA) and represented by


upper cases

 T or ∑ is a set of Terminal symbols represented in lower cases.

 S is a special variable called the Start symbol, S ∈ N

 P is Production rules for Terminals and Non-terminals. A production rule has the form α
→ β, where α and β are strings on VN ∪ ∑ and least one symbol of α belongs to VN.

 P represented as LHS RHS

Single Non-Terminal ε/ string of terminals/ Non-terminals

Example

Grammar G1 −

({S, A, B}, {a, b}, S, {S → AB, A → a, B → b})

Here,

 S, A, and B are Non-terminal symbols;

 a and b are Terminal symbols

 S is the Start symbol, S ∈ N

 Productions, P : S → AB, A → a, B → b

Generating a production rules from FA

30
N= {q1, q2, q3} T= {a, b} S = {q1}
P = {q1 aq1, q1bq2, q2aq3, q2bq2, q3aq1, q3bq2}
Derivations from a Grammar

Strings may be derived from other strings using the productions in a grammar.

Example

Let us consider the grammar −

G2 = ({S, A}, {a, b}, S, {S → aAb, aA → aaAb, A → ε } )

Some of the strings that can be derived are −

S ⇒ aAb using production S → aAb

⇒ aaAbb using production aA → aAb

⇒ aaaAbbb using production aA → aaAb

⇒ aaabbb using production A → ε

We will see another example The peacock is a beautiful bird. Starting with the
sentence symbol S, this can be dividing as Noun parse one NP1 and verb parse VP,
then Noun parse one is split into article one and Noun one, article one is The and
Noun one is Peacock, then VP is dividing into verb V, article two, adjective one and
Noun two N2. Verb is is, Article two is a, Adjective is beautiful and Noun two is
bird

<S>  <NP1> <VP>


<NP1  <Art1> <N1>
>

31
<Art1  The
>
<N1>  Peacock
<VP>  <V> <Art2>
<Adj><N2>
<V>  is
<Art2  a
>
<Adj>  Beautiful
<N2>  Bird.
So this single arrow can be rewritten as and
double arrow directly derives.

S  <NP1> <VP>
 <NP1><V> <ART2> <ADJ> <N2>
 <NP1><V> a <ADJ><N2>
 <NP1><V> a beautiful <N2>
 <ART1> <N1><V> a beautiful <N2>
 The <N1><V> a beautiful <N2>
 The <N1> is a beautiful <N2>
 The Peacock is a Beautiful <N2>
 The Peacock is a Beautiful Bird.

We will consider one more example Venice is a beautiful city.

<S> <NP1> <VP> <S>  <NP1><VP>


<NP1> <PropN>  <PropN><VP>
<PropN Venice  Venice <VP>
>
<VP> <V> <NP2>  Venice <V><NP2>
<V> Is  Venice is <NP2>
<NP2> <Art><Adj><N  Venice is <Art><Adj><N2>
2>
<Art> A  Venice is a <Adj><N2>
<Adj> beautiful  Venice is a beautiful <N2>
<N2> City  Venice is a beautiful city.

<S> rewritten as <NP1><VP>, then <NP1> rewritten as <PropN>, then <PropN>


rewritten as Venice, then <VP> rewritten as <Verb><NP2>, then <Verb> rewritten
as is, then <NP2> rewritten as <Art><Adj><N2>, then <Art> rewritten as a, then
<Adj> rewritten as beautiful and finally <N2> rewritten as city. So Grammar
generates the sentence Venice is a beautiful city.

Language Generated by a Grammar

Example
If there is a grammar
G: N = {S, A, B} T = {a, b} P = {S → AB, A → a, B → b}

32
Here S produces AB, and we can replace A by a, and B by b. Here, the only accepted string
is ab, i.e.,
L(G) = {ab}
Example
Suppose we have the following grammar −
G: N = {S, A, B} T = {a, b} P = {S → AB, A → aA|a, B → bB|b}
The language generated by this grammar −
L(G) = {ab, a2b, ab2, a2b2, ………}
= {am bn | m ≥ 1 and n ≥ 1}
Construction of a Grammar Generating a Language
We’ll consider some languages and convert it into a grammar G which produces those
languages.
Example1
Problem − Suppose, L (G) = {am bn | m ≥ 0 and n > 0}. We have to find out the
grammar G which produces L(G).
Solution
Since L(G) = {am bn | m ≥ 0 and n > 0}
the set of strings accepted can be rewritten as −
L(G) = {b, ab,bb, aab, abb, …….}
Here, the start symbol has to take at least one ‘b’ preceded by any number of ‘a’ including null.
To accept the string set {b, ab, bb, aab, abb, …….}, we have taken the productions −
S → aS , S → B, B → b and B → bB
S → B → b (Accepted)
S → B → bB → bb (Accepted)
S → aS → aB → ab (Accepted)
S → aS → aaS → aaB → aab(Accepted)
S → aS → aB → abB → abb (Accepted)
Thus, we can prove every single string in L(G) is accepted by the language generated by the
production set.
Example2
Problem − Suppose, L (G) = {am bn | m > 0 and n ≥ 0}. We have to find out the grammar G
which produces L(G).

33
Solution −
Since L(G) = {am bn | m > 0 and n ≥ 0}, the set of strings accepted can be rewritten as −
L(G) = {a, aa, ab, aaa, aab ,abb, …….}
Here, the start symbol has to take at least one ‘a’ followed by any number of ‘b’ including null.
To accept the string set {a, aa, ab, aaa, aab, abb, …….}, we have taken the productions −
S → aA, A → aA , A → B, B → bB ,B → λ
S → aA → aB → aλ → a (Accepted)
S → aA → aaA → aaB → aaλ → aa (Accepted)
S → aA → aB → abB → abλ → ab (Accepted)
S → aA → aaA → aaaA → aaaB → aaaλ → aaa (Accepted)
S → aA → aaA → aaB → aabB → aabλ → aab (Accepted)
S → aA → aB → abB → abbB → abbλ → abb (Accepted)
Thus, we can prove every single string in L(G) is accepted by the language generated by the
production set.
Hence the grammar −
G: ({S, A, B}, {a, b}, S, {S → aA, A → aA | B, B → λ | bB })
2.6 Chomsky Classification of Grammars

According to Chomsky hierarchy, grammar is divided into 4 types as follows:


1. Type 0 is known as unrestricted grammar and recognized by Turing machine
2. Type 1 is known as context-sensitive grammar and recognized by linear bounded automata
3. Type 2 is known as a context-free grammar and recognized by pushdown automata
4. Type 3 Regular Grammar and recognized by Finite automata (DFA and NFA)

Grammar Grammar Accepted Language Accepted Automaton


Type

Type 0 Unrestricted grammar Recursively enumerable Turing Machine


language

Type 1 Context-sensitive Context-sensitive language Linear-bounded


grammar automaton

Type 2 Context-free grammar Context-free language Pushdown automaton

34
Type 3 Regular grammar Regular language Finite state automaton

Type-0 grammars: generate recursively enumerable languages. The productions have no


restrictions. They are any phase structure grammar including all formal grammars.

They generate the languages that are recognized by a Turing machine.

The productions can be in the form of α → β where α is a string of terminals and non-terminals
with at least one non-terminal and α cannot be null. β is a string of terminals and non-terminals.

Example

S → ACaB

Bc → acB

CB → DB

aD → Db
Type-1 grammars: generate context-sensitive languages.

 First of all Type 1 grammar should be Type 0


The productions must be in the form

α A β → α γ β, | α |<= | β |

where A ∈ N (Non-terminal)

and α, β, γ ∈ (T ∪ N)* (Strings of terminals and non-terminals)

The strings α and β may be empty, but γ must be non-empty.

35
The rule S → ε is allowed if S does not appear on the right side of any rule. The languages generated by
these grammars are recognized by a linear bounded automaton.

Example

AB → AbBc , A → bcA , B → b

Type-2 grammars: generate context-free languages.

 First, it should be Type 0 and Type 1.


The productions must be in the form A → γ

where A ∈ N ( single Non-terminal)

and γ ∈ (T ∪ N)* (String of terminals and non-terminals).

These languages generated by these grammars are be recognized by a non-deterministic


pushdown automaton.

Example

S→Xa,X→a

X → aX

X → abc

X→ε
Type-3 grammars:

Type-3 grammars must have a single non-terminal on the left-hand side and a right-hand side
consisting of a single terminal or single terminal followed by a single non-terminal.

The productions must be in the form X → a or X → aY

where X, Y ∈ N (Non terminal)

and a ∈ T (Terminal)

The rule S → ε is allowed if S does not appear on the right side of any rule.

Example

X→ε

X → a | aY

Y→b

36
Chapter Three
3.1 Context free languages

Context free languages


Context-Free Language (CFL) is a language generated by a Context-Free Grammar (CFG). CFG
is a set of rules for deriving (or generating) strings (or sentences) in a language.

A grammar G = (N, T, S, P) is said to be context-free if all productions in P have the form


A → α where A is a Non-terminal [A ∈ N] and α is any string containing Non-terminals and/or
terminals. [α ∈ (N ∪ T)*].

Definition: Let L be a language. Then L is a context-free language if and only if there exists a
context-free grammar G such that L = L(G).

Every regular grammar is context-free, so a regular language is also a context- free one, so we
see that the family of regular languages is a proper subset of the family of context-free
languages. Context- free grammars derive their name from the fact that the substitution of the
variable on the left of a production can be made any time such a variable appears in a sentential
form. It does not depend on the symbols in the rest of the sentential form (the context). This
feature is the consequence of allowing only a single variable on the left side of the production.

Examples of Context-Free languages:


The grammar G = ({S}, {a, b}, S, P) with productions
S → aSa
S → bSb
S→ε
is context-free. A typical derivation in this grammar is
S ⇒ aSa ⇒ aaSaa ⇒ aabSbaa ⇒ aabbaa
This makes it clear that
L(G) = {wwR : w ∈ {a, b}*}.
The language is context-free

37
3.2 Parsing and ambiguity

The term parsing describes finding a sequence of productions by which a w ε L (G) is derived.

Given a string w in L (G), we can parse it in a rather obvious fashion; we systematically


construct all possible (say, leftmost) derivations and see whether any of them matches w.

Definition: Let G = (V, T, P, S) be a CFG. A tree is a derivation (or parse) tree if:
– Every vertex has a label from V U T U {ε}
– The label of the root is S
– If a vertex with label A has children with labels X1, X2,…, Xn, from left to right,
then
A –> X1, X2,…, Xn
must be a production in P.
If a vertex has label ε, then that vertex is a leaf and the only child of its’ parent
More generally, a derivation tree can be defined with any non-terminal as the root.

Example
For generating a language that generates equal number of a’s and b’s in te form anbn . the context
free grammar is defined as G= {(S, A), (a, b), (S aAb, A aAb/ ε)}
SaAb
 aaAbb (AaAb)
 aaaAbbb (A aAb)
 aaabb (A ε)
 aaabbb  a3b3  anbn

38
Parsing is the process of analyzing a string of symbols, either in natural or computer language.

Example: consider the grammar S → SS|aSb| bSa|λ and the string w = aabb. Using exhaustive
search parsing (Top-down parsing) we can solve as follow.
Round one
1.S → SS
2.S → aSb
3.S → bSa
4.S → λ
39
We can remove the last two
Round two: yields sentential forms
5.S → SS → SS
6.S → SS → aSbS
7.S → SS → bSaS
8.S → SS → S
From 2 we can get additional sentential forms
S → aSb → aSSb
S → aSb → aaSbb
S → aSb → abSab
S → aSb → ab
On the next round we get the target string
S → aSb → aaSbb → aabb
Therefore, aabb is the language generated by the given grammar.
Ambiguity
Grammar can be derived in either leftmost or rightmost derivation. We can draw the derivation
tree called as parse tree or syntax tree.
 Parse tree is a unique one though the derivation is leftmost or rightmost.
 If there exists more than one parse tree for a given grammar that means more than one
rightmost or leftmost is possible: that grammar is said to be ambiguous grammar.
Example: the grammar with production
S → aSb|SS|ε , is ambiguous
The sentence aabb has two derivation trees

40
S
S

S S

a b
S ε a S b

a b
S
a
S b

Let us consider this grammar: E -> E+E|id We can create a 2 parse tree from this grammar to
obtain a string id+id+id. The following are the 2 parse trees generated by left-most derivation:

From the above grammar String "id + id - id" can be derived in 2 ways:
First Leftmost derivation
E→E+E
→ id + E
→ id + E - E
→ id + id - E

41
→ id + id- id
Second Leftmost derivation
E→E-E
→E+E-E
→ id + E - E
→ id + id - E
→ id + id - id
Both the above parse trees are derived from the same grammar rules but both parse trees are
different. Hence the grammar is ambiguous.
For example, consider the grammar
S → SS
S → aSb
S→ε
The sentence aabb has two derivation trees as shown in the figures below:

Hence the above grammar is an Ambiguous grammar.


Example:
Check whether the given grammar G is ambiguous or not.
A → AA
A → (A)
A→a
Solution:
For the string "a(a)aa" the above grammar can generate two parse trees:

42
Since there are two parse trees for a single string "a(a)aa", the grammar G is ambiguous.
Ambiguity is a common feature of natural languages, where it is tolerated and dealt with in a
variety of ways. In programming languages, where there should be only one interpretation of
each statement, ambiguity must be removed when possible. Often we can achieve this by
rewriting the grammar in an equivalent, unambiguous form.

Unambiguous Grammar

A grammar can be unambiguous if the grammar does not contain ambiguity that means if it does
not contain more than one leftmost derivation or more than one rightmost derivation or more
than one parse tree for the given input string.

Example 1:

Consider a grammar G is given as follows:


S → AB | aaB
A → a | Aa
B→b
Solution:
Let us derive the string "aab"
Determine whether the grammar G is ambiguous or not. If G is ambiguous, construct an
unambiguous grammar equivalent to G.

43
As there are two different parse tree for deriving the same string, the given grammar is
ambiguous.
Unambiguous grammar will be:
S → AB
A → Aa | a
B→b
Sentential forms
Replacing right most non terminal, then replacing right most non terminal, and so on. So always
replacing right most non terminal, such a derivation is called right most derivation. Left most
and right most derivation are very important because later on when you learn about compiler
realizing top most parsing used for left most derivation and bottom most parsing used for right
most derivation.
We will consider one more example Venice is a beautiful city.
<S> <NP1> <VP> <S>  <NP1><VP>
<NP1> <PropN>  <PropN><VP>
<PropN Venice  Venice <VP>
>
<VP> <V> <NP2>  Venice <V><NP2>
<V> Is  Venice is <NP2>
<NP2> <Art><Adj><N  Venice is <Art><Adj><N2>
2>
<Art> A  Venice is a <Adj><N2>
<Adj> beautiful  Venice is a beautiful <N2>
<N2> City  Venice is a beautiful city.

<S> rewritten as <NP1><VP>, then <NP1> rewritten as <PropN>, then <PropN>


rewritten as Venice, then <VP> rewritten as <Verb><NP2>, then <Verb> rewritten
as is, then <NP2> rewritten as <Art><Adj><N2>, then <Art> rewritten as a, then
<Adj> rewritten as beautiful and finally <N2> rewritten as city. So Grammar
generates the sentence Venice is a beautiful city.

44
3.3 Derivation tree or parse tree

Derivation tree is a graphical representation for the derivation of the given production rules for
the given CFG. Its called parse tree.
Properties:
• Root node indicating start symbol
• Derivation is read from left to right
• Leaf nodes are terminals
• The interior nodes are non terminals
• Example: Let G be the grammar
• S  aB | bA
• A  a | aS | bAA
• B  b | bS | aBB
• For the string baaabbabba find leftmost, rightmost derivation and Derivation tree
• First write the production rules separately like below
• S  aB rule1
• S  bA rule2
• Aa rule3
• A  aS rule4
• A  bAA rule5
• Bb rule6
• B  bS rule7
• B  aBB rule8

45
3.4 Left most and right most derivations

Generation of the string using production rules is called derivation. In derivation of string:
replacing the non -terminal by appropriate rule. If there are more non-terminal available then
which non-terminal has to replace first; which is based on the two methods listed below:
Leftmost Derivation
The derivation is said to be leftmost if in each step the leftmost variable in the sentential form is
replaced. If in each step the right most variable is replaced, we call the derivation rightmost.
Example: consider the grammar with productions
S → aAB
A → bBb
B → A|λ

46
S → aAB → abBbB → abAbB → abbBbbB → abbbbB → abbbb is leftmost derivation of the
string abbbb. A rightmost derivation of the same string
S → aAB → aA → abBb → abAb → abbBbb → abbbb
Rightmost Derivation
Its derivation in which, rightmost non-terminal is replaced first from the sentential form
Simplification of context free grammar
All Grammars are not always optimized. That means grammars are containing some unnecessary
symbols(non-terminals) and this will increase the length of the grammar.
 Simplification of grammar means reduction of grammar by removing useless symbols.
This is done in three ways:
i. Removal of useless symbols
ii. Elimination of ε production
iii. Removal of unit production
Removal of Useless Symbols
If a symbol does not exist on the RHS of the production rule and does not participate in the
derivation of any string, it may be considered to be useless. Similar to this, if a variable is not
used to create any strings, it may be worthless.
Example:
T → xxY | xbX | xxT
X → xX
Y → xy | y
Z → xz
In the example given above, the variable ‘Z’ will never occur in any string’s derivation. Thus,
the production Z → xz is useless. Therefore, let’s eliminate it. This way, all the other productions
would be written in a way that the variable Z can never reach from the ‘T’ starting variable.
Production X → xX would also be useless because it cannot terminate it in any way. But if it
never terminates, it can never ever produce a string. Thus, such a production can not participate
in any derivation ever.
Elimination of ε Production
All the productions that are of type S → ε are known as ε productions. These types of
productions can be removed only from the grammars that do not happen to generate ε.

47
Step 1: Find all nullable non-terminal variables, which derive ε first.
Step 2: For all the productions of X → a, construct all production X → a, where a is obtained
from x by removing non-terminals from the first step.
Step 3: Combine the result of the second step with the original production. Remove all the ε
productions.
Example:
Let us remove the production from the CFG given below by preserving its meaning.
S → ABA
A → 0A | ε
B → 1B | ε
Solution:
While removing ε production, the rule A → ε and B → ε would be deleted. To preserve the
CFG’s meaning, we are placing ε actually at the RHS whenever A and B have appeared.
Let us take
S → ABA
If the first A at RHS is ε. Then,
S → BA
Similarly, if the last A in RHS = ε. Then,
S → AB
If B = ε, then
S → AA
If B and A are ε, then
S→A
If both A is replaced by ε, then
S→B
Now,
S → AB | BA | AA | A | B
Now, let us consider
A → 0A
If we place ε at RHS for A, then
A→0

48
A → 0A | 0
Similarly, B → 1B | 1
We can rewrite the CFG collectively with removed ε production as
S → AB | BA | AA | A | B
A → 0A | 0
B → 1B | 1
Removing Unit Productions
The productions that result in the giving of one non-terminal to another are known as unit
productions. To get rid of unit production, take the following actions:
Step 1: To remove A → B, add production A → x to the grammar rule whenever B → x occurs
in the grammar.
Step 2: Delete A → B from the grammar.
Step 3: Repeat the first and the second steps until all the unit productions are removed.
Example:
S → 0X | 1Y | Z
X → 0S | 00
Y→1|X
Z → 01
Solution:
S → Z is a unit production. However, while removing S → Z, we have to consider what Z gives.
So, we can add a rule to S.
S → 0X | 1Y | 01
Similarly, Y → X is also a unit production, so we can modify it as
B → 1 | 0S | 00
Thus, we can write CFG finally without the unit production, as follows:
S → 0X | 1Y | 01
X → 0S | 00
Y → 1 | 0S | 00
Z → 01
3.5 Chomsky’s hierarchy of grammars

According to Chomsky hierarchy, grammar is divided into 4 types as follows:

49
1. Type 0 is known as unrestricted grammar.
2. Type 1 is known as context-sensitive grammar.
3. Type 2 is known as a context-free grammar.
4. Type 3 Regular Grammar.

Type 0: Unrestricted Grammar:


Type-0 grammars include all formal grammar. Type 0 grammar languages are recognized by
turing machine. These languages are also known as the Recursively Enumerable languages.
Grammar Production in the form of α → β where α is a string of terminals and nonterminals
with at least one non-terminal and α cannot be null. β is a string of terminals and non-
terminals.
Example
S → ACaB
Bc → acB
CB → DB
aD → Db

Type - 1 Grammar
Type-1 grammars generate context-sensitive languages. The productions must be in the form
αAβ→αγβ
where A ∈ N (Non-terminal)
and α, β, γ ∈ (T ∪ N)* (Strings of terminals and non-terminals)
The strings α and β may be empty, but γ must be non-empty.

50
The rule S → ε is allowed if S does not appear on the right side of any rule. The languages
generated by these grammars are recognized by a linear bounded automaton.
Example
AB → AbBc
A → bcA
B→b
Type-2 Context Free Grammar:
Type-2 grammars generate context-free languages.
The productions must be in the form A → γ
where A ∈ N (Non terminal)
and γ ∈ (T ∪ N)* (String of terminals and non-terminals).
These languages generated by these grammars are be recognized by a non-deterministic
pushdown automaton.
Example
S→Xa
X→a
X → aX
X → abc
X→ε
Type-3 Regular Grammar:
Generate the regular languages. Such a grammar restricts its rules to a single non-terminal on the
left-hand side and the right-hand side consisting of a single terminal, possibly followed by a
single non-terminal (right regular).

Example

X→ε

X → a | aY

Y→b

51
The Two Important Normal Forms

Chomsky Normal form

 Symbols on the right of a production are strictly limited.

 String on the right of a production consist of no more than two symbols

Definition 6.4. A CFG is in Chomsky normal form if all productions are of the form:

A → BC or A → a where A, B and C are in V and a is in T

Example: the grammar

S → AS|a,

A → SA|b. is in Chomsky normal form but the grammar

S → AS|AAS

A → SA|aa is not.

Algorithm to Convert into Chomsky Normal Form –

Step 1 − If the start symbol S occurs on some right side, create a new start symbol S’ and a new
production S’→ S.

Step 2 − Remove Null productions.

Step 3 − Remove unit productions.

Step 4 − Replace each production A → B1…Bn where n > 2 with A → B1C where C →
B2 …Bn. Repeat this step for all productions having two or more symbols in the right side.

Step 5 − If the right side of any production is in the form A → aB where a is a terminal and A,
B are non-terminal, then the production is replaced by A → XB and X → a. Repeat this step for
every production which is in the form A → aB.

Problem:

Convert the following CFG into CNF

S → ASA | aB, A → B | S, B → b | ε

Solution

(1) Since S appears in R.H.S, we add a new state S0 and S0→S is added to the production set and
it becomes −

52
S0→S, S→ ASA | aB, A → B | S, B → b | ∈

(2) Now we will remove the null productions −

B → ∈ and A → ∈

After removing B → ε, the production set becomes −

S0→S, S→ ASA | aB | a, A → B | S | ∈, B → b

After removing A → ∈, the production set becomes −

S0→S, S→ ASA | aB | a | AS | SA | S, A → B | S, B → b

(3) Now we will remove the unit productions.

After removing S → S, the production set becomes −

S0→S, S→ ASA | aB | a | AS | SA, A → B | S, B → b

After removing S0→ S, the production set becomes −

S0→ ASA | aB | a | AS | SA, S→ ASA | aB | a | AS | SA

A → B | S, B → b

After removing A→ B, the production set becomes −

S0 → ASA | aB | a | AS | SA, S→ ASA | aB | a | AS | SA

A→S|b

B→b

After removing A→ S, the production set becomes −

S0 → ASA | aB | a | AS | SA, S→ ASA | aB | a | AS | SA

A → b |ASA | aB | a | AS | SA, B → b

(4) Now we will find out more than two variables in the R.H.S

Here, S0→ ASA, S → ASA, A→ ASA violates two Non-terminals in R.H.S.

Hence we will apply step 4 and step 5 to get the following final production set which is in CNF −

S0→ AX | aB | a | AS | SA

S→ AX | aB | a | AS | SA

53
A → b |AX | aB | a | AS | SA

B→b

X → SA

(5) We have to change the productions S0→ aB, S→ aB, A→ aB

And the final production set becomes −

S0→ AX | YB | a | AS | SA

S→ AX | YB | a | AS | SA

A → b A → b |AX | YB | a | AS | SA

B→b

X → SA

Y→a

Example: Convert the grammar with productions

S → ABa,

A → aab,

B → Ac. To Chomsky normal form.

Step 1:

S → ABBa,

A → BaBaBb,

B → ABc

Ba → a,

Bb → b,

Bc → c.

Step 2: introduce additional variables to get the first two into normal form.

S → AD1

D1 → BBa,

54
A → BaD2

D2 → BaBb,

B → ABc

Ba → a,

Bb → b,

Bc → c.

Greibach Normal Forms

Definition: a context-free grammar G is said to be in GNF if all productions have the form

A ax, where a ε T and xεV*.

Example: the grammar S → AB,

A→ aA/bB/b,

B→ b

is not in GNF. However, using the substitution rule we get the equivalent grammar

S → aAB/bBB/bB,

A → aA/bB/b,

B → b, which is in a GNF.

Review Questions

1. Define context free grammar with example

2. Explain parsing and ambiguity in context free grammar with example

3. Discuss about right most and left most derivation with example

4. Discuss the steps of simplification of CFG with example

5. Discuss about CNF and GNF with examples

55
Chapter Four

4.1 Push down automata

We know that CFG is a type 2 grammar. In addition, this grammar is recognized by push down
automata machine. This push down automata processes infinite amount of information. Push
down automata contains different components such as stack which maintains the elements and
(which contains two operations PUSH and POP). The other component is input tape, which
contains input string. The other is finite control, which controls the push and pop operations.

Definition of push down automata


A pushdown automaton is a way to implement a context-free grammar in a similar way we
design DFA for a regular grammar. A DFA can remember a finite amount of information, but a
PDA can remember an infinite amount of information.

A pushdown automaton is −

"Finite state machine" + "a stack"


A pushdown automaton has three components −

 an input tape,
 a control unit, and
 a stack with infinite size.
The stack head scans the top symbol of the stack.
A stack does two operations −
 Push − a new symbol is added at the top.
 Pop − the top symbol is read and removed.

A PDA may or may not read an input symbol, but it has to read the top of the stack in every
transition.

Pushdown Automata has extra memory called stack which gives more power than Finite
automata. It is used to recognize context-free languages. Deterministic and Non-Deterministic
PDA: In deterministic PDA, there is only one move from every state on every input symbol but
in Non-Deterministic PDA, there can be more than one move from one state for an input symbol.
Note:

• Power of NPDA is more than DPDA.


• It is not possible to convert every NPDA to corresponding DPDA.

56
• Language accepted by DPDA is a subset of language accepted by NPDA.
• The languages accepted by DPDA are called DCFL (Deterministic Context-Free Languages)
which are a subset of NCFL (Non-Deterministic CFL) accepted by NPDA.

Push down Automata is defined by the 7-tuple.


M = (Q, Σ, γ, δ, q0, z0, F)

where,
Q is a finite set of internal states of the control unit.
Σ is a finite set called input alphabet.
γ is a finite set called the stack alphabet.
δ is the transition function
q0 ∈ Q, is the initial state of the control unit.
z0 ∈ γ is the stack start symbol.
F ⊆ Q is the set of final states.

The arguments of δ are (q, a, x)


 The current state of the Control Unit,
 The current input symbol, and
 The current symbol on top of the stack.

The result is a set of pairs (qn, y) where,


 qn is the next state of the Control Unit and
 y is a string which is put on top of the stack in place of the single symbol there before.

For example, suppose the set of transition rules of Push Down Automata contains δ(q1, a, b) =
{(q2, cd), (q3, ε)} then it can be interpreted as:
If at any time the control unit is in state q1, the input symbol read is „a‟, and the symbol on top of
the stack is “b”, then one of two things can happen.
 The control unit goes into state q2 and the string „cd‟ replaces b on top of the stack
or

57
 The control unit goes into state q3 with the symbol “b” removed from the top of the stack.

The following diagram shows a transition in a PDA from a state q1 to state q2, labeled as a, b →
c−

This means at state q1, if we encounter an input string ‘a’ and top symbol of the stack is ‘b’, then
we pop ‘b’, push ‘c’ on top of the stack and move to state q2.
Example : Define the pushdown automata for language {a nbn | n > 0}
Solution : Q = { q0, q1 } and Σ = { a, b } and Γ = { A, Z } and δ is given by :
δ( q0, a, Z) = { ( q0, AZ) }
δ( q0, a, A) = { ( q0, AA) }
δ( q0, b, A) = { ( q1, ∈) }
δ( q1, b, A) = { ( q1, ∈) }
δ( q1, ∈, Z) = { ( q1, ∈) }

Let us see how this automata works for aaabbb.

Explanation : Initially, the state of automata is q0 and symbol on stack is Z and the input is
aaabbb as shown in row 1. On reading ‘a’ (shown in bold in row 2), the state will remain q0

58
and it will push symbol A on stack. On next ‘a’ (shown in row 3), it will push another symbol
A on stack. After reading 3 a’s, the stack will be AAAZ with A on the top. After reading ‘b’
(as shown in row 5), it will pop A and move to state q1 and stack will be AAZ. When all b’s
are read, the state will be q1 and stack will be Z. In row 8, on input symbol ‘∈’ and Z on stack,
it will pop Z and stack will be empty. This type of acceptance is known as acceptance by
empty stack.

Pushdown Automata Acceptance

There are two different ways to define PDA acceptability.

Final State Acceptability

In final state acceptability, a PDA accepts a string when, after reading the entire string, the PDA
is in a final state. From the starting state, we can make moves that end up in a final state with any
stack values. The stack values are irrelevant as long as we end up in a final state.

For a PDA (Q, Σ, S, δ, q0, I, F), the language accepted by the set of final states F is −

L(PDA) = {w | (q0, w, I) ⊢* (q, ε, x), q ∈ F} for any input stack string x.

Empty Stack Acceptability

Here a PDA accepts a string when, after reading the entire string, the PDA has emptied its stack.

For a PDA (Q, Σ, S, δ, q0, I, F), the language accepted by the empty stack is −

L(PDA) = {w | (q0, w, I) ⊢* (q, ε, ε), q ∈ Q}

Example: Construct a PDA that accepts L = {0n 1n | n ≥ 0}?

Solutions

59
This language accepts L = {ε, 01, 0011, 000111, ............................. }

Here, in this example, the number of ‘0’ and ‘1’ have to be same.

Initially we put a special symbol ‘$’ into the empty stack. Then at state q2, if we encounter input
0 and top is Null, we push 0 into stack. This may iterate. And if we encounter input 1 and top is
0, we pop this 0. Then at state q3, if we encounter input 1 and top is 0, we pop this 0. This may
also iterate. And if we encounter input 1 and top is 0, we pop the top element. If the special
symbol ‘$’ is encountered at top of the stack, it is popped out and it finally goes to the accepting
state q4.

Example Construct a PDA that accepts L = { wwR | w = (a+b)* }

Solution:

Initially we put a special symbol ‘$’ into the empty stack. At state q2, the w is being read. In
state q3, each a or b is popped when it matches the input. If any other input is given, the PDA
will go to a dead state. When we reach that special symbol ‘$’, we go to the accepting state q4.

60
Example: Construct a PDA for language L = {0n1m2m3n | n>=1, m>=1} Approach used in this
PDA – First 0’s are pushed into stack. Then 1’s are pushed into stack. Then for every 2 as input a
1 is popped out of stack. If some 2’s are still left and top of stack is a 0 then string is not accepted
by the PDA. Thereafter if 2’s are finished and top of stack is a 0 then for every 3 as input equal
number of 0’s are popped out of stack. If string is finished and stack is empty then string is
accepted by the PDA otherwise not accepted.

Step-1: On receiving 0 push it onto stack. On receiving 1, push it onto stack and goto next state
Step-2: On receiving 1 push it onto stack. On receiving 2, pop 1 from stack and goto next state
Step-3: On receiving 2 pop 1 from stack. If all the 1’s have been popped out of stack and now
receive 3 then pop a 0 from stack and goto next state
Step-4: On receiving 3 pop 0 from stack. If input is finished and stack is empty then goto last
state and string is accepted

Examples:
Input: 0 0 1 1 1 2 2 2 3 3
Result: ACCEPTED
Input: 0 0 0 1 1 2 2 2 3 3
Result: NOT ACCEPTED
Example: Construct a PDA for language L = {0n1m | n >= 1, m >= 1, m > n+2}

61
Approach used in this PDA –
First 0’s are pushed into stack. When 0’s are finished, two 1’s are ignored. Thereafter for every 1
as input a 0 is popped out of stack. When stack is empty and still some 1’s are left then all of
them are ignored.
Step-1: On receiving 0 push it onto stack. On receiving 1, ignore it and go to next state
Step-2: On receiving 1, ignore it and go to next state
Step-3: On receiving 1, pop a 0 from top of stack and go to next state
Step-4: On receiving 1, pop a 0 from top of stack. If stack is empty, on receiving 1 ignore it and
go to next state
Step-5: On receiving 1 ignore it. If input is finished then go to last state

Examples:
Input: 0 0 0 1 1 1 1 1 1
Result: ACCEPTED
Input: 0 0 0 0 1 1 1 1
Result: NOT ACCEPTED
Pushdown Automata Acceptance by Final State
We have discussed Pushdown Automata (PDA) and its acceptance by empty stack article. Now,
in this article, we will discuss how PDA can accept a CFL based on final state. Given a PDA P
as: P = (Q, Σ, Γ, δ, q0, Z, F)

62
The language accepted by P is the set of all strings consuming which PDA can move from initial
state to final state irrespective to any symbol left on the stack which can be depicted as:
L(P) = {w |(q0, w, Z) =>(qf, ɛ, s)}

Here, from start state q0 and stack symbol Z, final state qf ɛ F is reached when input w is
consumed. The stack can contain strings, which is irrelevant as final state is reached and w will
be accepted.

Example: Define the pushdown automata for language {anbn | n > 0} using final state.

Q = {q0, q1, q2, q3} and ∑ = {a, b} and Γ = { A, Z } and F={q3} and δ is given by:

δ( q0, a, Z ) = { ( q1, AZ ) }

δ( q1, a, A) = { ( q1, AA ) }

δ( q1, b, A) = { ( q2, ɛ) }

δ( q2, b, A) = { ( q2, ɛ) }

δ( q2, ɛ, Z) = { ( q3, Z) }

Explanation: Initially, the state of automata is q0 and symbol on stack is Z and the input is
aaabbb as shown in row 0. On reading a (shown in bold in row 1), the state will be changed to q1

63
and it will push symbol A on stack. On next a (shown in row 2), it will push another symbol A
on stack and remain in state q1. After reading 3 a’s, the stack will be AAAZ with A on the top.

After reading b (as shown in row 4), it will pop A and move to state q2 and stack will be AAZ.
When all b’s are read, the state will be q2 and stack will be Z. In row 7, on input symbol ɛ and Z
on stack, it will move to q3. As final state q3 has been reached after processing input, the string
will be accepted.

This type of acceptance is known as acceptance by final state.

Example: We will see how this automata works for aab:

As we can see in row 4, the input has been processed and PDA is in state q2 which is a non-final
state, the string aab will not be accepted.

Construct Pushdown Automata for all length palindrome

A Pushdown Automaton (PDA) is like an epsilon Non deterministic Finite Automata (NFA) with
infinite stack. PDA is a way to implement context free languages. Hence, it is important to learn,
how to draw PDA.

Here, take the example of odd length palindrome:

Que-1: Construct a PDA for language L = {wcw’ | w= {0, 1}*} where w’ is the reverse of w.

Approach used in this PDA –

Keep on pushing 0’s and 1’s no matter whatever is on the top of stack until reach the middle
element. When middle element ‘c’ is scanned then process it without making any changes in
stack. Now if scanned symbol is ‘1’ and top of stack also contain ‘1’ then pop the element from
top of stack or if scanned symbol is ‘0’ and top of stack also contain ‘0’ then pop the element

64
from top of stack. If string becomes empty or scanned symbol is ‘$’ and stack becomes empty,
then reach to final state else move to dead state.

Step 1: On receiving 0 or 1, keep on pushing it on top of stack without going to next state.

Step 2: On receiving an element ‘c’, move to next state without making any change in stack.

Step 3: On receiving an element, check if symbol scanned is ‘1’ and top of stack also contain ‘1’
or if symbol scanned is ‘0’ and top of stack also contain ‘0’ then pop the element from top of
stack else move to dead state. Keep on repeating step 3 until string becomes empty.

Step 4: Check if symbol scanned is ‘$’ and stack does not contain any element then move to
final state else move to dead state.

Examples:
Input: 1 0 1 0 1 0 1 0 1
Output: ACCEPTED
Input: 1 0 1 0 1 1 1 1 0
Output: NOT ACCEPTED

Now, take the example of even length palindrome:

Que-2: Construct a PDA for language L = {ww’ | w={0, 1}*} where w’ is the reverse of w.

Approach used in this PDA –


65
For construction of even length palindrome, user has to use Non Deterministic Pushdown
Automata (NPDA). A NPDA is basically an NFA with a stack added to it.

The NPDA for this language is identical to the previous one except for epsilon transition.
However, there is a significant difference, that this PDA must guess when to stop pushing
symbols, jump to the final state and start matching off of the stack. Therefore this machine is
decidedly non-deterministic.

Keep on pushing 0’s and 1’s no matter whatever is on the top of stack and at the same time keep
a check on the input string, whether reach to the second half of input string or not. If reach to last
element of first half of the input string then after processing the last element of first half of input
string make an epsilon move and move to next state. Now if scanned symbol is ‘1’ and top of
stack also contain ‘1’ then pop the element from top of stack or if scanned symbol is ‘0’ and top
of stack also contain ‘0’ then pop the element from top of stack. If string becomes empty or
scanned symbol is ‘$’ and stack becomes empty, then reach to final state else move to dead state.

Step 1: On receiving 0 or 1, keep on pushing it on top of stack and at a same time keep on
checking whether reach to second half of input string or not.

Step 2: If reach to last element of first half of input string, then push that element on top of stack
and then make an epsilon move to next state.

Step 3: On receiving an element, check if symbol scanned is ‘1’ and top of stack also contain ‘1’
or if symbol scanned is ‘0’ and top of stack also contain ‘0’ then pop the element from top of
stack else move to dead state. Keep on repeating step 3 until string becomes empty.

Step 4: Check if symbol scanned is ‘$’ and stack does not contain any element then move to
final state else move to dead state.

66
Examples:

Input: 1 0 0 1 1 1 1 0 0 1
Output: ACCEPTED
Input: 1 0 0 1 1 1
Output: NOT ACCEPTED
Now, take the example of all length palindrome, i.e. a PDA which can accept both odd length
palindrome and even length palindrome:
Que-3: Construct a PDA for language L = {ww’ | wcw’, w={0, 1}*} where w’ is the reverse of
w.
Approach used in this PDA –
For construction of all length palindrome, user has to use NPDA.
The approach is similar to above example, except now along with epsilon move now user has to
show one more transition move of symbol ‘c’ i.e. if string is of odd length and if reach to middle
element ‘c’ then just process it and move to next state without making any change in stack.
Step 1: On receiving 0 or 1, keep on pushing it on top of stack and at a same time keep on
checking, if input string is of even length then whether reach to second half of input string or not,
however if the input string is of odd length then keep on checking whether reach to middle
element or not.
Step 2: If input string is of even length and reach to last element of first half of input string, then
push that element on top of stack. Then make an epsilon move to next state or if the input string
is of odd length then on receiving an element ‘c’, move to next state without making any change
in stack.
67
Step 3: On receiving an element, check if symbol scanned is ‘1’ and top of stack also contain ‘1’
or if symbol scanned is ‘0’ and top of stack also contain ‘0’ then pop the element from top of
stack else move to dead state. Keep on repeating step 3 until string becomes empty.
Step 4: Check if symbol scanned is ‘$’ and stack does not contain any element then move to
final state else move to dead state.

Examples:
Input: 1 1 0 0 1 1 1 1 0 0 1 1
Output: ACCEPTED
Input: 1 0 1 0 1 0 1
Output: ACCEPTED
4.2 Non-deterministic pushdown automata

Pushdown automaton M is formally defined as a 7-tuple M = (Q, Σ, Γ, σ, q0, z0, F)


where,

 Q is the finite set of states

 ∑ is the finite set of input alphabet.

 q0 ∈ Q is the initial state

 F ⊆ Q is the set of final states

 Γ is the stack alphabet, specifying the set of symbols that can be pushed onto the stack

68
 Z0 is the start stack symbol (z0 Є Γ)

 σ is the transition function:

σ: Q × ∑ ∪ {∧} × Γ → Γ* × Q
Example: the set of transition rules of an npda contains

δ (q1, a,b) = {(q2, cd), (q3, λ)}.

If at any time the control unit is in state q1, the input symbol read is a, and the symbol on the top
of the stack is b, then one of the two things happen:

 The control unit goes into state q2 and the string cd replaces b on the top of the stack.
 The control unit goes into state q3 and b removed from the top of the stack.

Consider an npda with

Q = {q0, q1, q2, q3}

Σ = {a,b}

Γ = {0,1}

z= 0

F = {q3} and

δ(q0,a,0) = {(q1, 10), (q3, λ)}

δ(q0, λ,0) = {(q3, λ)}

δ(q1,a,1) = {(q1, 11)}

δ(q1,b,1) = {(q2, λ)}

δ(q2,b,1) = {(q2, λ)}

δ(q2, λ,0) = {(q3, λ)}

Example:

Design PDA for Palindrome strips.

Solution:

Suppose the language consists of string L = {aba, aa, bb, bab, bbabb, aabaa, ......]. The string can
be odd palindrome or even palindrome. The logic for constructing PDA is that we will push a

69
symbol onto the stack till half of the string then we will read each symbol and then perform the
pop operation. We will compare to see whether the symbol which is popped is similar to the
symbol which is read. Whether we reach to end of the input, we expect the stack to be empty.

This PDA is a non-deterministic PDA because finding the mid for the given string and reading
the string from left and matching it with from right (reverse) direction leads to non-deterministic

moves.

Simulation of abaaba

δ (q1, abaaba, Z) Apply rule 1

⊢ δ(q1, baaba, aZ) Apply rule 5

⊢ δ(q1, aaba, baZ) Apply rule 4

⊢ δ(q1, aba, abaZ) Apply rule 7

⊢ δ(q2, ba, baZ) Apply rule 8

⊢ δ(q2, a, aZ) Apply rule 7

⊢ δ(q2, ε, Z) Apply rule 11

70
⊢ δ(q2, ε) Accept

Problem: Design a non deterministic PDA for accepting the language L = {wwR w ∈ (a, b)+},
i.e.,
L = {aa, bb, abba, aabbaa, abaaba, ......}
Explanation: In this type of input string, one input has more than one transition states, hence it
is called non-deterministic PDA, and the input string contain any order of ‘a’ and ‘b’. Each input
alphabet has more than one possibility to move next state. And finally when the stack is empty
then the string is accepted by the NPDA. In this NPDA we used some symbols which are given
below:
Γ = { a, b, z }
Where, Γ = set of all the stack alphabet
z = stack start symbol
a = input alphabet
b = input alphabet
The approach used in the construction of PDA –
As we want to design an NPDA, thus every times ‘a’ or ‘b’ comes then either push into the stack
or move into the next state. It is dependent on a string. When we see the input alphabet which is
equal to the top of the stack then that time pop operation applies on the stack and move to the
next step.
So, in the end, if the stack becomes empty then we can say that the string is accepted by the
PDA.
STACK Transition Function
δ (q0, a, z) = (q0, az), δ (q0, a, a) = (q0, aa), δ (q0, b, z) =(q0, bz), δ (q0, b, b) = (q0, bb), δ (q0, a,
b) = (q0, ab), δ (q0, b, a) = (q0, ba), δ (q0, a, a) = (q1, ∈), δ (q0, b, b) = (q1, ∈), δ (q1, a, a) = (q1,
∈), δ (q1, b, b) = (q1, ∈), δ (q1, ∈, z) = (qf, z).
Where, q0 = Initial state
qf = Final state
∈ = indicates pop operation

71
So, this is our required non-deterministic PDA for accepting the language L = {wwR w ∈ (a,
b)*}
Example:
We will take one input string: “abbbba”.
 Scan string from left to right
 The first input is ‘a’ and follows the rule:
 on input ‘a’ and STACK alphabet Z, push the two ‘a’s into STACK as: (a, Z/aZ) and state
will be q0
 on input ‘b’ and STACK alphabet ‘a’, push the ‘b’ into STACK as: (b, a/ba) and state will
be q0
 on input ‘b’ and STACK alphabet ‘b’, push the ‘b’ into STACK as: (b, b/bb) and state will
be q0
 on input ‘b’ and STACK alphabet ‘b’ (state is q1), pop one ‘b’ from STACK as: (b, b/∈)
and state will be q1
 on input ‘b’ and STACK alphabet ‘b’ (state is q1), pop one ‘b’ from STACK as: (b, b/∈)
and state will be q1
 on input ‘a’ and STACK alphabet ‘a’ and state q1, pop one ‘a’ from STACK as: (a, a/∈)
and state will remain q1
 on input ∈ and STACK alphabet Z, go to the final state(qf) as : (∈, Z/Z)
So, at the end the stack becomes empty then we can say that the string is accepted by the
PDA.
NOTE: This DPDA will not accept the empty language.

72
4.3 Push down automata and context free languages

A grammar generates strings in a language using rules, which are instructions, or better, licenses,
to replace some nonterminal symbol by some string. Typical rules look like this:
S → ASa, B → aB, A → SaSSbB.
In context-free grammars, rules have a single non-terminal symbol (upper-case letter) on the left,
and any string of terminal and/or non-terminal symbols on the right. So even things like A → A
and B → ε are perfectly good context-free grammar rules. What is not allowed is something with
more than one symbol to the left of the arrow: AB → a, or a single terminal symbol: a → Ba, or
no symbols at all on the left: ε → Aab. The idea is that each rule allows the replacement of the
symbol on its left by the string on its right. We call these grammars context free because every
rule has just a single nonterminal on its left. We cannot add any contextual restrictions (such as
aAa). Therefore, each replacement is done independently of all the others.
For example, suppose G = (V, Σ, R, S), where
V = {S, A, B, a, b}, Σ = {a, b}, and R = {S → AB, A → aAa, A → a, B → Bb, B → b}
Then G generates the string aaabb by the following derivation:
S→ AB → aAaB → aAaBb →aaaBb → aaabb

4.4 Deterministic push down automata

73
Solution:
We will apply a very simple logic for this DPDA. When we read a, we will simply push it onto
the stack and as we read b simply POP corresponding ‘a’ from the stack. After reading the
complete input string, the stack should be empty.

4.5 Deterministic context free languages

A Context Free Grammar can be defined as G = (N, T, P, S)


Where N is set of non –terminals
T is set of terminals
P is set of production rules.
S is start of symbol
Each production rule is in the form of
Non-terminalNon-terminals. or
Non-terminalterminals.

74
For example:

The Grammar G is a Context free Grammar, which produces Languages L, which is context free
Language.
In formal language theory, Deterministic Context-Free Languages (DCFL) are a proper subset of
context-free languages.
A Language L is a deterministic context free language if it is accepted by DPDA.
Example:

is a DCFL
The above deterministic context free language (DCFL) is accepted by DPDA, so the given language
can be consider as DCFL.

Review Questions

1. Define push down automata

2. Discuss about acceptability of string by final state with example

3. Discus about Non-deterministic push down automata with example

4. Discuss about Deterministic push down automata with example

5. Discuss about Push down automata and context free languages with examples

6. Discuss about Deterministic context free languages with example

75
Chapter Five
5.1 Turing machines

Turing Machine was invented by Alan Turing in 1936. It is used to accept Recursive Enumerable
Languages (generated by Type-0 Grammar). Turing machines are a fundamental concept in the
theory of computation and play an important role in the field of computer science. They were
first described by the mathematician and computer scientist Alan Turing in 1936 and provide a
mathematical model of a simple abstract computer. A Turing machine is a finite automaton that
can read, write, and erase symbols on an infinitely long tape. The tape is divided into squares,
and each square contains a symbol. The Turing machine can only read one symbol at a time, and
it uses a set of rules (the transition function) to determine its next action based on the current
state and the symbol it is reading.

A Turing machine consists of a tape of infinite length on which read and writes operation can be
performed. The tape consists of infinite cells on which each cell either contains input symbol or a
special symbol called blank. It also consists of a head pointer, which points to cell currently
being read and it can move in both directions. A Turing machine is a finite automaton equipped
with an infinite tape as its memory.

Recursively Enumerable and Recursive Languages

Recall that Type 0 grammars are unrestricted. A grammar G = (V, T, S, P) is called unrestricted if

all the productions are of the form where V means


non-terminal and T means terminal. Any language generated by an unrestricted grammar is
recursively enumerable. Furthermore, for every recursively enumerable language L, there exists
an unrestricted grammar G, such that L = L(G). In other words, the family of languages
associated with unrestricted grammars is same as the family of recursively enumerable

76
languages. Equivalently, a language L is recursively enumerable if there exists a Turing
machine that accepts L. For every string w ∈ L, the Turing machine will halt in a final state.
However, what happens when the machine is given a string that is not in L? Will it halt in a non-
final state, or it never halts and enters an infinite loop? These questions lead us to the definition
of recursive languages. A formal language L is recursive if there exists a total Turing machine—
the one that halts on every given input—which given a finite sequence of symbols as input,
accepts the input if it belongs to the language L and rejects the input otherwise. Recursive
languages are also referred to as decidable languages.

The productions can be in the form of α → β where α is a string of terminals and non-terminals
with at least one non-terminal and α cannot be null. β is a string of terminals and non-terminals.

Example

S → ACaB

Bc → acB

CB → DB

aD → Db

5.2 Construction of Turing Machine

Turing machine can be formally defined as a 7-tuple M = ⟨ Q , Γ , b , Σ , δ , q 0 , F ⟩


Q is a finite, non-empty set of states.
Γ is a finite, non-empty set of the tape alphabet.
Σ is the set of input symbols with Σ ⊂ Γ.
δ is a partially defined function, the transition function - δ : (Q \ {qf}) x Γ → Q x Γ x {L,R}.
b ∈ Σ is the blank symbol.
q0 ∈ Q is the initial state.
qf ∈ Q is the set of accepting or final states.
● A Turing machine has two alphabets:
● An input alphabet Σ. All input strings are written in the input alphabet.

77
● A tape alphabet Γ, where Σ ⊆ Γ. The tape alphabet contains all symbols that can be written
onto the tape.
● The tape alphabet Γ can contain any number of symbols, but always contains at least one blank
symbol, denoted ☐. You are guaranteed ☐ ∉ Σ.
Example of Turing machine
Turing machine M = (Q, Γ, ∑, δ, q0, B, F) with
 Q = {q0, q1, q2, qf}
 Γ = {a, b}
 ∑ = {1}
 q0 = {q0}
 B = blank symbol
 F = {qf}

δ is given by −

Tape alphabet Present State Present State Present State


symbol ‘q0’ ‘q1’ ‘q2’

a 1Rq1 1Lq0 1Lqf

b 1Lq2 1Rq1 1Rqf


Here the transition 1Rq1 implies that the write symbol is 1, the tape moves right, and the next
state is q1. Similarly, the transition 1Lq2 implies that the write symbol is 1, the tape moves left,
and the next state is q2.
A TM accepts a language if it enters into a final state for any input string w. A language is
recursively enumerable (generated by Type-0 grammar) if it is accepted by a Turing machine.
A TM decides a language if it accepts it and enters into a rejecting state for any input not in the
language. A language is recursive if it is decided by a Turing machine.
Example 1
Design a TM to recognize all strings consisting of an odd number of α’s.
Solution
The Turing machine M can be constructed by the following moves −
 Let q1 be the initial state.
 If M is in q1; on scanning α, it enters the state q2 and writes B (blank).
78
 If M is in q2; on scanning α, it enters the state q1 and writes B (blank).
 From the above moves, we can see that M enters the state q1 if it scans an even number
of α’s, and it enters the state q2 if it scans an odd number of α’s. Hence q2 is the only
accepting state.
Hence,
M = {{q1, q2}, {1}, {1, B}, δ, q1, B, {q2}}
where δ is given by −

Tape alphabet symbol Present State ‘q1’ Present State ‘q2’

α BRq2 BRq1

Example 2:
Construct a TM for the language L = {0n1n2n} where n≥1
Solution:
L = {0n1n2n | n≥1} represents language where we use only 3 character, i.e., 0, 1 and 2. In this,
some number of 0's followed by an equal number of 1's and then followed by an equal number of
2's. Any type of string, which falls in this category will be accepted by this language.
The simulation for 001122 can be shown as below:

Now, we will see how this Turing machine will work for 001122. Initially, state is q0 and head
points to 0 as:

The move will be δ(q0, 0) = δ(q1, A, R) which means it will go to state q1, replaced 0 by A and
head will move to the right as:

79
The move will be δ(q1, 0) = δ(q1, 0, R) which means it will not change any symbol, remain in
the same state and move to the right as:

The move will be δ(q1, 1) = δ(q2, B, R) which means it will go to state q2, replaced 1 by B and
head will move to right as:

The move will be δ(q2, 1) = δ(q2, 1, R) which means it will not change any symbol, remain in
the same state and move to right as:

The move will be δ(q2, 2) = δ(q3, C, R) which means it will go to state q3, replaced 2 by C and
head will move to right as:

Now move δ(q3, 2) = δ(q3, 2, L) and δ(q3, C) = δ(q3, C, L) and δ(q3, 1) = δ(q3, 1, L) and δ(q3,
B) = δ(q3, B, L) and δ(q3, 0) = δ(q3, 0, L), and then move δ(q3, A) = δ(q0, A, R), it means will
go to state q0, replaced A by A and head will move to right as:

The move will be δ(q0, 0) = δ(q1, A, R) which means it will go to state q1, replaced 0 by A, and
head will move to right as:

80
The move will be δ(q1, B) = δ(q1, B, R) which means it will not change any symbol, remain in
the same state and move to right as:

The move will be δ(q1, 1) = δ(q2, B, R) which means it will go to state q2, replaced 1 by B and
head will move to right as:

The move will be δ(q2, C) = δ(q2, C, R) which means it will not change any symbol, remain in
the same state and move to right as:

The move will be δ(q2, 2) = δ(q3, C, L) which means it will go to state q3, replaced 2 by C and
head will move to left until we reached A as:

immediately before B is A that means all the 0's are marked by A. So we will move right to
ensure that no 1 or 2 is present. The move will be δ(q2, B) = (q4, B, R) which means it will go to
state q4, will not change any symbol, and move to right as:

81
The move will be (q4, B) = δ(q4, B, R) and (q4, C) = δ(q4, C, R) which means it will not change
any symbol, remain in the same state and move to right as:

The move δ(q4, X) = (q5, X, R) which means it will go to state q5 which is the HALT state and
HALT state is always an accept state for any TM.

The same TM can be represented by Transition Diagram:

5.3 Turing Decidable and Turing Acceptable

L is Turing decidable (or just decidable) if there exists a Turing machine M that accepts all
strings in L and rejects all strings not in L. Note that by rejection we mean that the machine halts

82
after a finite number of steps and announces that the input string is not acceptable. Acceptance,
as usual, also requires a decision after a finite number of steps.
A language is called Decidable or Recursive if there is a Turing machine which accepts and
halts on every input string w. Every decidable language is Turing-Acceptable. Decidable Turing
machines always halt; they never go into an infinite loop.
● A language L is called decidable iff there is a decider M such that ( ℒ M) = L.
● Given a decider M, you can learn whether or not a string w ∈ ( ℒ M).
● Run M on w.
● Although it might take a staggeringly long time, M will eventually accept or reject w.
● The set R is the set of all decidable languages. L ∈ R iff L is decidable

A decision problem P is decidable if the language L of all yes instances to P is decidable.


For a decidable language, for each input string, the TM halts either at the accept or the reject
state as depicted in the following diagram −

Example 1
Find out whether the following problem is decidable or not −

83
Is a number ‘m’ prime?
Solution
Prime numbers = {2, 3, 5, 7, 11, 13, …………..}
Divide the number ‘m’ by all the numbers between ‘2’ and ‘√m’ starting from ‘2’.
If any of these numbers produce a remainder zero, then it goes to the “Rejected state”, otherwise
it goes to the “Accepted state”. So, here the answer could be made by ‘Yes’ or ‘No’.
Hence, it is a decidable problem.
Turing recognizability
L is Turing recognizable if there is a Turing machine M that recognizes L, that is, M should
accept all strings in L and M should not accept any strings not in L. This is not the same as
decidability because recognizability does not require that M actually reject strings not in L. M
may reject some strings not in L but it is also possible that M will simply "remain undecided" on
some strings not in L; for such strings, M's computation never halts.
5.4 Language accepted by Turing machine

The Turing machine accepts all the language even though they are recursively enumerable.
Recursive means repeating the same set of rules for any number of times and enumerable means
a list of elements. The TM also accepts the computable functions, such as addition,
multiplication, subtraction, division, power function, and many more.
Example:
Construct a Turing machine, which accepts the language of aba over ∑ = {a, b}.
Solution:
We will assume that on input tape the string 'aba' is placed like this:

The tape head will read out the sequence up to the Δ characters. If the tape head is readout 'aba'
string then TM will halt after reading Δ.
Now, we will see how this Turing machine will work for aba. Initially, state is q0 and head
points to a as:

84
The move will be δ(q0, a) = δ(q1, A, R) which means it will go to state q1, replaced a by A and
head will move to right as:

The move will be δ(q1, b) = δ(q2, B, R) which means it will go to state q2, replaced b by B and
head will move to right as:

The move will be δ(q2, a) = δ(q3, A, R) which means it will go to state q3, replaced a by A and
head will move to right as:

The move δ(q3, Δ) = (q4, Δ, S) which means it will go to state q4 which is the HALT state and
HALT state is always an accept state for any TM.
The same TM can be represented by Transition Table:

States a b Δ

q0 (q1, A, R) – –

q1 – (q2, B, R) –

q2 (q3, A, R) – –

85
q3 – – (q4, Δ, S)

q4 – – –

The same TM can be represented by Transition Diagram:

5.5 Undecidable problems

In order to decide a problem, the Turing machine must always halt and either accept or reject the
input word corresponding to the problem instance. The main issue encountered by the Turing
machine is in the “halt” step, since there is no guarantee that the machine will halt on every input
word it receives. Indeed, even the most basic decision problems are rendered undecidable on a
Turing machine, simply because of the fundamental limitation that the machine may get caught
in an infinite loop during its computation
There are two types of TMs (based on halting):
Recursive: TMs that always halt, no matter accepting or non no matter accepting or non -
accepting  DECIDABLE PROBLEMS
Recursively enumerable: TMs that are guaranteed to halt are guaranteed to halt only on
acceptance only on acceptance. If non-accepting, it may or may not halt (i.e., could loop
forever). Undecidable problems are those that are not recursive.
Any TM for a Recursive language is going to look like this:

Any TM for a Any TM for a Recursively Enumerable Recursively Enumerable (RE) language is
going to look like this:

86
Example of Undecidable problems
Problem: Is a given context-free grammar G unambiguous?
[A context-free grammar G is unambiguous iff every string s in L(G) has a unique left-most
derivation]. This problem is undecidable.
The regular language { ε, a, aa, aaa, aaaa, aaaaa, … } Ambiguous grammar: A -> aA | Aa | ε
Unambiguous grammar: A -> aA | ε
The context free grammar A -> A + A | A - A | a is ambiguous, since a + a + a has two different
left-most derivations. A -> A + A -> a + A -> a + A + A -> a + a + A -> a + a + a and A -> A +
A -> A + A + A -> … -> a + a + a (replacing left-most nonterminal A by A+A)
Membership Problem We’ve seen that the membership problems for regular languages and
context-free languages are both decidable, by virtue of the fact that the models recognizing such
classes of languages always halt and either accept and reject their input words. For Turing
machines, however, we lose this valuable decidability property. Before we continue, let’s review
the notions of decidability and semidecidability from the previous lecture. Suppose that M is a
Turing machine recognizing a language L. Recall that:
• L is decidable if (i) whenever w 2 L, M accepts w; and (ii) whenever w 62 L, M rejects w; and
• L is semidecidable if (i) whenever w 2 L, M accepts w; and (ii) whenever w 62 L, M either
rejects w or enters a loop. It’s quite straightforward to show that the membership problem for
Turing machines, ATM, is semidecidable.

Review Questions
1. Define Turing machine?
2. Discuss about recursive and recursively enumerable languages
3. Discuss about Turing decidable and acceptance with example
4. Discuss about Turing undecidable with example
5. What does it mean by languages accepted by Turing machine? Discuss by using
necessary exampes

87
Chapter Six: Computability
6.1 Introduction

Computability is the ability to solve a problem in an effective manner. The computability of a


problem is closely linked to the existence of an algorithm to solve the problem. Computation is
usually modelled as a mapping from inputs to outputs, carried out by a formal “machine,” or
program, which processes its input in a sequence of steps.

An “effectively computable” function is one that can be computed in a finite amount of time
using finite resources. Function f is computable if some program P computes it: For any input x,
the computation, P(x) halts with output f(x) finite number of steps.
Unsolvable:-problems are formulated as functions, we call unsolvable functions are
 Non-computable;
 Problems formulated as predicates,
 They are called undecidable.
 Using Turing's concept of the abstract machine, we would say that a function is non-
computable if there exists no Turing machine that could compute it.

88
Computable problems: we are familiar with many problems (or functions) that are computable
(or decidable), meaning there exists some algorithm that computes an answer (or output) to any
instance of the problem (or for any input to the function) in a finite number of simple steps.
Examples:
 Computing the greatest common divisor of a pair of integers.
 Computing the least common multiple of a pair of integers.
 Finding the shortest path between a pair of nodes in a finite graph.

Non-Computable Problems – A non-computable is a problem for which there is no algorithm


that can be used to solve it. The most famous example of a non-computability (or undecidability)
is the Halting Problem. Given a description of a Turing machine and its initial input, determine
whether the program, when executed on this input, ever halts (completes). The alternative is that
it runs forever without halting.

6.2. Recursive languages and recursive Enumerable languages


Let us understand the concept of recursive language before learning about the recursively
enumerable language in the theory of computation (TOC).

89
Recursive Language
A language L is recursive (decidable) if L is the set of strings accepted by some Turing Machine
(TM) that halts on every input.
Example

When a Turing machine reaches a final state, it halts. We can also say that a Turing machine M
halts when M reaches a state q and a current symbol ‘a’ to be scanned so that δ(q, a) is
undefined.
There are TMs that never halt on some inputs in any one of these ways, so we make a distinction
between the languages accepted by a TM that halts on all input strings and a TM that never halts
on some input strings. We refer to a language L as recursive if there exists a Turing machine T
for it. In this case, the Turing machine accepts every string in language L and rejects all strings
that don't match the alphabet of L. In other words, if string S is part of the alphabet of language
L, then the Turing machine T will accept it otherwise the Turing machine halts without ever
reaching an accepting state.
Recursive Enumerable Language
A language L is recursively enumerable if L is the set of strings accepted by some TM.
If L is a recursive enumerable language then −
If w ∈ L then a TM halts in a final state,
If w ∉ L then a TM halts in a non-final state or loops forever.
If L is a recursive language then −
If w ∈ L then a TM halts in a final state,
If w ∉ L then TM halts in a non-final state.
Recursive Languages are also recursive enumerable
Proof − If L is a recursive then there is TM which decides a member in language then −
 M accepts x if x is in language L.
 M rejects on x if x is not in language L.
According to the definition, M can recognize the strings in language that are accepted on those
strings.

90
Examples of recursively enumerable languages are;
Recursive languages, Regular languages, Context-sensitive languages, Context-free languages.
Properties of both recursive and recursively enumerable languages.
We will state theorems which are also properties of both languages.
1. If language L is recursive, its complement L' is also recursive.
Proof:
L is a language accepted by a Turing machine that halts on all inputs. We construct a
Turing machine Ts from T as shown below:
2. If the languages L1 and L2 are recursive, their union L1 U L2 is also recursive.
Proof:
We have two Turing machines T1 and T2 that recognize languages L1 and L2.
3. The union of any two recursively enumerable languages is also a recursively enumerable
language.
Proof: We have two recursively enumerable languages L1 and L2 that are accepted by
Turing machines T1 and T2.
4. We have a language L and its complement L', a recursively enumerable language.
Then L will also be a recursive language.

91
Chapter 7: Computational Complexity

In computer science, the computational complexity or simply complexity of an algorithm is the


amount of resources required to run it.
 Particular focus is given to time and memory requirements. The complexity of a
problem is the complexity of the best algorithms that allow solving the problem.
 The study of the complexity of problems is called computational complexity theory.
 Both areas are highly related, as the complexity of an algorithm is always an upper
bounded on the complexity of the problem solved by this algorithm.
 For designing efficient algorithms, it is often fundamental to compare the complexity of a
specific algorithm to the complexity of the problem to be solved.
7.1 Big-O notations

 We can express algorithmic complexity using the big-O notation. For a problem of size
N:
 A constant-time function/method is “order 1” : O(1)
 A linear-time function/method is “order N” : O(N)
 A quadratic-time function/method is “order N squared” : O(N2)
 Definition: Let g and f be functions from the set of natural numbers to itself. The function

f is said to be O(g) (read big-oh of g), if there is a constant c > 0 and a natural number n0

such that f (n) ≤ cg(n) for all n >= n0.


 Abuse of notation: f = O(g) does not mean f ∈O(g).
 The Big-O Asymptotic Notation gives us the Upper Bound mathematically described
below: f(n) = O(g(n)) if there exists a positive integer n0and a positive constant c, such
that f(n) ≤ c.g(n) ∀n≥n0
 The general step wise procedure for Big-O runtime analysis is as follows:
 Figure out what the input is and what n represents.
 Express the maximum number of operations, the algorithm performs in terms of n.
 Eliminate all excluding the highest order terms.
 Remove all the constant factors.
Some of the useful properties of Big-O notation analysis are as follow:

92
 This asymptotic notation is used to measure and compare the worst-case scenarios of
algorithms theoretically.
 For any algorithm, the Big-O analysis should be straightforward as long as we correctly
identify the operations that are dependent on n, the input size.

Algorithmic Examples of Runtime Analysis: Some of the examples of all those types of
algorithms (in worst-case scenarios) are mentioned below:

93
Polynomial-Time Algorithms
 Are some problems solvable in polynomial time?–Of course: many algorithms we’ve
studied provide polynomial-time solutions to some problems
 Are all problems solvable in polynomial time?–No: Turing’s “Halting Problem” is not
solvable by any Computer, no matter how much time is given
 Most problems that do not yield polynomial-time algorithms are either optimization or
decision problems.

94
P and NP

 The P versus NP problem is a major unsolved problem in computer science.


 Informally, it asks whether every problem whose solution can be quickly verified by a
computer can also be quickly solved by a computer.
 The informal term quickly used above means the existence of an algorithm for the task
that runs in polynomial time.
 The general class of questions for which some algorithm can provide an answer in
polynomial time is called "class P" or just "P".
 For some questions, there is no known way to find an answer quickly, but if one is
provided with information showing what the answer is, it may be possible to verify the
answer quickly.
 The class of questions for which an answer can be verified in polynomial time is called
NP.
 Example: Consider the subset sum problem, an example of a problem that is easy to
verify, but whose answer may be difficult to compute.
 Given a set of integers, does some nonempty subset of them sum to 0?
 For instance, does a subset of the set {−2, −3, 15, 14, 7, −10} add up to 0? The answer
"yes, because {−2, −3, −10, 15} add up to zero" can be quickly verified with three
additions.
 However, there is no known algorithm to find such a subset in polynomial time (there is
one, however, in exponential time, which consists of 2n-1 tries), and indeed such an
algorithm cannot exist if the two complexity classes are not the same.
 Hence quickly this problem is in NP (quickly checkable) but not necessarily in P
(solvable).
 Note-When the execution time of a computation, m(n), is no more than a polynomial
function of the problem size, n. More formally m(n) = O(nk) where k is a constant.
 The most common resources are time (how many steps it takes to solve a problem) and
space (how much memory it takes to solve a problem).
 In such analysis, a model of the computer for Which time must be analyzed is required.

95
 Typically such models assume that the computer is deterministic(given the computer's
present state and any inputs, there is only one possible action that the computer might
take) and sequential(it performs actions one after the other).
 In this theory, the class P consists of all those decision problems that can be solved on a
deterministic sequential machine in an amount of time that is polynomial in the size of
the input;
 The class NP consists of all those decision problems whose positive solutions can be
verified in polynomial time given the right information, or equivalently,
 Clearly, P ⊆NP. Arguably, the biggest open question in theoretical computer science
concerns the relationship between those two classes: Is P equal to NP?
 In a poll of 100 researchers, 61 believed the answer to be no, 9 believed the answer is
yes, and 30 were unsure;
7.2. Class P vs class NP
 A problem can be solved in polynomial time.
 An algorithm is side to be solvable in polynomial time if number of steps required to
compete the algorithm for a given input is O(nk) for some constant k.
 P: the class of decision problems that have polynomial-time deterministic algorithms.–
That is, they are solvable in O(p(n)), where p(n) is a polynomial on n–A deterministic
algorithm is (essentially) one that always computes the correct answer.
 Sample Problems in P
 Fractional Knapsack
 MST,
 Sorting
The class NP
The NP problems set of problems whose solutions are hard to find but easy to verify and are
solved by Nondeterministic machine in polynomial time.
 The class NP consists of all the languages for which membership can be certified to a
polynomial-time algorithm.
 It contains many important problems not known to be in P. NP can also be defined using
non-deterministic Turing machines.

96
 Thus NP can also be thought of as the class of problems–whose solutions can be verified
in polynomial time
 Note that NP stands for “Nondeterministic Polynomial-time”
 If P= NP then for every search problem for which one can efficiently verify a given
solution, one can also efficiently find such a solution from scratch.
 P is subset of NP (any problem that can be solved by deterministic machine in
polynomial time can also be solved by non-deterministic machine in polynomial time).
 (A deterministic computer is what we know)
 A non-deterministic computer is one that can “guess” the right answer or solution.
 In deterministic algorithm, for a given particular input, the computer will always produce
the same output going through the same states.
 Non-deterministic algorithm, for the same input, the compiler may produce different
output in different runs.
 In fact, non-deterministic algorithms cannot solve the problem in polynomial time and
cannot determine what the next step is.
 The non-deterministic algorithms can show different behaviors for the same input on
different execution and there is a degree of randomness to it.

 NP is set of problems for which a solution can be verified in polynomial time–Examples:


Fractional Knapsack,…, Hamiltonian Cycle, CNF SAT(Conjunctive Normal Form
Satisfiability) , 3-CNF SAT


 Open question: Does P = NP?–

97
 Most suspect not
 An August 2010 claim of proof that P ≠NP, by VinayDeolalikar, researcher at HP Labs,
Palo Alto, has flaws….
7.3. NP-complete problems
 To attack the P = NP question the concept of NP-completeness is very useful.
 NP-complete problems are a set of problems to which any other NP-problem can be
reduced in polynomial time, and whose solution may still be verified in polynomial time.
 Informally, an NP-complete problem is at least as "tough" as any other problem in NP.
 A decision problem L is NP-complete if:
 1) L is in NP (Any given solution for NP-complete problems can be verified quickly, but
there is no efficient known solution).
 2) Every problem in NP is reducible to L in polynomial time
 A problem is NP-Hard if it follows property 2 mentioned above, doesn’t need to follow
property 1. Therefore, NP-Complete set is also a subset of NP-Hard set.

 Informally, these are the hardest problems in the class NP


 If any NP-complete problem can be solved by a polynomial time deterministic algorithm,
then every problem in NP can be solved by a polynomial time deterministic algorithm.

98
However, no polynomial time deterministic algorithm is known to solve any of them.
Examples of NP-complete problems
 Traveling salesman problem
 Hamiltonian cycle problem
 Clique problem
 Subset sum problem
 Boolean satisfiability problem
 Many thousands of other important computational problems in computer science,
mathematics, economics, manufacturing, communications, etc.
Polynomial-time reduction
 Theorem shows that any problem in NP can be reduced in polynomial time to an instance
of the Boolean satisfiability problem. This means that if the Boolean satisfiability
problem could be solved in polynomial time by a deterministic Turing machine, then all
problems in NP could be solved in polynomial time, and so the complexity class NP
would be equal to the complexity class P.
 Let L 1 and L2 be two languages over alphabets Σ 1 and Σ2,, respectively. L1 is said to
be polynomial-time reducible to L2 if there is a total function f: Σ 1*→ Σ2* for which1)
x ∈L1 if and only if f(x) ∈L2, and2) f can be computed in polynomial time The function f
is called a polynomial-time reduction

Let L1and L2be two decision problems. Suppose algorithm A2solves L2. That is, if y is an input
for L2then algorithm A2will answer Yes or No depending upon whether y belongs to L2or not.
The idea is to find a transformation from L1to L2so that the algorithm A2can be part of an
algorithm A1to solve L1.

99
One decision problem is polynomial-time reducible to another if a polynomial time algorithm
can be developed that changes each instance of the first problem to an instance of the second
such that a yes (or no) answer to the second problem entails a yes (or no) answer to the first.

7.4. Cook’s Theorem

 In computational complexity theory, the Cook–Levin theorem, also known as Cook's


theorem, states that the Boolean satisfiability problem is NP-complete.

 That is, any problem in NP can be reduced in polynomial time by a deterministic Turing
machine to the problem of determining whether a Boolean formula is satisfiable.

 An important consequence of the theorem is that if there exists a deterministic


polynomial time algorithm for solving Boolean satisfiability, then there exists a
deterministic polynomial time algorithm for solving all problems in NP.

 Crucially, the same follows for any NP complete problem.

 NOTE: In computer science, satisfiability(often written in all capitals or abbreviated


SAT) is the problem of determining if the variables of a given Boolean formula can be
assigned in such a way as to make the formula evaluate to TRUE.

 Equally important is to determine whether no such assignments exist, which would imply
that the function expressed by the formula is identically FALSE for all possible variable.

 We would say that the function is unsatisfiable; otherwise, it is satisfiable.

 For example, the formula a AND b is satisfiable because one can find the values a =
TRUE and b = TRUE, which make a AND b TRUE. To emphasize the binary nature of
this problem, it is frequently referred to as Boolean or propositional satisfiability.

100
Required Texts:

 Introduction to Automata Theory, Languages, and Computation by Hopcroft, Ullman and


Motwani

Reference books:

 An Introduction to Formal Languages and Automata, Third Edition, Peter Linz, 2001

 An Introduction to Formal Language Theory that Integrates Experimentation and Proof


Allen Stoughton, 2004.

 Complexity Theory: A Modern Approach Sanjeev Arora and Boaz Barak

Quiz
1. There are ________ tuples in finite state machine. A. 4 B. 5 C. 6 D. unlimited
2. Transition function maps. A. Σ * Q -> Σ B. Q * Q -> Σ C. Σ * Σ -> Q D. Q * Σ -> Q
3. What type of grammar do Turing machine generate? A. Type 0 B. Type 2 C. Type 1 D.
Type 3
4. In Turing machine, is a simple memory device that remembers which instructions should be
executed next? A. Finite control B. Blank symbol C. Scratch tape D. Decider
5. Finite automata requires minimum _______ number of stacks. A. 1 B. 0 C. 3 D.
unlimited
6. Regular expression for all strings starts with ab and ends with bba is. A. aba*b*bba B.
ab(ab)*bba C. ab(a+b)*bba D. All of the mentioned

7. The context free grammar generates


A. Unequal number of 0’s and 1’s B. Equal number of 0’s and 1’s

C. Any number of 0’s followed by any number of 1’s D. none


8. Pushdown Automata are equivalent to Context-Free ______.
A. Spelling B. Grammar C. parse D. choices
9. Which of the given operations are eligible in PDA?

101
A. insert B. add C. push D. all
10. The value of n if Turing machine is defined using n-tuples:
A. 6 B. 7 C. 8 D. 5
11. Which of the listed strings below accepted by the following NFA?

A. 0110 B. 01001 C. 100100000 D. 000000


12. If you have a grammar G=({A,B},{0,1},A,A -> 0B, B -> 0B ,B -> 01), What is the language
generated by the grammar?
A. 0*01 B. 00*01 C. 001 D. 00*01*
13. A context free Grammar is called ambiguous if there exists a string that can have?
A. only one parse tree B. no parse tree C. Partial parse tree D. more than one parse tree
14. The regular expression r = (a+b)*bb(a+b)* represents
A. All sets of strings from Σ = {a,b} that have exactly two b‘s
B. All sets of strings from Σ = {a,b} containing the substring bb
C. All sets of strings from Σ = {a,b} that have at most two b‘s
D. All sets of strings from Σ = {a,b} that have least two b‘s
15. From the following list of automata which one is the most powerful
A. Finite Automata B. Turing Machine C. Pushdown Automata D. None
Answers
1 B 6 C 11 B
2 D 7 B 12 B
3 A 8 B 13 D
4 A 9 C 14 B
5 B 10 B 15 B

102

You might also like