flat-notes
flat-notes
FLAT notes
UNIT - I
Introduction to Finite Automata: Structural Representations, Automata and Complexity, the Central Concepts of
Automata Theory – Alphabets, Strings, Languages, Problems.
Nondeterministic Finite Automata: Formal Definition, an application, Text Search, Finite Automata with Epsilon-
Transitions. Deterministic Finite Automata: Definition of DFA, How A DFA Process Strings, The language of DFA,
Conversion of NFA with €-transitions to NFA without €-transitions. Conversion of NFA to DFA, Moore and Melay
machines
UNIT -1
Finite Automata(FA) is the simplest machine to recognize patterns. The finite automata or finite state machine is an abstract
machine that has five elements or tuples.
Ø An automaton with a finite number of states is called finite automaton or finite state machine.
Ø The block diagram of finite automaton consist of falling three major components ·
Input tape ·
Read/write head ·
Finite control
I] Input Tape:
The R/W head examines only one square at a time and can move
one square either to the left or the right.
For further analysis, we restrict the movement of R/W head only
to the right side.
q : Initial state.
δ : Transition Function.
The Central
Concepts of Automata Theory – Alphabets, Strings, Languages, Problems.
Alphabet: A finite set of symbols. An alphabet is often denoted by sigma(∑), yet can be given any name.
Powers of an alphabet: If Σ is an alphabet, then Σ 0 = ε, no matter what the alphabet Σ is. In other words ε is the
only string of length 0.
If Σ = {a, b, c} then Σ1 = {a, b, c}, Σ2 = {aa, ab, ac, ba, bb, bc, ca, cb, cc}, Σ3 = {aaa, aab, aac, aba, abb, abc, aca,
acb, acc, baa, bab, bac, bba,bbb, bbc, bca, bcb, bcc, caa, cab, cac, cba, cbb, cbc, cca, ccb, ccc}
The set of all strings over an alphabet Σ is conventionally denoted by Σ*. For instance {0, 1}* = {ε, 0, 1, 00, 01, 10,
11, 000,… }. Another way is Σ* = Σ0 ᴜ Σ 1 ᴜ Σ 2 ᴜ …
The set of nonempty strings of an alphabet Σ is denoted as Σ+. And the two appropriate equivalences are :Σ+ = Σ1
ᴜΣ 2 ᴜΣ 3 ᴜ … Σ* = Σ+ ᴜ {ε}
String over an alphabet is a finite sequence of symbols from the alphabet, usually written
If w is a string over Σ then the length of w is written as |w| which is the number of
The string of length zero is called the empty string. It is written as ε(Epsilon).
The empty string plays the role of 0 in a number system. If w has length n then we
Operations on strings:
Concatenation of strings
The concatenation of two strings is the string formed by writing the first, followed
Downloaded by Ithvika Ch ([email protected])
lOMoARcPSD|44040870
That is, if w and x are strings, then wx is the concatenation of these two strings.
Example:
String Reversal
It is denoted by wR
Example:
Substring
Example:
Example:
String abc has prefixes ε, a, ab, and abc; its suffixes are ε, c, bc, and abc.
A prefix or suffix of a string, other than the string itself, is called a proper prefix or
suffix.
o A set of strings all of which are chosen from Σ*, where Σ is a particular alphabet, is called
language.
o Examples:
The language of all strings consisting of n 0’s followed by n 1’s for some n≥0; {
Downloaded by Ithvika Ch ([email protected])
lOMoARcPSD|44040870
The set of strings of 0’s and 1’s with an equal number of each : {ε, 01, 10, 0011,
0101, 1001, . . .}
The set of binary values whose value is prime: {10, 11, 101, 111, 1011,. . .}
Σ* is a language for any alphabet Σ.
ɸ the empty language is a language over any alphabet.
{ε} the language consisting of only the empty string, is also a language over
any alphabet.
Operations on languages:
Union
If L1 and L2 are two languages over an alphabet ∑.Then the union of L1 and L2
is denoted by L1 U L2.
Example:
Intersection
Example:
Complementation
the language consisting of those strings that are not in L over the alphabet.
Example:
{a,b,aa}={ε,bb,ab,ba.........}
Concatenation
L2.
Example:
Reversal
If L is language, then LR
LR ={wR | w ∈ L}
Example:
Kleene Closure
The Kleene closure (or just closure) of L, denoted L*, is the set and the positive closure
of L, denoted L+
, is the set
L * = U Li
i=0
∞
L + = U Li
i=1
That is, L* denotes words constructed by concatenating any number of words from L.
L+ is the same, but the case of zero words, whose "concatenation" is defined to be ε, is
Def: A string x is accepted by a finite automaton M = (Q, Σ, δ, q0, F) if δ(q0, x) =q for some q Є F. This is
basically the acceptability of a string by the final state.
Note: A final state is also called an accepting state.
If A is the set of all strings that machine M accepts then we say that A is the language of machine M and
written as L(M) = A. we say that M recognizes A or M accepts A.
Representation of FA:
1. Transition Table
2. Transition Diagram
Transition Tables:
A transition table is a conventional tabular representation of a function like δ that takes two
The rows of the table correspond to the states and columns correspond to the inputs. The entry for the row
corresponding to state q and the column corresponding the input a is the state δ(q, a).
0 1
q0 q2 q0
*q1 q1 q1
q2 q2 q1
Transition Diagram:
b) For each state q in Q and each input symbol a in Σ, let δ(q, a) = p. then the transition diagram
has an arc from node q to node p labeled a. If there are several input symbols that cause
transitions from q to p then the transition diagram can have one arc labeled by the list of these
symbols
c) There is an arrow into the start state q0, labeled start. This arrow does not originate at any
node.
d) Nodes corresponding to accepting state (those in F) are marked by a double circle. States not in
Example:
The following figure shows the transition diagram for the DFA that we designed in three states. There is a start arrow
entering the start state q0, and the one accepting state q1, is represented by double circle.
Out of each state is one arc labeled 0 and one arc labeled 1, although the two arcs are combined into the one
with a double label in the case q1.
In DFA, for each input symbol, one can determine the state to which the machine will move.
Hence, it is called Deterministic Automaton. As it has a finite number of states, the machine is called Deterministic Finite
Machine or Deterministic Finite Automaton.
2. NFA
o NFA stands for non-deterministic finite automata. It is easy to construct an NFA than DFA for a given regular
language.
o The finite automata are called NFA when there exist many paths for specific input from the current state to the next
state.
o Every NFA is not DFA, but each NFA can be translated into DFA.
o NFA is defined in the same way as DFA but with the following two exceptions, it contains multiple next states, and
it contains ε transition.
Downloaded by Ithvika Ch ([email protected])
lOMoARcPSD|44040870
Example:
In the following image, we can see that from state q0 for input a, there are two next states q1 and q2, similarly, from q0 for
input b, the next states are q0 and q1. Thus it is not fixed or determined that with a particular input where to go next. Hence
this FA is called non-deterministic finite automata.
δ: Q x ∑ →2Q
where,
Example 1:
1. Q = {q0, q1, q2}
2. ∑ = {0, 1}
3. q0 = {q0}
4. F = {q2}
Solution:
Downloaded by Ithvika Ch ([email protected])
lOMoARcPSD|44040870
Transition diagram:
Transition Table:
In the
→q0 q0, q1 q1
q1 q2 q0
*q2 q2 q1, q2
above diagram, we can see that when the current state is q0, on input 0, the next state will be q0 or q1, and on 1 input the
next state will be q1. When the current state is q1, on input 0 the next state will be q2 and on 1 input, the next state will be
q0. When the current state is q2, on 0 input the next state is q2, and on 1 input the next state will be q1 or q2.
Example 2:
NFA with ∑ = {0, 1} accepts all strings with 01.
Solution:
Transition Table:
q1 ε
ε q2
q2 q2
Example 3:
NFA with ∑ = {0, 1} and accept all string of length atleast 2.
Solution:
Transition Table:
q1 q1
q2 q2
ε ε
Present State 0 1
q1 q3 ε
q2 q2, q3 q3
→q3 q3 q3
Solution:
The transition diagram can be drawn by using the mapping function as given in the table.
δ(q2, 1) = {q3}
δ(q3, 1) = {q3}
ε -NFAs add a convenient feature but (in a sense) they bring us nothing new. They do not extend the
class of languages that can be represented.
Both NFAs and E-NFAs recognize exactly the same languages.
Epsilon (ε) - closure
Epsilon closure for a given state X is a set of states which can be reached from the states X with only (null) or E moves
including the state X itself.
Example
State 0 1 epsilon
A B,C A B
B - B C
C C C -
In DFA, for each input symbol, one can determine the state to which the machine will move.
Hence, it is called Deterministic Automaton. As it has a finite number of states, the machine is
q0 is the initial state from where any input is processed (q0 ∈ Q).
Examples:
Step 3 : Obtain the possible transitions to be made for each state on each input symbol.
2. Construct a transition system which can accept strings over the alphabet {0,1} ending with 101(DFA)
3. Construct DFA which can accept strings over the alphabet {a,b} ending with b
4. Construct DFA which can accept strings over the alphabet {a,b} number of a’s divisible by 3
5. . Construct DFA which can accept strings over the alphabet {a,b} starting with a.
6. Construct a transition system which can accept strings over the alphabet {a,b} starting with a ending with b
7. Design DFA that accepts all strings which starts with ‘a’ over the alphabet {a,b}
8. Design DFA that accepts all strings which contains ‘00’ as substring over the alphabet {0,1}
Tuple Representation:
M(Q, ∑, δ, q0, F) where
Q= finite set of states={ q0,q1,q2}
∑=finite set of inputs={0,1}
δ=transition function maps
q0 =initial state=q0
F=set of final states={q2}
Language recognizers:
Downloaded by Ithvika Ch ([email protected])
lOMoARcPSD|44040870
A language recognizer is a device that accepts valid strings produced in a given language.
Finite state automata are formalized types of language recognizers.
The language accepted by Finite Automata M designated L(M) is the set {x | δ(q0,x) is in F}.
Applications of FA:
Used in Lexical analysis phase of a compiler to recognize tokens.
Used in text editors for string matching
Example:
Convert the following NFA with ε to NFA without ε.
1. ε-closure(q0) = {q0}
2. ε-closure(q1) = {q1, q2}
3. ε-closure(q2) = {q2}
= ε-closure(δ(ε-closure(q0),a))
= ε-closure(δ(q0, a))
= ε-closure(q1)
= {q1, q2}
= ε-closure(δ(ε-closure(q0),b))
= ε-closure(δ(q0, b))
=Ф
= ε-closure(δ(ε-closure(q1),a))
= ε-closure(δ(q1, q2), a)
= ε-closure(Ф ∪ Ф)
=Ф
= ε-closure(δ(ε-closure(q1),b))
= ε-closure(δ(q1, q2), b)
= ε-closure(Ф ∪ q2)
= {q2}
= ε-closure(δ(ε-closure(q2),a))
= ε-closure(δ(q2, a))
= ε-closure(Ф)
=Ф
= ε-closure(δ(ε-closure(q2),b))
= ε-closure(δ(q2, b))
= ε-closure(q2)
= {q2}
δ'(q0, b) = Ф
δ'(q1, a) = Ф
δ'(q1, b) = {q2}
δ'(q2, a) = Ф
δ'(q2, b) = {q2}
States a b
*q1 Ф {q2}
*q2 Ф {q2}
State q1 and q2 become the final state as ε-closure of q1 and q2 contain the final state q2. The NFA can be shown by the
following transition diagram:
In this section, we will discuss the method of converting NFA to its equivalent DFA. In NFA, when a specific input is given
to the current state, the machine goes to multiple states. It can have zero, one or more than one move on a given input
symbol. On the other hand, in DFA, when a specific input is given to the current state, the machine goes to only one state.
DFA has only one move on a given input symbol.
Let, M = (Q, ∑, δ, q0, F) is an NFA which accepts the language L(M). There should be equivalent DFA denoted by M' =
(Q', ∑', q0', δ', F') such that L(M) = L(M').
Step 2: Add q0 of NFA to Q'. Then find the transitions from this start state.
Downloaded by Ithvika Ch ([email protected])
lOMoARcPSD|44040870
Step 3: In Q', find the possible set of states for each input symbol. If this set of states is not in Q', then add it to Q'.
Step 4: In DFA, the final state will be all the states which contain F(final states of NFA)
Example 1:
Convert the given NFA to DFA.
Solution: For the given transition diagram we will first construct the transition table.
State 0 1
→q0 q0 q1
q1 {q1, q2} q1
1. δ'([q0], 0) = [q0]
2. δ'([q0], 1) = [q1]
δ'([q1], 1) = [q1]
δ'([q2], 0) = [q2]
= [q1, q2]
= {q1, q2}
= [q1, q2]
The state [q1, q2] is the final state as well because it contains a final state q2. The transition table for the constructed DFA
will be:
State 0 1
Example 2:
Convert the given NFA to DFA.
Solution: For the given transition diagram we will first construct the transition table.
State 0 1
δ'([q1], 0) = ϕ
= {q0, q1} ∪ ϕ
= {q0, q1}
= [q0, q1]
Similarly,
= {q0, q1}
= [q0, q1]
As in the given NFA, q1 is a final state, then in DFA wherever, q1 exists that state becomes a final state. Hence in the DFA,
final states are [q1] and [q0, q1]. Therefore set of final states F = {[q1], [q0, q1]}.
State 0 1
Suppose
A = [q0]
B = [q1]
C = [q0, q1]
Moore Machine
Moore machine is a finite state machine in which the next state is decided by the current state and current input symbol.
The output symbol at a given time depends only on the present state of the machine. Moore machine can be described by 6
tuples (Q, q0, ∑, O, δ, λ) where,
Example 1:
The state diagram for Moore Machine is
In the above Moore machine, the output is represented with each input state separated by /. The output length for a Moore
machine is greater than input by 1.
Input: 010
Output: 1110(1 for q0, 1 for q1, again 1 for q1, 0 for q2)
Example 2:
Design a Moore machine to generate 1's complement of a given binary number.
Solution: To generate 1's complement of a given binary number the simple logic is that if the input is 0 then the output will
be 1 and if the input is 1 then the output will be 0. That means there are three states. One state is start state. The second
state is for taking 0's as input and produces output as 1. The third state is for taking 1's as input and producing output as 0.
Input 1 0 1 1
State q0 q2 q1 q2 q2
Output 0 0 1 0 0
Thus we get 00100 as 1's complement of 1011, we can neglect the initial 0 and the output which we get is 0100 which is 1's
complement of 1011. The transaction table is as follows:
Thus Moore machine M = (Q, q0, ∑, O, δ, λ); where Q = {q0, q1, q2}, ∑ = {0, 1}, O = {0, 1}. the transition table shows
the δ and λ functions.
Example 3:
Design a Moore machine for a binary input sequence such that if it has a substring 101, the machine output A, if the input
has substring 110, it outputs B otherwise it outputs C.
Solution: For designing such a machine, we will check two conditions, and those are 101 and 110. If we get 101, the output
will be A, and if we recognize 110, the output will be B. For other strings, the output will be C.
Now we will insert the possibilities of 0's and 1's for each state. Thus the Moore machine becomes:
Mealy Machine
A Mealy machine is a machine in which output symbol depends upon the present input symbol and present state of the
machine. In the Mealy machine, the output is represented with each input symbol for each state separated by /. The Mealy
machine can be described by 6 tuples (Q, q0, ∑, O, δ, λ') where
Solution: For designing such a machine, we will check two conditions, and those are 101 and 110. If we get 101, the output
will be A. If we recognize 110, the output will be B. For other strings the output will be C.
Now we will insert the possibilities of 0's and 1's for each state. Thus the Mealy machine becomes:
Example 2:
Design a mealy machine that scans sequence of input of 0 and 1 and generates output 'A' if the input string terminates in 00,
output 'B' if the string terminates in 11, and output 'C' otherwise.
UNIT -2
o The language accepted by finite automata can be easily described by simple expressions called Regular
Expressions. It is the most effective way to represent any language.
o The languages accepted by some regular expression are referred to as Regular languages.
o A regular expression can also be described as a sequence of pattern that defines a string.
Union: If L and M are two regular languages then their union L U M is also a union.
1. 1. L U M = {s | s is in L or s is in M}
Intersection: If L and M are two regular languages then their intersection is also an intersection.
1. 1. L ⋂ M = {st | s is in L and t is in M}
Kleen closure: If L is a regular language then its Kleen closure L1* will also be a regular language.
Example 1:
Write the regular expression for the language accepting all combinations of a's, over the set ∑ = {a}
Solution:
All combinations of a's means a may be zero, single, double and so on. If a is appearing zero times, that means a null string.
That is we expect the set of {ε, a, aa, aaa, ....}. So we give a regular expression for this as:
1. R = a*
That is Kleen closure of a.
Example 2:
Write the regular expression for the language accepting all combinations of a's except the null string, over the set ∑ = {a}
Solution:
This set indicates that there is no null string. So we can denote regular expression as:
R = a+
Example 3:
Write the regular expression for the language accepting all the string containing any number of a's and b's.
1. r.e. = (a + b)*
This will give the set as L = {ε, a, aa, b, bb, ab, ba, aba, bab, .....}, any combination of a and b.
The (a + b)* shows any combination with a and b even a null string.
Example 1:
Write the regular expression for the language accepting all the string which are starting with 1 and ending with 0, over ∑ =
{0, 1}.
Solution:
In a regular expression, the first symbol should be 1, and the last symbol should be 0. The r.e. is as follows:
1. R = 1 (0+1)* 0
Example 2:
Write the regular expression for the language starting and ending with a and having any having any combination of b's in
between.
Solution:
1. R = a b* b
Example 3:
Write the regular expression for the language starting with a but not having consecutive b's.
1. R = {a + ab}*
Example 4:
Write the regular expression for the language accepting all the string in which any number of a's is followed by any number
of b's is followed by any number of c's.
Solution: As we know, any number of a's means a* any number of b's means b*, any number of c's means c*. Since as
given in problem statement, b's appear after a's and c's appear after b's. So the regular expression could be:
1. R = a* b* c*
Example 5:
Write the regular expression for the language over ∑ = {0} having even length of the string.
Downloaded by Ithvika Ch ([email protected])
lOMoARcPSD|44040870
Solution:
1. R = (00)*
Example 6:
Write the regular expression for the language having a string which should have atleast one 0 and alteast one 1.
Solution:
age can be predicted from the regular expression by finding the meaning of it. We will first split the
pression as:
anguage consists of the string in which a's appear triples, there is no restriction on the number of b's}
8:
regular expression for the language L over ∑ = {0, 1} such that all the string do not contain the substring
uage is as follows:
= (1* 0*)
9:
regular expression for the language containing the string over {0, 1} in which there are at least two
es of 1's between any two occurrences of 1's between any two occurrences of 0's.
At least two 1's between two occurrences of 0's can be denoted by (0111*0)*.
if there is no occurrence of 0's, then any number of 1's are also allowed. Hence the r.e. for required
s:
= (1 + (0111*0))*
10:
egular expression for the language containing the string in which every 0 is immediately followed by 11.
= (011 + 1)*
∅.A = A.∅ = ∅
Let us see the DeMorgan Type Law for Regular Expressions RegEx.
(L + B)* = (L*B*)*
The conversion of a finite automaton to a regular expression is done through the state elimination method.
The State elimination method follows the following general set of rules:
1. Add a new initial state (II). Make a null transition from the old initial state to the new initial state.
2. Add a new final state (FF). Make null transition(s) to the new final state.
3. Eliminate all states, except II and FF, in the given finite automaton. Perform the elimination of states by checking
the in-degrees and out-degrees of states and taking a cross product.
After steps 1 and 2, the II state will not have any inward transitions, and the state FF state will not have any outward
transitions.
Example
Finite automaton
Add a new initial state, II. Make a null transition from state II to state q_0q0.
Adding state I
Add a new final state, FF. Make a null transition from state q_3q3 to state FF.
Step 3.1
Step 3.2
Eliminate state q_3q3. Concatenate transitions from state q_3q3 to state FF as per the basic rules of writing regular
expressions.
Eliminating q3
Step 3.3
Eliminate q_1q1. Check for the in-degree and out-degree of state q_1q1. Write the regular expressions for the new
transitions acquired after removing state q_1q1.
Step 3.4
Step 3.5
Theorem
Let L be a regular language. Then there exists a constant ‘c’ such that for every string w in L −
|w| ≥ c
|y| > 0
Downloaded by Ithvika Ch ([email protected])
lOMoARcPSD|44040870
|xy| ≤ c
For all k ≥ 0, the string xykz is also in L.
Applications of Pumping Lemma
Pumping Lemma is to be applied to show that certain languages are not regular. It should never be used to show a language
is regular.
Problem
Solution −
Closure properties on regular languages are defined as certain operations on regular language which are guaranteed to
produce regular language. Closure refers to some operation on a language, resulting in a new language that is of same
“type” as originally operated on i.e., regular.
1. Kleen Closure
RS is a regular expression whose language is L, M. R* is a regular expression whose language is L*.
2. Positive closure:
3. Complement:
The complement of a language L (with respect to an alphabet such that contains L) is –L. Since is
Proof: Let E be a regular expression for L. We show how to reverse E, to provide a regular expression for .
5. Union:
Let L and M be the languages of regular expressions R and S, respectively.Then R+S is a regular expression whose
language is(L U M).
6. Intersection:
Let L and M be the languages of regular expressions R and S, respectively then it a regular expression whose language is
L intersection M.
proof: Let A and B be DFA’s whose languages are L and M, respectively. Construct C, the product automaton of A and B
make the final states of C be the pairs consisting of final states of both A and B.
7. Set Difference operator:
If L and M are regular languages, then so is L – M = strings in L but not M.
Proof: Let A and B be DFA’s whose languages are L and M, respectively. Construct C, the product automaton of A and B
make the final states of C be the pairs, where A-state is final but B-state is not.
Downloaded by Ithvika Ch ([email protected])
lOMoARcPSD|44040870
8. Homomorphism:
A homomorphism on an alphabet is a function that gives a string for each symbol in that alphabet. Example: h(0) = ab;
If L is a regular language, and h is a homomorphism on its alphabet, then h(L)= {h(w) | w is in L} is also a
regular language.
Proof: Let E be a regular expression for L. Apply h to each symbol in E. Language of resulting R, E is h(L).
1. Inverse Homomorphism : Let h be a homomorphism and L a language whose alphabet is the output
Note: There are few more properties like symmetric difference operator, prefix operator, substitution which are closed
under closure properties of regular language.
Decision Properties:
Approximately all the properties are decidable in case of finite automaton.
(i) Emptiness
(ii) Non-emptiness
(iii) Finiteness
(iv) Infiniteness
(v) Membership
(vi) Equality
These are explained as following below.
Step-2: select the state from which we cannot reach the final state & delete them (remove dead states).
Step-3: if the resulting machine contains loops or cycles then the finite automata accepts infinite language.
Step-4: if the resulting machine do not contain loops or cycles then the finite automata accepts infinite language.
(iii) Membership:
Membership is a property to verify an arbitrary string is accepted by a finite automaton or not i.e. it is a member of the
language or not.
Let M is a finite automata that accepts some strings over an alphabet, and let ‘w’ be any string defined over the
alphabet, if there exist a transition path in M, which starts at initial state & ends in anyone of the final state, then
string ‘w’ is a member of M, otherwise ‘w’ is not a member of M.
(iv)Equality
Two finite state automata M1 & M2 is said to be equal if and only if, they accept the same language. Minimise the finite
state automata and the minimal DFA will be unique.
Minimization of DFA means reducing the number of states from given FA. Thus, we get the FSM(finite state machine) with
redundant states after minimizing the FSM.
We have to follow the various steps to minimize the DFA. These are as follows:
Step 1: Remove all the states that are unreachable from the initial state via any set of the transition of DFA.
Step 3: Now split the transition table into two tables T1 and T2. T1 contains all final states, and T2 contains non-final
states.
1. 1. δ (q, a) = p
2. 2. δ (r, a) = p
That means, find the two states which have the same value of a and b and remove one of them.
Step 5: Repeat step 3 until we find no similar rows available in the transition table T1.
Step 7: Now combine the reduced T1 and T2 tables. The combined transition table is the transition table of
minimized DFA.
Example:
Solution:
Step 1: In the given DFA, q2 and q4 are the unreachable states so remove them.
Step 2: Draw the transition table for the rest of the states.
State 0 1
→q0 q1 q3
q1 q0 q3
*q3 q5 q5
*q5 q5 q5
Step 3: Now divide rows of transition table into two sets as:
1. One set contains those rows, which start from non-final states:
State 0 1
q0 q1 q3
q1 q0 q3
2. Another set contains those rows, which starts from final states.
State 0 1
Downloaded by Ithvika Ch ([email protected])
lOMoARcPSD|44040870
q3 q5 q5
q5 q5 q5
Step 5: In set 2, row 1 and row 2 are similar since q3 and q5 transit to the same state on 0 and 1. So skip q5 and then
replace q5 by q3 in the rest.
State 0 1
q3 q3 q3
State 0 1
→q0 q1 q3
q1 q0 q3
*q3 q3 q3
Example
Let us use Algorithm 2 to minimize the DFA shown below.
a B C d e f
a B C d e f
c ✔ ✔
d ✔ ✔
e ✔ ✔
f ✔ ✔ ✔
Step 3 − We will try to mark the state pairs, with green colored check mark, transitively. If we
input 1 to state ‘a’ and ‘f’, it will go to state ‘c’ and ‘f’ respectively. (c, f) is already marked, hence
we will mark pair (a, f). Now, we input 1 to state ‘b’ and ‘f’; it will go to state ‘d’ and ‘f’
respectively. (d, f) is already marked, hence we will mark pair (b, f).
a b C d e f
c ✔ ✔
d ✔ ✔
e ✔ ✔
f ✔ ✔ ✔ ✔ ✔
After step 3, we have got state combinations {a, b} {c, d} {c, e} {d, e} that are unmarked.
We can recombine {c, d} {c, e} {d, e} into {c, d, e}
Hence we got two combined states as − {a, b} and {c, d, e}
So the final minimized DFA will contain three states {f}, {a, b} and {c, d, e}
UNIT – 3
CFG stands for context-free grammar. It is is a formal grammar which is used to generate all possible patterns of strings in
a given formal language. Context-free grammar G can be defined by four tuples as
G = (V, T, P, S)
Downloaded by Ithvika Ch ([email protected])
lOMoARcPSD|44040870
Where,
G is the grammar, which consists of a set of the production rule. It is used to generate the string of a language.
P is a set of production rules, which is used for replacing non-terminals symbols(on the left side of the production) in a
string with other terminal or non-terminal symbols(on the right side of the production).
S is the start symbol which is used to derive the string. We can derive the string by repeatedly replacing a non-terminal by
the right-hand side of the production until all non-terminal have been replaced by terminal symbols.
Example 1:
Construct the CFG for the language having any number of a's over the set ∑= {a}.
Solution:
As we know the regular expression for the above language is
1. r.e. = a*
Production rule for the Regular expression is as follows:
1. S → aS rule 1
2. S → ε rule 2
Now if we want to derive a string "aaaaaa", we can start with start symbols.
1. S
2. aS
3. aaS rule 1
4. aaaS rule 1
5. aaaaS rule 1
6. aaaaaS rule 1
7. aaaaaaS rule 1
8. aaaaaaε rule 2
9. aaaaaa
The r.e. = a* can generate a set of string {ε, a, aa, aaa,.....}. We can have a null string because S is a start symbol and rule 2 gives S →
ε.
Example 2:
Construct a CFG for the regular expression (0+1)*
Solution:
The CFG can be given by,
Example 3:
Construct a CFG for a language L = {wcwR | where w € (a, b)*}.
Solution:
The string that can be generated for a given language is {aacaa, bcb, abcba, bacab, abbcbba, ....}
1. S → aSa rule 1
2. S → bSb rule 2
3. S → c rule 3
Now if we want to derive a string "abbcbba", we can start with start symbols.
1. S → aSa
2. S → abSba from rule 2
3. S → abbSbba from rule 2
4. S → abbcbba from rule 3
Thus any of this kind of string can be derived from the given production rules.
Example 4:
Construct a CFG for the language L = anb2n where n>=1.
Solution:
The string that can be generated for a given language is {abb, aabbbb, aaabbbbbb....}.
The grammar could be:
1. S → aSbb | abb
Now if we want to derive a string "aabbbb", we can start with start symbols.
1. S → aSbb
2. S → aabbbb
Derivation is a sequence of production rules. It is used to get the input string through these
production rules. During parsing, we have to take two decisions. These are as follows:
We have two options to decide which non-terminal to be placed with production rule.
1. Leftmost Derivation:
In the leftmost derivation, the input is scanned and replaced with the production rule from left to
right. So in leftmost derivation, we read the input string from left to right.
Example:
Production rules:
1. E = E + E
2. E = E - E
3. E = a | b
Input
1. a - b + a
The leftmost derivation is:
1. E = E + E
2. E = E - E + E
3. E = a - E + E
4. E = a - b + E
5. E = a - b + a
2. Rightmost Derivation:
In rightmost derivation, the input is scanned and replaced with the production rule from right to
left. So in rightmost derivation, we read the input string from right to left.
Example
Production rules:
1. E = E + E
2. E = E - E
3. E = a | b
Input
1. a - b + a
The rightmost derivation is:
1. E = E - E
2. E = E - E + E
3. E = E - E + a
4. E = E - b + a
5. E = a - b + a
When we use the leftmost derivation or rightmost derivation, we may get the same string. This
type of derivation does not affect on getting of a string.
Downloaded by Ithvika Ch ([email protected])
lOMoARcPSD|44040870
Examples of Derivation:
Example 1:
Derive the string "abb" for leftmost derivation and rightmost derivation using a CFG given by,
1. S → AB | ε
2. A → aB
3. B → Sb
Rightmost derivation:
Example 2:
Derive the string "aabbabba" for leftmost derivation and rightmost derivation using a CFG given
by,
1. S → aB | bA
2. A → a | aS | bAA
3. B → b | bS | aBB
Solution:
Leftmost derivation
1. S
2. aB S → aB
3. aaBB B → aBB
4. aabB B→b
5. aabbS B → bS
6. aabbaB S → aB
Downloaded by Ithvika Ch ([email protected])
lOMoARcPSD|44040870
7. aabbabS B → bS
8. aabbabbA S → bA
9. aabbabba A→a
Rightmost derivation:
1. S
2. aB S → aB
3. aaBB B → aBB
4. aaBbS B → bS
5. aaBbbA S → bA
6. aaBbba A→a
7. aabSbba B → bS
8. aabbAbba S → bA
9. aabbabba A→a
Example 3:
Derive the string "00101" for leftmost derivation and rightmost derivation using a CFG given by,
1. S → A1B
2. A → 0A | ε
3. B → 0B | 1B | ε
1. S
2. A1B
3. 0A1B
4. 00A1B
5. 001B
6. 0010B
7. 00101B
8. 00101
Rightmost derivation:
1. S
2. A1B
3. A10B
4. A101B
5. A101
6. 0A101
7. 00A101
Downloaded by Ithvika Ch ([email protected])
lOMoARcPSD|44040870
8. 00101
Example-01:
V = { S }
T = { a , b }
P = { S → aSbS , S → bSaS , S → ∈ }
S = { S }
This grammar generates the strings having equal number of a’s and b’s.
Example-02:
Consider a grammar G = (V , T , P , S) where-
V = { S , A, B , C }
T = { a , b , c }
P = { S → ABC , A → a , B → b , C → c }
S = { S }
L(G) = { abc }
Sentential Forms
Parse tree
o Parse tree is the graphical representation of symbol. The symbol can be terminal or non-
terminal.
o In parsing, the string is derived using the start symbol. The root of the parse tree is that
start symbol.
o It is the graphical representation of symbol that can be terminals or non-terminals.
o Parse tree follows the precedence of operators. The deepest sub-tree traversed first. So, the
operator in the parent node has less precedence over the operator in the sub-tree.
Example1 :
Production rules:
1. S= S + S | S * TS
2. S = a|b|c
Input:
a * b + c
Step 1:
Step 2:
Step 3:
Step 4:
Step 5:
Example 2 :
S -> sAB
A -> a
B -> b
The input string is “sab”, then the Parse Tree is:
Parsers
Yacc Parsers – Generator
Markup Languages
Downloaded by Ithvika Ch ([email protected])
lOMoARcPSD|44040870
A grammar is said to be ambiguous if there exists more than one leftmost derivation or more than
one rightmost derivation or more than one parse tree for the given input string. If the grammar is
not ambiguous, then it is called unambiguous.
If the grammar has ambiguity, then it is not good for compiler construction. No method can
automatically detect and remove the ambiguity, but we can remove ambiguity by re-writing the
whole grammar without ambiguity.
Example 1:
Let us consider a grammar G with the production rule
1. E → I
2. E → E + E
3. E → E * E
4. E → (E)
5. I → ε | 0 | 1 | 2 | ... | 9
Solution:
For the string "3 * 2 + 5", the above grammar can generate two parse trees by leftmost
derivation:
Since there are two parse trees for a single string "3 * 2 + 5", the grammar G is ambiguous.
Example 2:
Check whether the given grammar G is ambiguous or not.
1. E → E + E
2. E → E - E
3. E → id
Downloaded by Ithvika Ch ([email protected])
lOMoARcPSD|44040870
Solution:
From the above grammar String "id + id - id" can be derived in 2 ways:
First Leftmost derivation
1. E → E + E
2. → id + E
3. → id + E - E
4. → id + id - E
5. → id + id- id
Second Leftmost derivation
1. E → E - E
2. →E+E-E
3. → id + E - E
4. → id + id - E
5. → id + id - id
Since there are two leftmost derivation for a single string "id + id - id", the grammar G is
ambiguous.
Example 3:
Check whether the given grammar G is ambiguous or not.
1. S → aSb | SS
2. S → ε
Solution:
For the string "aabb" the above grammar can generate two parse trees
Since there are two parse trees for a single string "aabb", the grammar G is ambiguous.
Example 4:
Check whether the given grammar G is ambiguous or not.
1. A → AA
2. A → (A)
3. A → a
Solution:
Downloaded by Ithvika Ch ([email protected])
lOMoARcPSD|44040870
For the string "a(a)aa" the above grammar can generate two parse trees:
Since there are two parse trees for a single string "a(a)aa", the grammar G is ambiguous.
Unambiguous Grammar
A grammar can be unambiguous if the grammar does not contain ambiguity that means if it does
not contain more than one leftmost derivation or more than one rightmost derivation or more than
one parse tree for the given input string.
To convert ambiguous grammar to unambiguous grammar, we will apply the following rules:
1. If the left associative operators (+, -, *, /) are used in the production rule, then apply left
recursion in the production rule. Left recursion means that the leftmost symbol on the right side is
the same as the non-terminal on the left side. For example,
1. X → Xa
2. If the right associative operates(^) is used in the production rule then apply right recursion in
the production rule. Right recursion means that the rightmost symbol on the left side is the same
as the non-terminal on the right side. For example,
1. X → aX
Example 1:
Consider a grammar G is given as follows:
1. S → AB | aaB
2. A → a | Aa
3. B → b
Determine whether the grammar G is ambiguous or not. If G is ambiguous, construct an unambiguous grammar equivalent
to G.
Solution:
Let us derive the string "aab"
As there are two different parse tree for deriving the same string, the given grammar is ambiguous.
Unambiguous grammar will be:
1. S → AB
2. A → Aa | a
3. B → b
Example 2:
Show that the given grammar is ambiguous. Also, find an equivalent unambiguous grammar.
1. S → ABA
2. A → aA | ε
3. B → bB | ε
Solution:
The given grammar is ambiguous because we can derive two different parse tree for string aa.
1. S → aXY | bYZ | ε
2. Z → aZ | a
3. X → aXY | a | ε
4. Y → bYZ | b | ε
Example 3:
Show that the given grammar is ambiguous. Also, find an equivalent unambiguous grammar.
1. E → E + E
2. E → E * E
3. E → id
Solution:
Let us derive the string "id + id * id"
As there are two different parse tree for deriving the same string, the given grammar is
ambiguous.
Unambiguous grammar will be:
1. E → E + T
2. E → T
3. T → T * F
4. T → F
5. F → id
Example 4:
Check that the given grammar is ambiguous or not. Also, find an equivalent unambiguous
grammar.
1. S → S + S
2. S → S * S
3. S → S ^ S
4. S → a
Solution:
The given grammar is ambiguous because the derivation of string aab can be represented by the
following string:
1. S → S + A |
2. A → A * B | B
3. B → C ^ B | C
4. C → a
Pushdown Automata(PDA)
Downloaded by Ithvika Ch ([email protected])
lOMoARcPSD|44040870
o Pushdown automata is a way to implement a CFG in the same way we design DFA for a
regular grammar. A DFA can remember a finite amount of information, but a PDA can
remember an infinite amount of information.
o Pushdown automata is simply an NFA augmented with an "external stack memory". The
addition of stack is used to provide a last-in-first-out memory management capability to
Pushdown automata. Pushdown automata can store an unbounded amount of information
on the stack. It can access a limited amount of information on the stack. A PDA can push an
element onto the top of the stack and pop off an element from the top of the stack. To read
an element into the stack, the top elements must be popped off and are lost.
o A PDA is more powerful than FA. Any language which can be acceptable by FA can also be
acceptable by PDA. PDA also accepts a class of language which even cannot be accepted by
FA. Thus PDA is much more superior to FA.
PDA Components:
Input tape: The input tape is divided in many cells or symbols. The input head is read-only and
may only move from left to right, one symbol at a time.
Finite control: The finite control has some pointer which points the current symbol which is to be
read.
Stack: The stack is a structure in which we can push and remove the items from one end only. It
has an infinite size. In PDA, the stack is used to store the items temporarily.
Γ: a stack symbol which can be pushed and popped from the stack
q0: the initial state
Z: a start symbol which is in Γ.
F: a set of final states
δ: mapping function which is used for moving from current state to next state.
Turnstile Notation:
⊢ sign describes the turnstile notation and represents one move.
For example,
(p, b, T) ⊢ (q, w, α)
In the above example, while taking a transition from state p to q, the input symbol 'b' is consumed, and
the top of the stack 'T' is represented by a new string α.
Example 1:
Design a PDA for accepting a language {anb2n | n>=1}.
Solution: In this language, n number of a's should be followed by 2n number of b's. Hence, we
will apply a very simple logic, and that is if we read single 'a', we will push two a's onto the stack.
As soon as we read 'b' then for every single 'b' only one 'a' should get popped from the stack.
The ID can be constructed as follows:
δ(q0, b, a) = (q1, ε)
Thus this process of popping 'b' will be repeated unless all the symbols are read. Note that
popping action occurs in state q1 only.
δ(q1, b, a) = (q1, ε)
After reading all b's, all the corresponding a's should get popped. Hence when we read ε as input
symbol then there should be nothing in the stack. Hence the move will be:
δ(q1, ε, Z) = (q2, ε)
Where
PDA = ({q0, q1, q2}, {a, b}, {a, Z}, δ, q0, Z, {q2})
We can summarize the ID as:
Example 2:
Design a PDA for accepting a language {0n1m0n | m, n>=1}.
Solution: In this PDA, n number of 0's are followed by any number of 1's followed n number of
0's. Hence the logic for design of such PDA will be as follows:
Push all 0's onto the stack on encountering first 0's. Then if we read 1, just do nothing. Then read
0, and on each read of 0, pop one 0 from the stack.
For instance:
3. The languages that are accepted by final state by some PDA. Grammar PDA by empty stack PDA by final state
2. Acceptance by Empty Stack: On reading the input string from the initial configuration for some
PDA, the stack of PDA gets empty.
Let P =(Q, ∑, Γ, δ, q0, Z, F) be a PDA. The language acceptable by empty stack can be defined as:
N(PDA) = {w | (q0, w, Z) ⊢* (p, ε, ε), q ∈ Q}
Example:
Construct a PDA that accepts the language L over {0, 1} by empty stack which accepts all the
string of 0's and 1's in which a number of 0's are twice of number of 1's.
Solution:
There are two parts for designing this PDA:
We are going to design the first part i.e. 1 comes before 0's. The logic is that read single 1 and
push two 1's onto the stack. Thereafter on reading two 0's, POP two 1's from the stack. The δ can
be
So, for a deterministic PDA, there is at most one transition possible in any combination of state,
input symbol and stack top.
M = (Σ,Γ,Q,δ,q)
Step 3: The initial symbol of CFG will be the initial symbol in the PDA.
δ(q, ε, A) = (q, α)
Where the production rule is A → α
Step 5: For each terminal symbols, add the following rule:
Example 1:
Convert the following grammar to a PDA that accepts the same language.
1. S → 0S1 | A
2. A → 1A0 | S | ε
Solution:
The CFG can be first simplified by eliminating unit productions:
1. S → 0S1 | 1S0 | ε
Now we will convert this CFG to GNF:
1. S → 0SX | 1SY | ε
2. X → 1
3. Y → 0
The PDA can be:
R1: δ(q, ε, S) = {(q, 0SX) | (q, 1SY) | (q, ε)}
R2: δ(q, ε, X) = {(q, 1)}
R3: δ(q, ε, Y) = {(q, 0)}
R4: δ(q, 0, 0) = {(q, ε)}
R5: δ(q, 1, 1) = {(q, ε)}
Example 2:
Construct PDA for the given CFG, and test whether 0104 is acceptable by this PDA.
1. S → 0BB
2. B → 0S | 1S | 0
Solution:
The PDA can be given as:
Example 3:
Draw a PDA for the CFG given below:
1. S → aSb
2. S → a | b | ε
Solution:
7. u
8. ⊢ δ(q, bb, bb) R4
9. ⊢ δ(q, b, b) R4
10. ⊢ δ(q, ε, z0) R5
11. ⊢ δ(q, ε)
12. ACCEPT
Unit- 4
UNIT – V
Types of Turing machine: Turing machines and halting
Undecidability: Undecidability, A Language that is Not Recursively Enumerable, An
Undecidable
Problem That is RE, Undecidable Problems about Turing Machines, Recursive languages,
Properties
of recursive languages, Post's Correspondence Problem, Modified Post Correspondence
problem,
Other Undecidable Problems, Counter machines.
Undecidability
Example
The halting problem of Turing machine
The mortality problem
The mortal matrix problem
The Post correspondence problem, etc.
Undecidable Problems –
The problems for which we can’t construct an algorithm that can answer the problem
correctly in finite time are termed as Undecidable Problems. These problems may be
partially decidable but they will never be decidable. That is there will always be a condition
that will lead the Turing Machine into an infinite loop without providing an answer at all.
Examples
Ambiguity of context-free languages: Given a context-free language, there is
no Turing machine which will always halt in finite amount of time and give
answer whether language is ambiguous or not.
Equivalence of two context-free languages: Given two context-free languages,
there is no Turing machine which will always halt in finite amount of time and
give answer whether two context free languages are equal or not.
Everything or completeness of CFG: Given a CFG and input alphabet,
whether CFG will generate all possible strings of input alphabet (∑*)is
undecidable.
Regularity of CFL, CSL, REC and REC: Given a CFL, CSL, REC or REC,
determining whether this language is regular is undecidable.
Note: Two popular undecidable problems are halting problem of TM
and PCP (Post Correspondence Problem).
Undecidable Problem about Turing Machine
In this section, we will discuss all the undecidable problems regarding turing machine. The
reduction is used to prove whether given language is desirable or not. In this section, we will
understand the concept of reduction first and then we will see an important theorem in this
regard.
Reduction
Reduction is a technique in which if a problem P1 is reduced to a problem P2 then any
solution of P2 solves P1. In general, if we have an algorithm to convert an instance of a
problem P1 to an instance of a problem P2 that have the same answer then it is called as P1
reduced P2. Hence if P1 is not recursive then P2 is also not recursive. Similarly, if P1 is not
recursively enumerable then P2 also is not recursively enumerable.
Proof:
'yes' if the input is P1 but may or may not halt for the input which is
not in P2. As we know that one can convert an instance of w in P1 to
an instance x in P2. Then apply a TM to check whether x is in P2. If x
is accepted that also means w is accepted. This procedure describes
a TM whose language is P1 if w is in P1 then x is also in P2 and if w
is not in P1 then x is also not in P2. This proves that if P1 is non-RE
then P2 is also non-RE.
L1= {anbncn|n>=0}
L2= {dmemfm|m>=0}
L3= L1.L2
= {anbncndm emfm|m>=0 and n>=0} is also recursive.
L1 says n no. of a’s followed by n no. of b’s followed by n no. of c’s. L2 says m
no. of d’s followed by m no. of e’s followed by m no. of f’s. Their concatenation
first matches no. of a’s, b’s and c’s and then matches no. of d’s, e’s and f’s. So it
can be decided by TM.
Kleene Closure: If L1is recursive, its kleene closure L1* will also be
recursive. For Example:
L1= {anbncn|n>=0}
L1*= { anbncn||n>=0}* is also recursive.
Intersection and complement: If L1 and If L2 are two recursive languages,
their intersection L1 ∩ L2 will also be recursive. For Example:
L1= {anbncndm|n>=0 and m>=0}
L2= {anbncndn|n>=0 and m>=0}
L3=L1 ∩ L2
= { anbncndn |n>=0} will be recursive.
The Post Correspondence Problem (PCP), introduced by Emil Post in 1946, is an undecidable
decision problem. The PCP problem over an alphabet ∑ is stated as follows −
We can say that there is a Post Correspondence Solution, if for some i 1,i2,………… ik, where
1 ≤ ij ≤ n, the condition xi1 …….xik = yi1 …….yik satisfies.
Example 1
Solution
x1 x2 x3
M Abb aa aaa
N Bba aaa aa
Here,
x2x1x3 = ‘aaabbaaa’
x2x1x3 = y2y1y3
Example 2
Find whether the lists M = (ab, bab, bbaaa) and N = (a, ba, bab) have a Post
Correspondence Solution?
Solution
x1 x2 x3
M Ab Bab bbaaa
N A Ba bab
Chomsky Hierarchy
Chomsky Hierarchy represents the class of languages that are accepted by the different
machine. The category of language in Chomsky's Hierarchy is as given below:
For example:
1. bAa → aa
2. S → s
Type 1 Grammar:
Type 1 grammar is known as Context Sensitive Grammar. The context sensitive grammar is
used to represent context sensitive language. The context sensitive grammar follows the
following rules:
o The context sensitive grammar may have more than one symbol on the left hand side
of their production rules.
o The number of symbols on the left-hand side must not exceed the number of symbols
on the right-hand side.
o The rule of the form A → ε is not allowed unless A is a start symbol. It does not occur
on the right-hand side of any rule.
o The Type 1 grammar should be Type 0. In type 1, Production is in the form of V → T
For example:
1. S → AT
2. T → xy
3. A → a
Type 2 Grammar:
Type 2 Grammar is known as Context Free Grammar. Context free languages are the
languages which can be represented by the context free grammar (CFG). Type 2 should be
type 1. The production rule is of the form
1. A → α
Where A is any single non-terminal and is any combination of terminals and non-terminals.
For example:
1. A → aBb
2. A → b
3. B → a
Type 3 Grammar:
Type 3 Grammar is known as Regular Grammar. Regular languages are those languages
which can be described using regular expressions. These languages can be modeled by NFA
or DFA.
Type 3 is most restricted form of grammar. The Type 3 grammar should be Type 2 and Type
1. Type 3 should be in the form of
1. V → T*V / T*
For example:
1. A → xy