Lecture21 PDF
Lecture21 PDF
V. Adamchik CS 15-251
Outline
Lecture 21 Carnegie Mellon University
DFAs
Finite Automata Regular Languages
0n1n is not regular
Union Theorem
Kleene’s Theorem
NFAs
Application: KMP
1
The machine processes a string and accepts
it if the process ends in a double circle
The unique string of length 0 will be denoted
by ε and will be called the empty or null string
1
The Language L(M) of Machine M The Language L(M) of Machine M
0 0 0
0,1
1
q0 q1
q0
1
1
What language does this DFA decide/accept?
L(M) = All strings of 0s and 1s
EXAMPLE
Determine the language
An automaton that accepts all recognized by
and only those strings that
contain 001
1 0,1
0 0,1 0
1
0 0 1
{0} {00} {001}
1 L(M)={1,11,111, …}
2
Membership problem
Determine the language
decided by
Determine whether some
word belongs to the language.
0 0,1
0 1 0,1
L(M)={1, 01}
3
Theorem: L = {0n1n : n∈ℕ} is not regular L = strings where the number of
occurrences of 01 is equal to the number
of occurrences of 10
Wrong Intuition:
1
0
For a DFA to decide L, it seems like it needs
1
to “remember” how many 0’s it sees at the 0 0
beginning of the string, so that it can
“check” there are equally many 1’s. 1 0
0
But a DFA has only finitely many states — 1
1
shouldn’t be able to handle arbitrary n. M accepts only the strings with an equal
number of 01’s and 10’s!
For example, 010110
Argue (usually by Pigeonhole) there are two Let ri be the state M reaches after processing 0i.
strings x and y which reach the same state in M. By Pigeonhole, there is a repeat among
r0, r1, r2, …, rk. So say that rs = rt for some s ≠ t.
Show there is a string z such that xz∈L but yz∉L.
Contradiction, since M accepts either both (or
neither.) Since 0s1s ∈ L, starting from rs and processing 1s
causes M to reach an accepting state.
4
Equivalence of two DFAs Union Theorem
Given a few equivalent machines, we are Theorem: The union of two regular
naturally interested in the smallest one languages is also a regular language.
with the least number of states.
qeven 0 qeven 0
M1 M1
qodd 0 qodd 0
M2 M2
0,1 0,1 0,1 0,1
p0 p1 p2 p0 p1 p2
0,1 0,1
5
Union Theorem Union Theorem
qeven 0 qeven 0
M1 M1
qodd 0 qodd 0
M2 M2
0,1 0,1 0,1 0,1
p0 p1 p2 p0 p1 p2
0,1 0,1
qeven 0 qeven 0
M1 M1
qodd 0 qodd 0
M2 M2
0,1 0,1 0,1 0,1
p0 p1 p2 p0 p1 p2
0,1 0,1
Input: 101001
1 1 1 1
Accept.
qodd 0 qodd 0
M2 M2
0,1 0,1 0,1 0,1
p0 p1 p2 p0 p1 p2
0,1 0,1
6
Union Theorem The Regular Operations
Q = pairs of states, one from M1 and one from M2
Union: A B={w|w A or w B}
= { (q1, q2) | q1 Q1 and q2 Q2 }
= Q1 Q2 Intersection: A B={w|w A and w B}
0
0 0 Negation: A={w|w A}
qeven, p0 qeven, p1 qeven, p2
Reverse: AR = { w1 …wk | wk …w1 A}
1 1 Concatenation: A B = { vw | v A and w B}
1 1
k 0
An axiomatic system for regular languages Every regular language over Σ can be
constructed from ∅ and {a}, a ∈ Σ, using only
Vocabulary: Languages over alphabet Σ
the operations union, concatenation
Axioms: ∅, {a} for each a∈Σ and Kleene star.
Deduction rules:
Given L1, L2, can obtain L1 ⋃ L2
Given L1, L2, can obtain L1 ⋅ L2
Given L, can obtain L*
7
Reverse Nondeterministic finite automaton
1 0,1
If we flip transitions
around we might not get 0
a DFA. q0 Allows transitions from qk on the same
q1
symbol to many states
0
L = {0n, 0n01, 0n11 | n = 0, 1, 2…}
8
What does it mean that for an NFA to What does it mean that for a NFA to
recognize a string? recognize a string?
0
0
0
s1 s3 Here we are going formally define this.
1
1
s0 0,1 For a state q and string w, *(q, w) is the set of
states that the NFA can reach when it reads the
0
string w starting at the state q.
1
s2 s4
Thus for NFA= (Q, Σ, , q0, F), the function
*: Q x Σ -> 2Q
Since each input symbol xj (for j>1) takes the
previous state to a set of states, we shall use a is defined by *(q, y xk) = p (p,xk)
*(q,y)
union of these states.
0 1
In other words,
if we ask if there is a NFA that is not
equivalent to any DFA. The answer is No.
L = 1* (01, 1, 10) (00)*
Drawbacks
Acceptance testing slower.
CMU prof. Sometimes algorithms more complicated.
emeritus
Rabin Scott
9
Pattern Matching Pattern Matching
Input: Text T, length n. Pattern P, length k.
Input: Text T of length k, string/pattern P of length n
Output: Does P occur in T?
Problem: Does pattern P appear inside text T?
Naïve method:
Automata solution:
a1, a2, a3, a4, a5, …, an
The language P is regular!
Cost: Roughly O(n k) comparisons There is some DFA MP which decides it.
Once you build MP, feed in T: takes time O(n).
may occur in images and DNA sequences
unlikely in English text
DFA Construction
Build DFA from pattern aabaaabb
b b a
b
0 1 2 0 1 2 3
a a a a
b b
10
DFA Construction DFA Construction
aabaaabb aabaaabb
b a b a
b a b
0 1 2 3 4 0 1 2 3 4 5
a a a a a a
b b
b b
b b
b a b a
b b b
0 1 2 3 4 5 6 0 1 2 3 4 5 6 7
a a a a a a a a a a
b b b b
b b
Pittsburgh native,
CMU professor.
11
The KMP Algorithm - Motivation Languages
DFAs
Algorithm compares the
The regular operations
pattern to the text in
left-to-right, but shifts a b a a b x 0n1n is not regular
the pattern more Union Theorem
intelligently than the Kleene’s Theorem
a b a a b a
brute-force algorithm. j NFAs
When a mismatch
occurs, we compute the a b a a b a Application: KMP
length of the longest
No need to Resume Here’s What
prefix of P that is a repeat these comparing You Need to
proper suffix of P. comparisons here
Know…
12