0% found this document useful (0 votes)
78 views

Lecture21 PDF

This document discusses finite automata and regular languages. It begins by defining deterministic finite automata (DFAs) and how they can be used to recognize regular languages. It then provides examples of DFAs and the regular languages they recognize. The document also discusses the membership problem for DFAs and proves that the language of strings with an equal number of 0s and 1s is not regular.

Uploaded by

SURAJ JAISWAL
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
78 views

Lecture21 PDF

This document discusses finite automata and regular languages. It begins by defining deterministic finite automata (DFAs) and how they can be used to recognize regular languages. It then provides examples of DFAs and the regular languages they recognize. The document also discusses the membership problem for DFAs and proves that the language of strings with an equal number of 0s and 1s is not regular.

Uploaded by

SURAJ JAISWAL
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Great Theoretical Ideas in CS

V. Adamchik CS 15-251
Outline
Lecture 21 Carnegie Mellon University

DFAs
Finite Automata Regular Languages
0n1n is not regular
Union Theorem
Kleene’s Theorem
NFAs
Application: KMP

Deterministic Finite Automata 0


11 1
0,1
1

A machine so simple that you can 0111 111 1ϵ


understand it in just one minute 0 0

1

The machine processes a string and accepts
it if the process ends in a double circle
The unique string of length 0 will be denoted
by ε and will be called the empty or null string

accept states (F) Anatomy of a Deterministic Finite


start state (q0)
0
11 Automaton
0,1 1
The singular of automata is automaton.
1
0111 111 The alphabet Σ of a finite automaton is the

set where the symbols come from, for
0 0 example {0,1}
transitions 1 The language L(M) of a finite automaton is
the set of strings that it accepts
states
L(M) = {x∈Σ: M accepts x}
The machine accepts a string if the process It’s also called the
ends in an accept state (double circle) “language decided/accepted by M”.

1
The Language L(M) of Machine M The Language L(M) of Machine M

0 0 0
0,1
1
q0 q1
q0
1
1
What language does this DFA decide/accept?
L(M) = All strings of 0s and 1s

The language of a finite automaton is the set


of strings that it accepts L(M) = { w | w has an even number of 1s}

Formal definition of DFAs M = (Q, Σ, , q0, F) Q = {q0, q1, q2, q3}


where
A finite automaton is a 5-tuple M = (Q, Σ, , q0, F) Σ = {0,1}
q0 Q is start state
Q is the finite set of states F = {q1, q2} Q accept states
:Q Σ → Q transition function
Σ is the alphabet
:Q Σ → Q is the transition function q1 1
0 0 1
0,1
q0 Q is the start state q0 q0 q1
1
q1 q2 q2
F Q is the set of accept states q0 M q2
0 0 q2 q3 q2
q3 q0 q2
L(M) = the language of machine M
1
q3

= set of all strings machine M accepts

EXAMPLE
Determine the language
An automaton that accepts all recognized by
and only those strings that
contain 001

1 0,1

0 0,1 0
1

0 0 1
{0} {00} {001}

1 L(M)={1,11,111, …}

2
Membership problem
Determine the language
decided by
Determine whether some
word belongs to the language.

0 0,1

0 1 0,1

L(M)={1, 01}

Regular Languages DFA Membership problem

A language over Σ is a set of strings over Σ Determine whether some


word belongs to the language.
A language L ⊆ Σ is regular if it is recognized by a
deterministic finite automaton
Theorem: The DFA Membership Problem is
A language L ⊆ Σ is regular if there is solvable in linear time.
a DFA which decides it.
Let M = (Q, Σ, , q0, F) and w = w1...wm.
L = { w | w contains 001} is regular Algorithm for DFA M:
p := q0;
L = { w | w has an even number of 1s} is regular for i := 1 to m do p := (p,wi);
if pF then return Yes else return No.

Theorem: L = {0n1n : n∈ℕ} is not regular


Are all languages
regular? Notation:
If a∈Σ is a symbol and n∈N then an denotes
the string aaa∙∙∙a (n times).

E.g., a3 means aaa, a5 means aaaaa,


a1 means a, a0 means ϵ, etc.
Theorem: Any finite language is
regular
Thus L = {ϵ, 01, 0011, 000111, 00001111, …}.

3
Theorem: L = {0n1n : n∈ℕ} is not regular L = strings where the number of
occurrences of 01 is equal to the number
of occurrences of 10
Wrong Intuition:
1
0
For a DFA to decide L, it seems like it needs
1
to “remember” how many 0’s it sees at the 0 0
beginning of the string, so that it can
“check” there are equally many 1’s. 1 0
0
But a DFA has only finitely many states — 1
1
shouldn’t be able to handle arbitrary n. M accepts only the strings with an equal
number of 01’s and 10’s!
For example, 010110

How to prove a language is not Theorem: L = {0n1n : n∈ℕ} is not regular


regular…
Full proof:
Assume for contradiction there is a DFA M with
L(M) = L. Suppose M is a DFA deciding L with, say, k states.

Argue (usually by Pigeonhole) there are two Let ri be the state M reaches after processing 0i.
strings x and y which reach the same state in M. By Pigeonhole, there is a repeat among
r0, r1, r2, …, rk. So say that rs = rt for some s ≠ t.
Show there is a string z such that xz∈L but yz∉L.
Contradiction, since M accepts either both (or
neither.) Since 0s1s ∈ L, starting from rs and processing 1s
causes M to reach an accepting state.

Theorem: L = {0n1n : n∈ℕ} is not regular Regular Languages


Full proof:
So on input 0s1s ∈ L, M will reach an accepting state. Definition: A language L ⊆ Σ is regular if there is
a DFA which decides it.
Consider input 0t1s ∉ L, s≠t.

M will process 0t, reach state rt = rs Questions:


1. Are all languages regular?
then M will process 1s, and reach an accepting state.
2. Are there other ways to tell if L is regular?
Contradiction!

4
Equivalence of two DFAs Union Theorem

Definition: Two DFAs M1 and M2 over the same


Given two languages, L1 and L2, define
alphabet are equivalent if they
the union of L1 and L2 as
accept the same language: L(M1) = L(M2).
L1 L2 = { w | w L1 or w L2 }

Given a few equivalent machines, we are Theorem: The union of two regular
naturally interested in the smallest one languages is also a regular language.
with the least number of states.

Theorem: The union of two regular Union Theorem


languages is also a regular language
L1 = strings with
qeven 0
Proof (Sketch): Let even # of 1’s M1
M1 = (Q1, Σ, 1, q0, F1) be finite automaton for L1
L2 = strings x with
and 1 1
|x| div. by 3
M2 = (Q2, Σ, 2, q0, F2) be finite automaton for L2
qodd 0
We want to construct a finite automaton
M = (Q, Σ, , q0, F) that recognizes L = L1 L2 M2
0,1 0,1
Idea: Run both M1 and M2 at the same time. p0 p1 p2
0,1

Union Theorem Union Theorem

qeven 0 qeven 0
M1 M1

Input: 101001 Input: 101001


1 1 1 1

qodd 0 qodd 0

M2 M2
0,1 0,1 0,1 0,1
p0 p1 p2 p0 p1 p2
0,1 0,1

5
Union Theorem Union Theorem

qeven 0 qeven 0
M1 M1

Input: 101001 Input: 101001


1 1 1 1

qodd 0 qodd 0

M2 M2
0,1 0,1 0,1 0,1
p0 p1 p2 p0 p1 p2
0,1 0,1

Union Theorem Union Theorem

qeven 0 qeven 0
M1 M1

Input: 101001 Input: 101001


1 1 1 1

qodd 0 qodd 0

M2 M2
0,1 0,1 0,1 0,1
p0 p1 p2 p0 p1 p2
0,1 0,1

Union Theorem Union Theorem


Make a DFA keeping
qeven 0 qeven 0
M1 track of both at once. M1

Input: 101001
1 1 1 1
Accept.

qodd 0 qodd 0

M2 M2
0,1 0,1 0,1 0,1
p0 p1 p2 p0 p1 p2
0,1 0,1

6
Union Theorem The Regular Operations
Q = pairs of states, one from M1 and one from M2
Union: A B={w|w A or w B}
= { (q1, q2) | q1 Q1 and q2 Q2 }
= Q1 Q2 Intersection: A B={w|w A and w B}
0
0 0 Negation: A={w|w A}
qeven, p0 qeven, p1 qeven, p2
Reverse: AR = { w1 …wk | wk …w1 A}

1 1 Concatenation: A B = { vw | v A and w B}

qodd, p0 0 qodd, p1 0 qodd, p2 Star: A* = { w1 …wk | k ≥ 0 and each wi A}

1 1

The Kleene closure: A* The Kleene closure: A*


Star: A* = { w1 …wk | k ≥ 0 and each wi A} What is A* of A={0,1}?

From the definition of the concatenation, All binary strings


we definite An, n =0, 1, 2, … recursively
A0 = {ε}
A = An A
n+1
What is A* of A={11}?

A* is a set consisting of concatenations


of arbitrary many strings from A. All binary strings of an even
number of 1s
A* UA k

k 0

Regular Languages Are Closed The Kleene Theorem (1956)


Under The Regular Operations

An axiomatic system for regular languages Every regular language over Σ can be
constructed from ∅ and {a}, a ∈ Σ, using only
Vocabulary: Languages over alphabet Σ
the operations union, concatenation
Axioms: ∅, {a} for each a∈Σ and Kleene star.
Deduction rules:
Given L1, L2, can obtain L1 ⋃ L2
Given L1, L2, can obtain L1 ⋅ L2
Given L, can obtain L*

7
Reverse Nondeterministic finite automaton

Reverse: AR = { w1 …wk | wk …w1 A}


There is another type
How to construct a DFA for the reversal machine in which there a
of a language? may be several possible a
1 0,1 next states. Such qk
The direction in which we machines called
read a string should be 0 nondeterministic. a
irrelevant. q0 q1

1 0,1
If we flip transitions
around we might not get 0
a DFA. q0 Allows transitions from qk on the same
q1
symbol to many states

Nondeterministic finite automaton Nondeterministic finite automaton


(NFA) (NFA)

An NFA is defined using the same


Nondeterminism can arise from two different notations M = (Q, Σ, , I, F)
sources: as DFA except the initial states I and
-Transition nondeterminism the transition function assigns a set of
-Initial state nondeterminism states to each pair Q Σ of state and
. input.

Note, every DFA is automatically also NFA.

NFA for {0k | k is a multiple of 2 or 3} Find the language recognized by this


0
NFA
0 0
s1 s3
0 0
ε 1
1
s0 0,1
0
ε 0 0 1
s2 s4

0
L = {0n, 0n01, 0n11 | n = 0, 1, 2…}

8
What does it mean that for an NFA to What does it mean that for a NFA to
recognize a string? recognize a string?
0
0
0
s1 s3 Here we are going formally define this.
1
1
s0 0,1 For a state q and string w, *(q, w) is the set of
states that the NFA can reach when it reads the
0
string w starting at the state q.
1
s2 s4
Thus for NFA= (Q, Σ, , q0, F), the function
*: Q x Σ -> 2Q
Since each input symbol xj (for j>1) takes the
previous state to a set of states, we shall use a is defined by *(q, y xk) = p (p,xk)
*(q,y)
union of these states.

Find the language recognized by this Nondeterministic finite automaton


NFA
0 Theorem.
1
1 If the language L is recognized by an NFA,
1 0 then L is also recognized by a DFA.
s0

0 1
In other words,
if we ask if there is a NFA that is not
equivalent to any DFA. The answer is No.
L = 1* (01, 1, 10) (00)*

Nondeterministic finite automaton NFA vs. DFA


Theorem (Rabin, Scott 1959). Advantages.
For every NFA there is an equivalent DFA. Easier to construct and manipulate.
Sometimes exponentially smaller.
For this they won the Turing Award. Sometimes algorithms much easier.

Drawbacks
Acceptance testing slower.
CMU prof. Sometimes algorithms more complicated.
emeritus

Rabin Scott

9
Pattern Matching Pattern Matching
Input: Text T, length n. Pattern P, length k.
Input: Text T of length k, string/pattern P of length n
Output: Does P occur in T?
Problem: Does pattern P appear inside text T?
Naïve method:
Automata solution:
a1, a2, a3, a4, a5, …, an
The language P is regular!
Cost: Roughly O(n k) comparisons There is some DFA MP which decides it.
Once you build MP, feed in T: takes time O(n).
may occur in images and DNA sequences
unlikely in English text

DFA Construction
Build DFA from pattern aabaaabb

The alphabet is {a, b}.


The pattern is a a b a a a b b.

To create a DFA we consider all prefixes b


ε, a, aa, aab, aaba, aabaa, aabaaa, aabaaab, a
0 1
aabaaabb

These prefixes are states. The initial state


is ε. The pattern is the accepting state.

DFA Construction DFA Construction


aabaaabb aabaaabb

b b a

b
0 1 2 0 1 2 3
a a a a

b b

10
DFA Construction DFA Construction
aabaaabb aabaaabb

b a b a

b a b
0 1 2 3 4 0 1 2 3 4 5
a a a a a a
b b
b b

DFA Construction DFA Construction


aabaaabb aabaaabb

b b

b a b a

b b b
0 1 2 3 4 5 6 0 1 2 3 4 5 6 7
a a a a a a a a a a
b b b b
b b

DFA Construction The Knuth-Morris-Pratt Algorithm (1976)


aabaaabb
1970 Cook published a paper about a possibility of
b existence of a linear time algorithm
a
b
a
Knuth and Pratt developed an algorithm
b a b
0 1 2 3 4 5 6 7 8
a a a a b
b
b Morris discovered the same algorithm
b a

Pittsburgh native,
CMU professor.

11
The KMP Algorithm - Motivation Languages
DFAs
Algorithm compares the
The regular operations
pattern to the text in
left-to-right, but shifts a b a a b x 0n1n is not regular
the pattern more Union Theorem
intelligently than the Kleene’s Theorem
a b a a b a
brute-force algorithm. j NFAs
When a mismatch
occurs, we compute the a b a a b a Application: KMP
length of the longest
No need to Resume Here’s What
prefix of P that is a repeat these comparing You Need to
proper suffix of P. comparisons here
Know…

12

You might also like