Main
Main
Motivation
1
Finite Automata
2
• Example: Finite Automaton modelling an
on/off switch
Push
Start
off on
Push
Start t h e n
t th the then
3
Structural Representations
• Lineup ⇒ P erson.Lineup
Lineup ⇒ P erson
’[A-Z][a-z]*[][A-Z][A-Z]’
matches Ithaca NY
5
Length of String: Number of positions for
symbols in the string.
|0110| = 4, || = 0
Example: Σ = {0, 1}
Σ1 = {0, 1}
Σ0 = {}
6
The set of all strings over Σ is denoted Σ∗
Σ∗ = Σ0 ∪ Σ1 ∪ Σ2 ∪ · · ·
Also:
Σ+ = Σ1 ∪ Σ2 ∪ Σ3 ∪ · · ·
Σ∗ = Σ+ ∪ {}
x = a1a2 . . . ai
y = b1 b 2 . . . b j
xy = a1a2 . . . aib1b2 . . . bj
If Σ is an alphabet, and L ⊆ Σ∗
then L is a language
Examples of languages:
8
• The set of strings with equal number of 0’s
and 1’s
{, 01, 10, 0011, 0101, 1001, . . .}
Note: ∅ 6= {}
9
Problem: Is a given string w a member of a
language L?
Question: Why?
10
Finite Automata Informally
Allowed events:
11
e-commerce
2
cancel
pay cancel
1 3 4
redeem transfer
Start Start
12
Completed protocols:
2
ship. redeem, transfer, pay,redeem, pay,redeem,
pay, cancel cancel cancel, ship cancel, ship
pay, 1 3 4
ship redeem transfer
Start Start
13
The entire system as an Automaton:
Start
a b c d e f g
P P P P P P
P S S S
1
R
C C C C C C C
R
P S S S
2
P P P P P P
R T
P S S S
4 R
14
Deterministic Finite Automata
A DFA is a quintuple
A = (Q, Σ, δ, q0, F )
15
Example: An automaton A that accepts
δ 0 1
→ q0 q2 q0
?q1 q1 q1
q2 q2 q1
1 0
Start 0 1
q0 q2 q1 0, 1
16
An FA accepts a string w = a1a2 . . . an if there
is a path in the transition diagram that
Example: The FA
1 0
Start 0 1
q0 q2 q1 0, 1
17
• The transition function δ can be extended
to δ̂ that operates on states and strings (as
opposed to states and symbols)
Basis: δ̂(q, ) = q
18
Example: DFA accepting all and only strings
with an even number of 0’s and an even num-
ber of 1’s
1
Start q0 q1
1
0
0
0 0
q2 1 q3
δ 0 1
? → q0 q2 q1
q1 q3 q0
q2 q0 q3
q3 q1 q2
19
Example
A B
x1 x3
x2
C D
20
A state is represented as sequence of three bits
followed by r or a (previous input rejected or
accepted)
A B
→ 000r 100r 011r
?000a 100r 011r
?001a 101r 000a
010r 110r 001a
?010a 110r 001a
011r 111r 010a
100r 010r 111r
?100a 010r 111r
101r 011r 100a
?101a 011r 100a
110r 000a 101a
?110a 000a 101a
111r 001a 110a
21
Nondeterministic Finite Automata
0, 1
Start q0 0 q1 1 q2
q0 q0 q0 q0 q0 q0
q1 q1 q1
(stuck)
q2 q2
(stuck)
0 0 1 0 1
22
Formally, a NFA is a quintuple
A = (Q, Σ, δ, q0, F )
• Σ is a finite alphabet
23
Example: The NFA from the previous slide is
δ 0 1
→ q0 {q0, q1} {q0}
q1 ∅ {q2}
?q2 ∅ ∅
24
Extended transition function δ̂.
Induction:
[
δ̂(q, xa) = δ(p, a)
p∈δ̂(q,x)
L(A) = {w : δ̂(q0, w) ∩ F 6= ∅}
25
Let’s prove formally that the NFA
0, 1
Start q0 0 q1 1 q2
0. w ∈ Σ∗ ⇒ q0 ∈ δ̂(q0, w)
1. q1 ∈ δ̂(q0, w) ⇔ w = x0
2. q2 ∈ δ̂(q0, w) ⇔ w = x01
26
Basis: If |w| = 0 then w = . Then statement
(0) follows from def. For (1) and (2) both
sides are false for
27
Equivalence of DFA and NFA
• Given an NFA
N = (QN , Σ, δN , q0, FN )
we will construct a DFA
D = (QD , Σ, δD , {q0}, FD )
such that
L(D) = L(N )
.
28
The details of the subset construction:
• QD = {S : S ⊆ QN }.
• FD = {S ⊆ QN : S ∩ FN 6= ∅}
29
Let’s construct δD from the NFA on slide 26
0 1
∅ ∅ ∅
→ {q0} {q0, q1} {q0}
{q1} ∅ {q2}
?{q2} ∅ ∅
{q0, q1} {q0, q1} {q0, q2}
?{q0, q2} {q0, q1} {q0}
?{q1, q2} ∅ {q2}
?{q0, q1, q2} {q0, q1} {q0, q2}
30
Note: The states of D correspond to subsets
of states of N , but we could have denoted the
states of D by, say, A − F just as well.
0 1
A A A
→B E B
C A D
?D A A
E E F
?F E B
?G A D
?H E F
31
We can often avoid the exponential blow-up
by constructing the transition table for D only
for accessible states S as follows:
1 0
Start
0 1
{q0} {q0, q1} {q0, q2}
32
Theorem 2.11: Let D be the “subset” DFA
of an NFA N . Then L(D) = L(N ).
33
Induction:
def
δ̂D ({q0}, xa) = δD (δ̂D ({q0}, x), a)
i.h.
= δD (δ̂N (q0, x), a)
cst [
= δN (p, a)
p∈δ̂N (q0 ,x)
def
= δ̂N (q0, xa)
34
Theorem 2.12: A language L is accepted by
some DFA if and only if L is accepted by some
NFA.
35
Exponential Blow-Up
1 0, 1 0, 1 0, 1 0, 1
q0 q1 q2 qn
Start
a1a2 . . . an 6= b1b2 . . . bn
Case 1:
1a2 . . . an
0b2 . . . bn
Case 2:
a1 . . . ai−11ai+1 . . . an
b1 . . . bi−10bi+1 . . . bn
1. An optional + or - sign
2. A string of digits
3. a decimal point
0,1,...,9 0,1,...,9
Start
q0 ε,+,- q1 q2 q3 ε q5
. 0,1,...,9
0,1,...,9 .
q4
38
An -NFA is a quintuple (Q, Σ, δ, q0, F ) where δ
is a function from Q × Σ ∪ {} to the powerset
of Q.
+,- . 0, . . . , 9
→ q0 {q1} {q1} ∅ ∅
q1 ∅ ∅ {q2} {q1, q4}
q2 ∅ ∅ ∅ {q3}
q3 {q5} ∅ ∅ {q3}
q4 ∅ ∅ {q3} ∅
?q5 ∅ ∅ ∅ ∅
39
ECLOSE
Basis:
q ∈ ECLOSE(q)
Induction:
40
Example of -closure
ε ε
2 3 6
ε
1 b
ε
4 5 7
a ε
For instance,
ECLOSE(1) = {1, 2, 3, 4, 6}
41
• Inductive definition of δ̂ for -NFA’s
Basis:
δ̂(q, ) = ECLOSE(q)
Induction:
[
δ̂(q, xa) = ECLOSE(p)
p∈δ(δ̂(q,x),a)
42
Given an -NFA
E = (QE , Σ, δE , q0, FE )
we will construct a DFA
D = (QD , Σ, δD , qD , FD )
such that
L(D) = L(E)
• QD = {S : S ⊆ QE and S = ECLOSE(S)}
• qD = ECLOSE(q0)
• FD = {S : S ∈ QD and S ∩ FE 6= ∅}
• δD (S, a) =
[
{ECLOSE(p) : p ∈ δ(t, a) for some t ∈ S}
43
Example: -NFA E
0,1,...,9 0,1,...,9
Start
q0 ε,+,- q1 q2 q3 ε q5
. 0,1,...,9
0,1,...,9 .
q4
DFA D corresponding to E
0,1,...,9 0,1,...,9
+,-
{q0, q } {q } {q , q } {q2, q3, q5}
1 1 0,1,...,9 1 4 .
. 0,1,...,9
.
Start {q2} {q3, q5}
0,1,...,9
0,1,...,9
44
Theorem 2.22: A language L is accepted by
some -NFA E if and only if L is accepted by
some DFA.
45
Induction:
[
δ̂E (q0, xa) = ECLOSE(p)
p∈δE (δ̂E (q0 ,x),a)
[
= ECLOSE(p)
p∈δD (δ̂D (qD ,x),a)
[
= ECLOSE(p)
p∈δ̂D (qD ,xa)
46
Regular expressions
47
Operations on languages
Union:
L ∪ M = {w : w ∈ L or w ∈ M }
Concatenation:
L.M = {w : w = xy, x ∈ L, y ∈ M }
Powers:
Kleene Closure:
∞
L∗ = Li
[
i=0
If a ∈ Σ, then a is a regex.
L(a) = {a}.
Induction:
or, equivalently,
( + 1)(01)∗( + 0)
1. Star
2. Dot
3. Plus
50
Equivalence of FA’s and regex’s
ε-NFA NFA
RE DFA
51
Theorem 3.4: For every DFA A = (Q, Σ, δ, q0, F )
there is a regex R, s.t. L(R) = L(A).
(k)
• Let Rij be a regex describing the set of
labels of all paths in A from state i to state
j going through intermediate states {1, . . . , k}
only.
i
j
52
(k)
Rij will be defined inductively. Note that
• Case 1: i 6= j
(0) M
Rij = a
{a∈Σ:δ(i,a)=j}
• Case 2: i = j
(0) M
Rii =
a
+
{a∈Σ:δ(i,a)=i}
53
Induction:
(k)
Rij
=
(k−1)
Rij
+
(k−1) (k−1) ∗ (k−1)
Rik Rkk Rkj
i k k k k j
In R (k-1)
ik In R (k-1)
Zero or more strings in R (k-1)
kk
kj
54
Example: Let’s find R for A, where
L(A) = {x0y : x ∈ {1}∗ and y ∈ {0, 1}∗}
1
Start 0 0,1
1 2
(0)
R11 +1
(0)
R12 0
(0)
R21 ∅
(0)
R22 +0+1
55
We will need the following simplification rules:
• ( + R)∗ = R∗
• R + RS ∗ = RS ∗
• ∅R = R∅ = ∅ (Annihilation)
• ∅ + R = R + ∅ = R (Identity)
56
(0)
R11 +1
(0)
R12 0
(0)
R21 ∅
(0)
R22 +0+1
57
Simplified
(1)
R11 1∗
(1)
R12 1∗ 0
(1)
R21 ∅
(1)
R22 +0+1
By direct substitution
(2)
R11 1∗ + 1∗0( + 0 + 1)∗∅
(2)
R12 1∗0 + 1∗0( + 0 + 1)∗( + 0 + 1)
(2)
R21 ∅ + ( + 0 + 1)( + 0 + 1)∗∅
(2)
R22 + 0 + 1 + ( + 0 + 1)( + 0 + 1)∗( + 0 + 1)
58
By direct substitution
(2)
R11 1∗ + 1∗0( + 0 + 1)∗∅
(2)
R12 1∗0 + 1∗0( + 0 + 1)∗( + 0 + 1)
(2)
R21 ∅ + ( + 0 + 1)( + 0 + 1)∗∅
(2)
R22 + 0 + 1 + ( + 0 + 1)( + 0 + 1)∗( + 0 + 1)
Simplified
(2)
R11 1∗
(2)
R12 1∗0(0 + 1)∗
(2)
R21 ∅
(2)
R22 (0 + 1)∗
59
Observations
(k)
There are n3 expressions Rij
(n)
Rij could have size 4n
(k) (k−1)
For all {i, j} ⊆ {1, . . . , n}, Rij uses Rkk
(k−1)
so we have to write n2 times the regex Rkk
60
The state elimination technique
R 1m
R 11
q1 p1
Q1
S P1
Pm
Qk
qk pm
R km
R k1
61
Now, let’s eliminate state s.
R 11 + Q 1 S* P1
q1 p1
R 1m + Q 1 S* Pm
R k1 + Q k S* P1
qk pm
R km + Q k S* Pm
62
For each q ∈ F we’ll be left with an Aq that
looks like
R U
S
Start
Start
63
Example: A, where L(A) = {W : w = x1b, or w =
x1bc, x ∈ {0, 1}∗, {b, c} ⊆ {0, 1}}
0,1
0+1
64
0+1
0+1
Start 1( 0 + 1) 0+1
A C D
0+1
Start 1( 0 + 1) ( 0 + 1)
A D
65
From
0+1
Start 1( 0 + 1) 0+1
A C D
0+1
Start 1( 0 + 1)
A C
66
From regex’s to -NFA’s
(a)
(b)
(c)
67
Induction: Automata for R + S, RS, and R∗
ε R ε
ε ε
S
(a)
ε
R S
(b)
ε ε
R
ε
(c)
68
Example: We convert (0 + 1)∗1(0 + 1)
ε 0 ε
ε ε
1
(a)
ε
ε 0 ε
ε ε
ε ε ε
1
(b)
ε
ε 0 ε
Start ε ε
ε
ε ε ε
1
ε 0 ε
1 ε
ε ε
1
(c)
69
Algebraic Laws for languages
• L ∪ M = M ∪ L.
Union is commutative.
• (L ∪ M ) ∪ N = L ∪ (M ∪ N ).
Union is associative.
• (LM )N = L(M N ).
Concatenation is associative
70
• ∅ ∪ L = L ∪ ∅ = L.
• {}L = L{} = L.
• ∅L = L∅ = ∅.
71
• L(M ∪ N ) = LM ∪ LN .
• (M ∪ N )L = M L ∪ N L.
• L ∪ L = L.
Union is idempotent.
72
• (L∗)∗ = L∗. Closure is idempotent
Proof:
∞ ∞ !i
w ∈ (L∗)∗ ⇐⇒ w ∈ Lj
[ [
i=0 j=0
⇐⇒ ∃k, m ∈ N : w ∈ (Lm)k
⇐⇒ ∃p ∈ N : w ∈ Lp
∞
Li
[
⇐⇒ w ∈
i=0
⇐⇒ w ∈ L∗
73
Algebraic Laws for regex’s
More generally
1. Prove it by hand.
74
In Chapter 4 we will learn how to test auto-
matically if E = F , for any concrete regex’s
E and F .
Method:
75
Answer: Yes, as long as the identities use only
plus, dot, and star.
S (E(E, F )) = (0 + 11)0
76
Theorem 3.13: Fix a “freezing” substitution
♠ = {E1/a1, E2/a2, . . . , Em/am}.
77
For example: Suppose the alphabet is {1, 2}.
Let E(E1, E2) be (E1 + E2)E1, and let E1 be 1,
and E2 be 2. Then
78
Proof of Theorem 3.13: We do a structural
induction of E.
79
Induction:
Case 1: E = F + G.
Case 2: E = F.G.
Case 3: E = F∗.
81
Theorem 3.14: E(E1, . . . , Em) = F(E1, . . . , Em) ⇔
L(♠(E)) = L(♠(F))
Proof:
w ∈ L(E(E1, . . . Em)) ⇔
w ∈ L(F(E1, . . . Em))
82
Examples:
83
Theorem 3.14: E(E1, . . . , Em) = F(E1, . . . , Em) ⇔
L(♠(E)) = L(♠(F))
Proof:
w ∈ L(E(E1, . . . Em)) ⇔
w ∈ L(F(E1, . . . Em))
84
Properties of Regular Languages
85
The Pumping Lemma Informally
p0
0 p1
00 p2
... ...
0k pk
86
Now you can fool A:
If δ̂(q, 1i) ∈
/ F the machine will foolishly re-
ject 0i1i.
87
Theorem 4.1.
Let L be regular.
1. y 6=
2. |xy| ≤ n
3. ∀k ≥ 0, xy k z ∈ L
88
Proof: Suppose L is regular
⇒ ∃ i < j : pi = pj
89
Now w = xyz, where
1. x = a1a2 . . . ai
2. y = ai+1ai+2 . . . aj
3. z = aj+1aj+2 . . . am
y=
ai+1 . . . aj
x= z=
Start a1 . . . ai aj+1 . . . am
p0 pi
90
Example: Let Leq be the language of strings
with equal number of zero’s and one’s.
| {z. . }. .| .{z
w = 000 . 0} 0111
| {z. . . 11}
x y z
91
Suppose Lpr = {1p : p is prime } were regular.
Choose a prime p ≥ n + 2.
p
z }| {
| {z· · }· ·| ·{z
w = 111 · 1} 1111
| {z· · · 11}
x y z
|y|=m
• y 6= ⇒ 1 + m > 1
• m = |y| ≤ |xy| ≤ n, p ≥ n + 2
⇒ p − m ≥ n + 2 − n = 2.
⇒ L(A) 6= Lpr .
92
Closure Properties of Regular Languages
• Union: L ∪ M
• Intersection: L ∩ M
• Complement: N
• Difference: L \ M
• Reversal: LR = {wR : w ∈ L}
• Closure: L∗.
• Concatenation: L.M
• Homomorphism:
h(L) = {h(w) : w ∈ L, h is a homom. }
• Inverse homomorphism:
h−1(L) = {w ∈ Σ : h(w) ∈ L, h : Σ → ∆ is a homom.}
93
Theorem 4.4. For any regular L and M , L∪M
is regular.
A = (Q, Σ, δ, q0, F ).
Let B = (Q, Σ, δ, q0, Q \ F ). Now L(B) = L.
94
Example:
1 0
Start
0 1
{q0} {q0, q1} {q0, q2}
Then L is recognized by
1 0
Start
0 1
{q0} {q0, q1} {q0, q2}
96
Theorem 4.8. If L and M are regular, then
so in L ∩ M .
AM = (QM , Σ, δM , qM , FM )
97
If AL goes from state p to state s on reading a,
and AM goes from state q to state t on reading
a, then AL∩M will go from state (p, q) to state
(s, t) on reading a.
Input a
AL
AM
98
Formally
Question: Why?
99
Example: (c) = (a) × (b)
Start 0 0,1
p q
(a)
0
Start 1 0,1
r s
(b)
1
Start 1
pr ps
0 0
0,1
qr qs
1
0
(c)
100
Theorem 4.10. If L and M are regular lan-
guages, then so in L \ M .
101
Theorem 4.11. If L is a regular language,
then so is LR .
102
Theorem 4.11. If L is a regular language,
then so is LR .
Basis: If E is , ∅, or a, then E R = E.
Induction:
1. E = F + G. Then E R = F R + GR
2. E = F.G. Then E R = GR .F R
3. E = F ∗. Then E R = (F R )∗
L(E R ) = (L(E))R
103
Homomorphisms
104
Theorem 4.14: h(L) is regular, whenever L
is.
Proof:
Induction:
h−1(L) = {w ∈ Σ∗ : h(w) ∈ L}
L h h(L)
(a)
h-1 (L) h L
(b)
106
Example: Let h : {a, b} → {0, 1}∗ be defined by
h(a) = 01, and h(b) = 10. If L = L((00 + 1)∗),
then h−1(L) = L((ba)∗).
/ L((ba)∗). There
Let h(w) ∈ L, and suppose w ∈
are four cases to consider.
107
Theorem 4.16: Let h : Σ → Θ∗ be a homom.,
and L ⊆ Θ∗ regular. Then h−1(L) is regular.
Input a
h
Input
Start h(a) to A
Accept/reject
A
108
Decision Properties
2. Is L = ∅?
3. Is w ∈ L?
109
From NFA’s to DFA’s
110
From DFA to NFA
From FA to regex
113
Example 4.17
Inverse homomorphism
Homomorphism
114
• M L1
h([paq]) = a
115
For example h−1(101) =
n
[p1p][p0q][p1p],
[p1p][p0q][q1q],
[p1p][q0q][p1p],
[p1p][q0q][q1q],
[q1q][p0q][p1p],
[q1q][p0q][q1q],
[q1q][q0q][p1p],
o
[q1q][q0q][q1q]
116
• L1 L2
Define
M
E1 = [q0ap]
a∈Σ, δ(q0 ,a)=p
Let
L2 = L1 ∩ L(E1).T ∗
117
• L2 L3
Define
M
E2 = [paq][rbs]
[paq]∈T ∗ , [rbs]∈T ∗ , q6=r
Let
L3 = L2 \ T ∗.L(E2).T ∗
[q0a1p1][p1a2p2] . . . [pn−1anpn]
in T ∗ such that
a1a2 . . . an ∈ L(A)
δ(qo, a1) = p1
pn ∈ F
118
• L3 L4
Define
M
Eq = [ras]
[ras]∈T ∗ , r6=q, s6=q
Let
Eq∗
M
L4 = L3 \ L
q∈Q
119
• L4 L
L = h(L4)
Now L =
120
Equivalence and Minimization of Automata
121
Example:
0 1
Start 0 1 0
A B C D
1 0 1
0 1
1 1 0
E F G H
1 0
δ̂(C, ) ∈ F, δ̂(G, ) ∈
/ F ⇒ C 6≡ G
122
What about A and E?
0 1
Start 0 1 0
A B C D
1 0 1
0 1
1 1 0
E F G H
1 0
δ̂(A, ) = A ∈
/ F, δ̂(E, ) = E ∈
/F
δ̂(A, 1) = F = δ̂(E, 1)
Conclusion: A ≡ E.
123
We can compute distinguishable pairs with the
following inductive table filling algorithm:
Example:
Applying the table filling algo to DFA A:
B x
C x x
D x x x
E x x x
F x x x x
G x x x x x x
H x x x x x x
A B C D E F G
124
Theorem 4.20: If p and q are not distin-
guished by the TF-algo, then p ≡ q.
1. ∃w : δ̂(p, w) ∈ F, δ̂(q, w) ∈
/ F , or vice versa.
125
Consider states r = δ(p, a1) and s = δ(q, a1).
Now {r, s} cannot be a bad pair since {r, s}
would be indentified by a string shorter than w.
Therefore, the TF-algo must have discovered
that r and s are distinguishable.
126
Testing Equivalence of Regular Languages
To test if L = M
127
Example:
0 1
Start 1
A B
0
0
Start 0
C D
1 0
1
B x
C x
D x
E x x x
A B C D
129
Theorem 4.23: If p ≡ q and q ≡ r, then p ≡ r.
130
To minimize a DFA A = (Q, Σ, δ, q0, F ) con-
struct a DFA B = (Q/≡ , Σ, γ, q0/≡ , F/≡ ), where
131
Example: We can minimize
0 1
Start 0 1 0
A B C D
1 0 1
0 1
1 1 0
E F G H
1 0
to obtain
1
0
1
G D,F
1
Start
A,E 0 0
1
B,H C
1
0
132
NOTE: We cannot apply the TF-algo to NFA’s.
0,1
Start 0
A B
1 0
However, A 6≡ C.
133
Why the Minimized DFA Can’t Be Beaten
134
Claim: For each state p in B there is at least
one state q in C, s.t. p ≡ q.
135
Context-Free Grammars and Languages
Consider Lpal = {w ∈ Σ∗ : w = wR }
1. P →
2. P → 0
3. P → 1
4. P → 0P 0
5. P → 1P 1
138
Formal definition of CFG’s
G = (V, T, P, S)
where
139
Example: Gpal = ({P }, {0, 1}, A, P ), where A =
{P → , P → 0, P → 1, P → 0P 0, P → 1P 1}.
140
Example: (simple) expressions in a typical prog
lang. Operators are + and *, and arguments
are identfiers, i.e. strings in
L((a + b)(a + b + 0 + 1)∗)
1. E →I
2. E →E+E
3. E →E∗E
4. E → (E)
5. I →a
6. I →b
7. I → Ia
8. I → Ib
9. I → I0
10. I → I1
141
Derivations using grammars
142
• Derivations
Then we write
αAβ ⇒ αγβ
G
or, if G is understood
αAβ ⇒ αγβ
and say that αAβ derives αγβ.
∗
We define ⇒ to be the reflexive and transitive
closure of ⇒, IOW:
∗
Basis: Let α ∈ (V ∪ T )∗. Then α ⇒ α.
∗ ∗
Induction: If α ⇒ β, and β ⇒ γ, then α ⇒ γ.
143
Example: Derivation of a ∗ (a + b00) from E in
the grammar of slide 138:
E ⇒ E ∗ E ⇒ I ∗ E ⇒ a ∗ E ⇒ a ∗ (E) ⇒
E ⇒E+E
won’t lead to a derivation of a ∗ (a + b00).
144
Leftmost and Rightmost Derivations
Rightmost:
E ⇒E∗E ⇒
rm rm
∗
We can conclude that E ⇒ a ∗ (a + b00)
rm
145
The Language of a Grammar
Theorem 5.7:
146
Basis: |w| = 0, or |w| = 1. Then w is , 0,
or 1. Since P → , P → 0, and P → 1 are
∗
productions, we conclude that P ⇒ w in all
G
base cases.
∗
If w = 0x0 we know from the IH that P ⇒ x.
Then
∗
P ⇒ 0P 0 ⇒ 0x0 = w
Thus w ∈ L(Gpal ).
147
(⊆-direction.) We assume that w ∈ L(Gpal )
and must show that w = wR .
∗
Since w ∈ L(Gpal ), we have P ⇒ w.
∗
We do an induction of the length of ⇒.
∗
Basis: The derivation P ⇒ w is done in one
step.
149
Example: Take G from slide 138. Then E ∗ (I + E)
is a sentential form since
E ⇒E∗E ⇒I ∗E ⇒a∗E
lm lm lm
E ⇒ E ∗ E ⇒ E ∗ (E) ⇒ E ∗ (E + E)
rm rm rm
150
Parse Trees
X1, X2, . . . , Xk ,
then A → X1X2 . . . Xk ∈ P .
152
Example: In the grammar
1. E → I
2. E → E + E
3. E → E ∗ E
4. E → (E)
·
·
·
E + E
∗
This parse tree shows the derivation E ⇒ I +E
153
Example: In the grammar
1. P →
2. P → 0
3. P → 1
4. P → 0P 0
5. P → 1P 1
0 P 0
1 P 1
∗
It shows the derivation of P ⇒ 0110.
154
The Yield of a Parse Tree
155
Example: Below is an important parse tree
E * E
I ( E )
a E + E
I I
a I 0
I 0
Rightmost
Derivation derivation Recursive
inference
157
From Inferences to Trees
158
Induction: w is inferred in n + 1 steps. Sup-
pose the last step was based on a production
A → X1X2 · · · Xk ,
where Xi ∈ V ∪ T . We break w up as
w1w2 · · · wk ,
where wi = Xi, when Xi ∈ T , and when Xi ∈ V,
then wi was previously inferred being in Xi, in
at most n steps.
X1 X2 ... Xk
w1 w2 ... wk
159
From trees to derivations
E ⇒ I ⇒ Ib ⇒ ab.
Then, for any α and β there is a derivation
E ⇒ E + E ⇒ E + (E).
The we can choose α = E + ( and β =) and
continue the derivation as
Consequently A → w ∈ P , and A ⇒ w.
lm
161
Induction: Height is n + 1. The tree must
look like
X1 X2 ... Xk
w1 w2 ... wk
1. If Xi ∈ T , then wi = Xi.
∗
2. If Xi ∈ V , then Xi ⇒ wi in G by the IH.
lm
162
∗
Now we construct A ⇒ w by an (inner) induc-
lm
tion by showing that
∗
∀i : A ⇒ w1w2 · · · wiXi+1Xi+2 · · · Xk .
lm
163
(Case 2:) Xi ∈ V . By the IH there is a deriva-
tion Xi ⇒ α1 ⇒ α2 ⇒ · · · ⇒ wi. By the contex-
lm lm lm lm
free property of derivations we can proceed
with
∗
A⇒
lm
w1w2 · · · wi−1XiXi+1 · · · Xk ⇒
lm
w1w2 · · · wi−1α1Xi+1 · · · Xk ⇒
lm
w1w2 · · · wi−1α2Xi+1 · · · Xk ⇒
lm
···
w1w2 · · · wi−1wiXi+1 · · · Xk
164
Example: Let’s construct the leftmost deriva-
tion for the tree
E
E * E
I ( E )
a E + E
I I
a I 0
I 0
E⇒
lm
E∗E ⇒
lm
I ∗E ⇒
lm
a∗E ⇒
lm
a ∗ (E) ⇒
lm
a ∗ (E + E) ⇒
lm
a ∗ (I + E) ⇒
lm
a ∗ (a + E) ⇒
lm
a ∗ (a + I) ⇒
lm
a ∗ (a + I0) ⇒
lm
a ∗ (a + I00) ⇒
lm
a ∗ (a + b00)
166
From Derivations to Recursive Inferences
∗
Observation: Suppose that A ⇒ X1X2 · · · Xk ⇒ w.
∗
Then w = w1w2 · · · wk , where Xi ⇒ wi
∗
The factor wi can be extracted from A ⇒ w by
looking at the expansion of Xi only.
Example: E ⇒ a ∗ b + a, and
E ⇒ |{z} ∗ |{z}
E |{z} E |{z} E
+ |{z}
X1 X2 X3 X4 X5
We have
E ⇒E∗E ⇒E∗E+E ⇒I ∗E+E ⇒I ∗I +E ⇒
I ∗I +I ⇒a∗I +I ⇒a∗b+I ⇒a∗b+a
168
∗
Induction: Suppose A ⇒ w in n + 1 steps.
G
Write the derivation as
∗
A ⇒ X1X2 · · · Xk ⇒ w
G G
169
Ambiguity in Grammars and Languages
In the grammar
1. E →I
2. E →E+E
3. E →E∗E
4. E → (E)
···
the sentential form E + E ∗ E has two deriva-
tions:
E ⇒E+E ⇒E+E∗E
and
E ⇒E∗E ⇒E+E∗E
This gives us two parse trees:
E E
E + E E * E
E * E E + E
(a) (b)
170
The mere existence of several derivations is not
dangerous, it is the existence of several parse
trees that ruins a grammar.
5. I → a
6. I → b
7. I → Ia
8. I → Ib
9. I → I0
10. I → I1
the string a + b has several derivations, e.g.
E E
E + E E * E
I E * E E + E I
a I I I I a
a a a a
(a) (b)
172
Removing Ambiguity From Grammars
173
Solution: We introduce more variables, each
representing expressions of same “binding strength.”
(a) Identifiers
174
We’ll let F stand for factors, T for terms, and E
for expressions. Consider the following gram-
mar:
1. I → a | b | Ia | Ib | I0 | I1
2. F → I | (E)
3. T →F |T ∗F
4. E →T |E+T
E + T
T T * F
F F I
I I a
a a
175
Why is the new grammar unambiguous?
Intuitive explanation:
f1 ∗ f2 ∗ · · · ∗ fn−1 ∗ fn
of factors is the one that gives f1 ∗ f2 ∗ · · · ∗ fn−1
as a term and fn as a factor, as in the parse
tree on the next slide.
• An expression is a sequence
t1 + t2 + · · · + tn−1 + tn
of terms ti. It can only be parsed with
t1 + t2 + · · · + tn−1 as an expression and tn as
a term.
176
T
T * F
T * F
. .
.
T
T * F
177
Leftmost derivations and Ambiguity
E + E E * E
I E * E E + E I
a I I I I a
a a a a
(a) (b)
give rise to two derivations:
E ⇒E+E ⇒I +E ⇒a+E ⇒a+E∗E
lm lm lm lm
⇒a+I ∗E ⇒a+a∗E ⇒a+a∗I ⇒a+a∗a
lm lm lm lm
and
E ⇒ E ∗E ⇒ E +E ∗E ⇒ I +E ∗E ⇒ a+E ∗E
lm lm lm lm
⇒a+I ∗E ⇒a+a∗E ⇒a+a∗I ⇒a+a∗a
lm lm lm lm
178
In General:
179
Sketch of Proof: (Only If.) If the two parse
trees differ, they have a node a which dif-
ferent productions, say A → X1X2 · · · Xk and
B → Y1Y2 · · · Ym. The corresponding leftmost
derivations will use derivations based on these
two different productions and will thus be dis-
tinct.
180
Inherent Ambiguity
Example: Consider L =
A grammar for L is
S → AB | C
A → aAb | ab
B → cBd | cd
C → aCd | aDd
D → bDc | bc
181
Let’s look at parsing the string aabbccdd.
S S
A B C
a A b c B d a C d
a b c d a D d
b D c
b c
(a) (b)
182
From this we see that there are two leftmost
derivations:
and
183
Pushdown Automata
Finite
Input state Accept/reject
control
Stack
184
Example: Let’s consider
185
The PDA for Lwwr as a transition diagram:
0 , Z 0 /0 Z 0
1 , Z 0 /1 Z 0
0 , 0 /0 0
0 , 1 /0 1
1 , 0 /1 0 0, 0/ ε
1 , 1 /1 1 1, 1/ ε
Start
q0 q q2
1
ε, Z 0 / Z 0 ε , Z 0 /Z 0
ε, 0 / 0
ε, 1 / 1
186
PDA formally
A PDA is a seven-tuple:
187
Example: The PDA
0 , Z 0 /0 Z 0
1 , Z 0 /1 Z 0
0 , 0 /0 0
0 , 1 /0 1
1 , 0 /1 0 0, 0/ ε
1 , 1 /1 1 1, 1/ ε
Start
q0 q q2
1
ε, Z 0 / Z 0 ε , Z 0 /Z 0
ε, 0 / 0
ε, 1 / 1
P = ({q0, q1, q2}, {0, 1}, {0, 1, Z0}, δ, q0, Z0, {q2}),
where δ is given by the following table (set
brackets missing):
188
Instantaneous Descriptions
∗
We define ` to be the reflexive-transitive clo-
sure of `.
189
Example: On input 1111 the PDA
0 , Z 0 /0 Z 0
1 , Z 0 /1 Z 0
0 , 0 /0 0
0 , 1 /0 1
1 , 0 /1 0 0, 0/ ε
1 , 1 /1 1 1, 1/ ε
Start
q0 q q2
1
ε, Z 0 / Z 0 ε , Z 0 /Z 0
ε, 0 / 0
ε, 1 / 1
190
( q0 , 1111, Z 0 )
( q0 , ε , 1111Z 0 ) ( q , 1, 111Z 0 ) ( q , 1, 1 Z 0 )
1 1
( q , ε , 1111Z 0 ) ( q , ε , 11 Z 0 ) ( q , ε , Z0 )
1 1 1
( q2 , ε , Z 0 )
191
The following properties hold:
192
Theorem 6.5: ∀w ∈ Σ∗, β ∈ Γ∗ :
∗ ∗
(q, x, α) ` (p, y, β) ⇒ (q, xw, αγ) ` (p, yw, βγ).
Theorem 6.6:
∗ ∗
(q, xw, α) ` (p, yw, β) ⇒ (q, x, α) ` (p, y, β).
193
Acceptance by final state
194
(⊆-direction.)
∗
Thus it is sufficient to show that if (q0, x, Z0) `
(q1, , Z0) then x = wwR , for some word w.
Ther are two moves for the PDA from ID (q0, x, α):
195
Move 1: The spontaneous (q0, x, α) ` (q1, x, α).
∗
Now (q1, x, α) ` (q1, , β) implies that |β| < |α|,
which implies β 6= α.
Thus a1 = an and
∗
(q0, a2 . . . an, a1α) ` (q1, an, a1α).
By Theorem 6.6 we can remove an. Therefore
∗
(q0, a2 . . . an−1, a1α) ` (q1, , a1α).
Then, by the IH a2 . . . an−1 = yy R . Then x =
a1yy R an is a palindrome.
196
Acceptance by Empty Stack
197
From Empty Stack to Final State
Proof: Let
PF = (Q ∪ {p0, pf }, Σ, Γ ∪ {X0}, δF , p0, X0, {pf })
where δF (p0, , X0) = {(q0, Z0X0)}, and for all
q ∈ Q, a ∈ Σ∪{}, Y ∈ Γ : δF (q, a, Y ) = δN (q, a, Y ),
and in addition (pf , ) ∈ δF (q, , X0).
ε, X 0 / ε
ε, X 0 / ε
Start ε, X 0 / Z 0X 0
p0 q0 PN pf
ε, X 0 / ε
ε, X 0 / ε
198
We have to show that L(PF ) = N (PN ).
199
Let’s design PN for for cathing errors in strings
meant to be in the if-else-grammar G
S → |SS|iS|iSe.
Here e.g. {ieie, iie, iiee} ⊆ G, and e.g. {ei, ieeii} ∩ G = ∅.
The diagram for PN is
e, Z/ ε
i, Z/ZZ
Start
q
Formally,
200
From PN we can construct
e, Z/ ε
i, Z/ZZ
Start ε, X 0/ZX 0 ε, X 0 / ε
p q r
201
From Final State to Empty Stack
Proof: Let
ε, any/ ε ε, any/ ε
Start ε, X 0 / Z 0X 0 PF
p0 q0 p
ε, any/ ε
202
We have to show that N (PN ) = L(PF ).
203
Equivalence of PDA’s and CFG’s
A language is
generated by a CFG
if and only if it is
if and only if it is
PDA by PDA by
Grammar
empty stack final state
∗
Given G, we construct a PDA that simulates ⇒.
lm
xAα
where A is the leftmost variable in the form.
For instance,
(a+ E )
| {z } |{z} |{z}
x | A {z α }
tail
Proof:
S = γ1 ⇒ γ2 ⇒ · · · ⇒ γn = w
lm lm lm
then
∗
(q, w, S) ` (q, yi, αi),
where w = xiyi.
207
Basis: For i = 1, γ1 = S. Thus x1 = , and
∗
y1 = w. Clearly (q, w, S) ` (q, w, S).
∗
Induction: IH is (q, w, S) ` (q, yi, αi). We have
to show that
(q, yi, αi) ` (q, yi+1, αi+1)
Now αi begins with a variable A, and we have
the form
x Aχ ⇒ xi+1βχ
| i{z }lm
γi
| {z }
γi+1
∗ ∗
(♣) If (q, x, A) ` (q, , ), then A ⇒ x.
209
We can now write x as x1x2 · · · xn, according
to the figure below, where Y1 = B, Y2 = a, and
Y3 = C.
x x x
1 2 3
210
Now we can conclude that
∗
(q, xixi+1 · · · xk , Yi) ` (q, xi+1 · · · xk , )
is less than n steps, for all i ∈ {1, . . . , k}. If Yi
is a variable we have by the IH and Theorem
6.6 that
∗
Yi ⇒ x i
If Yi is a terminal, we have |xi| = 1, and Yi = xi.
∗ ∗
Thus Yi ⇒ xi by the reflexivity of ⇒.
211
From PDA’s to CFG’s
p0
Y1
p1
Y2
.
.
.
pk- 1
Yk
pk
x1 x2 xk
213
Example: Let’s convert
e, Z/ ε
i, Z/ZZ
Start
q
214
Example: Let P = ({p, q}, {0, 1}, {X, Z0}, δ, q, Z0),
where δ is given by
to a CFG.
215
We get G = (V, {0, 1}, R, S), where
S → [qZ0q]|[qZ0p]
[qZ0q] → 1[qXq][qZ0q]
[qZ0q] → 1[qXp][pZ0q]
[qZ0p] → 1[qXq][qZ0p]
[qZ0p] → 1[qXp][pZ0p]
[qXq] → 1[qXq][qXq]
[qXq] → 1[qXp][pXq]
[qXp] → 1[qXq][qXp]
[qXp] → 1[qXp][pXp]
216
From rule (3):
[qXq] → 0[pXq]
[qXp] → 0[pXp]
[qXq] →
[pXp] → 1
[pZ0q] → 0[qZ0q]
[pZ0p] → 0[qZ0p]
217
Theorem 6.14: Let G be constructed from a
PDA P as above. Then L(G) = N (P )
Proof:
∗ ∗
(♠) If (q, w, X) ` (p, , ) then [qXp] ⇒ w.
218
Induction: Length is n > 1, and ♠ holds for
lengths < n. We must have
219
We then get the following derivation sequence:
∗
[qXrk ] ⇒ a[r0Y1r1] · · · [rk−1Yk rk ] ⇒
∗
aw1[r1Y2r2][r2Y3r3] · · · [rk−1Yk rk ] ⇒
∗
aw1w2[r2Y3r3] · · · [rk−1Yk rk ] ⇒
···
aw1w2 · · · wk = w
220
(⊇-direction.) We shall show by an induction
∗
on the length of the derivation ⇒ that
∗ ∗
(♥) If [qXp] ⇒ w then (q, w, X) ` (p, , )
∗
Induction: Length of ⇒ is n > 1, and ♥ holds
for lengths < n. Then we must have
∗
[qXrk ] ⇒ a[r0Y1r1][r1Y2r2] · · · [rk−1Yk rk ] ⇒ w
∗
We can break w into aw2 · · · wk such that [ri−1Yiri] ⇒
wi. From the IH we get
∗
(ri−1, wi, Yi) ` (ri, , )
221
From Theorem 6.5 we get
∗
(ri−1, wiwi+1 · · · wk , YiYi+1 · · · Yk ) `
(ri, wi+1 · · · wk , Yi+1 · · · Yk )
222
Deterministic PDA’s
0 , Z 0 /0 Z 0
1 , Z 0 /1 Z 0
0 , 0 /0 0
0 , 1 /0 1
1 , 0 /1 0 0, 0/ ε
1 , 1 /1 1 1, 1/ ε
Start
q0 q q2
1
c , Z 0 /Z 0 ε , Z 0 /Z 0
c, 0/ 0
c, 1/ 1
223
We’ll show that Regular⊂ L(DPDA) ⊂ CFL
Proof: Homework
225
• We have seen that Regular⊆ L(DPDA).
228
Chomsky Normal Form
229
Eliminating Useless Symbols
∗
• A symbol X is generating if X ⇒ w, for some
G
w∈T ∗
∗
• A symbol X is reachable if S ⇒ αXβ, for
G
some {α, β} ⊆ (V ∪ T )∗
230
Example: Let G be
S → AB|a, A → b
232
Proof: We first prove that G1 has no useless
symbols:
∗
Let X remain in V1 ∪T1. Thus X ⇒ w in G1, for
some w ∈ T ∗. Moreover, every symbol used in
∗
this derivation is also generating. Thus X ⇒ w
in G2 also.
∗
Then, let w ∈ L(G). Thus S ⇒ w. Each sym-
G
bol is this derivation is evidently both reach-
able and generating, so this is also a derivation
of G1.
Thus w ∈ L(G1).
234
We have to give algorithms to compute the
generating and reachable symbols of G = (V, T, P, S).
Basis: g(G) == T
235
Theorem 7.4: At saturation, g(G) contains
all and only the generating symbols of G.
Proof:
Proof: Homework.
237
Eliminating -Productions
Basis: n(G) == {A : A → ∈ P }
239
Example: Let G be
A → aAA|aA|aA|a
the third
B → bBB|bB|bB|b
We then delete rules with -bodies, and end up
with grammar G1 :
240
Theorem 7.9: L(G1) = L(G) \ {}.
∗ ∗
(]) A ⇒ w in G1 if and only if w 6= and A ⇒ w
in G.
∗
⊆-direction: Suppose A ⇒ w in G1. Then
clearly w 6= (Why?). We’ll show by and in-
duction on the length of the derivation that
∗
A ⇒ w in G also.
∗
Furhtermore, w = w1w2 · · · wk , and Xi ⇒ wi in
G1 in less than n steps. By the IH we have
∗
Xi ⇒ wi in G. Now we get
∗ ∗
A ⇒ Y1Y2 · · · Ym ⇒ X1X2 · · · Xk ⇒ w1w2 · · · wk = w
G G G
242
∗
⊇-direction: Let A ⇒ w, and w 6= . We’ll show
G
by induction of the length of the derivation
∗
that A ⇒ w in G1.
∗
Now X1X2 · · · Xk ⇒ w (Why?)
G
243
∗
Each Xj /Yj ⇒ wj in less than n steps, so by
G
∗
IH we have that if w 6= then Yj ⇒ wj in G1.
Thus
∗
A ⇒ X1X2 · · · Xk ⇒ w in G1
244
Eliminating Unit Productions
A→B
is a unit production, whenever A and B are
variables.
I → a | b | Ia | Ib | I0 | I1
F→ I | (E)
T→ F |T ∗F
E→ T |E+T
245
We’ll expand rule E → T and get rules
E → F, E → T ∗ F
We then expand E → F and get
E → I|(E)|T ∗ F
Finally we expand E → I and get
E → a | b | Ia | Ib | I0 | I1 | (E) | T ∗ F
A → B, B → C, C → A
246
∗
(A, B) is a unit pair if A ⇒ B using unit pro-
ductions only.
∗
Note: In A → BC, C → we have A ⇒ B, but
not using unit productions only.
Proof: Easy.
247
Given G = (V, T, P, S) we can construct G1 =
(V, T, P1, S) that doesn’t have unit productions,
and such that L(G1) = L(G) by setting
P1 = {A → α : α ∈
/ V, B → α ∈ P, (A, B) ∈ u(G)}
Pair Productions
(E, E) E →E+T
(E, T ) E →T ∗F
(E, F ) E → (E)
(E, I) E → a | b | Ia | Ib | I0 | I1
(T, T ) T →T ∗F
(T, F ) T → (E)
(T, I) T → a | b | Ia | Ib | I0 | I1
(F, F ) F → (E)
(F, I) F → a | b | Ia | Ib | I0 | I1
(I, I) I → a | b | Ia | Ib | I0 | I1
1. Eliminate -productions
in this order.
249
Chomsky Normal Form, CNF
• A → α, where A ∈ V , and α ∈ T .
250
• For step 2, for every terminal a that appears
in a body of length ≥ 2, create a new variable,
say A, and replace a by A in all bodies.
Then add a new rule A → a.
A → B1 B2 · · · Bk ,
k ≥ 3, introduce new variables C1, C2, . . . Ck−2,
and replace the rule with
A → B1C1
C1 → B2C2
···
Ck−3 → Bk−2Ck−2
Ck−2 → Bk−1Bk
251
Illustration of the effect of step 3
B1 C1
B2 C2
.
.
.
C k -2
B k-1 Bk
(a)
B1 B2 . . . Bk
(b)
252
Example of CNF conversion
E → E + T | T ∗ F | (E) | a | b | Ia | Ib | I0 | I1
T → T ∗ F | (E)a | b | Ia | Ib | I0 | I1
F → (E) a | b | Ia | Ib | I0 | I1
I → a | b | Ia | Ib | I0 | I1
E → EP T | T M F | LER | a | b | IA | IB | IZ | IO
T → T M F | LER | a | b | IA | IB | IZ | IO
F → LER | a | b | IA | IB | IZ | IO
I → a | b | IA | IB | IZ | IO
A → a, B → b, Z → 0, O → 1
P → +, M → ∗, L → (, R →)
253
For step 3, we replace
E → EP T by E → EC1, C1 → P T
E → T M F, T → T M F by
E → T C2, T → T C2, C2 → M F
E → EC1 | T C2 | LC3 | a | b | IA | IB | IZ | IO
T → T C2 | LC3 | a | b | IA | IB | IZ | IO
F → LC3 | a | b | IA | IB | IZ | IO
I → a | b | IA | IB | IZ | IO
C1 → P T, C2 → M F, C3 → ER
A → a, B → b, Z → 0, O → 1
P → +, M → ∗, L → (, R →)
254
The size of parse trees
Proof: Induction on n.
1. |vwx| ≤ n.
2. vx 6=
A i =A j
Aj
u v w x y
256
Proof:
A0
A1
A2
.
.
.
Ak
a
257
Since G has only m variables, at least one vari-
able has to be repeated. Suppose Ai = Aj ,
where k − m ≤ i < j ≤ k (choose Ai as close to
the bottom as possible).
A i =A j
Aj
u v w x y
258
Then we can pump the tree in (a) as uv 0wx0y
(tree (b)) or uv 2wx2 (tree (c)), and in general
as uv iwxiy, i ≥ 0.
A
(a)
A
u v w x y
A
(b)
w
u y
A
(c)
A
u v A x y
v w x
259
Closure Properties of CFL’s
Consider a mapping
∗
s : Σ → 2∆
where Σ and ∆ are finite alphabets. Let w ∈ Σ∗,
where w = a1a2 . . . an, and define
267
Example: Σ = {0, 1}, ∆ = {a, b},
s(0) = {anbn : n ≥ 1}, s(1) = {aa, bb}.
268
Proof: We start with grammars
G = (V, Σ, P, S)
for L, and
G0 = (V 0, T 0, P 0, S)
where
V 0 = ( a∈Σ Va) ∪ V
S
0 S
T = a∈Σ Ta
269
Now we have to show that
• L(G0) = s(L).
Sa Sa Sa
1 2 n
x1 x2 xn
Sa Sa Sa
1 2 n
x1 x2 xn
271
Applications of the Substitution Theorem
a 7→ {h(a)}
Then h(L) = s(L).
272
Theorem: If L is CF, then so in LR .
P R = {A → αR : A → α ∈ P }
Show at home by inductions on the lengths of
the derivations in G (for one direction) and in
GR (for the other direction) that (L(G))R =
L(GR ).
273
CFL’s are not closed under ∩
274
CF ∩ Regular = CF
P = (QP , Σ, Γ, δP , qP , Z0, FP )
by final state, and let R be accepted by DFA
FA
state
Stack
275
Formally, define
∗
Prove at home by an induction `, both for P
and for P 0 that
∗
(qP , w, Z0) `
P
(q, , γ), and δ̂(qA, w) ∈ FA
if and only if
(qP , qA), w, Z0 (q, δ̂(pA, w)), , γ
276
Theorem 7.29: Let L, L1, L2 be CFL’s and
R regular. Then
1. L \ R is CF
2. L̄ is not necessarily CF
3. L1 \ L2 is not necessarily CF
Proof:
L1 ∩ L2 = L1 ∪ L2
always would be CF.
277
Inverse homomorphism
Buffer
a h(a)
Input h
PDA Accept/
state reject
Stack
278
Let L be accepted by PDA
• Q0 = {(q, x) : q ∈ Q, x ∈ suffix(h(a)), a ∈ Σ}
• δ 0((q, ), a, X) =
{((q, h(a)), X) : 6= a ∈ Σ, q ∈ Q, X ∈ Γ}
∗
• (q0, h(w), Z0) ` (p, , γ) in P if and only if
∗
((q0, ), w, Z0) ` ((p, ), , γ) in P 0.
279
Decision Properties of CFL’s
280
Converting between CFA’s and PDA’s
• Input size is n.
281
Avoidable exponential blow-up
(slide 210)
282
• Now, k will be ≤ 2 for all rules.
283
Converting into CNF
Good news:
(slides 229,232,234)
(slides 244,245)
(slide 248)
284
Bad news:
(slide 236)
285
Testing emptiness of CFL’s
Generating?
Count
A ?
B yes A B c D B 3
C B A 2
286
Creation and initialzation of the array is O(n)
287
w ∈ L(G)?
Inefficient way:
288
CYK-algo for membership testing
Input is w = a1a2 · · · an
X 15
X 14 X 25
X 13 X 24 X 35
X 12 X 23 X 34 X 45
X 11 X 22 X 33 X 44 X 55
a1 a2 a3 a4 a5
289
To fill the table we work row-by-row, upwards
Basis: Xii == {A : A → ai is in G}
Induction:
A ∈ Xij , if
∗
A ⇒ aiai + 1 · · · aj , if
for some k < j, and A → BC, we have
∗ ∗
B ⇒ aiai+1 · · · ak , and C ⇒ ak+1ak+2 · · · aj , if
B ∈ Xik , and C ∈ Xkj
290
Example:
G has productions
S → AB|BC
A → BA|a
B → CC|b
C → AB|a
{S,A,C}
- {S,A,C}
- {B} {B}
b a a b a
291
To compute Xij we need to compare at most
n pairs of previously computed sets:
as suggested below
293
∞
294
Problems that computers cannot solve
main()
{
printf(‘‘hello, world\n’’);
}
295
What about the program in Fig. 8.2 in the
textbook?
xn + y n = z n
has a solution where x, y, and z are integers.
296
The hypothetical “Hello, world” tester H
I Hello-world yes
tester
P H no
I yes
H1
P hello, world
297
Modify H1 so that it takes only P as input,
stores P and uses it both as P and I. We get
program H2.
yes
P H2
hello, world
yes
H2 H2
hello, world
Finite
control
... B B X1 X2 Xi Xn B B ...
In a move, a TM will
1. Change state
299
A (deterministic) Turing Machine is a 7-tuple
M = (Q, Σ, Γ, δ, q0, B, F ),
where
300
Instantaneous Description
X1X2 · · · Xi−1qXiXi+1 · · · Xn
where
301
The moves and the language of a TM
We’ll use `M
to indicate a move by M from a
configuration to another.
X1X2 · · · Xi−1qXiXi+1 · · · Xn `
M
X1X2 · · · pXi−1Y Xi+1 · · · Xn
X1X2 · · · Xi−1qXiXi+1 · · · Xn `
M
X1X2 · · · Xi−1Y pXi+1 · · · Xn
302
A TM for {0n 1n : n ≥ 1}
X/ X
Y/ Y
B/ B
q3 q4
Y/ Y
303
A TM with “output”
B/ B
0/ 0 1/ 1
Start 0/ B 1/ 1 0/ 1 0/ 0
q0 q1 q2 q3
1/ 1
1/ B B/ B
B/ B B/ 0
q5 q6 q4
0/ B 0/ 0
1/ B 1/ B
304
Programming Techniques for TM’s
• Storage in State
305
• Multiple Tracks for {wcw : w ∈ {0, 1}∗}
State q
Storage A B C
Track 1 X
Track 2 Y
Track 3 Z
Γ = {B, ∗} × {0, 1, c, B}
306
• Subroutines
0/ B
0/0
0/0 B /B
1/ B 1/ B
q12 q11 q10
0/ B
Start 0/ X B /0
q1 q2 q3
X/ X
1/1
q4 1/1 q5
X /0
307
Variations of the basic TM
... ...
... ...
... ...
308
Theorem: Every language accepted by a multi-
tape TM M is RE
A1 A2 Ai Aj
B1 B2 Bi Bj
309
• Nondeterministic TM’s
Finite
control
Queue x ...
ID1 * ID2 * ID3 * ID4 *
of ID’s
Scratch
tape
310
Undecidability
2. w ∈ {0, 1}∗.
3. M accepts w.
311
The landscape of languages (problems)
Ld .
Lu .
RE Recursive
but
not
recursive
Not RE
312
Encoding TM’s
313
Example:
where
δ 0 1 B
→ q1 (q3 , 0, R)
?q2
q3 (q1 , 1, R) (q2 , 0, R) (q3 , 1, L)
0100100010100
0001010100100
00010010010100
0001000100010010
01001000101001100010101001001100010010010100110001000100010010
314
The diagonalization language Ld
j
1 2 3 4 ...
1 0 1 1 0 ...
2 1 1 0 0 ...
i ...
3 0 0 1 1
4 0 1 0 1 ...
. . . . . . .
. . . . . .
. . . . .
Diagonal
Define Ld = {wi : wi ∈
/ L(Mi)}
Proof:
Accept Accept
w M
Reject Reject
Proof:
Accept Accept
M1
w
Accept Reject
M2
316
The universal language Lu
TM U where L(U ) = Lu
Finite
control
Input M w
Scratch
317
Operation of U :
318
5. Make the move
319
Theorem: Lu is RE but not recursive.
• wi ∈ L(M 0 ) ⇒ wi 111wi ∈ Lu ⇒ wi ∈
/ L(Mi ) ⇒ wi ∈ Ld
320
Reductions for proving lower bounds
yes
yes
no
no
P1 P2
Le = {enc(M ) : L(M ) = ∅}
Guessed
w Accept Accept
Mi U
M for Lne
322
Theorem: Lne = {enc(M ) : L(M ) 6= ∅} is not
recursive.
w ∈ L(M ) ⇔ L(M 0) 6= ∅
w Accept Accept
x M
M’
324
Properties of the RE languages
325
Proof of Rice’s Theorem:
Suppose first ∅ ∈
/ P.
Accept
w M start Accept Accept
ML
x
M’
We have reduced Lu to LP .
327
Post’s Correspondence Problem
Example:
List A List B
i wi xi
1 1 111
2 10111 10
3 10 0
Solution: i1 = 2, i2 = 1, i3 = 1, i4 = 3 gives
w2w1w1w3 = x2x1x1x3 = 101111110
w2 011
If i1 = 2 we cannot match x2= 11
w3 101
If i1 = 3 we cannot match x3= 011
w1 10
Therefore i1 = 1 and a partial solution is x1= 101
329
w1 w1 1010
If i2 = 1 we cannot match x1x1=101101
w1 w2 10011
If i2 = 2 we cannot match x1x2=10111
w1 w3 10101
Only i2 = 3 is possible giving x1x3=101011
w1 w3 10101101
Only i3 = 3 is possible giving x1x3=101011011
w1 w3 w3 10101101101
Only i4 = 3 is possible giving x1x3x3=101011011011
330
The Modified PCP:
Example:
List A List B
i wi xi
1 1 111
2 10111 10
3 10 0
w1 1
Any solution would have to begin with x1=111
w1 w2 110111
If i2 = 2 we cannot match x1x2=11110
w1 w2 110
If i2 = 3 we cannot match x1x2=1110
w1 w2 11
If i2 = 1 we have to match x1x2=111111
We are back to “square one.”
331
We reduce MPCP to PCP
y0 = ∗y1 and z0 = z1
332
Example:
Let MPCP be
List A List B
i wi xi
1 1 111
2 10111 10
3 10 0
List A List B
i yi zi
0 *1* *1*1*1
1 1* *1*1*1
2 1*0*1*1*1* *1*0
3 1*0* *
4 $ *$
333
PCP is undecidable
Lu an MPCP an PCP
algorithm algorithm
334
Let M = (Q, Σ, Γ, δ, q0, B, F ) be the TM. WLOG
assume that M never prints a blank, and never
moves the head left of the initial position.
List A List B
# #q0w#
2. For each X ∈ Γ
List A List B
X X
# #
3. ∀q ∈ Q \ F, ∀p ∈ Q, ∀X, Y, Z ∈ Γ
List A List B
qX Yp if δ(q, X) = (p, Y, R)
ZqX pZY if δ(q, X) = (p, Y, L), Z ∈ Γ
q# Y p# if δ(q, B) = (p, Y, R)
Zq# pZY # if δ(q, B) = (p, Y, L), Z ∈ Γ
335
4. ∀q ∈ F, ∀X, Y ∈ Γ
List A List B
XqY q
Xq q
qY q
5. Final pair
List A List B
q## #
336
Example: Lu instance: (M, 01)
337