Notes
Notes
Fall 2023
Alexis Maciel
Department of Computer Science
Clarkson University
Preface vii
1 Introduction 1
2 Finite Automata 5
2.1 Turing Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Introduction to Finite Automata . . . . . . . . . . . . . . . . . . . . . 9
2.3 More Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.4 Formal Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.5 Closure Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4 Regular Expressions 87
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.2 Formal Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
iv CONTENTS
4.3 More Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
4.4 Converting Regular Expressions into DFA’s . . . . . . . . . . . . . . . 95
4.5 Converting DFA’s into Regular Expressions . . . . . . . . . . . . . . . 98
4.6 Precise Description of the Algorithm . . . . . . . . . . . . . . . . . . 112
8 Undecidability 199
8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
CONTENTS v
Index 201
vi CONTENTS
Preface
If you’ve taken a few computer science courses, you may now have the feel-
ing that programs can always be written to solve any computational problem.
Writing the program may be hard work. For example, it may involve learning a
difficult technique. And many hours of debugging. But with enough time and
effort, the program can be written.
So it may come as a surprise that this is not the case: there are computational
problems for which no program exists. And these are not ill-defined problems
— Can a computer fall in love? — or uninteresting toy problems. These are
precisely defined problems with important practical applications.
Theoretical computer science can be briefly described as the mathematical
study of computation. These notes will introduce you to this branch of com-
puter science by focusing on computability theory and automata theory. You
will learn how to precisely define what computation is and why certain com-
putational problems cannot be solved. You will also learn several concepts and
techniques that have important applications. Chapter 1 provides a more detailed
introduction to this rich and beautiful area of computer science.
These notes were written for the course CS345 Automata Theory and Formal
Languages taught at Clarkson University. The course is also listed as MA345 and
CS541. The prerequisites are CS142 (a second course in programming) and
viii PREFACE
MA211 (a course on discrete mathematics with a focus on writing mathematical
proofs).
These notes were typeset using LaTeX (MiKTeX implementation with the TeX-
works environment). The paper size and margins are set small to make it easier
to read the notes on the relatively small screen of a laptop or tablet. But then, if
you’re going to print the notes, don’t print them to “fit the page”. That would give
you pages with huge text and tiny margins. Instead, print them double-sided,
at “actual size” (100%) and centered. If your favorite Web browser doesn’t al-
low you to do that, download the notes and print them from a standalone PDF
reader such as Adobe Acrobat.
Feedback on these notes is welcome. Please send comments to
[email protected].
Chapter 1
Introduction
In this chapter, we introduce the subject of these notes, automata theory and
computability theory. We explain what this is and why it is worth studying.
Computer science can be divided into two main branches. One branch is con-
cerned with the design and implementation of computer systems, with a special
focus on software. (Computer hardware is the main focus of computer engi-
neering.) This includes not only software development but also research into
subjects like operating systems and computer security. This branch of computer
science is called applied computer science. It can be viewed as the engineering
component of computer science.
The other branch of computer science is the mathematical study of compu-
tation. One of its main goals is to determine which computational problems can
and cannot be solved, and which ones can be solved efficiently. This involves
discovering algorithms for computational problems, but also finding mathemat-
ical proofs that such algorithms do not exist. This branch of computer science is
called theoretical computer science or the theory of computation.1 It can be viewed
1
The two main US-based theory conferences are the ACM’s Symposium on Theory of Comput-
2 CHAPTER 1. INTRODUCTION
as the science component of computer science.2
To better illustrate what theoretical computer science is, consider the Halting
Problem. The input to this computational problem is a program P written in
some fixed programming language. To keep things simple and concrete, let’s
limit the problem to C++ programs whose only input is a single text file. The
output of the Halting Problem is the answer to the following question: Does the
program P halt on every possible input? In other words, does P always halt no
matter how it is used?
It should be clear that the Halting Problem is both natural and relevant.
Software developers already have a tool that determines if the programs they
write are correctly written according to the rules of the language they are using.
This is part of what the compiler does. It would clearly be useful if software
developers also had a tool that could determine if their programs are guaranteed
to halt on every possible input.
Unfortunately, it turns out such a tool does not exist. Not that it currently
does not exist and that maybe one day someone will invent one. No, it turns out
that there is no algorithm that can solve the Halting Problem.
How can something like that be known? How can we know for sure that a
computational problem has no algorithms — none, not now, not ever? Perhaps
the only way of being absolutely sure of anything is to use mathematics. What
this means is that we will define the Halting Problem precisely and then prove
a theorem that says that no algorithm can solve the Halting Problem.
Note that this also requires that we define precisely what we mean by an
algorithm or, more generally, what it means to compute something. Such a def-
ing (STOC) and the IEEE’s Symposium on Foundations of Computing (FOCS). One of the main Eu-
ropean theory conferences is the Symposium on Theoretical Aspects of Computer Science (STACS).
2
Note that many areas of computer science have both applied and theoretical aspects. Artifi-
cial intelligence is one example. Some people working in AI produce actual systems that can be
used in practice. Others try to discover better techniques by using theoretical models.
3
inition is called a model of computation. We want a model that’s simple enough
so we can prove theorems about it. But we also want this model to be close
enough to real-life computation so we can claim that the theorems we prove say
something that’s relevant about real-life computation.
The general model of computation we will study is the Turing machine. We
won’t go into the details right now but the Turing machine is easy to define
and, despite the fact that it is very simple, it captures the essential operations of
real-life digital computers.3
Once we have defined Turing machines, we will be able to prove that no
Turing machine can solve the Halting Problem. And we will take this as clear
evidence that no real-life computer program can solve the Halting Problem.
The above discussion was meant to give you a better idea of what theoretical
computer science is. The discussion focused on computability theory, the study
of which computational problems can and cannot be solved. It’s useful at this
point to say a bit more about why theoretical computer science is worth studying
(either as a student or as a researcher).
First, theoretical computer science provides critical guidance to applied com-
puter scientists. For example, because the Halting Problem cannot be solved by
any Turing machine, applied computer scientists do not waste their time trying
to write programs that solves this problem, at least not in full generality. There
are many other other examples, many of which are related to software verifica-
tion.
Second, theoretical computer science involves concepts and techniques that
have found important applications. For example, regular expressions, and algo-
rithms that manipulate them, are used in compiler design and in the design of
many programs that process input.
3
In fact, the Turing machine was used by its inventor, Alan Turing, as a basic mathematical
blueprint for the first digital computers.
4 CHAPTER 1. INTRODUCTION
Third, theoretical computer science is intellectually interesting, which leads
some people to study it just out of curiosity. This should not be underestimated:
many important scientific and mathematical discoveries have been made by peo-
ple who were mainly trying to satisfy their intellectual curiosity. A famous ex-
ample is number theory, which plays a key role in the design of modern cryp-
tographic systems. The mathematicians who investigated number theory many
years ago had no idea that their work would make possible electronic commerce
as we know it today.
The plan for the rest of these notes is as follows. The first part of the notes
will focus on finite automata, which are essentially Turing machines without
memory. The study of finite automata is good practice for the study of general
Turing machines. But we will also learn about the regular expressions we men-
tioned earlier. Regular expressions are essentially a way of describing patterns
in strings. We will learn that regular expressions and finite automata are equiv-
alent, in the sense that the patterns that can be described by regular expressions
are also the patterns that can be recognized by finite automata. And we will
learn algorithms that can convert regular expressions into finite automata, and
vice-versa. These are the algorithms that are used in the design of programs,
such as compilers, that process input. We will also learn how to prove that cer-
tain computational problems cannot be solved by finite automata, which also
means that certain patterns cannot be described by regular expressions.
Next, we will study context-free grammars, which are essentially an exten-
sion of regular expressions that allows the description of more complex patterns.
Context-free grammars are needed for the precise definition of modern program-
ming languages and therefore play a critical role in the design of compilers.
The final part of these notes will focus on general Turing machines and will
culminate in the proof that certain problems, such as the Halting Problem, can-
not be solved by Turing machines.
Chapter 2
Finite Automata
In this chapter, we study a very simple model of computation called a finite au-
tomaton. Finite automata are useful for solving certain problems but studying
finite automata is also good practice for the study of Turing machines, the gen-
eral model of computation we will study later in these notes.
a b, R
b b, L
c a, R
Figure 2.1: A transition table for a model of computation that’s too simple
a b c
q0 q1 , b, R q0 , a, L q2 , b, R
q1 q1 , a, L q1 , c, R q0 , b, R
q2 q0 , c, R q2 , b, R q1 , c, L
This is a very simple model of computation but we may have gone too far:
since the value we write in each memory location depends only on the contents
of that memory location, there is no way to copy a value from one memory loca-
tion to another. To fix this without reintroducing more complicated instructions,
we simply add to our model a special register we call the state of the program.
Each instruction will now have to consider the current state in addition to the
current memory location. A sample transition table is shown in Figure 2.2, as-
suming that q0 , q1 , q2 are the possible states.
The model we have just described is called the Turing machine. (Each “pro-
gram” in this model is considered a “machine”.) To complete the description of
the model, we need to specify a few more details. First, we restrict our atten-
8 CHAPTER 2. FINITE AUTOMATA
memory
control
unit
yes/no
tion to decision problems, which are computational problems in which the input
is a string and the output is the answer to some yes/no question about the in-
put. The Halting Problem mentioned in the previous chapter is an example of a
decision problem.
Second, we need to describe how the input is given to a Turing machine,
how the output is produced by the machine, and how the machine terminates
its computation. For the input, we assume that initially, the memory of the
Turing machine contains the input and nothing else. For the output, each Turing
machine will have special yes and no states. Whenever one of these states is
entered, the machine halts and produces the corresponding output. This takes
care of termination too.
Figure 2.3 shows a Turing machine. The control unit consists of the transition
table and the state.
The Turing machine is the standard model that is used to study computa-
tion mathematically. (As mentioned earlier, the Turing machine was used by
its inventor, Alan Turing, as a basic mathematical blueprint for the first digital
computers.) The Turing machine is clearly a simple model but it is also relevant
2.2. INTRODUCTION TO FINITE AUTOMATA 9
input (read-only)
control
unit
yes/no
Figure 2.5: The transition table of a finite automaton that determines if the input
is a valid C++ identifiers
Here are a few more details about finite automata. First, in addition to being
read-only, the input must be read from left to right. In other words, the input
head can only move to the right. Second, the computation of a finite automaton
starts at the beginning of the input and automatically ends when the end of
the input is reached, that is, after the last symbol of the input has been read.
Finally, some of the states of the automaton are designated as accepting states.
If the automaton ends its computation in one of those states, then we say that
the input is accepted. In other words, the output is yes. On the other hand, if
the automaton ends its computation in a non-accepting state, then the input is
rejected — the output is no.
Now that we have a pretty good idea of what a finite automaton is, let’s look
at a couple of examples. A valid C++ identifier consists of an underscore or a
letter followed by any number of underscores, letters and digits. Consider the
problem of determining if an input string is a valid C++ identifier. This is a
decision problem.
Figure 2.5 shows the transition table of a finite automaton that solves this
problem. As implied by the table, the automaton has three states. State q0 is the
start state of the automaton. From that state, the automaton enters state q1 if
it the first symbol of the input is an underscore or a letter. The automaton will
2.2. INTRODUCTION TO FINITE AUTOMATA 11
remain in state q1 as long as the input consists of underscores, letter and digits.
In other words, the automaton finds itself in state q1 if the portion of the string
it has read so far is a valid C++ identifier.
On the other hand, the automaton enters state q2 if it has decided that the
input cannot be a valid C++ identifier. If the automaton enters state q2 , it will
never be able to leave. This corresponds to the fact that once we see a character
that causes the string to be an invalid C++ identifier, there is nothing we can see
later that could fix that. A non-accepting state from which you cannot escape is
sometimes called a garbage state.
State q1 is accepting because if the computation ends in that state, then this
implies that the entire string is a valid C++ identifier. The start state is not an
accepting state because the empty string, the one with no characters, is not a
valid identifier.
Let’s look at some sample strings. When reading the string input_file,
the automaton enters state q1 and remains there until the end of the string.
Therefore, this string is accepted, which is correct. On the other hand, when
reading the string input−file, the automaton enters state q1 when it sees the
first i but then leaves for state q2 when it encounters the dash. This will cause
the string to be rejected, which is correct since dashes are not allowed in C++
identifiers.
A finite automaton can also be described by using a graph. Figure 2.6 shows
the transition graph of the automaton for valid C++ identifiers. Each state is
represented by a node in this graph. Each edge connects two states and is labeled
by an input symbol. If an edge goes from q to q0 and is labeled a, this indicates
that when in state q and reading an a, the automaton enters state q0 . Each
such step is called a move or a transition (hence the terms transition table and
transition graph). The start state is indicated with an arrow and the accepting
states have a double circle.
12 CHAPTER 2. FINITE AUTOMATA
underscore,
letter,
digit
underscore,
q0 letter q1
digit, other
other
q2 any
Figure 2.6: The transition graph of a finite automaton that determines if the
input is a valid C++ identifier
2.2. INTRODUCTION TO FINITE AUTOMATA 13
d
q0 d q1 d q2 d q3 − q4 d q5 d q6 d q7 d q8
Figure 2.7: A finite automaton that determines if the input is a correctly format-
ted phone number
Transition graphs and transition tables provide exactly the same information.
But the graphs make it easier to visualize the computation of the automaton,
which is why we will draw transition graphs whenever possible. There are some
circumstances, however, where it is impractical to draw a graph. We will see
examples later.
Let’s consider another decision problem, the problem of determining if an
input string is a phone number in one of the following two formats: 7 digits, or
3 digits followed by a dash and 4 digits. Figure 2.7 shows a finite automaton
that solves this problem. The transition label d stands for digit.
In a finite automaton, from every state, there should be a transition for every
possible input symbol. This means that the graph of Figure 2.7 is missing many
transitions. For example, from state q0 , there is no transition labeled − and
there is no transition labeled by a letter. All those missing transitions correspond
to cases in which the input string should be rejected. Therefore, we can have
all of those missing transitions go to a garbage state. We chose not to draw
those transitions and the garbage state simply to avoid cluttering the diagram
unnecessarily.
It is interesting to compare the finite automata we designed in this section
with C++ programs that solve the same problems. For example, Figure 2.8
shows an algorithm that solves the C++ identifier problem. The algorithm is
14 CHAPTER 2. FINITE AUTOMATA
if (end of input)
return no
read char c
if (c is not underscore or letter)
return no
while (not end of input)
read char c
if (c is not underscore, letter or digit)
return no
return yes
Figure 2.8: A simple algorithm that determines if the input is a valid C++ iden-
tifier
q = start_state()
while (not end of input)
read symbol c
q = next_state(q, c)
if (is_accepting(q))
return yes
else
return no
Study Questions
2.2.1. Can finite automata solve problems that pseudocode algorithms cannot?
2.2.2. What are three advantages of finite automata over pseudocode algo-
16 CHAPTER 2. FINITE AUTOMATA
rithms?
Exercises
2.2.3. Consider the problem of determining if a string is an integer in the follow-
ing format: an optional minus sign followed by at least one digit. Design
a finite automaton for this problem.
2.2.5. Suppose that a valid C++ identifier is no longer allowed to consist of only
underscores. Modify the finite automaton of Figure 2.6 accordingly.
2.2.6. Add optional area codes to the phone number problem we saw in this
section. That is, consider the problem of determining if a string is a phone
number in the following format: 7 digits, or 10 digits, or 3 digits followed
by a dash and 4 digits, or 3 digits followed by a dash, 3 digits, another
dash and 4 digits. Design a finite automaton for this problem.
2.2.7. Convert the finite automaton of Figure 2.6 into a high-level pseudocode
algorithm by using the technique explained in this section. That is, write
pseudocode for the functions starting_state(), next_state(q,
c) and is_accepting(q) of Figure 2.9.
2.2.8. Repeat the previous exercise for the finite automaton of Figure 2.7. (Don’t
forget the garbage state.)
2.3. MORE EXAMPLES 17
Definition 2.1 An alphabet is a finite set whose elements are called symbols.
This definition allows us to talk about the input alphabet of a finite automaton
instead of having to say the set of possible input symbols of a finite automaton. In
this context, symbols are sometimes also called letters or characters, as we did
in the previous section.
The length of a string is the number of symbols it contains. Note that a string
can be empty: the empty string has length 0 and contains no symbols. In these
notes, we will use " to denote the empty string.
Now, each finite automaton solves a problem by accepting some strings and
rejecting the others. Another way of looking at this is to say that the finite
automaton recognizes a set of strings, those that it accepts.
Definition 2.4 The language recognized by a finite automaton M (or the lan-
guage of M ) is the set of strings accepted by M .
18 CHAPTER 2. FINITE AUTOMATA
0, 1
q0 1 q1
q2 0, 1
Figure 2.10: A DFA for the language of strings that start with 1
Example 2.5 Figure 2.10 shows a DFA for the language of strings that start
with 1. t
u
Example 2.6 Now consider the language of strings that end in 1. One difficulty
here is that there is no mechanism in a DFA that allows us to know whether the
symbol we are currently looking at is the last symbol of the input string. So we
2.3. MORE EXAMPLES 19
1
q0 1 q1
0 0
1
q2
have to always be ready, as if the current symbol was the last one.1 What this
means is that after reading every symbol, we have to be in an accept state if and
only if the portion of the input string we’ve seen so far is in the language.
Figure 2.11 shows a DFA for this language. Strings that begin in 1 lead to
state q1 while strings that begin in 0 lead to state q2 . Further 0’s and 1 cause
the DFA to move between these states as needed.
Notice that the starting state is not an accepting state because the empty
string, the string of length 0 that contains no symbols, does not end in 1. But
then, states q0 and q2 play the same role in the DFA: they’re both non-accepting
states and the transitions coming out of them lead to the same states. This
implies that these states can be merged to get the slightly simpler DFA shown in
Figure 2.12. t
u
1
To paraphrase a well-known quote attributed to Jeremy Schwartz, “Read every symbol as if
it were your last. Because one day, it will be.”
20 CHAPTER 2. FINITE AUTOMATA
0 1
1
q0 q1
0
Figure 2.12: A simpler DFA for the language of strings that end in 1
Example 2.7 Consider the language of strings of length at least two that begin
and end with the same symbol. A DFA for this language can be obtained by
combining the ideas of the previous two examples, as shown in Figure 2.13 t u
Example 2.8 Consider the language of strings that contain the string 001 as
a substring. What this means is that the symbols 0, 0, 1 occur consecutively
within the input string. For example, the string 0100110 is in the language but
0110110 is not.
Figure 2.14 shows a DFA for this language. The idea is that the DFA remem-
bers the longest prefix of 001 that ends the portion of the input string that has
been seen so far. For example, initially, the DFA has seen nothing, so the starting
state corresponds to the empty string. If the DFA then sees a 0, it moves to state
q1 . If it then sees a 1, then the portion of the input string that the DFA has seen
so far ends in 01, which is not a prefix of 001. So the DFA goes back to state q0 .
t
u
Example 2.9 Consider the language of strings that contain an even number of
1’s. Initially, the number of 1’s is 0, which is even. After seeing the first 1, that
number will be 1, which is odd. After seeing each additional 1, the DFA will
toggle back and forth between even and odd. This idea leads to the DFA shown
in Figure 2.15. Note how the input symbol 0 never affects the state of the DFA.
2.3. MORE EXAMPLES 21
0 1
1
q1 q2
1 0
q0
1 0
0
0
q3 q4
1
Figure 2.13: A DFA for the language of strings of length at least two that begin
and end with the same symbol
1 0 0, 1
q0 0 q1 0 q2 1 q3
1
Figure 2.14: A DFA for the language of strings that contain the substring 001
22 CHAPTER 2. FINITE AUTOMATA
0 0
1
q0 q1
1
Figure 2.15: A DFA for the language of strings that contain an even number of
1’s
0 0 0
q0 1 q1 1 q2
Figure 2.16: A DFA for the language of strings that contain a number of 1’s that’s
a multiple of 3
We say that in this DFA, and with respect to this language, the symbol is neutral.
t
u
Example 2.11 We can generalize this even further. For every number k ≥ 2,
consider the language of strings that contain a number of 1’s that’s a multiple of
2.3. MORE EXAMPLES 23
0 0 0 0
q0 1 q1 1 q2 1 1 qk−1
···
Figure 2.17: A DFA for the language of strings that contain a number of 1’s that’s
a multiple of k
0,1
k. Note that this defines an infinite number of languages, one for every possible
value of k. Each one of those languages can be recognized by a DFA since,
for every k ≥ 2, a DFA can be constructed to count modulo k, as shown in
Figure 2.17. u
t
Example 2.12 We end this section with three simple DFA’s for the basic but
important languages Σ∗ (the language of all strings), ; (the empty language)
and {"} (the language that contains only the empty string). The DFA’s are shown
in Figure 2.18. u
t
24 CHAPTER 2. FINITE AUTOMATA
Study Questions
2.3.1. What is an alphabet?
Exercises
2.3.5. Modify the DFA of Figure 2.13 so that strings of length 1 are also accepted.
2.3.6. Give DFA’s for the following languages. In all cases, the alphabet is {0, 1}.
a) The language of strings of length at least two that begin with 0 and
end in 1.
b) The language of strings of length at least two that have a 1 as their
second symbol.
c) The language of strings of length at least k that have a 1 in position
k. Do this in general, for every k ≥ 1. (You did the case k = 2 in part
(b).)
2.3.7. Give DFA’s for the following languages. In all cases, the alphabet is {0, 1}.
2.3.8. Give DFA’s for the following languages. In all cases, the alphabet is {0, 1}.
a) The language of strings of length at least two whose first two symbols
are the same.
b) The language of strings of length at least two whose last two symbols
are the same.
c) The language of strings of length at least two that have a 1 in the
second-to-last position.
• A transition table or graph that specifies a next state for every possible pair
(state, input character).
26 CHAPTER 2. FINITE AUTOMATA
Actually, to be able to specify all the transitions, we also need to know what the
possible input symbols are. This information should also be considered part of
the DFA:
We can also define what it means for a DFA to accept a string: run the algo-
rithm of Figure 2.9 and accept if the algorithm returns yes.
The above definition of a DFA and its operation should be pretty clear. But
it has a couple of problems. The first one is that it doesn’t say exactly what a
transition table or graph is. That wouldn’t be too hard to fix but the second
problem with the above definition is more serious: the operation of the DFA is
defined in terms of a pseudocode algorithm. For this definition to be complete,
we would need to also define what those algorithms are. But recall that we are
interested in DFA’s mainly because they are supposed to be easy to define. DFA’s
are not going to be simpler than pseudocode algorithms if the definition of a
DFA includes a pseudocode algorithm.
In the rest of this section, we will see that it is possible to define a DFA and
its operation without referring to either graphs, tables or algorithms. This will
be our formal definition. The above definition, in terms of a graph or table, and
an algorithm, will be considered an informal definition.
Example 2.14 Consider the DFA of Figure 4.1. Here’s a formal description of
this DFA. The DFA is ({q0 , q1 , q2 }, Σ, δ, q0 , {q1 }) where Σ is the set of all characters
that appear on a standard keyboard.
Before describing the transition function δ, it is useful to explain the role
that each state plays in the DFA. State q0 is just a start state. State q1 is entered
if the input is a valid identifier. State q2 is a garbage state.
28 CHAPTER 2. FINITE AUTOMATA
underscore,
letter underscore,
q0 q1
letter,
digit
digit, other
other
q2 any
t
u
There really was no need to formally describe the DFA of this example be-
cause the transition graph of Figure 4.1 is very simple and crystal clear. Here
are two additional examples of DFA’s. The second one will generalize the first
one and illustrate the usefulness of formal descriptions.
Example 2.15 Let’s go back the modulo 3 counting example and generalize it
in another way. Suppose that the input alphabet is now {0, 1, 2, . . . , 9} and
2.4. FORMAL DEFINITION 29
1,4,7
0,3,6,9 0,3,6,9 0,3,6,9
1,4,7 1,4,7
q0 q1 q2
2,5,8 2,5,8
2,5,8
Figure 2.20: A DFA for the language of strings whose digits add to a multiple
of 3
consider the language of strings that have the property that the sum of their
digits is a multiple of 3. For example, 315 is in the language because 3+1+5 = 9
is a multiple of 3. The idea is still to count modulo 3 but there are now more
cases to consider, as shown in Figure 2.20. u
t
Example 2.16 Now let’s generalize this. That is, over the alphabet
{0, 1, 2, . . . , 9}, for every number k, let’s consider the language of strings that
have the property that the sum of their digits is a multiple of k. Once again, the
idea is to count modulo k. For example, Figure 2.21 shows the DFA for k = 4.
It should be pretty clear that a DFA exists for every k. But the transition
diagram would be difficult to draw and would likely be ambiguous. This is an
example where we are better off describing the DFA by giving a formal descrip-
30 CHAPTER 2. FINITE AUTOMATA
1,5,9
2,6 2,6
0,4,8 0,4,8 0,4,8 0,4,8
2,6 2,6
3,7
Figure 2.21: A DFA for the language of strings whose digits add to a multiple
of 4
2.4. FORMAL DEFINITION 31
tion. Here it is: the DFA is (Q, Σ, δ, q0 , F ) where
Q = {q0 , q1 , q2 , . . . , qk−1 }
Σ = {0, 1, 2, . . . , 9}
F = {q0 }
u
t
We now define what it means for a DFA to accept its input string. Without
referring to an algorithm. Instead, we will only talk about the sequence of states
that the DFA goes through while processing its input string.
r0 = q 0
ri = δ(ri−1 , w i ), for i = 1, . . . , n.
Note how the sequence of states is defined by only referring to the transition
function, without referring to an algorithm that computes that sequence of states.
In the previous section, we defined the language of a DFA as the set of strings
accepted by the DFA. Now that we have formally defined what a DFA is and what
3
It is understood here that w1 , . . . , w n are the individual symbols of w.
32 CHAPTER 2. FINITE AUTOMATA
it means for a DFA to accept a string, that definition is complete. Here it is again,
for completeness:
L(M ) = {w ∈ Σ∗ | w is accepted by M }
where Σ is the input alphabet of M and Σ∗ denotes the set of all strings over Σ.
Many interesting languages are regular but we will soon learn that many
others are not. Those languages require algorithms that are more powerful than
DFA’s.
Study Questions
2.4.1. What is the advantage of the formal definition of a DFA over the informal
definition presented at the beginning of this section?
2.4.3. What does it mean for a DFA to accept a string? (Give a formal definition.)
2.4.6. Give a DFA for the language of strings of length at least k that have a 1 in
position k from the end. Do this in general, for every k ≥ 1. The alphabet
is {0, 1}. (You did the case k = 2 in Exercise 2.3.8, part (c).)
2.4.7. Give DFA’s for the following languages. In both cases, the alphabet is the
set of digits {0, 1, 2, . . . , 9}.
2.4.8. This exercise asks you to show that DFA’s can add, at least when the num-
bers are presented in certain ways. Consider the alphabet that consists of
symbols of the form [abc ] where a, b and c are digits. For example,
[631] and [937] are two of the symbols in this alphabet. If w is a string
of digits, let n(w) denote the number represented by w. For example, if
w is the string 428, then n(w) is the number 428. Now give DFA’s for the
following languages.
34 CHAPTER 2. FINITE AUTOMATA
a) The language of strings of the form [ x 0 y0 z0 ][ x 1 y1 z1 ] · · · [ x n yn zn ]
such that
n(x n · · · x 1 x 0 ) + n( yn · · · y1 y0 ) = n(zn · · · z1 z0 ).
n(x n · · · x 1 x 0 ) + n( yn · · · y1 y0 ) = n(zn · · · z1 z0 ).
1 0 0, 1
q0 0 q1 0 q2 1 q3
1
Figure 2.22: A DFA for the language of strings that contain the substring 001
This new language is the complement of the first one.4 And the above tech-
nique should work for any regular language L: by switching the accepting states
of a DFA for L, we should get a DFA for L.
In a moment, we will show in detail that the complement of a regular lan-
guage is always regular. This is an example of what is called a closure property.
For example, the set of natural numbers is closed under addition and multi-
plication but not under subtraction or division because 2 − 3 is negative and 1/2
is not even an integer. The set of integers is closed under addition, subtraction
and multiplication but not under division. The set of rational numbers is closed
p
under those four operations but not under the square root operation since 2 is
not a rational number.5
4
Note that to define precisely what is meant by the complement of a language, we need
to know the alphabet over which the language is defined. If L is a language over Σ, then its
complement is defined as follows: L = {w ∈ Σ∗ | w ∈ / L}. In other words, L = Σ∗ − L.
5
The proof of this fact is a nice example
p of a proof by contradiction and of the usefulness of
basic number
p theory. Suppose that 2 is a rational number. Let a and b be positive integers
such that 2 = a/b. Since fractions can be simplified, we can assume that a and b have no
36 CHAPTER 2. FINITE AUTOMATA
Theorem 2.21 The class of regular languages is closed under complementation.
Proof Suppose that A is regular and let M be a DFA for A. We construct a DFA
M 0 for A.
Let M 0 be the result of switching the accepting status of every state in M .
More precisely, if M = (Q, Σ, δ, q0 , F ), then M 0 = (Q, Σ, δ, q0 , Q − F ). We claim
that L(M 0 ) = A.
To prove that, suppose that w ∈ A. Then, in M , w leads to an accepting state.
This implies that in M 0 , w leads to the same state but this state is non-accepting
in M 0 . Therefore, M 0 rejects w.
A similar argument shows that if w ∈ / A, then M 0 accepts w. Therefore,
L(M 0 ) = A. t
u
Note that the proof of this closure property is constructive: it establishes the
existence of a DFA for A by providing an algorithm that constructs that DFA.
Proofs of existence are not always constructive. But when they are, the algo-
rithms they provide are often useful.
At this point, it is natural to wonder if the class of regular languages is closed
under other operations. A natural candidate is the union operation: if A and B
are two languages over Σ, then the union of A and B is A ∪ B = {w ∈ Σ∗ | w ∈
A or w ∈ B}.6
p
common factors (other than 1). Now, the fact that 2 = a/b implies that 2 = a2 /b2 and that
2b2 = a2 . Therefore, a is even and 4 divides a2 . But then, 4 must also divide 2b2 , which implies
that 2 divides b. Therefore, 2 divides both a and b, contradicting the fact that a and b have no
common factors.
6
We could also consider the union of languages that are defined over different alphabets. In
that case, the alphabet for the union would be the union of the two underlying alphabets: if A is
a language over Σ1 and B is a language over Σ2 , then A∪ B = {w ∈ (Σ1 ∪ Σ2 )∗ | w ∈ A or w ∈ B}.
In these notes, we will normally consider only the union of languages over the same alphabet
because this keeps things simpler and because this is probably the most common situation that
occurs in practice.
2.5. CLOSURE PROPERTIES 37
For example, consider the language of strings that either contain an even
number of 0’s or end in 1. Call this language L. It turns out that L is the union
of two languages we are familiar with:
L = L1 ∪ L2
where
L1 = {w ∈ {0, 1}∗ | w contains an even number of 0’s}
and
L2 = {w ∈ {0, 1}∗ | w ends in 1}
In particular, we already know how to construct DFA’s for L1 and L2 . These DFA’s
are the first two shown in Figure 2.23.
Now, a DFA for L can be designed by essentially simulating the DFA’s for L1
and L2 in parallel. That is, let M1 and M2 be the DFA’s for L1 and L2 . The DFA for
L, which we will call M , will “store” the current state of both M1 and M2 , and
update these states according to the transition functions of these DFA’s. This can
be implemented by having each state of M be a pair that combines a state of M1
with a state of M2 . The result is the third DFA shown in Figure 2.23.
In this DFA, the 0 transition coming out of q0 r0 goes to q1 r0 because in M1 ,
the 0 transition coming out of q0 goes to q1 and in M2 , the 0 transition coming
out of r0 stays at r0 . Similarly, the 1 transition coming out of q0 r0 goes to q0 r1
because in M1 , the 1 transition coming out of q0 stays at q0 while in M2 , the 1
transition coming out of r0 goes to r1 . The remaining transitions can be figured
out in the same way.
The accepting states of M are those pairs that have q0 as their first state or r1
as their second state. This is correct since M should accept the input whenever
either M1 or M2 accepts.
The above idea can be generalized to show that the union of any two regular
38 CHAPTER 2. FINITE AUTOMATA
1 1
0
q0 q1
0
r0 0
0 1
r1 1
0
q0 r0 0 q1 r0
1 0 0 1
q0 r1 q1 r1
1 1
Figure 2.23: A DFA for the language of strings that contain an even number of
0’s, a DFA for the language of strings that end in 1, and a DFA for the union of
these two languages
2.5. CLOSURE PROPERTIES 39
languages is always regular.
Proof Suppose that A1 and A2 are regular and let M1 and M2 be DFA’s for these
languages. We construct a DFA M that recognizes A1 ∪ A2 .
The idea, as explained above, is that M is going to simulate M1 and M2 in
parallel. More precisely, if after reading a string w, M1 would be in state r1 and
M2 would be in state r2 , then M , after reading the string w, will be in state
(r1 , r2 ).
Here are the full details. Suppose that Mi = (Q i , Σ, δi , qi , Fi ), for i = 1, 2.
Then let M = (Q, Σ, δ, q0 , F ) where
Q = Q1 × Q2
q0 = (q1 , q2 )
F = {(r1 , r2 ) | r1 ∈ F1 or r2 ∈ F2 }
Because the start state of M consists of the start states of M1 and M2 , and
because M updates its state according to the transition functions of M1 and M2 ,
it should be clear that after reading a string w, M will be in state (r1 , r2 ) where
r1 and r2 are the states that M1 and M2 would be in after reading w.7 Now, if
w ∈ A1 ∪ A2 , then either r1 ∈ F1 or r2 ∈ F2 , which implies that M accepts w. The
reverse is also true. Therefore, L(M ) = A1 ∪ A2 . t
u
7
We could provide more details here, if we thought our readers needed them. The idea is
to refer to the details in the definition of acceptance. Suppose that w = w1 · · · w n is a string of
length n and that r0 , r1 , . . . , rn is the sequence of states that M1 goes through while reading w.
40 CHAPTER 2. FINITE AUTOMATA
Corollary 2.23 The class of regular languages is closed under intersection.
In other words, F = F1 × F2 . t
u
So we now know that the class of regular languages is closed under the three
basic set operations: complementation, union and intersection. And the proofs
of these closure properties are all constructive.
We end this section with another operation that’s specific to languages. If A
and B are two languages over Σ, then the concatenation of A and B is
AB = {x y ∈ Σ∗ | x ∈ A and y ∈ B}.
That is, the concatenation of A and B consists of those strings we can form, in
all possible ways, by taking a string from A and appending to it a string from B.
ri = δ(ri−1 , w i ).
Suppose that s0 , s1 , . . . , sn is the sequence of states that M2 goes through while reading w. Again,
this means that s0 = q2 and that, for i = 1, . . . , n,
si = δ(si−1 , w i ).
Then (r0 , s0 ), (r1 , s1 ), . . . , (rn , sn ) has to be the sequence of states that M goes through while
reading w because (r0 , s0 ) = (q1 , q2 ) = q0 and, for i = 1, . . . , n,
A = {0k | k is even}
B = {1k | k is odd}
Then AB is the language of strings that consist of an even number of 0’s followed
by an odd number of 1’s:
Here’s another example that will demonstrate the usefulness of both the
union and concatenation operations. Let N be the language of numbers de-
fined in Exercise 2.2.4: strings that consist of an optional minus sign followed
by at least one digit, or an optional minus sign followed by any number of dig-
its, a decimal point and at least one digit. Let D∗ denote the language of strings
that consist of any number of digits. Let D+ denote the language of strings that
consist of at least one digit. Then the language N can be defined as follows:
In other words, a language that took 33 words to describe can now defined
with a mathematical expression that’s about half a line long. In addition, the
mathematical expression helps us visualize what the strings of the language look
like.
The obvious question now is whether the class of regular languages is closed
under concatenation? And whether we can prove this closure property construc-
tively.
42 CHAPTER 2. FINITE AUTOMATA
0 1
q0 q1 q10 q11
0 1
0
1
Figure 2.25: An idea for showing that the class of regular languages is closed
under concatenation
Study Questions
2.5.1. What does it mean for a set to be closed under a certain operation?
Exercises
2.5.3. Give DFA’s for the complement of each of the languages of Exercise 2.3.6.
44 CHAPTER 2. FINITE AUTOMATA
2.5.4. Each of the following languages is the union or intersection of two simpler
languages. In each case, give DFA’s for the two simpler languages and then
use the pair construction from the proof of Theorem 2.22 to obtain a DFA
for the more complex language. In all cases, the alphabet is {0, 1}.
2.5.5. Give DFA’s for the following languages. In all cases, the alphabet is
{0, 1, #}.
So far, all our finite automata have been deterministic. This means that each
one of their moves is completely determined by the current state and the cur-
rent input symbol. In this chapter, we learn that finite automata can be non-
deterministic. Nondeterministic finite automata are a useful tool for showing
that languages are regular. They are also a good introduction to the important
concept of nondeterminism.
3.1 Introduction
In this section, we introduce the concept of a nondeterministic finite automaton
(NFA) through some examples. In the next section, we will formally define what
we mean.
Consider again the language of strings that contain the substring 001. Fig-
ure 3.1 shows the DFA we designed in the previous chapter for this language.
Now consider the finite automaton of Figure 3.2. This automaton is not a
DFA for two reasons. First, some transitions are missing. For example, state q1
46 CHAPTER 3. NONDETERMINISTIC FINITE AUTOMATA
1 0 0, 1
q0 0 q1 0 q2 1 q3
1
Figure 3.1: A DFA for the language of strings that contain the substring 001
0, 1 0, 1
q0 0 q1 0 q2 1 q3
Figure 3.2: An NFA for the language of strings that contain the substring 001
3.1. INTRODUCTION 47
has no transition labeled 1. If the NFA is in that state and the next input symbol
is a 1, we consider that the NFA is stuck. It can’t finish reading the input string
and is unable to accept it.
Second, some states have multiple transitions for the same input symbol. In
the case of this NFA, there are two transitions labeled 0 coming out of state q0 .
What this means is that when in that state and reading a 0, the NFA has a choice:
it can stay in state q0 or move to state q1 .
How does the NFA make that choice? We simply consider that if there is
an option that eventually leads the NFA to accept the input string, the NFA will
choose that option. In this example, if the input string contains the substring
001, the NFA will wait in state q0 until it reaches an occurrence of the substring
001 (there may be more than one), move to the accepting state as it reads the
substring 001, and then finish reading the input string while looping in the
accepting state.
On the other hand, if the input string does not contain the substring 001,
then the NFA will do something, but whatever it does will not allow it to reach
the accepting state. That’s because to move from the start state to the accepting
state requires that the symbols 0, 0 and 1 occur consecutively in the input string.
Here’s another example. Consider the language of strings that contain either
a number of 0’s that’s even or a number of 1’s that’s even. In the previous
chapter, we observed that this language is the union of two simpler languages
and then used the pair construction to obtain the DFA of Figure 2.23.
An NFA for this language is shown in Figure 3.3. It combines the DFA’s of the
two simpler languages in a much simpler way than the pair construction.
This NFA illustrates another feature that distinguishes NFA’s from DFA’s: tran-
sitions that are labeled ". The NFA can use these " transitions without reading
any input symbols. This particular NFA has two " transitions, both coming out
of the start state. This means that in its start state, the NFA has two options: it
48 CHAPTER 3. NONDETERMINISTIC FINITE AUTOMATA
0 0
1
q10 q11
1
"
q0
1 1
"
0
q20 q21
0
Figure 3.3: An NFA for the language of strings that contain either a number of
0’s that’s even or a number of 1’s that’s even.
can move to either state q10 or state q20 . And it makes that choice before reading
the first symbol of the input string.
This NFA operates as described earlier: given multiple options, if there is one
that eventually leads the NFA to accept the input string, the NFA will choose that
option. In this example, if the number of 1’s in the input string is even but the
number of 0’s is not, the NFA will choose to move to state q10 . If instead the
number of 0’s is even but the number of 1’s is not, the NFA will choose to move
to state q20 . If both the number of 0’s and the number of 1’s are even, the NFA
will choose to move to either state q10 or q20 , and eventually accept either way.
If neither the number of 0’s nor the number of 1’s is even, then the NFA will not
be able to accept the input string.
Here’s one more example. Consider the language of strings of length at least
3 that have a 1 in position 3 from the end. We can design a DFA for this language
3.1. INTRODUCTION 49
0
1
1 0 1
0 1
0 1 0
0
1
Figure 3.4: A DFA for the language of strings of length at least 3 that have a 1
in position 3 from the end
by having the DFA remember the last three symbols it has seen, as shown in
Figure 3.4. The start state is q000 , which corresponds to assuming that the input
string is preceded by 000. This eliminates special cases while reading the first
two symbols of the input string.
Figure 3.5 shows an NFA for the same language. This NFA is surprisingly
simpler than the DFA. If the input string does contain a 1 in position 3 from the
end, the NFA will wait in the start state until it reaches that 1 and then move to
the accepting state as it reads the last three symbols of the input string.
On the other hand, if the input string does not contain a 1 in position 3 from
the end, then the NFA will either fall short of reaching the accepting state or it
will reach it before having read the last symbol of the input string. In that case,
it will be stuck, unable to finish reading the input string. Either way, the input
string will not be accepted.
50 CHAPTER 3. NONDETERMINISTIC FINITE AUTOMATA
0, 1
1 0, 1 0, 1
q0 q1 q2 q3
Figure 3.5: An NFA for the language of strings of length at least 3 that have a 1
in position 3 from the end
This example makes two important points. One is that NFA’s can be much
simpler than DFA’s that recognize the same language and, consequently, it can
be much easier to design an NFA than a DFA. Second, we consider that an NFA
accepts its input string only if it is able to read the entire input string.
To summarize, an NFA is a finite automaton that may have missing transi-
tions, multiple transitions coming out of a state for the same input symbol, and
transitions labeled " that can be used without reading any input symbol. An
NFA accepts a string if there is a way for the NFA to read the entire string and
end up in an accepting state.
Earlier, we said that when confronted with multiple options, we consider
that the NFA makes the right choice: if there is an option that eventually leads
to acceptance, the NFA will choose it. Now, saying that the NFA makes the right
choice, if there is one, does not explain how the NFA makes that choice. One
way to look at this is to simply pretend that the NFA has the magical ability to
make the right choice. . .
Of course, real computers aren’t magical. In fact, they’re deterministic. So
a more realistic way of looking at the computation of an NFA is to imagine that
the NFA explores all the possible options, looking for one that will allow it to
accept the input string. So it’s not that NFA’s are magical, it’s that they are
3.1. INTRODUCTION 51
somewhat misleading: when we compare the NFA of Figure 3.5 with the DFA of
Figure 3.4, the NFA looks much simpler but, in reality, it hides a large amount
of computation.
So why are we interested in NFA’s if they aren’t a realistic model of compu-
tation? One answer is that they are useful. We will soon learn that there is a
simple algorithm that can convert any NFA into an equivalent DFA, that is, one
that recognizes the same language. And for some languages, it can be much
easier to design an NFA than a DFA. This means that for those languages, it is
much easier to design an NFA and then convert it to a DFA than to design the
DFA directly.
Another reason for studying NFA’s is that they are a good introduction to
the concept of nondeterminism. In the context of finite automata, NFA’s are no
more powerful than DFA’s: they do not recognize languages that can’t already
be recognized by DFA’s. But in other contexts, nondeterminism seems to result
in additional computational power.
Without going into the details, here’s the most famous example. Algorithms
are generally considered to be efficient if they run in polynomial time. There
is a wide variety of languages that can be recognized by polynomial-time algo-
rithms. But there are also many others that can be recognized by nondeterminis-
tic polynomial-time algorithms but for which no deterministic polynomial-time
algorithm is known. It is widely believed that most of these languages cannot
be recognized by deterministic polynomial-time algorithms. After decades of ef-
fort, researchers are still unable to prove this but their investigations have led
to deep insights into the complexity of computational problems.1
1
What we are referring to here is the famous P vs NP problem and the theory of NP-
completeness. Let P be the class of languages that can be recognized by deterministic algo-
rithms that run in polynomial time. Let NP be the class of languages that can be recognized by
nondeterministic algorithms that run in polynomial time. Then proving that there are languages
that can be recognized by nondeterministic polynomial-time algorithms but not by deterministic
52 CHAPTER 3. NONDETERMINISTIC FINITE AUTOMATA
Study Questions
3.1.1. What is an NFA?
Exercises
3.1.3. Give NFA’s for the following languages. Each NFA should respect the spec-
ified limits on the number of states and transitions. (Transitions labeled
with two symbols count as two transitions.) In all cases, the alphabet is
{0, 1}.
a) The language of strings of length at least two that begin with 0 and
end in 1. No more than three states and four transitions.
b) The language of strings of length at least two whose last two symbols
are the same. No more than four states and six transitions.
c) The language of strings of length at least two that have a 1 in the
second-to-last position. No more than three states and five transi-
tions.
d) The language of strings of length at least k that have a 1 in position
k from the end. Do this in general, for every k ≥ 1. No more than
polynomial-time algorithms is equivalent to showing that P is a strict subset of NP. The consen-
sus among experts is that this is true but no proof has yet been discovered. However, researchers
have discovered an amazing connection between a wide variety of languages that belong to NP
but are not known to belong to P: if a single one of these languages was shown to belong to P,
then every language in NP, all of them, would belong to P. This property is called NP-completeness.
The fact that a language is NP-complete is considered strong evidence that there are no efficient
algorithms that recognize it. The P vs NP problem is considered one of the most important open
problems in all of mathematics, as evidenced by the fact that the Clay Mathematics Institute has
offered a million dollar prize to the first person who solves it.
3.2. FORMAL DEFINITION 53
k + 1 states and 2k + 1 transitions. (You did the case k = 2 in the
previous part.)
e) The language of strings that contain exactly one 1. No more than
two states and three transitions.
q0 0 q1 0 q2 1 q3
Figure 3.6: An NFA for the language of strings that contain the substring 001
δ 0 1 "
q0 {q0 , q1 } {q0 } ;
q1 {q2 } ; ;
q2 ; {q3 } ;
q3 {q3 } {q3 } ;
Example 3.2 Consider the NFA shown in Figure 3.6. Here is the formal descrip-
tion of this NFA: the NFA is (Q, {0, 1}, δ, q0 , F ) where
Q = {q0 , q1 , q2 , q3 }
F = {q3 }
Example 3.3 Consider the NFA shown in Figure 3.8. This NFA is
3.2. FORMAL DEFINITION 55
0 0
1
q10 q11
1
"
q0
1 1
"
0
q20 q21
0
Figure 3.8: An NFA for the language of strings that contain either a number of
0’s that’s even or a number of 1’s that’s even.
and δ is defined by the table shown in Figure 3.9. Note that in this table, we
omitted the braces and used a dash (−) instead of the empty set symbol (;). We
will often do that to avoid cluttering the tables. u
t
We now define what it means for an NFA to accept an input string. We first
do this for NFA’s that don’t contain any " transitions.
In the case of DFA’s (see Definition 2.17), we were able to talk about the
sequence of states that the DFA goes through while reading the input string.
That was because that sequence was unique. But in the case of NFA’s, there may
56 CHAPTER 3. NONDETERMINISTIC FINITE AUTOMATA
δ 0 1 "
q0 − − q10 , q20
q10 q10 q11 −
q11 q11 q10 −
q20 q21 q20 −
q21 q20 q21 −
be multiple sequences of states for each input string, depending on how many
options the NFA has.
For example, consider the NFA of Figure 3.6. While reading the string w =
0001, the NFA could go through any of the following four sequences of states:
0 0 0 1
q0 q0 q0 q0 q0
q0 q0 q0 q1
q0 q0 q1 q2 q3
q0 q1 q2
Two of those sequences end prematurely because the NFA is stuck, unable to read
the next input symbol. The first sequence allows the NFA to read the entire input
string but it ends in a non-accepting state. We consider that the NFA accepts
because one of these four sequences, the third one, allows the NFA to read the
entire input string and end in an accepting state.
3.2. FORMAL DEFINITION 57
1 0, 1
q0 0 q1 0 q2 1 q3
" "
Figure 3.10: Another NFA for the language of strings that contain the substring
001
Definition 3.4 Let N = (Q, Σ, δ, q0 , F ) be an NFA without " transitions and let
w = w1 · · · w n be a string of length n over Σ. Then N accepts w if and only if there
is a sequence of states r0 , r1 , . . . , rn such that
r0 = q0 (3.1)
ri ∈ δ(ri−1 , w i ), for i = 1, . . . , n (3.2)
rn ∈ F. (3.3)
Equations 3.1 and 3.2 assert that the sequence of states r0 , r1 , . . . , rn is one
of the possible sequences of states that the NFA may go through while reading
(all of) w. Equation 3.3 says that this sequence of states leads to an accepting
state.
We now define acceptance for NFA’s that may contain " transitions. This
is a little trickier. Figure 3.10 shows another NFA for the language of strings
that contain the substring 001. Consider again the string w = 0001. The NFA
accepts this string because it could go through the following sequence of states
as it reads w:
58 CHAPTER 3. NONDETERMINISTIC FINITE AUTOMATA
0 " 0 0 1
q0 q1 q0 q1 q2 q3
For the NFA to go through this sequence of states, the NFA needs to essentially
“insert” an " between the first and second symbols of w. That is, it needs to view
w as 0" 001.
Another possible sequence of states is
0 0 " 0 1
q0 q1 q2 q1 q2 q3
In this case, the NFA inserts an " between the second and third symbols of w.
That is, it views w as 00" 01.
A definition of acceptance for NFA’s with " transitions can be based on this
idea of inserting " into the input string. Note that inserting " into a string does
not change its “value”; that is, as strings, we have that 0001 = 0" 001 =
00" 01. (In fact, for every string x, x" = " x = x. We say that " is an neutral
element with respect to concatenation of strings.)
r0 = q0 (3.4)
ri ∈ δ(ri−1 , yi ), for i = 1, . . . , m (3.5)
rm ∈ F. (3.6)
Exercises
3.2.1. Give formal descriptions of the following NFA’s. In each case, describe
the transition function by using a transition table.
3.2.2. Consider the NFA of Figure 3.6. There are three possible sequences
of states that this NFA could go through while reading all of the string
001001. What are they? Does the NFA accept this string?
3.2.3. Consider the NFA of Figure 3.10. There are seven possible sequences
of states that this NFA could go through while reading all of the string
001001. What are they? Indicate where "’s need to be inserted into the
string. Does the NFA accept this string?
q0 1 q1
Consider the NFA of Figure 3.11. This NFA recognizes the language of strings
that end in 1. It’s not the simplest possible NFA for this language, but if an
algorithm is going to be able to convert NFA’s into DFA’s, it has to be able to
handle any NFA.
Now consider the input string w = 1101101. The diagram of Figure 3.12
shows all the possible ways in which the NFA could process this string. This
diagram is called the computation tree of the NFA on string w. The nodes in this
tree are labeled by states. The root of the tree (shown at the top) represents
the beginning of the computation. Edges show possible moves that the NFA can
make. For example, from the start state, the NFA has two options when reading
symbol 1: stay in state q0 or go to state q1 .
Some nodes in the computation tree are dead ends. For example, the node
labeled q1 at level 1 is a dead end because there is no 1 transition coming out
of state q1 . (Note that the top level of a tree is considered to be level 0.)
The nodes at each level of the computation tree show all the possible states
that the NFA could be in after reading a certain portion of the input string. For
example, level 4 has nodes labeled q0 , q1 , q0 , q1 . These are the states that the
NFA could be in after reading 1101.
The nodes at the bottom level show the states that the NFA could be in after
3.3. EQUIVALENCE WITH DFA’S 61
q0
1 1
q0 q1
1 1
q0 q1
0 0
q0 q0
1 1 1 1
q0 q1 q0 q1
1 1 1 1
q0 q1 q0 q1
0 0 0 0
q0 q0 q0 q0
1 1 1 1 1 1 1 1
q0 q1 q0 q1 q0 q1 q0 q1
Figure 3.12: The computation tree of the NFA of Figure 3.11 on input string
1101101
62 CHAPTER 3. NONDETERMINISTIC FINITE AUTOMATA
reading the entire string. We can see that this NFA accepts this input string
because the bottom level contains the accepting state q1 .
Computation trees help us visualize all the possible ways in which an NFA can
process a particular input string. But they also suggest a way of simulating NFA’s:
as we read the input string, we can move down the computation tree, figuring
out what nodes occur at each of its levels. And we don’t need to remember
the entire tree, only the nodes at the current level, the one that corresponds to
the last input symbol that was read.2 For example, in the computation tree of
Figure 3.12, after reading the string 1101, we would have figured out that the
current level of the computation tree consists of states q0 , q1 , q0 , q1 .
How would a DFA carry out this simulation? In the only other simulation we
have seen so far, the pair construction of Theorem 2.22, the DFA M simulates
the DFA’s M1 and M2 by keeping track of the state in which each of these DFA’s
would be. Each of the states of M is a pair that combined a state of M1 with a
state of M2 .
In our case, we would have each state of the DFA that simulates an NFA be
the sequence of states that appear at the current level of the computation tree
of the NFA on the input string. However, as the computation tree of Figure 3.12
clearly indicates, the number of nodes at each level can grow. For example, if an
NFA has two options at each move, which is certainly possible, then the bottom
level of the tree would contain 2n nodes, where n is the length of the input string.
This implies that there may be an infinite number of possible sequences of states
that can appear at any level of the computation tree. Since the DFA has a finite
number of states, it can’t have a state for each possible sequence.
A solution to this problem comes from noticing that most levels of the tree
2
Some of you may recognize that this is essentially a breadth-first search. At Clarkson, this
algorithm and other important graph algorithms are normally covered in the courses CS344
Algorithms and Data Structures and CS447 Computer Algorithms.
3.3. EQUIVALENCE WITH DFA’S 63
of Figure 3.12 contain a lot of repetition. And we only want to determine if
an accepting state occurs at the bottom level of the tree, not how many times
it occurs, or how many different ways it can be reached from the start state.
Therefore, as it reads the input string, the DFA only needs to remember the set
of states that occur at the current level of the tree (without repetition). This
corresponds to pruning the computation tree by eliminating duplicate subtrees,
as shown in Figure 3.13. The pruning locations are indicated by dots (· · · ).
Another way of looking at this is to say that as it reads the input string, the
DFA simulates the NFA by keeping track of the set of states that the NFA could
currently be in. If the NFA has k states, then there are 2k different sets of states.
Since k is a constant, 2k is also a constant. This implies that the DFA can have a
state for each set of states of the NFA.
Here are the details of the simulation, first for the case of NFA’s without "
transitions.
Theorem 3.6 Every NFA without " transitions has an equivalent DFA.
q0
1 1
q0 q1
1 1
q0 q1
0 1
q0 q0
···
1 1
q0 q1
1 1
q0 q1
0 0
q0 q0
···
1 1
q0 q1
Figure 3.13: The computation tree of Figure 3.12 with duplicate subtrees re-
moved
3.3. EQUIVALENCE WITH DFA’S 65
If R is the set of states that N can currently be in, then after reading one
more symbol a, N could be in any state that can be reached from a state in R by
a transition labeled a. That is, N could be in any state in the following set:
Therefore, from state R, M will have a transition labeled a going to the state
that corresponds to that set.
To summarize, the DFA M is (Q0 , Σ, δ0 , q00 , F 0 ) where
Q0 = P (Q)
q00 = {q0 }
F 0 = {R ⊆ Q | R ∩ F 6= ;}
Example 3.7 Let’s construct a DFA that simulates the NFA of Figure 3.11. We
get M = (Q0 , {0, 1}, δ0 , q00 , F 0 ) where
δ0 0 1
; ; ;
{q0 } {q0 } {q0 , q1 }
{q1 } {q0 } ;
{q0 , q1 } {q0 } {q0 , q1 }
Figure 3.14: The transition function of the DFA that simulates the NFA of Fig-
ure 3.11
an empty union gives an empty set. (This is consistent with the fact that no states
can be reached from any of the states in an empty set.) Then δ0 ({q0 }, a) =
δ(q0 , a) and similarly for {q1 }. Finally,
We now tackle the case of NFA’s with " transitions. The DFA now needs to ac-
count for the fact that the NFA may use any number of " transitions between any
two input symbols, as well as before the first symbol and after the last symbol.
3.3. EQUIVALENCE WITH DFA’S 67
1
0,1 0
1
0
; {q0 } {q1 } {q0 , q1 }
1 0
Figure 3.15: The DFA that simulates the NFA of Figure 3.11
1
0
1
{q0 } {q0 , q1 }
Figure 3.16: A simplified DFA that simulates the NFA of Figure 3.11
68 CHAPTER 3. NONDETERMINISTIC FINITE AUTOMATA
1 0, 1
q0 0 q1 0 q2 1 q3
" "
Figure 3.17: An NFA for the language of strings that contain the substring 001
Definition 3.8 The extension of a set of states R, denoted E(R), is the set of states
that can be reached from any state in R by using any number of " transitions (none
included).3
Example 3.9 Figure 3.17 shows an NFA for the language of strings that contain
the substring 001. In this NFA, we have the following:
E({q0 }) = {q0 }
E({q1 }) = {q0 , q1 }
E({q2 }) = {q0 , q1 , q2 }
3
This can be defined more formally as follows: E(R) is the set of states q for which there is a
sequence of states r0 , . . . , rk , for some k ≥ 0, such that
r0 ∈ R
ri ∈ δ(ri−1 , "), for i = 1, . . . , k
rk = q.
3.3. EQUIVALENCE WITH DFA’S 69
In addition,
This illustrates how the extension of a set that contains more than one state
can be computed as the union of the extensions of singletons (sets that contain
exactly one state). u
t
This will account for the fact that N can use any number of " transitions after
reading each input symbol (including the last one).
This is what we get: M = (Q0 , Σ, δ0 , q00 , F 0 ) where
Q0 = P (Q)
q00 = E({q0 })
F 0 = {R ⊆ Q | R ∩ F 6= ;}
70 CHAPTER 3. NONDETERMINISTIC FINITE AUTOMATA
and δ0 is defined as follows:
[
δ0 (R, a) = E(δ(r, a)), for R ⊆ Q and a ∈ Σ.
r∈R
Example 3.11 Let’s construct a DFA that simulates the NFA of Figure 3.17. The
easiest way to do this by hand is usually to start with the transition table and
proceed as follows:
1. Compute δ0 ({r}, a) = E(δ(r, a)) for the individual states r of the NFA.
Those values are shown in the top half of Figure 3.18.
3. Add states as needed (to the bottom half of the table) until no new states
are introduced. The value of the transition function for these states is
computed from the values in the top half of the table by taking advantage
of the following fact:
[ [
δ0 (R, a) = E(δ(r, a)) = δ0 ({r}, a).
r∈R r∈R
Note that in this table, we omitted the braces and used a dash (−) instead of
the empty set symbol (;).
All that we have left to do now is to identify the accepting states of the DFA.
In this case, they’re the sets of states that contain q3 , the accepting state of the
NFA.
Figure 3.19 shows the transition diagram of the DFA. Note that the three
accepting states can be merged since all the transitions coming out of these states
3.3. EQUIVALENCE WITH DFA’S 71
δ0 0 1
q0 q0 , q1 q0
q1 q0 , q1 , q2 −
q2 − q3
q3 q3 q3
q0 , q1 q0 , q1 , q2 q0
q0 , q1 , q2 q0 , q1 , q2 q0 , q3
q0 , q3 q0 , q1 , q3 q0 , q3
q0 , q1 , q3 q0 , q1 , q2 , q3 q0 , q3
q0 , q1 , q2 , q3 q0 , q1 , q2 , q3 q0 , q3
Figure 3.18: The transition function of the DFA that simulates the NFA of Fig-
ure 3.17
72 CHAPTER 3. NONDETERMINISTIC FINITE AUTOMATA
0
1
1
q0 0 q0 , q1 0 q0 , q1 , q2 q0 , q3
1
1 1 0
1 q0 , q1 , q3
q0 , q1 , q2 , q3 0
q0
1 1
"
0
q20 q21
0
Figure 3.20: An NFA for the language of strings that contain either a number of
0’s that’s even or a number of 1’s that’s even.
stay within this group of states. If we merged these states, then this DFA would
be identical to the DFA we designed in the previous chapter (see Figure 2.14). u
t
Example 3.12 Let’s construct a DFA that simulates the NFA of Figure 3.20. Fig-
ure 3.21 shows the transition table of the DFA. The start state is E({q0 }) =
{q0 , q10 , q20 }, which is why this state is the first one that appears in the bot-
tom half of the table. The accepting states are all the states that contain either
q10 or q20 , the two accepting states of the NFA.
Note that states {q0 , q10 , q20 } and {q10 , q20 } could be merged since they are
both accepting and their transitions lead to exactly the same states. If we did
that, then we would find that this DFA is identical to the DFA we designed in the
previous chapter by using the pair construction algorithm (see Figure 2.23).
By the way, since the NFA has 5 states, in principle, the DFA has 25 = 32
74 CHAPTER 3. NONDETERMINISTIC FINITE AUTOMATA
δ0 0 1
q0 − −
q10 q10 q11
q11 q11 q10
q20 q21 q20
q21 q20 q21
q0 , q10 , q20 q10 , q21 q11 , q20
q10 , q21 q10 , q20 q11 , q21
q11 , q20 q11 , q21 q10 , q20
q10 , q20 q10 , q21 q11 , q20
q11 , q21 q11 , q20 q10 , q21
Figure 3.21: The transition function of the DFA that simulates the NFA of Fig-
ure 3.20
3.3. EQUIVALENCE WITH DFA’S 75
states. But as the transition table indicates, there are only 5 states that are
reachable from the start state. t
u
In this section, we used the technique described in the proofs of Theorems 3.6
and 3.10 to convert three NFA’s into equivalent DFA’s. Does that mean that these
proofs are constructive? Yes but with one caveat: in the proof of Theorem 3.10,
we didn’t specify how to compute the extension of a set of states. This is es-
sentially what is called a graph reachability problem. We will learn a graph
reachability algorithm later in these notes.
Another observation. A language is regular if it is recognized by some DFA.
This is the definition. But now we know that every NFA can be simulated by a
DFA. We also know that a DFA is a special case of an NFA. Therefore, we get the
following alternate characterization of regular languages:
Study Questions
3.3.1. In an NFA, what is the extension of a state?
Exercises
3.3.2. Draw the computation tree of the NFA of Figure 3.6 on the input string
001001. Prune as needed.
3.3.3. By using the algorithm of this section, convert into DFA’s the NFA’s of
Figures 3.22 and 3.23.
3.3.4. By using the algorithm of this section, convert into DFA’s the NFA’s of
Figure 3.24.
76 CHAPTER 3. NONDETERMINISTIC FINITE AUTOMATA
a)
0,1
q0 0 q1 1 q2
b)
q1
0,1
0 0
q0 q3
1 1
q2
c)
0, 1
1 0, 1
q0 q1 q2
d)
0 0
q0 1 q1
a)
0
q0 0 q1 1 q2
"
b)
q1
0,1
0
0, "
q0 q3
1, "
1
q2
Proof Suppose that A1 and A2 are regular and let M1 and M2 be DFA’s for these
languages. We construct an NFA N that recognizes A1 ∪ A2 .
The idea is illustrated in Figure 3.25. We add to M1 and M2 a new start state
that we connect to the old start states with " transitions. This gives the NFA the
option of processing the input string by using either M1 or M2 . If w ∈ A1 ∪ A2 ,
then N will choose the appropriate DFA and accept. And if N accepts, it must
be that one of the DFA’s accepts w. Therefore, L(N ) = A1 ∪ A2 .
The NFA can be described more precisely as follows. Suppose that Mi =
(Q i , Σ, δi , qi , Fi ), for i = 1, 2. Without loss of generality, assume that Q 1 and Q 2
are disjoint. (Otherwise, rename the states so the two sets become disjoint.) Let
q0 be a state not in Q 1 ∪ Q 2 . Then N = (Q, Σ, δ, q0 , F ) where
Q = Q 1 ∪ Q 2 ∪ {q0 }
F = F1 ∪ F2
3.4. CLOSURE PROPERTIES 79
N M1
"
"
M2
u
t
In the previous chapter, we proved closure under union by using the pair
construction. And we observed that the construction could be easily modified
to prove closure under intersection. We can’t do this here: there is no known
simple way to adapt the above construction for the case of intersection. But we
can still easily prove closure under intersection by using De Morgan’s Law.
A1 ∩ A2 = A1 ∪ A2 .
If A1 and A2 are regular, then A1 and A2 are regular, A1 ∪ A2 is regular, and then
so is A1 ∪ A2 . This implies that A1 ∩ A2 is regular. t
u
Let’s now turn to concatenation. Figure 2.25 illustrates one possible idea:
connect DFA’s M1 and M2 in series by adding transitions so that the accepting
states of M1 act as the start state of M2 . By using " transitions, we can simplify
this a bit.
N
"
"
Proof Suppose that A1 and A2 are regular and let M1 and M2 be DFA’s for these
languages. We construct an NFA N that recognizes A1 A2 .
The idea is illustrated in Figure 3.26. We add to the accepting states of M1
" transitions to the start state of M2 . This gives N the option of “switching” to
M2 every time it enters one of the accepting states of M1 . We also make the
accepting states of M1 non-accepting.
Let’s make sure this really works. Suppose that w ∈ A1 A2 . That is, w = x y
with x ∈ A1 and y ∈ A2 . Then after reading x, N will be in one of the accepting
states of M1 . From there, it can use one of the new " transitions to move to the
start state of M2 . The string y will then take N to one of the accepting states of
M2 , causing N to accept w.
82 CHAPTER 3. NONDETERMINISTIC FINITE AUTOMATA
Conversely, if w is accepted by N , it must be that N uses one of the new
" transitions. This means that w = x y with x ∈ A1 and y ∈ A2 . Therefore,
L(N ) = A1 A2 .
The formal description of N is left as an exercise. u
t
That is, A∗ consists of those strings we can form, in all possible ways, by taking
any number of strings from A and concatenating them together.4 Note that "
is always in the star of a language because, by convention, x 1 · · · x k = " when
k = 0.
For example, {0}∗ is the language of all strings that contain only 0’s:
{", 0, 00, 000, . . .}. And {0, 1}∗ is the language of all strings that can be formed
with 0’s and 1’s: {", 0, 1, 00, 01, 10, 11, 000, . . .}. Note that this definition of
the star operation is consistent with our use of Σ∗ to denote the set of all strings
over the alphabet Σ.
Theorem 3.17 The class of regular languages is closed under the star operation.
Proof Suppose that A is regular and let M be a DFA that recognizes this lan-
guages. We construct an NFA N that recognizes A∗ .
The idea is illustrated in Figure 3.27. From each of the accepting states of
M , we add " transitions that loop back to the start state.
We also need to ensure that " is accepted by N . One idea is to simply make
the start state of M an accepting state. But this doesn’t work, as one of the
4
The language A∗ is also sometimes called the Kleene closure of A.
3.4. CLOSURE PROPERTIES 83
N "
"
"
Study Questions
3.4.1. What is the star of a language?
Exercises
3.4.2. Give a formal description of the NFA of the proof of Theorem 3.16.
3.4.3. Give a formal description of the NFA of the proof of Theorem 3.17.
3.4.4. The proof of Theorem 3.17 mentions the idea illustrated in Figure 3.28.
From each of the accepting states of M , we add " transitions that loop back
to the start state. In addition, we make the start state of M accepting to
ensure that " is accepted.
a) Explain where the proof of the theorem would break down if this idea
was used.
3.4. CLOSURE PROPERTIES 85
N "
"
Figure 3.28: A bad idea for an NFA for the star of a regular language
86 CHAPTER 3. NONDETERMINISTIC FINITE AUTOMATA
b) Show that this idea cannot work by providing an example of a DFA
M for which the NFA N of Figure 3.28 would not recognize L(M )∗ .
A+ = {x 1 · · · x k | k ≥ 1 and each x i ∈ Σ∗ }.
Show that the class of regular languages is closed under the “plus” opera-
tion.
Regular Expressions
In the previous two chapters, we studied two types of machines that recognize
languages. In this chapter, we approach languages from a different angle: in-
stead of recognizing them, we will describe them. We will learn that regular
expressions allow us to describe languages precisely and, often, concisely. We
will also learn that regular expressions are powerful enough to describe any reg-
ular language, that regular expressions can be easily converted to DFA’s, and that
it is often easier to write a regular expression than to directly design a DFA or
an NFA. This implies that regular expressions are useful not only for describing
regular languages but also as a tool for obtaining DFA’s for these languages.
4.1 Introduction
Many of you are probably already familiar with regular expressions. For exam-
ple, in the Unix or Linux operating systems, when working at the prompt (on a
console or terminal) we can list all the PDF files in the current directory (folder)
by using the command ls *.pdf. The string ls is the name of the command.
88 CHAPTER 4. REGULAR EXPRESSIONS
(It’s short for list.) The string *.pdf is a regular expression. In Unix regular
expressions, the star (*) represents any string. So this command is asking for a
list of all the files whose name consists of any string followed by the characters
.pdf.
Another example is rm project1.*. This removes all the files associated
with Project 1, that is, all the files that have the name project1 followed by
any extension.
Regular expressions are convenient. They allow us to precisely and concisely
describe many interesting patterns in strings. But note that a regular expression
corresponds to a set of strings, those strings that possess the pattern. Therefore,
regular expressions describe languages.
In the next section, we will define precisely what we mean by a regular ex-
pression. Our regular expressions will be a little different from Unix regular
expressions. In the meantime, here are two examples that illustrate both the
style of regular expressions we will use as well as the usefulness of regular ex-
pressions.
Example 4.1 Consider the language of valid C++ identifiers. Recall that these
are strings that begin with an underscore or a letter followed by any number of
underscores, letters and digits. Figure 4.1 shows the DFA we designed earlier
for this language. This DFA is not very complicated but a regular expression is
even simpler. Let D stand for any digit and L for any letter. Then the valid C++
identifiers can be described as follows:
(_ ∪ L)(_ ∪ L ∪ D)∗
underscore,
letter,
digit
underscore,
q0 letter q1
digit, other
other
q2 any
q0 d q1 d q2 d q3 − q4 d q5 d q6 d q7 d q8
We can make the regular expression more precise be defining what we mean
with the symbols L and D. Here too, regular expressions can be used:
L = a ∪ b ∪ c ∪ ··· ∪ Z
D = 0 ∪ 1 ∪ 2 ∪ ··· ∪ 9
t
u
D7 ∪ D3 − D4
4.1.4. What does the union operator (∪) mean in a regular expression?
Exercises
4.1.5. Give regular expressions for the languages of the first four exercises of
Section 2.2. (Note that " can be used in a regular expression.)
1. R = a, where a ∈ Σ.
2. R = ".
3. R = ;.
and
0 ∪ 01∗ = (0 ∪ (0 ◦ (1∗ ))).
Definition 4.4 Suppose that R is a regular expression over Σ. The language de-
scribed by R (or the language of R) is defined as follows:
1. L(a) = {a}, if a ∈ Σ.
2. L(") = {"}.
3. L(;) = ;.
6. L(R∗1 ) = L(R1 )∗ .
These are the basic regular expressions we will use in these notes. When
convenient, we will augment them with the following “abbreviations”. If Σ =
4.3. MORE EXAMPLES 93
{a1 , . . . , ak } is an alphabet, then Σ denotes the regular expression a1 ∪ · · · ∪ ak .
If R is a regular expression, then R+ denotes the regular expression RR∗ , so that
L(R+ ) = L(R)L(R)∗
= {x 1 · · · x k | k ≥ 1 and each x i ∈ L(R)}
= L(R)+ .
2. The language of all strings that begin with 1 is described by the regular
expression 1Σ∗ .
4. The language of strings of length at least two that begin and end with the
same symbol is described by 0Σ∗ 0 ∪ 1Σ∗ 1.
94 CHAPTER 4. REGULAR EXPRESSIONS
5. The language of strings that contain the substring 001 is described by
Σ∗ 001Σ∗ .
t
u
L(R ∪ ;) = L(R)
L(R;) = ;
L(R") = L(R)
L(;∗ ) = ;∗
= {x 1 · · · x k | k ≥ 0 and each x i ∈ ;}
= {"}
Exercises
4.3.1. Give a regular expression for the language of strings that begin and end
with the same symbol. (Note that the strings may have length one.) The
alphabet is Σ = {0, 1}.
4.4. CONVERTING REGULAR EXPRESSIONS INTO DFA’S 95
4.3.2. Give regular expressions for the languages of Exercise 2.3.6.
4.3.5. Give regular expressions for the complement of each of the languages of
Exercise 2.3.6.
Example 4.8 Let’s convert the regular expression 0(0 ∪ 1)∗ into an NFA. Fig-
ures 4.4 and 4.5 show the various steps. Step 1 shows the basic NFA’s. Step 2
shows an NFA for 0 ∪ 1. Step 3 shows an NFA for (0 ∪ 1)∗ . Figure 4.5 shows the
final result, the concatenation of the NFA’s from Steps 2 and 3. u
t
Exercises
4.4.1. By using the algorithm of this section, convert the following regular ex-
pressions into NFA’s. In all cases, the alphabet is Σ = {0, 1}.
a) 0∗ ∪ 1∗ .
b) Σ∗ 0.
c) 0∗ 10∗ .
d) (11)∗ .
1.
q0 0 q1 q2 1 q3
2.
q4 " q0 0 q1
"
q2 1 q3
3.
"
q5 " q4 " q0 0 q1
"
q2 1 q3
"
"
"
q2 1 q3
"
and then show that these extended regular expressions can only describe
regular languages. (In other words, show that these extended regular
expressions are no more powerful than the basic ones.)
0, 1 0, 1
To keep things concrete, we’re assuming that the input alphabet Σ is {0, 1}.
These DFA’s can be easily converted into regular expressions: Σ∗ and ;. This
was probably too easy to teach us much about the algorithm.
100 CHAPTER 4. REGULAR EXPRESSIONS
So let’s consider a DFA with two states:
0 1
1
A regular expression for this DFA can be written by allowing the input string
to leave and return to the start state any number of times, and then go to the
accepting state and stay there. There are two basic ways to leave the start state
and return to it: either by reading a 0 or by reading a string of the form 11∗ 0.
There is only one way of going to the accepting state and staying there: by
reading a string of the form 11∗ . So the resulting regular expression is (0 ∪
11∗ 0)∗ 11∗ .
Here’s another two-state DFA:
0 1
1
In this case, there is only one way to leave the start state and return to it: by
reading a string of the form (0 ∪ 1)1∗ 0. This leads to the regular expression
((0 ∪ 1)1∗ 0)∗ (0 ∪ 1)1∗ .
So transitions labeled with multiple symbols are easy to deal with. But any
of the transitions in a two-state DFA can be missing. This would lead to a large
number of cases. It would be much more convenient if we could come up with
a single “formula” that covers all the cases.
This can be done by considering that missing transitions are actually there
but labeled by the regular expression ;. This makes sense because the set of
symbols that allow us to use a missing transition is empty. In addition, we can
consider that transitions labeled with multiple symbols are actually labeled by
the union of those symbols. This too makes sense since that’s the regular expres-
sion that describes the set of symbols that allow us to use that transition. If we
transform the previous DFA in this way, we get
; 1
0∪1
R1 R3
R2
R4
R4
Again, using our earlier reasoning, this corresponds to the regular expression
So we now know how to handle any two-state DFA. Let’s move on to a DFA
with three states, such as the one shown in Figure 4.6. To make things more
4.5. CONVERTING DFA’S INTO REGULAR EXPRESSIONS 103
c
a a
b
b a
q0 q1 q2
c c
interesting, the input alphabet is now {a, b, c}. Coming up with a regular ex-
pression for this DFA seems much more complicated than for DFA’s with two
states. So here’s an idea that may at first seem a little crazy: what about we
try to remove the middle state of the DFA? If this could be done, then the DFA
would have two states and we already know how to handle that.
To be able to remove state q1 from the DFA, we need to consider all the ways
in which state q1 can be used to travel through the DFA. For example, state q1
can be used to go from state q0 to state q2 . The DFA does that while reading a
string of the form b b∗ a. The eventual removal of state q1 can be compensated
for by adding the option bb∗ a to the transition that goes directly from q0 to q2 ,
as shown in Figure 4.7. Then, instead of going from q0 to q2 through q1 while
reading a string of the form bb∗ a, we can now go directly from q0 to q2 while
reading that same string.
Note that the resulting automaton is no longer a DFA because it now contains
a transition that is labeled by a regular expression and that regular expression is
not just ; or a union of symbols. But the meaning of this should be pretty clear:
if a transition is labeled R, then that transition can be used while reading any
104 CHAPTER 4. REGULAR EXPRESSIONS
c ∪ bb∗ a
a a
b
b a
q0 q1 q2
c c
string in the language described by R. In the next section, we will take the time
to formally define this new kind of automaton and how it operates.
Another possibility that needs to be considered is that state q1 can also be
used to travel from state q0 back to state q0 . The DFA does that while reading a
string of the form b b∗ c. Once again, the removal of state q1 can be compensated
for by adding to the transition that goes directly from q0 back to q0 , as shown in
Figure 4.8.
To be able to fully remove q1 from the DFA, this compensation operation needs
to be performed for every pair of states other than q1 . Once this is done and q1
is removed, the result is the two-state automaton shown in Figure 4.9. And we
already know how to convert a two-state automaton to a regular expression.
Here’s a general description of the compensation operation. Consider the
top automaton of Figure 4.10. State q1 is the one we want to remove. That
state can be used to travel from q2 to q3 while reading a string of the form
(R2 )(R3 )∗ (R4 ). So we can compensate for the eventual removal of q1 by adding
the option (R2 )(R3 )∗ (R4 ) to the transition going directly from q2 to q3 , as shown
in the second automaton of Figure 4.10.
4.5. CONVERTING DFA’S INTO REGULAR EXPRESSIONS 105
c ∪ bb∗ a
a ∪ bb∗ c a
b
b a
q0 q1 q2
c c
a ∪ bb∗ c a ∪ c b∗ a
c ∪ bb∗ a
q0 q2
b ∪ c b∗ c
Figure 4.9: The result of removing state q1 from the DFA of Figure 4.6
106 CHAPTER 4. REGULAR EXPRESSIONS
R3
R2 R4
q2 q1 q3
R1
q2 q1 q3
Note that in Figure 4.10, transitions are labeled by regular expressions, not
individual symbols. This allows us to deal with missing transitions and transi-
tions labeled with multiple symbols, as explained earlier.
In Figure 4.10, it could also be that q2 and q3 are the same state. In that case,
the transition going from q2 to q3 would be a loop.
Now, not all DFA’s with three states are similar to the DFA of Figure 4.6.
For example, it could be that state q1 is also an accepting state, as shown in
Figure 4.11. Removing state q1 would be problematic.
A solution is to modify the DFA so it has only one accepting state. This can be
done by adding a new accepting state, new " transitions from the old accepting
states to the new one, and removing the accepting status of the old accepting
states, as shown in Figure 4.12. Once the new accepting state is added, we can
remove states q1 and q2 , one by one, as before, and end up, once again, with
an automaton with two states. The result is shown in Figure 4.13. The first
4.5. CONVERTING DFA’S INTO REGULAR EXPRESSIONS 107
c
a a
b
b a
q0 q1 q2
c c
c
a a
b
b a
q0 q1 q2 "
c c
"
b
Figure 4.12: Transforming the DFA of Figure 4.11 so it has only one accepting
state
108 CHAPTER 4. REGULAR EXPRESSIONS
ca∗
a ∪ ca∗ b b ∪ aa∗ c
b ∪ ca∗ c
q0 q1
" ∪ aa∗
∗
c ∪ aa b
Figure 4.13: Removing states q2 and q1 (in that order) from the automaton of
Figure 4.12
R1
R2
where
R1 = a ∪ ca∗ b ∪ (b ∪ ca∗ c)(b ∪ aa∗ c)∗ (c ∪ aa∗ b)
4.5. CONVERTING DFA’S INTO REGULAR EXPRESSIONS 109
and
R2 = ca∗ ∪ (b ∪ ca∗ c)(b ∪ aa∗ c)∗ (" ∪ aa∗ ).
R2
To see why, consider again Figure 4.10, which illustrates the compensation op-
eration. If the new accepting state is involved in a compensation operation,
it must be as state q3 because the new accepting state has no outgoing transi-
tions. But then, it’s clear that the compensation operation does not add outgoing
transitions to that state. Therefore, in the final two-state automaton, the new
accepting state will not have any outgoing transitions. Because this results in a
simpler two-state automaton, we will always add a new accepting state to the
original automaton, even when it only has one accepting state.
In an automaton with three states, it could also be that the start state is an
accepting state. But that’s not a problem since adding a new accepting state will
turn the start state into a non-accepting state so we are back to the previous
case.
So we now know how to handle any automaton with three states: add a new
accepting state and then remove the two states that are not the start state or the
new accepting state. This results in a simple two-state automaton that’s easy to
convert to a regular expression.
110 CHAPTER 4. REGULAR EXPRESSIONS
What about an automaton with four states? We can simply use the same
strategy: add a new accepting state and then remove the three states that are
not the start state or the new accepting state. And clearly, this can be extended
to automata with any number of states. Here’s a complete description of the
algorithm:
2. Add a new accepting state, together with new " transitions from the old
accepting states to the new one. Remove the accepting status of the old
accepting states.
3. Remove each state, one by one, except for the start state and the accepting
state. This results in an automaton of the following form:
R1
R2
Exercises
4.5.1. By using the algorithm of this section, convert into regular expressions
the NFA’s of Figure 4.14.
4.5. CONVERTING DFA’S INTO REGULAR EXPRESSIONS 111
a)
1
0
1
0 1
q0 q1 q2
0
b)
0 0 0,1
0 0
q0 q1 q2
1
R3
R2 R4
q2 q1 q3
R1
R3
R2 R4
q2 q1 q3
The only difference between this definition and that of an ordinary NFA is
the specification of the transition function. In an NFA, the transition function
takes a state and a symbol, or ", and gives us a set of possible next states. In
constrast, in a GNFA, the transition function takes a pair of states and gives us
the regular expression that labels the transition going from the first state to the
second one. Note how this neatly enforces the fact that in a GNFA, there is
exactly one transition going from each state to every other state.
Example 4.10 Consider the NFA shown in Figure 4.16. This NFA is almost a
4.6. PRECISE DESCRIPTION OF THE ALGORITHM 115
δ q0 q1
q0 ; 0∪1
q1 0 1
Figure 4.17: The transition function of the GNFA that corresponds to the NFA of
Figure 4.16.
GNFA. In fact, only two things prevent it from being a GNFA: there is no transi-
tion from q0 to q0 and there are two transitions going from q0 to q1 (one labeled
0, the other labeled 1). To turn this NFA into a GNFA, all we need to do is add
a transition labeled ; from q0 to q0 and merge the two transitions going from q0
to q1 into a single transition labeled 0 ∪ 1.
Here is a formal description of the resulting GNFA: the GNFA is
({q0 , q1 }, {0, 1}, δ, q0 , {q1 }) where δ is defined by the table shown in Fig-
ure 4.17. t
u
We now define how a GNFA operates; that is, we define what it means for a
GNFA to accept a string. The idea is that a string w is accepted if the GNFA reads
a sequence of strings y1 , . . . , yk with the following properties:
1. This sequence of strings takes the GNFA through a sequence of states
r0 , r1 , . . . , r k .
4. The reading of each string yi is a valid move, in the sense that yi is in the
language of the regular expression that labels the transition going from
ri−1 to ri .
116 CHAPTER 4. REGULAR EXPRESSIONS
5. The concatenation of all the yi ’s corresponds to w.
All of this can be said more concisely, and more precisely, as follows:
r0 = q 0
yi ∈ δ(ri−1 , ri ), for i = 1, . . . , k
rk ∈ F.
We now describe precisely the state removal step of the algorithm. Let q1 be
the state to be removed. For every other pair of states q2 and q3 , let
R1 = δ(q2 , q3 )
R2 = δ(q2 , q1 )
R3 = δ(q1 , q1 )
R4 = δ(q1 , q3 )
as illustrated by the first automaton of Figure 4.15. Change to R1 ∪(R2 )(R3 )∗ (R4 )
the label of the transition that goes directly from q2 to q3 , as shown in the second
automaton of Figure 4.15. That is, set δ(q2 , q3 ) to R1 ∪ (R2 )(R3 )∗ (R4 ). Once all
pairs q2 and q3 have been considered, remove state q1 and all adjacent transitions
(those that come into, or come out of, q1 ).
And here’s a proof that this works:
Lemma 4.12 If the state removal step is applied to GNFA N , then the resulting
GNFA still recognizes L(N ).
4.6. PRECISE DESCRIPTION OF THE ALGORITHM 117
Proof Let N 0 be the GNFA that results from removing state q1 from GNFA N .
Suppose that w ∈ L(N ). If w can be accepted by N without traveling through
state q1 , then w is accepted by N 0 . Now suppose that to accept w, N must travel
through q1 . Suppose that in one such instance, N reaches q1 from q2 and that it
goes to q3 after leaving q1 , as shown in the first automaton of Figure 4.15. Then
N travels from q2 to q3 by reading strings y1 , . . . , yk such that
y1 ∈ L(R2 )
yi ∈ L(R3 ), for i = 2, . . . , k − 1
yk ∈ L(R4 ).
This implies that y1 · · · yk ∈ L((R2 )(R3 )∗ (R4 )). In N 0 , the transition going directly
from q2 to q3 is now labeled R1 ∪ (R2 )(R3 )∗ (R4 ). This implies that N 0 can move
directly from q2 to q3 by reading y1 · · · yk . And this applies to every instance in
which N 0 goes through q1 while reading w. Therefore, N 0 still accepts w.
y1 ∈ L(R2 )
yi ∈ L(R3 ), for i = 2, . . . , k − 1
yk ∈ L(R4 ).
118 CHAPTER 4. REGULAR EXPRESSIONS
This implies that N can go from q2 to q3 while reading y as long as it does it
by going through q1 while reading y1 , . . . , yk . This applies to every instance in
which N 0 uses a relabeled transition while reading w. Therefore, N accepts w
and this completes the proof that L(N 0 ) = L(N ). t
u
So we now know for sure that the state removal step works. As described in
the previous section, this step is repeated for every state except for the start state
and the new accepting state. As explained in the previous section, this always
results in a two-state GNFA of the following form:
R1
R2
We are now ready to describe the entire conversion algorithm and prove its
correctness. The algorithm can actually convert arbitrary NFA’s, not just DFA’s.
2. Transform N into a GNFA. This can be done as follows. For every pair of
states q1 and q2 , if there is no transition from q1 to q2 , add one labeled ;. If
there are multiple transitions labeled a1 , . . . , ak going from q1 to q2 , replace
those transitions by a single transition labeled a1 ∪ · · · ∪ ak .
4.6. PRECISE DESCRIPTION OF THE ALGORITHM 119
3. Add a new accepting state to N , add new " transitions from the old ac-
cepting states to the new one and remove the accepting status of the old
accepting states.
4. One by one, remove every state other than the start state or the accepting
state. This can be done as follows. Let q1 be the state to be removed. For
every other pair of states q2 and q3 , let
R1 = δ(q2 , q3 )
R2 = δ(q2 , q1 )
R3 = δ(q1 , q1 )
R4 = δ(q1 , q3 )
R1
R2
Exercises
4.6.1. Give a formal description of the GNFA of Figure 4.8.
Nonregular Languages
In this chapter, we show that not all languages are regular. In other words, we
show that there are languages that can’t be recognized by finite automata or de-
scribed by regular expressions. We will learn to prove that particular languages
are nonregular by using a general result called the Pumping Lemma.
0 j−i
q0 0i qi 0n− j 1n
Now, what Figure 5.1 makes clear is that M accepts not only w but also the
string 0i 0n− j 1n , which is simply w with j − i 0’s removed. This string is accepted
by M because after reading 0i , the DFA enters state qi . From there, after reading
0n− j 1n , the DFA again reaches the same accepting state that was reached when
reading w. In other words, when reading this string, M simply skips the loop
shown in Figure 5.1.
But 0i 0n− j 1n is not in L because it contains less 0’s than 1’s. So we have
found a string that is accepted by M but does not belong to L. This contradicts
the fact that M recognizes L and shows that a DFA that recognizes L cannot
exist. Therefore, L is not regular. t
u
The argument we developed in this example can be used to show that other
languages are not regular.
What this last example shows is that sometimes going around the loop more
than once is the way to obtain a contradiction and make the argument work.
Here’s another example.
Example 5.3 Let L be the language of strings of the form ww where w is any
string over the alphabet is {0, 1}. Suppose that L is regular. Let M be a DFA
that recognizes L and let n be the number of states of M .
Consider the string w = 0n 0n . As M reads the 0’s in the first half of w, M
goes through a sequence of states q0 , q1 , q2 , . . . , qn . Because this sequence is of
length n + 1, there must be a repetition in the sequence.
Suppose that qi = q j with i < j. Then the computation of M on w is as
shown in Figure 5.2. This implies that the string 02n−( j−i) is also accepted. Un-
fortunately, we cannot conclude that this string is not in L. In fact, if j − i is
even, then this string is in L because the total number of 0’s would still be even.
2
Note that these i and j are not the same i and j that were used to define the language. It
may have been clearer use different names in the definition of the language, for example, by
saying that L is the language of strings of the form 0s 1 t where s ≤ t. But at some point, you
start running out of convenient names. So you can just view the i and j in the definition of L as
being local to that definition, just as in programming, variables can be local to a function, which
allows you to reuse their names for other purposes outside of that function.
5.1. SOME EXAMPLES 125
0 j−i
q0 0i qi 0n− j 0n
0 j−i
q0 0i qi 0n− j 10n 1
To fix this argument, note that if the string is still in L after we delete some
0’s from its first half, it’s because the middle of the string will have shifted to the
right. So we just need to pick a string w whose middle cannot move. . .
Let w = 0n 10n 1. As before, we get a repetition within the first block of 0’s,
as shown in Figure 5.3. This implies that the string 0n−( j−i) 10n 1 is also accepted
by M . If this string was in L, since its second half ends in 1, its first half would
also have to end in 1. But then, the two halves would not be equal. So this string
cannot be in L, which contradicts the fact that M recognizes L. Therefore, M
does not exist and L is not regular. t
u
This last example makes the important point that the string w, that is, the
string that we focus the argument on, sometimes has to be chosen carefully. In
fact, this choice can sometimes be a little tricky.
126 CHAPTER 5. NONREGULAR LANGUAGES
Exercises
5.1.1. Show that the language {0n 12n | n ≥ 0} is not regular.
q0 x qi z
The fact that the reading of y takes M from state qi back to state qi implies
that the strings xz and x y 2 z are also accepted by M . (In fact, it is also true that
x y k z is accepted by M for every k ≥ 0.)
In each of the examples of the previous section, we then observed that one
of the strings xz or x y 2 z does not belong to L. (The details depend on x, y,
z and L.) This contradicts the fact that M recognizes L. Therefore, M cannot
exist and L is not regular.
To summarize, the argument starts with a language L that is assumed to be
regular and a string w ∈ L that is long enough. It then proceeds to show that w
can be broken into three pieces, w = x yz, such that x y k z ∈ L, for every k ≥ 0.
There is more to it, though. In all three examples of the previous section,
we also used the fact that the string y occurs within the first n symbols of w.
This is because a repetition is guaranteed to occur while M is reading the first
n symbols of w. This is useful because it gives us some information about the
contents of y. For example, in the case of the language {0n 1n }, this told us that
y contained only 0’s, which meant that xz contained less 0’s than 1’s.
In addition, the above also implicitly uses the fact that y is not empty. And
this is absolutely critical since, otherwise, xz would equal w and the fact that xz
is accepted by M could not possibly lead to a contradiction.
So a more complete summary of the above argument is as follows. The argu-
128 CHAPTER 5. NONREGULAR LANGUAGES
ment starts with a language L that is assumed to be regular and a string w ∈ L
that is long enough. It then proceeds to show that w can be broken into three
pieces, w = x yz, such that
In other words, the above argument proves the following result, which is
called the Pumping Lemma:
1. |x y| ≤ p,
2. y 6= ",
3. x y k z ∈ L, for every k ≥ 0.
In a moment, we are going to see how the Pumping Lemma can be used to
simplify proofs that languages are not regular. But first, here’s a clean write-up
of the proof of the Pumping Lemma.
Proof Let L be a regular and let M be a DFA that recognizes L. Let p be the
number of states of M . Now, suppose that w is any string in L with |w| ≥ p.
As M reads the first p symbols of w, M goes through a sequence of states
q0 , q1 , q2 , . . . , q p . Because this sequence is of length p + 1, there must be a repe-
tition in the sequence.
5.2. THE PUMPING LEMMA 129
Suppose that qi = q j with i < j. Then the computation of M on w is as shown
in Figure 5.4. In other words, if w = w1 · · · w m , let x = w1 · · · w i , y = w i+1 · · · w j
and z = w j+1 · · · w m . Clearly, w = x yz. In addition, |x y| = j ≤ p and | y| =
j − i > 0, which implies that y 6= ".
Finally, the fact that the reading of y takes M from state qi back to state qi
implies that the string x y k z is accepted by M , and thus belongs to L, for every
k ≥ 0. u
t
We now see how the Pumping Lemma can be used to prove that languages
are not regular.
Example 5.5 Let L be the language {0n 1n | n ≥ 0}. Suppose that L is regular.
Let p be the pumping length given by the Pumping Lemma. Consider the string
w = 0 p 1 p . Clearly, w ∈ L and |w| ≥ p. Therefore, according to the Pumping
Lemma, w can be written as x yz where
1. |x y| ≤ p.
2. y 6= ".
3. x y k z ∈ L, for every k ≥ 0.
Condition (1) implies that y contains only 0’s. Condition (2) implies that y
contains at least one 0. Therefore, the string xz cannot belong to L because it
contains less 0’s than 1’s. This contradicts Condition (3). So it must be that our
initial assumption was wrong: L is not regular. t
u
It is interesting to compare this proof that {0n 1n } is not regular with the proof
we gave in the first example of the previous section. The new proof is shorter
130 CHAPTER 5. NONREGULAR LANGUAGES
but, perhaps more importantly, it doesn’t need to establish that the string y can
be pumped. Those details are now hidden in the proof of the Pumping Lemma.3
Here’s another example.
Example 5.6 Let L be the language of strings of the form ww. Suppose that L
is regular. Let p be the pumping length given by the Pumping Lemma. Consider
the string w = 0 p 10 p 1. Clearly, w ∈ L and |w| ≥ p. Therefore, according to the
Pumping Lemma, w can be written as x yz where
1. |x y| ≤ p.
2. y 6= ".
3. x y k z ∈ L, for every k ≥ 0.
Condition (1) implies that y contains only 0’s from the first half of w. Con-
dition (2) implies that y contains at least one 0. Therefore, xz is of the form
0i 10p 1 where i < p. This string is not in L, contradicting Condition (3). This
implies that L is not regular. t
u
1. |x y| ≤ p.
2. y 6= ".
3
Yes, this is very similar to the idea of abstraction in software design.
5.2. THE PUMPING LEMMA 131
3. x y k z ∈ L, for every k ≥ 0.
2
Consider the string x y 2 z, which is equal to 1 p +| y| . Since | y| ≥ 1, for this
string to belong to L, it must be that p2 + | y| ≥ (p + 1)2 , the first perfect square
greater than p2 . But | y| ≤ p implies that p2 + | y| ≤ p2 + p = p(p + 1) < (p + 1)2 .
Therefore, x y 2 z does not belong to L, which contradicts the Pumping Lemma
and shows that L is not regular. t
u
Here’s an example that shows that the Pumping Lemma must be used care-
fully.
Example 5.8 Let L be the language {1n | n is even}. Suppose that L is regular.
Let p be the pumping length. Consider the string w = 12p . Clearly, w ∈ L and
|w| ≥ p. Therefore, according to the Pumping Lemma, w can be written as x yz
where
1. |x y| ≤ p.
2. y 6= ".
3. x y k z ∈ L, for every k ≥ 0.
Let y = 1. Then xz = 12p−1 is of odd length and cannot belong to L. This
contradicts the Pumping Lemma and shows that L is not regular.
Of course, this doesn’t make sense because we know that L is regular. So
what did we do wrong in this “proof”? What we did wrong is that we chose the
value of y. We can’t do that. All that we know about y is what the Pumping
Lemma says: w = x yz, |x y| ≤ p, y 6= " and x y k z ∈ L, for every k ≥ 0. This
does not imply that y = 1.
To summarize, when using the Pumping Lemma to prove that a language is
not regular, we are free to choose the string w and the number k. But we cannot
choose the number p or the strings x, y and z. u
t
132 CHAPTER 5. NONREGULAR LANGUAGES
One final example.
Example 5.9 Let L be the language of strings that contain an equal number of
0’s and 1’s. We could show that this language is nonregular by using the same
argument we used for {0n 1n }. However, there is a simpler proof. The key is to
note that
L ∩ 0∗ 1∗ = {0n 1n }.
If L was regular, then {0n 1n } would also be regular because the intersection of
two regular languages is always regular. But {0n 1n } is not regular, which implies
that L cannot be regular. t
u
This example shows that closure properties can also be used to show that
languages are not regular. In this case, the fact that L is nonregular became a
direct consequence of the fact that {0n 1n } is nonregular.
Note that it is not correct to say that the fact that {0n 1n } is a subset of L
implies that L is nonregular. Because the same argument would also imply that
{0, 1}∗ is nonregular. In other words, the fact that a language contains a non-
regular language does not imply that the larger language is nonregular. What
makes a language is nonregular is not the fact that it contains certain strings:
it’s the fact that it contains certain strings while excluding certain others.
Exercises
5.2.1. Use the Pumping Lemma to show that the language {0i 1 j | i ≤ j} is not
regular.
5.2.2. Use the Pumping Lemma to show that the language {1i #1 j #1i+ j } is not
regular. The alphabet is {1, #}.
5.2. THE PUMPING LEMMA 133
5.2.3. Exercise 2.4.8 asked you to show that DFA’s can add when the numbers
are presented in a particular way. In contrast, this exercise asks you to
show that DFA’s cannot add when the numbers are presented in a more
usual way. That is, consider the language of strings of the form x # y #z
where x, y and z are strings of digits that, when viewed as numbers, satisfy
the equation x + y = z. For example, the string 123#47#170 is in this
language because 123 + 47 = 170. The alphabet is {0, 1, . . . , 9, #}. Use
the Pumping Lemma to show that this language is not regular.
5.2.4. Let L be the language of strings that start with 0. What is wrong with the
following “proof” that L is not regular?
Context-Free Languages
6.1 Introduction
We know that the language of strings of the form 0n 1n is not regular. This
implies that no regular expression can describe this language. But here is a way
to describe this language:
S → 0S 1
S→"
136 CHAPTER 6. CONTEXT-FREE LANGUAGES
This is called a context-free grammar (CFG). A CFG consists of variables and rules.
This grammar has one variable: S. That variable is also the start variable of the
grammar, as indicated by its placement on the left of the first rule.
Each rule specifies that the variable on the left can be replaced by the string
on the right. A grammar is used to derive strings. This is done by beginning with
the start variable and then repeatedly applying rules until all the variables are
gone. For example, this grammar can derive the string 0011 as follows:
S ⇒ 0S 1 ⇒ 00S 11 ⇒ 0011.
F →_
F→L
Continuing in this way, we get the following grammar for the language of
valid identifiers:
I → FR
F →_| L
R → _R | LR | DR | "
L → a | ··· | z | A | ··· | Z
D → 0 | ··· | 9
The most interesting rules in this grammar are probably the rules for the variable
R. These rules define the concept R in a recursive way. They illustrate how
repetition can be carried out in a CFG.
Study Questions
6.1.1. What does a CFG consist of?
6.1.4. What is the purpose of the vertical bar when it appears on the right-hand-
side of a rule?
138 CHAPTER 6. CONTEXT-FREE LANGUAGES
Exercises
6.1.5. Consider the grammar for valid identifiers we saw in this section. Show
how the strings x23 and _v2_ can be derived by this grammar. In each
case, give a derivation.
6.1.6. Give CFG’s for the languages of the first three exercises of Section 2.2.
Definition 6.2 Suppose that G = (V, Σ, R, S) is a CFG. Consider a string uAv where
A ∈ V and u, v ∈ (V ∪ Σ)∗ . If A → w is a rule, then we say that the string uwv can
be derived (in one step) from uAv and we write uAv ⇒ uwv.
6.3. MORE EXAMPLES 139
Definition 6.3 Suppose that G = (V, Σ, R, S) is a CFG. If u, v ∈ (V ∪ Σ)∗ , then v
can be derived from u, or G derives v from u, if v = u or if there is a sequence
u1 , u2 , . . . , uk ∈ (V ∪ Σ)∗ , for some k ≥ 0, such that
u ⇒ u1 ⇒ u2 ⇒ · · · ⇒ uk ⇒ v.
∗
In this case, we write u ⇒ v.
Definition 6.4 Suppose that G = (V, Σ, R, S) is a CFG. Then the language gener-
ated by G (or the language of G) is the following set:
∗
L(G) = {w ∈ Σ∗ | S ⇒ w}.
S → 0S | 1S | "
In this grammar, strings are derived from left to right, and then S is deleted. For
example,
S ⇒ 0S ⇒ 01S ⇒ 011S ⇒ 011
In contrast, recall the grammar for the language of strings of the form 0n 1n :
S → 0S 1 | "
140 CHAPTER 6. CONTEXT-FREE LANGUAGES
In this grammar, strings are derived from the outside towards the middle:
t
u
S → 1T
T → 0T | 1T | "
The rule for S must be used first and this rule ensures that the derived string
begins with a 1.
Here’s an alternative grammar for this language:
S → S0 | S1 | 1
The grammar in the previous example derived strings from left to right and then
deleted S. What this new grammar does is derive strings from right to left and
then replace S, which is now at the front, by 1. For example,
S ⇒ S 0 ⇒ S 10 ⇒ S 010 ⇒ 1010
u
t
Example 6.8 The language of strings that start and end with the same symbol:
S → 0T 0 | 1T 1 | 0 | 1
T → 0T | 1T | "
6.3. MORE EXAMPLES 141
Note that the following grammar would not work:
S → XTX | 0 | 1
X →0|1
T → 0T | 1T | "
In this grammar, to derive a string of length greater than 1, we’d have to use the
first rule:
S ⇒ XTX
But then, there is no way to enforce that the same symbols will be derived from
the two occurrences of X . So this won’t guarantee that the string begins and
ends with the same symbol. u
t
Example 6.9 The language of strings that contain the substring 001:
S → T 001 T
T → 0T | 1T | "
Note that here, in contrast to the previous example, we actually take advantage
of the fact that the two occurrences of T in the first rule are not tied together.
This allows us to derive different strings on either side of 001, which is what we
want. u
t
The languages in the above example are all regular. In fact, we can show
that all regular languages are context-free.
Proof The proof is constructive: we give an algorithm that converts any regular
expression R into an equivalent CFG G.
142 CHAPTER 6. CONTEXT-FREE LANGUAGES
1 : S1 → 1
0 : S2 → 0
0 ∪ 1 : S3 → S1 | S2
(0 ∪ 1)∗ : S4 → S3 S4 | "
1(0 ∪ 1)∗ : S5 → S1 S4
The algorithm has six cases, based on the form of R. If R = a, then G contains
a single rule: S → a. If R = ", then G again contains a single rule: S → ". If
R = ;, then G simply contains no rules.
The last three cases are when R = R1 ∪ R2 , R = R1 R2 and R = (R1 )∗ . These
cases are recursive. First, convert R1 and R2 into grammars G1 and G2 , making
sure that the two grammars have no variables in common. (Rename variables if
needed.) Let S1 and S2 be the start variables of G1 and G2 , respectively. Then,
for the case R = R1 ∪ R2 , G contains all the variables and rules of G1 and G2
plus a new start variable S and the following rule: S → S1 | S2 . The other cases
are similar except that the extra rule is replaced by S → S1 S2 and S → S1 S | ",
respectively. u
t
Example 6.11 Consider the regular expression 1Σ∗ = 1(0 ∪ 1)∗ . Let’s convert
this regular expression to a CFG.
Figure 6.1 shows the necessary steps. The resulting grammar is the combi-
nation of all there rules. The start variable is S5 . u
t
Example 6.12 Over the alphabet of parentheses, {(, )}, consider the language
of strings that are properly nested. Given our background in computer science
6.3. MORE EXAMPLES 143
and mathematics, we should all have a pretty clear intuitive understanding of
what these strings are.1 But few of us have probably thought of a precise defini-
tion, or of the need for one. But a precise definition is needed, to build compilers,
for example.
A precise definition can be obtained as follows. Suppose that w is a string of
properly nested parentheses. Then the first symbol of w must be a left paren-
thesis that “matches” a right parenthesis that occurs later in the string. Between
these two matching parentheses, all the parentheses should be properly nested
(among themselves). And to the right of the right parenthesis that matches the
first left parenthesis of w, all the parentheses should also be properly nested. In
other words, w should be of the form (u) v where u and v are strings of properly
nested parentheses. Note that either u or v may be empty.
The above is the core of a recursive definition. We also need a base case. That
is, we need to define the shortest possible strings of properly nested parentheses.
Let’s say that it’s the empty string. (An alternative would be the string ().)
Putting it all together, we get that a string of properly nested parentheses
is either " or a string of the form (u) v where u and v are strings of properly
nested parentheses.
A CFG that derives these strings is easy to obtain:
S → (S )S | "
It should be clear that this grammar is correct because it simply paraphrases the
definition. (A formal proof would proceed by induction on the length of a string,
for one direction, and on the length of a derivation, for the other.) u
t
1
To paraphrase one of the most famous phrases in the history of the U.S. Supreme Court, “We
know one when we see it.”
144 CHAPTER 6. CONTEXT-FREE LANGUAGES
Exercises
6.3.1. Give CFG’s for the following languages. In all cases, the alphabet is
{0, 1}.
6.3.2. Give a CFG for the language {1i #1 j #1i+ j }. The alphabet is {1, #}.
6.3.3. By using the algorithm of this section, convert the following regular ex-
pressions into CFG’s.
a) 0∗ ∪ 1∗ .
b) 0∗ 10.
c) (11)∗ .
6.3.5. Suppose that we no longer consider that " is a string of properly nested
parentheses. In other words, we now consider that the string () is the
shortest possible string of properly nested parentheses. Give a revised
definition and CFG for the language of properly nested parentheses.
6.4. AMBIGUITY AND PARSE TREES 145
E → E+E | E * E | (E) | a
For example, in this grammar, the string (a+a)*a can be derived as follows:
∗
E ⇒ E * E ⇒ ( E )* E ⇒ ( E + E )* E ⇒ (a + a) ∗ a
E * E
( E ) a
E + E
a a
But in the case of arithmetic expressions, the parse tree also indicates how the
expression should be evaluated: once the values of the operands are known, the
expression can be evaluated by moving up the tree from the leaves. Note that
interpreted in this way, the parse tree of Figure 6.2 does correspond to the correct
evaluation of the expression (a+a)*a. In addition, this parse tree is unique: in
this grammar, there is only one way in which the expression (a+a)*a can be
derived (and evaluated).
In contrast, consider the expression a+a*a. This expression has two differ-
ent parse trees, as shown in Figure 6.3. Each of these parse trees corresponds to
a different way of deriving, and evaluating, the expression. But only the parse
tree on the right corresponds to the correct way of evaluating this expression
because it follows the rule that multiplication has precedence over addition.
In general, a parse tree is viewed as assigning meaning to a string. When
6.4. AMBIGUITY AND PARSE TREES 147
E E
E * E E + E
E + E a a E * E
a a a a
a string has more than one parse tree in a particular grammar, then the gram-
mar is said to be ambiguous. The above grammar for arithmetic expressions is
ambiguous.
In practical applications, we typically want unambiguous grammars. In the
case of arithmetic expressions, an unambiguous CFG can be designed by creating
levels of expressions: expressions, terms, factors, operands. Here’s one way of
doing this:
E → E+T | T
T → T *F | F
F → (E) | a
In this grammar, the string a+a*a has only one parse tree, the one shown in
Figure 6.4. And this parse tree does correspond to the correct evaluation of the
expression.
148 CHAPTER 6. CONTEXT-FREE LANGUAGES
E
E + T
T T * F
F F a
a a
It is not hard to see that this last grammar is unambiguous and that it cor-
rectly enforces precedence rules as well as left associativity. Note, however, that
some context-free languages are inherently ambiguous, in the sense that they can
only be generated by ambiguous CFG’s.
Study Questions
6.4.1. What is a parse tree?
S → 0S 1 | "
This grammar generates the 0’s and 1’s in pairs, starting from the outside and
progressing towards the middle of the string:
Now consider the language {an bn cn | n ≥ 0}. One idea would be a rule such
as S → aS bS c but that would mix the different kinds of letters. Another idea
would be S → T1 T2 where T1 derives strings of the form a2n bn while T2 derives
strings of the form bn c2n . But that would not ensure equal numbers of a’s and
c’s.
It turns out that {an bn cn } is not context-free: there is no CFG that can gen-
erate it. And, as in the case of regular languages, this can be shown by using a
Pumping Lemma.
1. |v x y| ≤ p,
2. v y 6= ",
3. uv k x y k z ∈ L, for every k ≥ 0.
1. |v x y| ≤ p.
2. v y 6= ".
3. uv k x y k z ∈ L, for every k ≥ 0.
There are two cases to consider. First, suppose that either v or y contains
more than one type of symbol. Then uv 2 x y 2 z ∈ / L because that string is not even
∗ ∗ ∗
in a b c .
Second, suppose that v and y each contain only one type of symbol. Then
uv x y 2 z contains additional occurrences of at least one type symbol but not of
2
/ L.
all three types. Therefore, uv 2 x y 2 z ∈
In both cases, we have that uv x y 2 z ∈
2
/ L. This is a contradiction and proves
that L is not context-free. t
u
Note that the condition |v x y| ≤ p was not used in this proof that {an bn cn }
is not context-free. Our next example will require that we use that condition.
Let L be the language of strings of the form ww. An exercise in the previ-
ous chapter asked you to show that the language of strings of the form wwR
6.5. A PUMPING LEMMA 151
is context-free. The idea is to generate the symbols of wwR starting from the
outside and progressing towards the middle:
S → 0S 0 | 1S 1 | "
But this idea does not work with strings of the form ww because the first symbol
of the string must match a symbol located in its middle.
Instead, we now use the Pumping Lemma to show that the language of
strings of the form ww is not context-free.
Example 6.15 Let L be the language of strings of the form ww. Suppose that L
is context-free. Let p be the pumping length. Consider the string w = 0 p 1 p 0 p 1 p .
Clearly, w ∈ L and |w| ≥ p. Therefore, according to the Pumping Lemma, w can
be written as uv x yz where
1. |v x y| ≤ p.
2. v y 6= ".
3. uv k x y k z ∈ L, for every k ≥ 0.
6.5.2. Use the Pumping Lemma to show that the language {ai b j ck | i ≤ j ≤ k}
is not context-free. The alphabet is {a, b, c}.
6.5.3. Use the Pumping Lemma to show that the language {1i #1 j #1i+ j | i ≤ j}
is not context-free. The alphabet is {1, #}.
6.5.4. Consider the language of strings of the form wwR that also contain equal
numbers of 0’s and 1’s. For example, the string 0110 is in this language
but 011110 is not. Use the Pumping Lemma to show that this language
is not context-free.
6.5.5. Use the Pumping Lemma to show that the language {1i #1 j #1i j } is not
context-free. The alphabet is {1, #}. (Recall that Exercises 5.2.2 and 6.3.2
asked you to show that the similar language for unary addition is context-
free but not regular. Therefore, when the problems are presented in this
way, unary addition is context-free but not regular, while unary multipli-
cation is not even context-free.)
3. uv k x y k z ∈ L, for every k ≥ 0.
u1 A u2 u A z
y1 A y2 v A y
u v x y z x
second tree in that figure represents the same derivation but without showing
the intermediate strings. (Note that, strictly speaking, these trees are not parse
trees but high-level outlines of parse trees.)
Figure 6.6 illustrates the second type of repetition, the non-nested type,
where the second A is derived from a variable in either u1 or u2 . The case that’s
illustrated is the one where the second A is derived from a variable in u2 . Once
again, the second tree represents the same derivation but without showing the
intermediate strings.
Recall that we are looking for a way to pump w. It’s not clear how a non-
nested repetition would allow us to pump w. But a nested repetition does. That’s
because the portion of the parse tree that derives vAy from the first A in Fig-
ure 6.5 can be omitted or repeated as illustrated in Figure 6.7. In this figure, the
cases k = 0 and k = 2 are shown but it should be clear that for every k ≥ 0, the
string uv k x y k z can be derived in a similar way.
Now, how can we ensure that a derivation of w contains a nested repetition?
6.6. PROOF OF THE PUMPING LEMMA 155
S S
u1 A u2 u A
y1 A y2 v x A z
u v x y z y
u A z
S v A y
u A z v A y
x x
u A z
v A y
u A z
v A y
Proof Let L be a CFL and let G be a CFG that generates L. Let b be the
maximum number of symbols on the right-hand side of any rule of G. Let p =
max(b|V | + 1, b|V |+1 ).2 Now suppose that w is any string in L with |w| ≥ p.
Consider one of the smallest possible parse trees for w. Because |w| ≥ b|V | +1,
the height of that tree must at least |V |+1. Let π to be one of the longest possible
paths in the tree. Then the length of π is at least |V | + 1, which implies that π
contains a repetition among its bottom |V | + 1 variables. Let A be one such
repeated variable. Then the tree is of the form shown in Figure 6.8.
Define u, v, x, y and z as suggested by Figure 6.8. Then w = uv x yz and by
omitting or repeating the portion of the tree that derives vAy from the first A,
a parse tree can be constructed for every string of the form uv k x y k z ∈ L with
k ≥ 0. This is illustrated in Figure 6.7.
2
Note that if b ≥ 2, then p = b|V |+1 .
160 CHAPTER 6. CONTEXT-FREE LANGUAGES
We also have that v y 6= ", since otherwise the first parse tree of Figure 6.7
would be a parse tree for w that’s smaller than the original tree, and that’s not
possible since we chose that tree to be one of the smallest possible parse trees
for w.
In Figure 6.8, the height of the subtree rooted at the higher A is at most
|V |+1. Because if that subtree contained a path longer than |V |+1, as illustrated
in Figure 6.9, then that path, together with the path that goes from the top of
the entire tree to the higher A would be longer than π, and that’s not possible
since we chose π to be one of the longest possible paths in the entire tree. This
implies that |v x y| ≤ b|V |+1 ≤ p. t
u
S → TU
T → aT b | "
U → cU | "
6.7. CLOSURE PROPERTIES 161
Now, consider the language L2 = {ai bn cn | i, n ≥ 0}. This language too is
context-free:
S → UT
T → bT c | "
U → aU | "
Corollary 6.18 The class of context-free languages is not closed under complemen-
tation.
a∗ b∗ c∗ ∪ {ai b j ck | i 6= j or j 6= k}.
It’s not hard to show that this language is context-free (see the exercises).
Here’s another example. We know that the language of strings of the form
ww is not context-free. The complement of this language is
6.7.2. Give a CFG for the language {ai b j ck | i 6= j or j 6= k}. Show that this
implies that the complement of {an bn cn | n ≥ 0} is context-free.
6.7.3. Give a CFG for the language {x y | x, y ∈ {0, 1}∗ , |x| = | y| but x 6= y}.
Show that this implies that the complement of the language of strings of
the form ww is context-free.
6.7.4. Give a CFG for the language {x # y | x, y ∈ {0, 1}∗ and x 6= y}. Use this
to construct another example of a CFL whose complement is not context-
free.
It’s not clear how this could be proved using CFG’s: how can a CFG be com-
bined with a regular expression to produce another CFG?
Instead, recall that for regular languages, closure under intersection was
proved by using DFA’s and the pair construction. To carry out a similar proof
for the intersection of a CFL and a regular language, we would need to identify
a class of automata that corresponds exactly to CFL’s.
The key is to use nondeterminism. Here’s a first idea. Suppose that G is a
CFG with start variable S. We want an algorithm that given a string w as input,
6.8. PUSHDOWN AUTOMATA 163
set u = S
while (u contains at least one variable)
let A be any variable in u
nondeterministically choose a rule for A
replace A by the right−hand side of that rule
if (u == w)
accept
else
reject
E → E+T | T
T → T *F | F
F → (E) | a
In addition, as was the case with regular languages, the proof of this theorem
is constructive: there are algorithms for converting CFG’s into PDA’s, and vice
versa.
We can now sketch the proof that the class of CFL’s is closed under intersec-
tion with a regular language.
Note that the above proof sketch does not work for the intersection of two
CFL’s. The problem is that combining two PDA’s would produce an automaton
with two stacks and this is not a PDA. In fact, adding a second stack to PDA’s
produces a class of automata that can recognize languages that are not context-
free. Here’s an example.
168 CHAPTER 6. CONTEXT-FREE LANGUAGES
Example 6.21 Consider the language {an bn cn | n ≥ 0}. A two-stack, single-
scan algorithm that recognizes this language can be designed as follows. The
idea is to push all the a’s onto one stack, all the b’s onto the other stack, and
then pop both stacks once for every c in the input. We then accept if and only if
both stacks become empty at the same time as the end of the input is reached.
This algorithm can be described in pseudocode as shown in Figure 6.13. t
u
Example 6.22 Consider the language wwR . (Recall that wR is the reverse of w.)
A single-scan, nondeterministic stack algorithm that recognizes this language
can be designed as follows. The idea is to push all the symbols from the first
half of the input onto the stack. Then, as the symbols from the second half of
the input are read, they are matched, one by one, with the symbols contained
in the stack. For example, if the input is 0110, then 01 would be pushed on the
stack, with 0 at the bottom and 1 at the top. Then, the second half of the string,
10, would be read and match the symbols from the stack.
One difficulty with this idea is determining where the middle of the string
is. This is where nondeterminism is useful. The algorithm can simply guess
where the middle is. If the guess is correct, the end of the input will be reached
at exactly the same time as the stack becomes empty. Otherwise, the guess is
incorrect and the algorithm would reject. This algorithm works because if the
input is in the language, there is a correct guess that will cause the algorithm to
accept. But if the input is not in the language, then the algorithm will always
reject because even if it guesses the middle correctly, the algorithm will find a
6.8. PUSHDOWN AUTOMATA 169
while (true)
read next char x
push x on the stack
nondeterministically chose to either continue with
the loop or exit the loop
if (stack empty)
accept
else
reject // middle guessed too late
Study Questions
We won’t prove this theorem in these notes but note that the theorem can
be proved constructively: there is an algorithm that given a CFG G, produces an
equivalent CFG G 0 in CNF.
CFG’s in CNF have the following useful property:
Study Questions
6.9.1. What forms of rules can appear in a grammar in CNF?
Exercises
6.9.2. A leftmost derivation is one where every step replaces the leftmost variable
of the current string of variables and terminals. Consider the following
modification of the algorithm described in this section
Show that this algorithm runs in time 2O(n) . Hint: The similar algorithm
described earlier in this section does not necessarily run in time 2O(n) .
6.9.3. Consider the language of strings of the form ww. Show that the comple-
ment of this language is not a DCFL.
4
At Clarkson, this is CS445/545 Compiler Construction.
Chapter 7
Turing Machines
7.1 Introduction
Recall that one of our goals in these notes is to show that certain computational
problems cannot be solved by any algorithm whatsoever. As explained in Chap-
ter 1 of these notes, this requires that we define precisely what we mean by an
algorithm. In other words, we need a model of computation. We’d like that model
to be simple so we can prove theorems about it. But we also want our model
to be relevant to real-life computation so that our theorems say something that
applies to real-life algorithms.
In Section 2.1, we described the standard model: the Turing machine. Fig-
ure 7.1 shows a Turing machine. The control unit of a Turing machine consists
of a transition table and a state. In other words, the control unit of a Turing
machine is essentially a DFA, which means that a Turing machine is essentially a
DFA augmented by memory.
The memory of a Turing machine is a string of symbols. That string is semi-
infinite: it has a beginning but no end. At any moment in time, the control unit
176 CHAPTER 7. TURING MACHINES
memory
control
unit
yes/no
has access to one symbol from the memory. We imagine that this is done with a
memory head.
Here’s an overview of how a Turing machine operates. Initially, the memory
contains the input string followed by an infinite number of a special symbol
called a blank symbol. The control unit is in its start state. Then, based on
the memory symbol being scanned and the internal state of the control unit, the
Turing machine overwrites the memory symbol, moves the memory head by one
position to the left or right, and changes its state. This is done according to the
transition function of the control unit. The computation halts when the Turing
machine enters one of two special halting states. One of these is an accepting
state while the other is a rejecting state.
It’s important to note that the amount of memory a Turing machine can use
is not limited by the length of the input string. In fact, there is no limit to how
much memory a Turing machine can use. A Turing machine is always free to
move beyond its input string and use as much memory as it wants.
In the next section, we will formally define what a Turing machine is. In
the meantime, here’s an informal example that illustrates what Turing machines
7.1. INTRODUCTION 177
can do.
Consider the language {an bn cn | n ≥ 0}. A Turing machine that recognizes
this language can be designed as follows. Suppose that the input is
aaabbbccc
The basic idea is to make sure that every a has a matching b and a matching
c. The Turing machine can keep track of which symbols have been matched
by crossing them off. So the machine starts by crossing off the first a and then
moves its memory head to the right until it finds, and crosses off, a b and a c:
xaaxbbxcc
2. Scan the input from left to right to verify that it is of the form a∗ b∗ c∗ . In
other words, verify that there are no a’s after the first b, and that there
are no a’s or b’s after the first c. If that’s not the case, reject.
7. Repeat Steps 3 to 6 until all the a’s have been crossed off. When that
happens, scan the input to verify that all other symbols have been crossed
off. If so, accept. Otherwise, reject.
The above description is still missing several details. We can call it a sketch
or outline of a Turing machine. We will be able to give a more precise description
once we have formally defined what a Turing machine is in the next section.
Recall that the language {an bn cn | n ≥ 0} is not context-free. So the example
of this section points to the fact that Turing machines can recognize languages
that both DFA’s and PDA’s can’t.
Exercises
7.1.1. In the style of this section, sketch the operation of a Turing machine for
the language {1i #1 j #1i+ j }.
7.1.2. In the style of this section, sketch the operation of a Turing machine for
the language {1i #1 j #1i j }.
1. q ∈ Q is the state of M .
We are now ready to define what a Turing machine does at every step of its
computation.
L(M ) = {w ∈ Σ∗ | w is accepted by M }
It is important to note that this is more than just one possible definition. It
turns out that every other reasonable notion of algorithm that has ever been
proposed has been shown to be equivalent to the Turing machine definition.
(We will see some evidence for this later in this chapter.) In other words, the
Turing machine appears to be the only definition possible. This phenomenon
is known as the Church-Turing Thesis. So the Turing machine definition of an
algorithm is more than just a definition: it can be viewed as a basic law or axiom
of computing. And the Church-Turing Thesis has an important consequence: by
eliminating the need for competing notions of algorithms, it brings simplicity
and clarity to the theory of computation.
182 CHAPTER 7. TURING MACHINES
7.3 Examples
In this section, we start exploring the power of Turing machines by considering
some simple examples.
Example 7.10 Over the alphabet {0, 1}, consider the language of strings of
length at least two whose first and last symbols are identical. A Turing machine
for this language can be designed based on the following basic idea: look at the
first symbol then move the memory head to the last symbol and make sure it is
the same.
Here’s a more detailed description:
2. Remember the first symbol. (By using the states of the machine. More
detail later.)
3. Move right.
4. If the current symbol is blank, reject. (Because the input has length 1.)
7. If the current symbol is identical to the first one, accept. Otherwise, reject.
Note how the above description specifies how the Turing machine moves its
memory head. But it doesn’t specify the states or the transition function of the
machine. We call this a low-level description of the Turing machine.
Figure 7.2 gives a full, or complete, description. This description takes the
form of a state diagram similar to the state diagram of a DFA. All the details on
7.3. EXAMPLES 183
0, 1 → R
0, 1 → R →L
q1 q2 q3
0→R
0
1
q0 qR qA
0
1
1→R
0, 1 → R →L
q4 q5 q6
0, 1 → R
Figure 7.2: A Turing machine for the language of strings of length at least two
whose first and last symbols are identical
the states and the transition function of the machine are included in the state
diagram. This complete description of the machine is essentially the implemen-
tation of the preceding low-level description.
In general, in the state diagram of a Turing machine, a transition labeled
a → b, D from state q to state r means that δ(q, a) = (r, b, D). In other words,
that transition should be used if the current symbol is an a, that symbol should be
changed to a b and the head should move in the direction D. In case the symbol
is not to be changed, the shorter form a → D may be used as an abbreviation for
184 CHAPTER 7. TURING MACHINES
a → a, D. In that case, multiple transitions can also be combined into one, as
in a, b → D. When moving to the accepting or rejecting state, the symbol to be
written and the movement of the head are not important (because the machine
is about to halt) so we often omit them. The label of the transition is then just a
single symbol, as in a, or multiple symbols, as in a, b. t
u
This example showed how a Turing machine can remember individual sym-
bols from its input (or memory) as its head moves to other symbols. The tech-
nique used is essentially the same one we used with DFA’s, which isn’t surprising
since a Turing machine is essentially a DFA with memory.
The next example revisits the language {an bn cn }. We sketched a Turing
machine for this language at the end of Section 7.1. We now provide more
detail. This will illustrate how a Turing machine can take advantage of its ability
to change the contents of the memory.
2. If the first symbol is not an a, reject. (Because the input has more b’s or
c’s than a’s.)
3. Replace the a by an x.
4. Move right, skipping a’s and y’s, until a b is found. Replace that b with a
y. If no b was found, reject.
5. Move right, skipping b’s and z’s, until a c is found. Replace that c with a
z. If no c was found, reject.
6. Move left until an x is found. Move right. (The current symbol is the first
a that hasn’t been crossed off. Or a y.)
8. When that symbol is a y instead, scan the rest of the memory to verify that
all b’s and c’s have been crossed off. This can be done by moving right,
skipping y’s and z’s until a blank symbol is found. If other symbols are
found before the first blank, reject. Otherwise, accept.
It is easy to see that this Turing machine will accept every string of the form
a bn cn . To see that it will only accept strings of that form, consider what the
n
memory contents could be after the first iteration of the loop. It has to be a
string of the form xa∗ yb∗ zΣ∗ . (We’re omitting the blanks.) After two iterations,
the memory contains a string of the form x2 a∗ y2 b∗ z2 Σ∗ . Suppose that the
186 CHAPTER 7. TURING MACHINES
a, y → R b, z → R z, b, y, a → L
a → x, R b → y, R c → z, L
q0 q1 q2 q3
→R y→R x→R
qA q4
→R
y, z → R
loop runs for n iterations. At that point, the memory contents is of the form
xn yn b∗ zn Σ∗ . The input will then be accepted only if the memory contents is of
the form xn yn zn , which implies that the input was of the form an bn cn .
Figure 7.3 gives the transition diagram of a Turing machine that implements
the above low-level description. Note that all missing transitions go to the re-
jecting state. These transitions, as well as the rejecting state itself, are not shown
to reduce clutter in the state diagram. t
u
Proof The basic idea is simple. Suppose that M is a DFA. A Turing machine can
simulate M by reading the input from left to right while going through the same
188 CHAPTER 7. TURING MACHINES
a→R
b→R
qA
qR
states as M . Once the end of the input is reached, the Turing machine enters its
accepting state if the current state of the DFA is accepting; otherwise, it enters
its rejecting state.
t
u
The proof of this theorem gives us a general approach for designing Turing
machines for regular languages: design a DFA and then convert it into a Turing
machine. But note that it is often easier to design the Turing machine directly,
as we did in the first example of this section.
Exercises
7.3.1. Give a low-level and a full description of a Turing machine for the lan-
guage of strings of length at least two that end in 00. The alphabet is
{0, 1}.
7.3.2. Give a low-level and a full description of a TM for the language {w#w |
w ∈ {0, 1}∗ }. The alphabet is {0, 1, #}.
7.3.3. Give a low-level description of a TM for the language {ww | w ∈ {0, 1}∗ }.
The alphabet is {0, 1}.
δ : Q × Γ → Q × Γ × {L, R, S}
The definition of a move must also be extended by adding the following case:
i0 = i if D = S
δ(q1 , a) = (q1,a , b, R)
Theorem 7.14 Every multitape Turing machine has an equivalent single-tape Tur-
ing machine.
The # symbol at the beginning of the tape will allow M1 to detect the beginning
of the memory.
The single-tape machine also needs to keep track of the location of all the
memory heads of the multitape machine. This can be done by introducing un-
derlined versions of each tape symbol of Mk . For example, if one of the memory
heads of Mk is scanning a 1, then, in the memory of M1 , the corresponding 1
will be underlined: 1.
Here is how M1 operates:
192 CHAPTER 7. TURING MACHINES
1. Initialize the tape so it corresponds to the initial configuration of Mk .
2. Scan the tape from left to right to record the k symbols currently being
scanned by Mk . (The scanning can stop as soon as the k underlined sym-
bols have been seen. M1 remembers those symbols by using its states.)
3. Scan the tape from right to left, updating the scanned symbols and head
locations according to the transition function of Mk .
Exercises
7.4.1. Give a formal definition of multitape Turing machines.
7.4.2. The memory string of the basic Turing machine is semi-infinite because it
is infinite in only one direction. Say that a memory string is doubly infinite
if it is infinite in both directions. Show that Turing machines with doubly
infinite memory are equivalent to the basic Turing machine.
7.4.3. Show that the class of decidable languages is closed under complemen-
tation, union, intersection and concatenation.
7.4.4. Consider the language of strings of the form x # y #z where x, y and z are
strings of digits of the same length that when viewed as numbers, satisfy
the equation x + y = z. For example, the string 123#047#170 is in this
language because 123 + 47 = 170. The alphabet is {0, 1, . . . , 9, #}. Show
that this language is decidable. (Recall that according to Exercise 5.2.3,
7.5. EQUIVALENCE WITH PROGRAMS 193
this language is not regular. It’s also possible to show that it’s not context-
free.)
1. Move the memory head to location i. (This can be done by scanning the
memory from left to right while counting by using the states of the TM.)
2. Write the 32 bits of x to the 32 bits that start at the current memory loca-
tion.
Now, suppose that we add indirect addressing to this. Here’s how we can
simulate an instruction that sets to x the contents of the memory location whose
address is stored at memory location i. We’re assuming that memory addresses
have 32 bits.
2. Copy the 32 bits that start at the current memory location to an extra tape.
Call this value j.
3. Scan the memory from left to right. For each symbol, subtract 1 from j.
When j becomes 1, stop. (The head is now at memory location j.)
196 CHAPTER 7. TURING MACHINES
4. Write the 32 bits of x to the 32 bits that start at the current memory loca-
tion.
As can be seen from the above, while executing instructions, the Turing ma-
chine will need to use a small number of additional tapes to store temporary
values. The details of the subtraction in Step 2 are left as an exercise similar to
one from the previous section.
Other assembler instructions can be simulated in a similar way. This makes
it pretty clear that Turing machines can simulate assembler programs. And, as
explained earlier, by combining this with the fact that compilers can translate
C++ into assembler, we get that Turing machines can simulate C++ programs.
Up until now, all the descriptions we have given of Turing machines have
been either low-level descriptions, such as those in this section, or full descrip-
tions, such as the state diagrams of Figures 7.2 and 7.3. Now that we have
convincing evidence that Turing machines can simulate the constructs of high-
level programming languages, from now on, we will usually describe Turing
machines in pseudocode. We will say that these are high-level descriptions of
Turing machines. (From now on, unless otherwise indicated, you should assume
that exercises ask for high-level descriptions of Turing machines.)
Exercises
7.5.1. Describe how a Turing machine can simulate each of the following as-
sembler instructions. Give low-level descriptions. In each case, assume
that 32 bits are copied, added or tested.
7.5.2. Show that the class of decidable languages is closed under the star oper-
ation.
198 CHAPTER 7. TURING MACHINES
Chapter 8
Undecidability
8.1 Introduction
In this chapter, we will finally prove that there are computational problems that
cannot be solved by any algorithm. More precisely, using the concepts we have
studied in these notes, we will prove that there are languages that cannot de-
cided by any Turing machine. We say that these languages are undecidable.
One of these languages corresponds to a problem that we mentioned in the
first chapter of these notes: the Halting Problem. In general, the input to this
problem is an algorithm and the output is the answer to the following question:
Does the algorithm halt on every possible input?
As we have learned in these notes, problems like these can be made more
precise by formulating them as languages. In this case, we will say that the Halt-
ing Problem is the set of Turing machines that halt on every input. But languages
are sets of strings, so this requires that every Turing machine be represented by
a string.
This can be done in a variety of ways. One way is to simply write out the
200 CHAPTER 8. UNDECIDABILITY
formal description of the Turing machine, that is, a description that follows the
formal definition of Section 7.2. This requires an alphabet that includes paren-
theses, commas, braces, letters and digits. One difficulty is that the input or
tape alphabets of the Turing machine may include symbols that are not in the
alphabet used for the encoding. A solution is to simply number these symbols
and rewrite the transition function so it uses those numbers instead of the orig-
inal symbols. The same can be done for the states in case their original names
include symbols that are not in the encoding alphabet.
If M is a Turing machine, let 〈M 〉 denote the encoding of M in whatever
encoding scheme we’ve decided to use. Then the Halting Problem can be defined
precisely — formally — as the language of strings of the form 〈M 〉 where M is
a TM that halts on every input. Later in this chapter, we will prove that this
language is undecidable.
It turns out that pretty much any computational problem concerning Turing
machines is undecidable. We will see several examples in this chapter, including
the following:
Before learning how to prove these undecidability results, we will first con-
sider the above problems in the context of finite automata and context-free gram-
mars. In those contexts, most of these problems are actually decidable.
Index