Discrete Mathematics in Computer Science 0132160528 9780132160520 Compress
Discrete Mathematics in Computer Science 0132160528 9780132160520 Compress
COMPUTER SCIENCE
DONALD F. STANAT
Department of Computer Science
University of North Carolina at Chapel Hill
DAVID F. McALLISTER
Department of Computer Science
North Carolina State University
10 9 8 7 6 5 4 3 2 #1
ISBN Q-13-21605ce-4
PREFACE x
Notation xiii
O MATHEMATICAL MODELS 1
0.0 Introduction 1
0.1 Principles and Models 1
0.2 Mathematical Models 2
0.3 Purposes of Models 6
1 MATHEMATICAL REASONING 8
1.0 Introduction 8
Propositions 9
Predicates and Quantifiers 20
3 Quantifiers and Logical Operators 29
.4 Logical Inference 39
5 Methods of Proof 47
6 Program Correctness 57
Axioms of Assignment 69
2 SETS 75
2.0 Introduction 75
2.1 The Primitives of Set Theory 75
{2.2 The Paradoxes of Set Theory 79
2.3 Relations Between Sets 82
Introduction 120
Binary Relations and Digraphs 120
Trees 131
Search Trees 136
Tree Traversal Algorithms 140
Special Properties of Relations 145
Composition of Relations 149
Closure Operations on Relations 155
Order Relations 164
{Some Additional Concepts for Posets 173
Equivalence Relations and Partitions 178
{Sums and Products of Partitions 187
4 FUNCTIONS 193
Introduction 275
Finite and Infinite Sets 275
Countable and Uncountable Sets 279
Comparison of Cardinal Numbers 288
Cardinal Arithmetic 295
7 ALGEBRAS 300
Introduction 300
The Structure of Algebras 30]
Some Varieties of Algebras 309
Semigroups 309
Monoids 310
Groups 311
Boolean Algebras 312
73 Homomorphisms 315
Congruence Relations 322
7.5 New Algebraic Systems from Old 327
Quotient Algebras 327
Product Algebras 329
BIBLIOGRAPHY 391
INDEX 393
PREFACE
vili
PREFACE ix
the benefit of those who can eas ily und ers tan d the m and sho uld not cau se con cer n
to those who can not. A num ber sign (3) is use d to den ote the end of a col lec tio n
of examples.
Exe rci ses are giv en at the end of eac h sec tio n in the app rox ima te ord er in
which the topics are pre sen ted in the text ; wit hin top ics , the y are ord ere d acc ord ing
to inc rea sin g diff icul ty. A pro ble m mar ked } tre ats mat eri al fro m an opt ion al sub -
section. The pro gra mmi ng pro ble ms giv en at the end of som e sec tio ns will usu all y
require additional spe cif ica tio n bef ore the y can be wor ked by nov ice pro gra mme rs.
For exampl e, in a set the ory pro ble m, one mig ht wan t to con sid er onl y sets wit h no
more than 100 elements.
This text has evo lve d ove r a per iod of sev era l yea rs. Pre lim ina ry ver sio ns
have been used ext ens ive ly at the Uni ver sit y of Nor th Car oli na at Cha pel Hill
and Nor th Car oli na Sta te Uni ver sit y. It wou ld be imp oss ibl e to list all tho se
who have contributed to the final product. Jon Bentley, Don Johnson and
Nei l Jon es des erv e par tic ula r men tio n; the y pro vid ed com men ts and sug ges tio ns
on the entire manuscript. Others who made substantial contributions include
Peter Calingaert, James W. Hanson, Yale N. Patt, Stephen M. Pizer, James
Tha tch er, Vic tor L. Wal lac e and Ste phe n F. Wei ss. Ann e Pre sne ll and Dav e Tol le
assisted in the preparation of problem solutions. Finally, we wish to thank the
many students who studied from the manuscript and contributed to its final form.
Our secretarial help has come from many quarters, but three individuals
deserve special mention. Nina Eaker worked on endless drafts and revisions in the
early stages of the manuscript, Gloria Edwards carried the work forward, and
Anne Edwards brought the manuscript to its final form. We thank them for their
help and support.
DONALD F. STANAT
Davip F. MCALLISTER
NOTATION
Logic
—P not P
PV Q PorQ
PAO Pand QO
P>@Q P implies Q
PQ P if and only if O
V Universal quantifier: for all...
4 Existential quantifier: there exists...
qd! There exists a unique...
_ Numbers
[x] the integer n such that x <<n<x+1
|x| the integer n such that x >n> x — |
N the set of natural numbers, or nonnegative integers: 0, 1, 2,...
I the set of all integers: ..., —2, —1,0,1,2,...
I+ the set of positive integers: 1, 2,3,...
Q the set of rational numbers.
Q+ the set of positive rational numbers,
R the set of real numbers.
R+ the set of positive real numbers.
(a, b) the open interval in R from a to b:
(a,b) = {x|x ERA a<x<bd}.
{a, | the closed interval in R from a to b:
[a,b] = {x|x Ee RA ax<x<dt.
(a, 5] the half-open interval in R from a to b:
(a, b] = {xla<x< bh.
[a, 5) the half-open interval in R from a to b:
(a,b) = {xja<x < Bh.
(a, co) {x|x eR A x> ah.
[a, oo) {x|x ERA x> a}.
N, the set of integers (0, 1,2,...,k — 1}.
Sets
aca a is an element of the set A.
a¢a a is not an element of the set A.
AcB the set A is contained in the set B.
AEB the set A is not contained in the set B.
d the empty, or void set.
AUB the union of the sets A and B.
AMB the intersection of the sets A and B.
A—B the relative complement of B with respect to A.
xi
the absolute complement of A.
the power set of A.
SA x ic
{x|dif € Aj}.
S>x
{x|VWi e Aj}
lic
the cartesian product of A with B.
the cartesian product of the sets A,, 1 Si<in.
xii
Functions
uC) the value of the function f for the argument a.
f:A-B J is a function with domain A and codomain B.
f(A) the image of the set A under the function f.
fog, or fg the composite function of f with g.
A®
the set of functions from B to A.
1, the identity function on the set A.
fu}
the inverse of f.
f-*(A) the inverse image of A under f.
ta the function f restricted to A.
4A the characteristic function of A.
p
P(n, r) the number of: permutations of n objects taken r at a time.
the number of combinations of n objects taken r at a time.
Algebras
<8, 0, kD an algebra with carrier S, operation o, and constant k.
addition modulo k .
the product algebra of A with A’.
the quotient algebra of A’ with respect to the congruence
relation ~.
xiii
O
MATHEMATICAL MODELS
0.0 INTRODUCTION
The goal of this text is the development of mathematical concepts and techniques
which are fundamental to the field of computer science. We define computer science
broadly, as the discipline concerned with the representation and processing of
information. We consider computer science to lie somewhere between mathematics
and technology, close enough to each to be profoundly affected by developments
in either of these fields but dominated by neither. The mathematical topics we will
develop are classical ones which predate computer science, but which are generally
recognized as necessary and fundamental tools for the investigation of many
problems in the field. Our aim is to present these mathematical tools and illustrate
their use in characterizing the phenomena of computer science. In this chapter we
describe the ways in which mathematics can be used to represent objects of study.
Principles vary in their validity as wel l as the ir pre cis ion . Th ey als o var y in the ir
importance and the degree to wh ic h the y aff ect the way we thi nk and act .
The concept of “model” is eve n mo re vag ue tha n tha t of “pr inc ipl e.” Ro ug hl y
speaking, a model is an ana log y for so me obj ect or ph en om en on of int ere st. As
we will use the term, mode ls are use d to “ex pla in” a pro ces s or to pre dic t an eve nt.
For example, a wind tunnel, use d wit h a min iat ure rep lic a of an air cra ft, ma ke s it
possible to predict so me cha rac ter ist ics of the air cra ft’ s pe rf or ma nc e, sin ce the
behavior of the full-sized cra ft is str ong ly rel ate d to tha t of the mod el. Sim ila rly ,
a world globe all ows us to est ima te dis tan ce be tw ee n loc ati ons on the ear th, and
an orrery provides a vis ual mo de l of the mo ve me nt of the pla net s abo ut the sun .
Genetic models for the transf er of tra its pro vid e a bas is for pre dic tin g the fre que n-
cies with whic h inh eri ted cha rac ter ist ics wil l app ear in suc ces siv e gen era tio ns.
Models can als o be mis lea din g. A me di ev al mo de l of hu ma n re pr od uc ti on
proposed that babies dev elo p fro m ho mu nc ul i co nt ai ne d ab ini tio in a wo ma n’ s
body. Of cou rse , fem ale ho mu nc ul i als o co nt ai ne d oth er ho mu nc ul i nes ted wit hin .
Since it was felt tha t thi s nes tin g cou ld not go on wit hou t lim it, thi s mo de l had the
unco mf or ta bl e imp lic ati on tha t the rac e wo ul d be co me ext inc t, sin ce re pr od uc ti on
would cease aft er the in ne rm os t ho mu nc ul i wer e bor n. A mo de l of our uni ver se
wh ic h was co mm on ly acc ept ed in the fif tee nth cen tur y pre dic ted tha t Co lu mb us
would not ret urn fr om his vo ya ge to the wes t. Thi s mo de l of a fla t ear th of fin ite
extent was cle arl y an im po rt an t one , par tly bec aus e of its inf lue nce on exp lor ati on,
but it see ms wr on g to cal l it a val uab le mod el. The val ue of a mo de l mi gh t bes t be
define d as the deg ree to wh ic h it ena ble s us to an sw er que sti ons and ma ke pre dic -
tions correctly.
Mathematics, because of its rigor and lack of ambiguity, has always provided
a good la ng ua ge for the exp res sio n of pri nci ple s. Mo de ls bas ed on ma th em at ic al ly
stated principles are called mathematical models. The purpose of this text is to
dev elo p ma th em at ic s for exp res sin g pri nci ple s and con str uct ing mo de ls in co m-
puter science. While the mathematical topics we treat cannot be nicely categorized,
our emphasis will be on what is often referred to as discrete mathematics.
Real-world Process
Mathematical Structure
dence between the two. Such a model is represented by Fig. 0.2.1. Some comments
will help to clarify the concept.
(a) The first component of a model is a phenomenon or process which we
wish to characterize mathematically. Examples include physical processes,
such as planetary motion, fluid flow, or the pattern of weather change,
as well as such things as economic processes, learning patterns, and so
on. Examples in computer science include the execution of a program,
the allocation of resources of a computation center, and the flow of
information in a computer network. Although the phenomena of interest
need not be taken from the “real world,” they usually are, and in our
discussion, the phrase “real world” will denote this component of a
mathematical model. The real world component is described quantita-
tively by such things as parameter values and times at which events occur.
/(b) The second component of a model is an abstract mathematical structure.
The set of integers with the operations of addition and multiplication
provides one example of such a structure. In itself, this structure is abstract
and has no intrinsic relation to the real world. However, because of its
abstractness, the structure can be used to model many different phenom-
ena. Every mathematical structure has an associated language for
making assertions. In our familiar system of algebra, the assertions
5+ 6< 10, and 7x +y = 18
can both be made, although one is incorrect. If a mathematical model
is successful, the language of its mathematical structure can be used to
make assertions about the object being modeled..
(c) The third component of a model is a specification of the way in which
the real world is represented by the mathematical structure, that is, a
4 MATHEMATICAL MODELS Ch. 0
Example
Every business must keep track of the cash received from sales each day. A
mathematical model is used for this purpose. The first component of the model is
the process of accumulating money from sales. The set of integers (denoting cents),
together with the operation of addition, provides a simple but appropriate mathe-
matical structure. Receiving cash from a sale corresponds to adding the amount of
the sale to the current receipts. The principal parameter of the model represents
cash received. This parameter takes on integer values; at the beginning of the day
its value is 0, and at any time during the day the value of the parameter is the current
amount received from sales. The occurrence of a cash sale is represented in the
structure by the operation of addition; selling an item worth & cents is represented by
adding k to the current value of the parameter. At the end of the business day,
the store owner can determine the total cash receipts by noting the value of the
parameter. +
The above example illustrates all the crucial points of our description of a
mathematical model. It also illustrates that mathematical models ignore certain
aspects of the real world process. For example, the model described above does
not keep track of how many one dollar bills or how many pennies have been
received. This failure is not considered to be a defect of the model, since the store
owner is willing to assume that the actual form of currency received will not cause
him any particular inconvenience. If, however, all of his income for one day hap-
pened to be in pennies, he might find himself with a serious transportation problem
when it came time to take the day’s receipts to the bank. Other factors which are
ignored by the model may be more important. The model does not try to answer
such questions as how the storeowner can maximize his profits. It is legitimate to
use a mathematical model to deal with this kind of question, but the question is
beyond the scope of a model designed simply to keep track of the store’s daily
receipts. Thus, the suitability of a mathematical model depends strongly on the
problem at hand. Ideally, we want a model to represent everything that is impor-
tant about the process and ignore everything else. It is difficult to realize this ideal,
because we are often not sure what aspects of the real world are important. In
fact, the process of deciding which aspects are important can be one of the most
difficult and rewarding steps in specifying a mathematical model.
Sec, 0.2 MATHEMATICAL MODELS §&
Without going into detail, we can give examples of more elaborate mathe-
matical models and describe how they are used.
Examples
{a) A set of simultaneous partial differential equations is useful as a mathematical
structure to describe planetary motion. Newton first proposed such a model
based on observations of the planets and his work on gravitational attraction.
(b) Differential equations are used to determine the flow of current in electrical
circuits by establishing a correspondence between the parts of an electrical
circuit and the terms of mathematical equations. The same equations can be
used to describe mechanical systems involving objects with mass, springs, and
damping devices called dashpots. Thus, the same mathematical structure can be
used in models of entirely different phenomena. These examples also show that
not all models need be mathematical: a mechanical system consisting of springs,
masses, and dashpots can be used as a mechanical model of an electrical circuit,
and vice versa. Analog computers exploit this fact and use electrical models to
solve problems which are expressed mathematically.
(c) Mathematical models are the basis for all computer simulations. Consider the
problem of simulating the operation of a computer center. We can view a com-
puter center as a system which accepts programs and program data as inputs
and produces outputs in a variety of forms, including program listings and
program output. At any time, the state of the system is described by parameter
values which specify what programs are being executed, which disk and tape
drives are busy, the length of the input queue, etc. Other parameters, such as
average turnaround time and the total number of programs processed, can be
used to measure the performance of the system. A mathematical model for
simulation of the system in discrete time steps will incorporate these parameters
into a set of mathematical equations which describe how the values of the
parameters and the system input at any time ¢ can be used to determine the
values of the parameters at time ¢ + 1. Different machine configurations and
different operation policies will be represented by different sets of equations. The
system is simulated by hypothesizing initial parameter values for time t = 0 and
then successively solving the equations for times ¢ = 1, 2,3,..., a. If the simu-
lation is successful, then the system parameters at time ¢ = nv will accurately
forecast the behavior of the system. Such simulation models can be used as a
basis for choosing among various alternatives, e.g., the performance of a model
can be used to predict the result of a proposed change in either a hardware
configuration or in operations policy. #
The purposes of models fall into three categories. In the most straightforward
applications, models are used to present information in an easily assimilated form.
For example, graphs may be used to present genealogies and family trees. It is
much easier to decide if cousin Joseph is a descendant of great-grandfather John’s
sister Martha when we have a drawing of the family tree before us instead of a
written record of marriages and offspring. In the same way, a roadmap provides
a descriptive model of a highway network. Planning a trip would not be so easy if,
instead of a roadmap, we had a list of distances between adjacent cities.
A second use of models is to provide a convenient method for performing
certain computations. Familiar examples include optimization methods and
Fourier analysis. The choice of a model for the purpose of computation is often
directly affected by the set of available mathematical techniques. For example, a
system known to have nonlinear components may be modeled approximately with
a set of linear equations so that linear programming can be used to estimate a
solution.
Thirdly, models are used for investigation and prediction. Simulation, both
with physical models and with computers, is an excellent example. The Wright
brothers invented the wind tunnel so they could use physical models to compare
the lifts of different airfoils. Analogous experiments in water tanks use models of
ship hulls to determine which shapes produce the least turbulence and drag.
Models are frequently used to predict parameter values of events which have not
yet occurred, such as the time of tomorrow’s sunrise or the implications for the
national economy of a change in the tax laws. The equations used for calculating
the time of tomorrow’s sunrise are well established and thoroughly tested; con-
sequently, we have a great deal of faith in these predictions. The same is not true
for current models of the national economy, and our prediction in this case is not
likely to be so accurate. In many cases, the predictive ability of a model determines
its worth. The value of Newton’s model of planetary motion was established.
beyond any doubt when deviations from the model’s predictions led to the discovery
of the planet Neptune. The location of Neptune was estimated by determining
what could be the source of observed deviations from the predicted orbit of
Uranus.
Sec. 0.3 PURPOSES OF MODELS 7
The first chapter of Maki and Thompson [1973] gives an excellent description
of how models are built and refined. Their discussion treats the roles of axioms
and theorems in models and provides a basis for some of the topics of our next
chapter. Chapter 2 of their book is a collection of case studies from a variety of
areas. The first chapter of Roberts [1976] is also a good description of model
types and the modeling process.
MATHEMATICAL REASONING
1.0 INTRODUCTION
1.1 PROPOSITIONS
Examples
The following are all propositions:
(a) The moon is made of green cheese.
(b) 4 is a prime number.
(c) 3+3=6.
(d) 2 is an even integer and 3 is not.
(e) It snowed on the island that is now called Manhattan on the day the King of
England signed the Magna Carta.
(f) My most recently written computer program always halts if allowed to run for
a sufficiently long time.
Of the above propositions, (a) and (b) are false, (c) and (d) are true, and (e)
may or may not be true; we have no way of ascertaining its truth value. Neverthe-
less, we assume the assertion is either true or false and therefore classify it as a
proposition. The truth of proposition (f) may be difficult to determine; establishing
the truth of such assertions is the subject of some profound mathematical results in
the theory of computation.
Th e fo ll ow in g ar e no t pr op os it io ns :
(g) x +y> 4.
(h) x = 3.
(i) Are you leaving?
(j) Buy four of them.
The first example is an assertion but not a proposition because its truth value de-
pends on the values of x and y. Similarly, the truth value of the second assertion
depends on the value of x. Examples (i) and (j) are not assertions and are therefore
not propositions. #
+A system in which propositions must be either true or false is said to use a two-valued logic.
The characteristic that “a proposition which is not true is false, and vice-versa” is known as the
law of the excluded middle. Some mathematicians do not consider the law of the excluded middle
to be an accurate reflection of our reasoning. To understand some of the reasons for rejecting the
law of the excluded middle and for a description of logical systems with more than two truth
values, the reader is referred to Rescher [1969].
10 MATHEMATICAL REASONING Ch. 1
using words such as “and,” “or,” and “not.” For example, from the propositions
“John is six feet tall” and “There are four cows in the barn,” we can form
“John is six feet tall and there are four cows in the barn.”
“John is six feet tall or there are four cows in the barn.”
“John is not six feet tall.”
“<p or Q”
“not P”
are assertions which can be formed from the propositional variables P and Q.
In expressions such as the above, the variables P and Q are called operands, and
the words “and,” “or,” and “not” are called logical operators, or logical connec-
tives. Logical connectives denote operations on propositions in the same way that
“plus” and “times” denote operations on numbers. This terminology is common
throughout mathematics; for example, in algebra the expression “4 + x” has 4
and x as operands and + as an operator.
An assertion which contains at least one propositional variable is called a
propositional form. When propositions are substituted for the variables of a propo-
sitional form, a proposition results. Thus, if P represents “John is six feet tall”
and Q represents “Two is a prime number,” the propositional form “P and Q”
represents the proposition “John is six feet tall and two is a prime number,” and
“not P” represents “It is false that John is six feet tall.” When no confusion will
result, we will often refer to propositional forms as propositions. The principal
distinction between propositions and propositional forms is that every proposition
has a truth value whereas a propositional form is an expression whose truth value
may not be determined until propositions are substituted for its propositional
variables.
When a logical operator is used to construct a new proposition from old ones,
the truth value of the new proposition depends on both the logical operator and
the truth values of the original propositions. We will now discuss how the logical
operators “and,” “or,” and “not” affect the truth value of propositions. We will
see that the meaning of the logical operators does not always coincide precisely
with English usage.
The logical operator “not,” or negation, is denoted by the symbol —. Let P
denote a proposition; then “P is not true” is a proposition which we represent by
“-——«P” and refer to as “not P,” or the negation of P. It follows from the law of the
excluded middle that —P is true if P is false, and vice versa. The relationship
between the truth value of —P and that of P is defined by a truth table for the logi-
cal operator —. The truth table of a logical operator specifies how the truth value
of a proposition using that operator is determined by the truth values of the oper-
ands. A truth table lists all possible combinations of truth values of the operands
Sec, 1.1 PROPOSITIONS 11
in the leftmost columns and the truth value of the resulting proposition in the
rightmost column. The truth table for — is the following:
In order to make truth tables easier to read, we will generally use the symbol
1 to denote true and 0 to denote false. Using this convention, the truth table for —
is given as
P —P
0 1
1 0
While negation changes one proposition into another, other logical operators
combine two propositions to form a third. An example is the logical operator
“and,” which we will denote by the symbol /. If P and Q are propositions, then
“P and Q” is a proposition which we represent by “P (A Q” and refer to as the
conjunction of P and Q. The following truth table defines the logical operator /\.
P Q(|PAQ
The truth table defines P (A Q to be true if and only if both P and Q are true.
Like “and,” the logical operator “or,” denoted by the symbol \/, combines
two propositions to form a third. If P and Q are propositions, then the proposi-
tion “P or Q” is called the disjunction of P and Q and is denoted by “P V Q.”
The following truth table defines the logical operator \/.
Q(|PV@Q
It follows from the truth table that P V Q is true if at least one of P or Q is true.
This operator is known as “logical or” or “inclusive or.” One can also define an
12. MATHEMATICAL REASONING Ch.1
P Q|P®eQ
The English language uses the word “or” to denote both the “inclusive or”
and the “exclusive or.” For example, an “inclusive or” is intended in the sentence
“It will rain or snow today”
since the speaker would presumably not be branded a liar if it both rained and
snowed. On the other hand,
“You have to wash the dishes or you must clean the garage”
is not likely to be considered a true statement if, in fact, you are required to wash
the dishes and to clean the garage as well. In mathematics, we use different symbols
for the “inclusive or” and the “exclusive or” to preclude any ambiguity.
The logical operator “implies” is denoted by the symbol =; the proposition
“P implies Q” is represented by “P = Q” and is called an implication. The operand
P is called the premise, hypothesis, or antecedent, and Q is called the conclusion
or consequence. The truth table for the operator => is the following:
The proposition P > Q is false only when P is true and Q is false. Implications may
be stated in a number of ways; the assertion P = Q may be expressed as
“If P, then Q”
“P only if Q”
“P is a sufficient condition for QO”
“Q is a necessary condition for P”
“QO if P”
“@Q follows from P”
“Q provided P”
“O is a logical consequence of P”
“QO whenever P.”
The converse of P => Q is the proposition Q = P, and the contrapositive of P > Q
is the proposition —Q > —P. If P > Q is true, then P is said to be a stronger
=
Sec. 1.1 PROPOSITIONS 13
Example
If P represents “oranges are purple” and Q represents “the earth is not flat,”
then P > Q represents “If oranges are purple, then the earth is not flat.” Although
no causal or inherent relationship holds between the color of oranges and the shape
of the earth, the implication P = Q is true since the premise is false and the con-
clusion is true. #
If P and Q have the same truth values, then they are said to be /ogically equiva-
lent propositions. A logical operator called “equivalence” and denoted by <= pro-
duces a true proposition if the operand propositions are logically equivalent. The
truth table which defines the operator “equivalence” is the following:
P O|P<@
Comparison of the truth tables for implication and equivalence shows that
if P > Q is true, then P > Q and QO => Pare both true. Conversely, if both P > Q
and Q = P are true, then P <> Q’is true. For these reasons, the terminologies for
equivalence and implication are closely related. The proposition P < Q is read
“P is equivalent to Q,” “P is a necessary and sufficient condition for Q,” or “P
if and only if Q.” The abbreviation “iff” is often used to represent the phrase “if
and only if.”
Other logical operators can be defined and are of interest for a variety of
reasons; some of them will be described in the exercises of this section.
Truth tables for individual operators can be used to construct truth tables for
arbitrarily complex propositional forms. The truth table for a propositional form
specifies its truth value for every possible combination of truth values of its propo-
sitional variables. Each propositional variable can assume either of two values,
true or false. Therefore, if k variables occur in a proposition, the associated truth
table must describe 2* cases. Each case occurs as a separate line in the truth table.
14 MATHEMATICAL REASONING Ch.1
Examples
(a) Construct a truth table for the proposition (Q A —“P) => P.
RIPAQI|7RI(PAQVmRIIPA QV ARI P
|
CD
©
OO
Oe
OOOO
OO
et
Om
BE oo
>
mm
Om
et
>
Oe
oO
et
et
OOS
Oh
OE
ht
et OD
O
Oe
—
Lola
~——
am
Oe
toh
—_
—_
In the above truth tables, we have used two conventions which aid readability:
(i) All propositional variables occur in the leftmost columns.
(ii) Truth values are assigned to the propositional variables by “counting in
binary” from 0 to 2* — 1, where & is the number of propositional vari-
ables.
A tautology is a propositional form whose truth value is true for all possible
values of its propositional variables, e.g., P \/ 1 P. A contradiction or absurdity
is a propositional form which is always false, such as P A — P. A propositional
form which is neither a tautology nor a contradiction is called a contingency.
Properties of a propositional form can sometimes be determined by construct-
ing an “abbreviated” truth table. For example, if we wish to show that a proposi-
tional form is a contingency, it suffices to exhibit two lines of the truth table, one
of which makes the proposition true and another that makes it false. To determine
if a propositional form is a tautology, it is only necessary to check those lines of the
truth table for which the proposition could be false.
Example
Consider the problem of determining whether (P A Q) => Pisa tautology. We
will use an abbreviated truth table. If an implication A => B is false, then A must be
true and B must be false. The truth table for (P A Q) = P has only one line where
Sec. 1.1 PROPOSITIONS 15
the value of the premise P A Q is true. Since this is the only instance where
(P A Q) => P could be false, it suffices to consider this line.
P Q|PAQ|PAQ)>P
Since the value of the propositional form for this line is true, it follows that the
proposition is a tautology. +
1 P<(PYV P) idempotence of V
2. P<>(P AP) idempotence of A
3. (PV O<S(OV P) commutativity of V
4, PAQ<-(QAP) commutativity of A
5. (PV QV RISIPV (OV RI associativity of V
6. [PA QA RIS(IPACA RI associativity of A
7. “PV Q)<>(7P A 7Q) |} DeMorgan’s
’ Laws
8. -(P A Q) <> (PV 70)
9 [PA(OVRIS(PAQDV(PAR distributivity of A over V
10. [PV(QARISIPV ODACPY RJ distributivity of V over A
ll. (PVD<+1
12. (PADSP
13. (PVO@P
14, PAH+0
15. (PV mP)<1
16. (PA —mP)<0
17. P< —-(-P) double negation
18. (P>Q)->("PV Q) implication
19. (Pe O<elP>AAC>P)) equivalence
20. (PA Q)=> Rl <> [P>(QO> R)] exportation
21. [((P=>QAP> 7-0] —~P absurdity
22. (P= Q)<>(“Q> —P) contrapositive
16 MATHEMATICAL REASONING Ch. 1
Example
Simplify the following propositional form:
> (BV
> B) V (A => D)]
[((A D).
The numbers at the right indicate which identities are applied at each step.
1. P=>(PV Q) addition
2, (PAQ>P simplification
3. [PA P2>Q120 modus ponens
4. (P>QA70])> —7P modus tollens
5. [AP A(PV QI>@ disjunctive syllogism
6. (P= OQ) A (O> BR) > (P= R) hypothetical syllogism
7. (P> O)=>[ (O
=> R)> (P=> R)]
8. [(P>QAR>SP-IPARD>(OAS)I
9 (Po? QA(Q>R)]> (PR)
Sec. 1.1 PROPOSITIONS 17
Example
A man who was captured by savages was promised his freedom if he could deter-
mine with a single “yes or no” question the color of the tribe’s idol. He knew the
idol was either white or black. Unfortunately, the tribe contained two kinds of
individuals: liars, who invariably gave the wrong answer to any question they were
asked, and truth-tellers, who invariably gave the right answer. Fortunately, the
victim was well-educated. He knew he must ask a question which would be answered
according to the following table:
Color of Idol
White Black
Liars
Truth-tellers Yes No
However, since a liar always gave the wrong answer, he realized he must ask a
question whose correct answers could be tabulated as follows:
Color of Idol
White Black
Liars
Truth-tellers
Whereupon he asked his nearest captor “Is it true that either you tell the truth and
the idol is white or that you lie and the idol is black ?”+ This question enabled him to
determine the color correctly, since an answer of yes meant the idol was white and xo
meant it was black. Unfortunately, the savages thought it was just a lucky guess and
reneged on their promise. That’s why you never heard this story before. #
1. Using truth tables, show that if P <> Q is true, then P > Q and Q => P are both
true. Conversely, show that if P = Q and Q => P are both true, then P <> Q is true.
2. Show that P = Q has the same truth value as “P VY Q for all truth values of P
and Q, i.e., show that (P > Q)<(—P V Q)isa tautology.
+Simpler questions of equivalent power can be formulated, e.g., “Would the other kind of
person say yes if I asked him if the idol is black ?”
18 MATHEMATICAL REASONING Ch.1
For each of the following expressions, use identities to find equivalent expressions.
which use only A and — and are as simple as possible.
(a) PV QV mR
(b) => PI
PV [(7@ A R)
(c) P»>(Q=>P)
For each of the following expressions, use identities to find equivalent expressions
which use only V and — and are as simple as possible. .
(d) (PA Q) A 7P
(ec) [P=(QV 7ARI] A WPA Q
(f) “PA 7Q A (7R=> P)
Sec. 1.1 PROPOSITIONS 19
Establish the following tautologies by simplifying the left side to the form of the
right side:
(a) (PA Q)>P)1
(b)° —(-(P V Q) => —P)=0
(c) (Q>P) A (7P>Q) A (Q> QO) <P
(d) (P= 7P) A (7P => P)] <0
Relate the following assertion to the logical operator >: “If you start with a false
assumption, you can prove anything you like.” HINT: Consider the truth table
of =>.
An operation with two operands is said to be commutative if the order of the operands
does not affect the result. Thus, addition is commutative since x + y = y + x for
all values of x and y, but subtraction is not commutative since 4 —-2~42—-4.A
logical operator with two operands is commutative if reversing the order of the
operands produces a logically equivalent proposition.
(a) Determine which of the following logical operators are commutative: A,
Vi, @.
(b) Prove your assertions by using truth tables.
10. Let “(7)” denote a logical operator with two operands; the expression x [_] y denotes
the result of applying [] to the operands x and y. The operator [] is said to be
associative if x [_] (y (] z) and (« [] vy) (] z are logically equivalent for all operands
x,y, and z,
(a) Determine which of the logical operations A, V, =, <>, and @ are associative.
(b) Prove your assertions using truth tables.
Q|P\Q
20. MATHEMATICAL REASONING Ch.1
Q|PiQ
For each of the following, find equivalent expressions which use only the nor
operator.
(ij) “P
Gi) PV Q
Giii) PA Q
Programming Problem
The language of propositions is not sufficiently powerful to make all the assertions
needed in mathmatics. We also need to make assertions such as “x == 3,” “x > y,”
and “x + y = z.” Such assertions are not propositions, since they are not neces-
sarily either true or false. However, if values are assigned to the variables, each of
these assertions becomes a proposition. Similar assertions occur in English where
pronouns and improper nouns are often used as variables; e.g.,
“He is tall and blonde,” (“x is tall and blonde”).
“She lives in the city,” (“x lives in y”).
These assertions are formed using variables in a “template” which expresses a
property of an object or a relationship between objects. These templates are called
predicates. Assertions made with predicates and variables become true or false
when the variables are replaced by specific values. In the assertion “x is tall and
blonde,” x is a variable and “is tall and blonde” is a predicate; in the assertion
“x lives in y,” x and y are variables and “lives in” is a predicate. For ease of dis-
cussion, we will often refer to an assertion containing a predicate simply as a
“predicate.”
Sec. 1.2 PREDICATES AND QUANTIFIERS = 21
Example
Predicates are commonly used in control statements in high-level programming
languages. For example, a statement of the form
universe U, and the values c,, ¢2,..., ¢, Which make P(c,, ¢2,..., ¢,) true are said
to satisfy P. If P is not satisfiable in the universe U, then we say P is unsatisfiable
in U. Note that a predicate is permitted to have zero arguments. Since a predicate
constant must have a value of either true or false when values are assigned to all
its arguments, it follows that a predicate constant with no arguments is a proposi-
tion. Similarly, a predicate variable with zero arguments is a propositional variable.
In order to change a predicate into a proposition, each individual variable of
the predicate must be bound; this may be done in two ways. The first way to bind
an individual variable is by assigning a value to it.
Example
Consider the predicate “x + y = 3” which we will denote by P(x, y). If the
value 1 is assigned to x, and 2 to y, the predicate is changed into a proposition
P(, 2) whose truth value is true. On the other hand, if we assign the values 2 and 6
to x and y, respectively, the resulting proposition, P(2, 6), is false. #
Examples
The following propositions are formed by universal quantification:
(a) VWxlx <x 41] (for all x, x is less than x + 1)
(b) Wxl[x = 3] (for any x, x = 3)
If the universe is the set of integers I, the predicate x < x + 1 is true for all
values of x, but “x = 3” is false when x is assigned the value of 1. Consequently,
for this universe (a) is true and (b) is false.
Sec. 1.2 PREDICATES AND QUANTIFIERS 23
(c) If A is an integer array with 50 entries, 4[1], A[2],..., A[50], then we can
assert that all entries are nonzero as follows:
Vi{1 <i A i< 50) > Afi] 4 0}.
The entries of the array are sorted in nondecreasing order if the following
assertion holds.
Vi{d <i A i < 50) > Afi] < Afi + 1}.
We may also use more than one quantifier with predicates which have more
than one variable, e.g., the assertion
(d) Wx Vyfx + y > x] is read “for all x and all y, x + y is greater than x.” This
proposition is true if the universe of discourse consists of positive integers I-+-
and false if the universe is the set of all integers I. +
Examples
The variable x is existentially quantified in the following propositions:
(a) dx[x <x-+1] (There exists an x such that x is less than x + 1).
(b) Ax{x = 3] (There exists an x such that x = 3.)
Both of these are true propositions if the universe of discourse is the set of
integers. The proposition
(c) Jxfx =x +1] (There exists an x such that x = x + 1.)
is false, since no matter what value we assign to x, the assertion “x = x + 1”
is false. #
A third form of quantification can be used to assert that there is one and only
one element of the universe of discourse which makes a predicate true. This quan-
tifier is denoted J!, and the sequence of symbols d!x is read “There exists a
unique x such that... .” or “There is one and only one x such that...”
24 MATHEMATICAL REASONING Ch. 1
Examples
Let the universe of discourse be the set of natural numbers N. Then the fol-
lowing propositions are true.
(a) dixfx < 1] a
(b) dlx[x = 3]
In (a), assigning the value of 0 to x makes the assertion x < 1 true; no other
value will do. In (b), the unique value of x is 3. For the same universe, the assertion
(c) Alx[x > 1] is fals e, sinc e the asse rtio n “x > 1” is true if x is assi gned any
value other than Oorl. #
An ass ert ion wit h qua nti fie d var iab les can be exp res sed usi ng pro pos iti ons
obtained by ass ign ing val ues to the ind ivi dua l var iab les of the pre dic ate s whi ch
occur in the assertion. This relationship can be made explicit by considering a
finite uni ver se of dis cou rse . Let the uni ver se cons ist of the inte gers 1,2, and 3.
Then the proposition
VxP(x)
is equivalent to the conjunction
P(l) A PQ) A PQ),
and the proposition
dxP(x)
is equivalent to the disjunction
P(l) V PQ) V PG).
The proposition
dtxP(x)
is equivalent to the proposition
[P) A 7 P) A 7 PG) V (PQ) A 7 PO) A 7 PG)
V [PG) A 7 PQ) A PQ).
If the universe of discourse is infinite, a quantified assertion cannot always be
represented by a finite conjunction or disjunction of propositions without quanti-
fiers. However, the concept can be extended, and it is often convenient to consider
a universally quantified assertion over an infinite universe as an infinite conjunction
and an existentially quantified assertion as an infinite disjunction.
Example
Considér the universe of nonnegative integers, and let P(x) denote the assertion
“x > 3.” Then the proposition
VxP(x)
can be interpreted as the infinite conjunction
P(O) A PQ) A PQ) A PB) A:
Sec. 1.2 PREDICATES AND QUANTIFIERS 25
which is false, since some of the operands, e.g., P(0), are false. The proposition
dxP(x)
can be interpreted as the infinite disjunction
Examples
The predicate P(x, y, z) representing “x + y = z,” has three variables, all of
which are free in the assertion
P(x, y, 2).
If we assign x the value of 2, the result is the predicate P with a bound variable.
P(2, y, 2).
This assertion is equivalent to a predicate with two free variables, which we can
denote by Q(y, z), where Q(y, z) is true if 2 + y = z. Similarly,
AyP(x, y, Z)
is an assertion with two free variables. The truth value of this assertion is equivalent
to that of a predicate with two variables which we will call R(x, z); if the universe is
the natural numbers, then
If y does not occur as an individual variable in P(x,, x,,...,,), then the as-
sertions VyP(x,, X.,...,%,) and dyP(x,, x,,...,%,) are both equivalent to
P(X1,% ,..+,X,), Since none of the individual variables of P are bound by the
quantification. As a special case, if P is a proposition, then the truth value of
4dxP or VxP is equal to the truth value of P.
If more than one quantifier is applied to a predicate, the order in which the
variables are bound is the same as their order in the quantifier list; for example,
Vx VyP(x, y) denotes Vx[VyP(x, y)].
The binding order can profoundly affect the meaning of an assertion. For example,
the sequence “Wx dy,” can be paraphrased informally as “No matter what value
of x is chosen, a value of y can be found such that .. .” In this quantifier sequence,
since y is chosen after x, the value of y may depend on the value of x. In contrast,
the sequence “dy Vx” asserts “A value of y can be chosen so that no matter what
26 MATHEMATICAL REASONING Ch, 1
value is chosen for x...” In this case, since y is bound first, the value of y must
be specified independently of the value of x.
Examples
Let the universe of discourse be the set of married persons. Then
(a) Wx dy[x is married to y] is true. However,
(b) Jy Vx[x is married to y] asserts that there is some person in the universe who
is married to everyone; this is false.
Now let the universe of discourse be the integers I. The assertion
(c) Wx dylx + y = 0] (For all x, there exists a y such that x + y = 0.) is true,
since for any value of x there is a value of y (i.e., y is equal to —x) which makes
the assertion “x + y = 0” true. The proposition
(d) dy Vx[x + y = 0] (There exists a y such that for all x, x + y = 0.) asserts
that the value of y can be chosen independently of the value of x. Since no y
exists which yields zero when added to an arbitrary integer, this proposition is
false. The proposition
(e) Wx Vy diz[x + » = z] asserts that for every pair of integers x and y, there
is a unique integer z equal to their sum; the assertion is true. If we interchange
the last two quantifiers of part (e), we obtain the proposition
(f) Wx 3!zVy[x + y = z] which asserts that for every x, a unique z can be
chosen such that no matter what y is added to x, x + y = z. This proposition
is false. The proposition
(g) A!x[x-6 = 0] is true since equation x-6 = 0 is true if and only if x = 0. The
proposition
(h) dix Vy[x-y = 0] is true, but
(i) Wy dlxix-y = 0] is false, since, if y = 0, any value of x will yield zero.
Similarly,
G) Wy dtxlx + » < 0] is false, since for any value of y there are many values of
x for which the sum of x and y is negative. #
Although the order in which individual variables are bound cannot always
be changed without affecting the meaning of an assertion, there are two notable
exceptions: the sequence Vx Vy can always be replaced by Vy Vx, and the
sequence 4x dy can always be replaced by dy Ax.
Example
Let the universe be the nonnegative integers. For any predicate P, the propo-
sition
Vx VyP(x, y)
can be expandedf to
+Throughout this chapter, we will frequently expand quantified statements into infinite con-
junctions or disjunctions, rearrange the terms using the identities of Table 1.1.1 and derive a new
quantified assertion. This technique does not always constitute a careful mathematical argument
and in fact cannot be applied to some universes. We use it as an intuitive aid for understanding
quantified assertions.
Sec. 1.2 PREDICATES AND QUANTIFIERS 27
ee
A sae
which represents
which has no free variables. We will follow this convention of deleting universal
quantifiers in later chapters, but will refrain from doing so for the present.
The notions of predicates and quantified variables described in this section
provide a strong extension to the language of propositions. Most substantive
mathematical arguments involve quantification, and the tools introduced in this
section will be used throughout the remainder of this text.
1. Let S(x, y, z) denote the predicate “x + y = z,” P(x, y, z) denote “x-y = z,” and
L(x, y) denote “x < y.” Let the universe of discourse be the natural numbers N.
Using the above predicates, express the following assertions. The phrase “there is an
x” does not imply that x has a unique value.
28 MATHEMATICAL REASONING Ch. 1
Examples
Let the universe be the integers and let N(x) denote “x is a nonnegative integer,”
E(x) denote “x is even,” O(x) denote “x is odd,” and P(x) denote “x is prime.” The
following examples illustrate the transcription of assertions into logical notation.
(a) There exists an even integer.
dxE(x)
(b) Every integer is even or odd.
Vx[E(x) V O(}X)]
(c) All prime integers are nonnegative.
Vx[P(x) > N(x)]
(d) The only even prime is two.
Vx((E@) A P(x) > x = 2]
(e) There is one and only one even prime.
AMx[E(x) A P(X)]
(f} Not all integers are odd.
AV xO(x), or dx 7 O(x)
(g) Not all primes are odd.
“WV x[P(x) => O(x)], or Ax[PXX) A 70()]
(h) If an integer is not odd, then it’s even.
Va[70(x) > EGO]. #
Examples
Consider the universe of integers and let P(x, y, z) denote “xy = z”. The fol-
lowing are examples of mathematical statements and equivalent formulations in
logical notation. Note that informal statements of propositions frequently omit the
universal quantification of individual variables.
REASONING Ch. 1
30 MATHEMATICAL
The preceding examples ill ust rat e a var iet y of way s in wh ic h ass ert ion s can inv olv e
predicates, quantifiers and log ica l ope rat ors . In con str uct ing pro ofs , we fre que ntl y
need to establish rel ati ons hip s be tw ee n ass ert ion s. For ex am pl e, con sid er the
statements
Ax[P(x) > O(x)] and AxP(x) > AyQQ).
Are they equ iva len t, or doe s one imp ly the oth er, or is no sta tem ent of this kin d
possible? In order to res olv e suc h que sti ons , it is nec ess ary to und ers tan d the way s
in which logical operators, quantifiers, and predicates interact.
An ass ert ion inv olv ing pre dic ate var iab les is vali d if it is true for eve ry uni ver se
of discourse no mat ter how the pre dic ate var iab les are int erp ret ed. An ass ert ion
is satisfiable if there exists a universe and some interpretation of the predicate
variab les whi ch mak es it true . If an ass ert ion is not true for any uni ver se or inte r-
pretation, it is uns ati sfi abl e. Val id, sat isf iab le and uns ati sfi abl e ass ert ion s are the
analogs of tautologies, contingencies, and contradictions in the language of prop-
ositions. In this sec tio n, we will dev elo p som e fun dam ent al ide nti tie s whi ch can
be used to determ ine the val idi ty of ass ert ion s. In our dis cus sio n we will oft en
refer to “equivalent” assertions. Two assertions A, and A, are said to be (log-
ical ly) equ iva len t if and onl y if for eve ry uni ver se of dis cou rse and eve ry int erp ret a-
tion of the predicate variables, A, is true if and only if A, is true. In other words,
A, and A, are equivalent if and only if the assertion A, <> A, is valid.
We firs t con sid er how the neg ati on ope rat ion aff ect s qua nti fie d ass ert ion s.
Let P(x) be a predicate and consider the meaning of the proposition —WVxP(x).
We can interpret this proposition as “the assertion ‘WxP(x)’ is false,” which is
equivalent to the statement “for some x, P(x) is not true,” or “Ix — P(x).” This
leads to the valid assertion
—W xP(x) <> dx — P(x).
Similarly, the proposition —74xP(x) asserts that “it is false that there exists an
x such that P(x) is true.” This is equivalent to the assertion that “there does not
exist an x such that P(x) is true” or “for all x, P(x) is false.” This establishes the
valid assertion
—3IxP(x) <> Vx — P(x).
Sec. 1.3 QUANTIFIERS AND LOGICAL OPERATORS 31
These two equivalences can be used to propagate negation signs through a sequence
of quantifiers, as illustrated by the following example.
Example
“dx Vy VzP(x, y, z) <> Vx a Vy VzPC(x, y, z)
<> Vx dy — VzP(x, y, z)
<> Vx dy dz 7 P(x,y,z) #
Propagation of negations through quantifier sequences is often useful in construct-
ing proofs and counterexamples. Consider the following assertion:
For every pair of integers x and y, there exists a z such that x + z= yp.
This statement can be formulated as follows:
Vx Vy Az[x + z = yl].
This proposition is true for the universe of integers I, but false for the natural
numbers N. We establish its falsity for the universe N by showing that its negation
is true. The negation has the form
—Vx Vy dz[x +2z= y]
which is somewhat difficult to interpret. The equivalent form
dx dy Vz [x + z= y], or dx dy Vz[x + zy]
is more tractable and can easily be shown to be true for the nonnegative integers
by choosing x > y.
The scope of a quantifier is the part of an assertion in which variables are
bound by the quantifier.
Examples
(a) In the assertion
AV Wx[P(x) V Q(x)]
the scope of the universal quantifier is [P(x) V Q(x)].
(b) In the assertion
[VxP(x)] > [AxO@)]
the scope of the universal quantifier is P(x), and the scope of the existential
quantifier is Q(x).
(c) In the assertion
Vx{[P() > (A V B)]
the scope of the universal quantifier is [P(x) > (A V B)]. #
Parentheses and brackets can be used to make the scope of a quantifier explicit.
We adopt the convention that the scope of a quantifier is the smallest subexpression
possible, consistent with the parentheses of the expression. Consequently, the
" assertion
VxP(x) V Q(x)
REASONING Ch.1
32 MATHEMATICAL
Example
In the expression
Vx[P(x) V O(y) V R(x, 2)]
both occurrences of x refer to the same variable. But in the expression
We now con sid er the way qua nti fie rs affe ct con jun cti ons and dis jun cti ons .
We first note that if a pro pos iti on occ urs in a dis jun cti on or con jun cti on wit hin
the scope of a qua nti fie r, it can be rem ove d fro m the sco pe of the qua nti fie r. Thu s,
Example
We show that 3 does not distribute over =>; that is, the assertion
Ax[P(x) > Q(x)] <> [AxP(X) > 3xQ@Q)]
is not valid.
Sec. 1.3 QUANTIFIERS AND LOGICAL OPERATORS 35
We can construct a truth table for the propositional form of this assertion, taking
the components VxP(x), JxQ(x), and 4xP(x) as propositional variables. How-
ever, since JxP(x) is true whenever VxP(x) is true, two lines of the truth table do
not apply.
0 0 0 1 1
0 0 1 1 1
0 1 0 1 0
0 1 1 1 I
1 0 0 ma na
I 0 1 ma na.
1 1 0 0
1 1 j 1 1
Considering the last two columns of the table, we conclude that the implication
holds in one direction,
WN
Wx P(x) <> “73 xP)
VxP(x) => 3xP(x)
AMNP
dx PQ) <> “OV xP)
[VxP(Qx) A QO) <> Ve[P@ A Q]
[VxP(x) V Q] <> Val[P®) V Q]
WEN
[VxP(x) A VxOQ)] <> ValP@) A O@))]
[VxP(x) V VxO(x)] > ValPQ) V OQ)
10. [BxP(x) A Q] <> dxfP(x) A Q]
11. [SxP@) V Q]) <> 3xfP@) V Q)
12. Ix[P@) A OC) = [AxP@) A 3xOQ)]
13. [BxP@) V 3xOQ)] > JAP) V OCO))
Using these conventions, the formal statement of assertions becomes both more
compact and more readable. Furthermore, the compact notation allows a negation
sign to be propagated through a sequence of quantifiers in the same manner as
was illustrated earlier.
Example
Consider the limit of a function defined over the real line. The definition is
usually expressed as follows.
Definition:
lim f(x) = k > VE.s0 Adss0Vxl|x — cl <d >| f(x) — k| < €].
xe
To show that lim f(x} +k, we form the negation of both sides of the above
aoe
definition giving
lim f(x) 4 k <> F€.39 WO5s0 Ax[|x — ec] < 6 A | f(x) — kl > €].
This establishes that lim f(x) + k if and only if there exists an € > 0 such that
woe
The virtues of the compact notation will be obvious to anyone who writes out
the definition of a limit using the conventional logical notation.
In this section we have described ways in which quantifiers and logical oper-
ators interact with each other. These interactions are often subtle, and dealing
with them requires some care, but a facility with them is invaluable in the construc-
tion of sound mathematical arguments.
become s “If y is the asse rtio n w V x, zis the asse rtio n x V wan d y is pro vab le, then
zis provable.”
(a) Vx[P(x) > T(x)]
(b) Vax{T(x) V mS]
(c) Ax{Tx) A 7PO))
(d) Wx Vy Vz{[D(x, y, z) A P(2)) > (PO) V PON}
(e) Wx{T(x) > Vy VaLDG@, y, 2) > T)]}
Put the following into logical notation. Choose predicates so that each assertion
requires at least one quantifier.
(a) There is one and only one even prime.
(b) No odd numbers are even.
(c) Every train is faster than some cars.
(d) Some cars are slower than all trains but at least one train is faster than every car.
(e) If it rains tomorrow, then somebody will get wet.
Find an assertion which is logically equivalent to VxP(x) but uses only the quantifier
3 and the logical operator —. Similarly, express Ix P(x} in terms of V and —.
Find an assertion which is logically equivalent to 4!xP(x) but which uses only the
quantifiers V and J together with the predicate for equality and logical operators.
Show that the following propositions are valid.
(a) [VxP(x) > Q] <= [Ax[P(x) = Q]]
(b) Vx{P > O(X)] = [P = VxOQ)]
For the following assertions, establish those which are true and find interpretations
for P and Q which provide counterexamples for those which are false.
(a) Wx{P(x) > O(x)] > [VxP(x) > VxO(x)]
(b) [WxP(x) > VxOQ(x)] > VxlP(x) > GO)]
(c) [AxP@) > VxQ(x)] > Va[PQ) > O(X)]
(d) Wx[P(x) > QO(x)] > [AxP(x) > VxOQ)]
(a) For a universe containing only the elements 0 and 1, expand
Ax[P(x) A Q(x)] and [AxP(x) A 4xQQ)]
into propositions involving P(0), P(1),...etc., and without quantifiers. Re-
arrange the terms of this expansion to show
Ax[P(x) A Q(x)] > [AxP(x) A 4xOQ)1.
(b) Show that the converse of the implication of part (a) is not valid.
(c) For the same universe, show
Vx[P(x) <> Q(x)] > [VxP(x) <> VxQ(x)].
(d) Show that the converse of the implication of part (c) is not valid.
Show that the following are valid for the universe of natural numbers N either by
expanding the statement or by applying identities.
(a) Vx Vy[P(x) V Q0)] <> [VxP(x) V VyQ0)]
(b) dx 3y[PQX) A QQ)] > 4xP(x)
(c) Vx VylP(x) A Q0)]<> [VxP(x) A Vv@0)]
(d) dx Sy[P(x) > P(y)] > [VxP(x) > AyPQ)]
(ce) Vx Vy[P(x) > Q()) <> [AxP@) > VyQ0)]
Sec. 1.4 LOGICAL INFERENCE 39
10. (a) Write out the definition of lim,., f(x) = k in the usual logical notation rather
than the compact notation used in the last example of this section.
(b) Find the condition for lim,.., f(x) # k by forming the negation of both sides
" of the definition.
11. Let A be a two-dimensional integer array with 20 rows (indexed from 1 to 20) and 30
columns (indexed from 1 to 30). Using compact logical notation, make the following
assertions. Assume the universe of discourse is the set of integers I.
(a) All entries of A are nonnegative.
(b) All entries of the 4th and 15th rows are positive.
(c) Some entries of A are zero.
(d) The entries of A are sorted in row-major order (the entries are in order within
rows, and every entry of the ith row is less than or equal to every entry of the
(i + I)st row).
are intended to serve bot h as con vin cin g arg ume nts and as mod els of pro of tec h-
niques. The exercises are intend ed to pro vid e pra cti ce in the con str uct ion of pro ofs .
A proof of an ass ert ion is a seq uen ce of sta tem ent s whi ch rep res ent s an arg u-
ment that the theorem is tru e. So me of the ass ert ion s whi ch occ ur in a pro of may
be known to be tru e a pri ori ; the se inc lud e axi oms or pre vio usl y pro ved the ore ms.
Other assertions may be hyp oth ese s of the the ore m, as su me d to be tru e in the
argume nt. Fin all y, som e ass ert ion s may be inf err ed fro m oth er ass ert ion s whi ch
occurred earlie r in the pro of. Thu s, to con str uct pro ofs , we nee d a mea ns of dra win g
conclusions or der ivi ng new ass ert ion s fro m old one s. Thi s is don e usi ng rul es of
inference . Rul es of inf ere nce spe cif y con clu sio ns whi ch can be dra wn fro m ass er-
tions known or assumed to be true.
Perhaps the most fundamental rules of inference are those which permit sub-
sti tut ion s. Thu s, we are gen era lly all owe d to rep lac e any exp res sio n in an ass ert ion
by another expression which is equivalent to it; we consider the new assertion to
be true if and only if the original assertion was true. We learn this rule of inference
at an ear ly age ; it is som eti mes exp res sed as “eq ual s can be sub sti tut ed for equ als .”
Other rules gov ern ing sub sti tut ion are co mm on ly use d in mat hem ati cs, and we
wil l app ly the m fre ely wit hou t exp lic itl y sta tin g the m. For exa mpl e, if Sis a tau -
tology in whi ch pro pos iti ona l var iab les occ ur, sub sti tut ion of pro pos iti ons for
the propositional variables in the usual way results in a new tautology.
Another rule of inference can be stated as follows: If it is known that a state-
ment P is true, and also that the statement P = Q is true, then we can conclude
that the statement Q is true.
Example
Suppose we know “Samson is strong” and “If Samson is strong, then it will take
a woman to do him in.” We can conclude “It will take a woman to do Samson in.”
+wth
This rule of inference is called modus ponens; it is often presented in the form of
an argument as follows:
P
P>@Q
Q
In such a tabular presentation of an argument, the assertions above the
horizontal line are called hypotheses or premises; the assertion below the line is
the conclusion. The symbol .*. is read “therefore” or “it follows that,” or “hence.”
An argument is said to be valid if, whenever all the premises are true, the con-
clusion is true. A rule of inference is an argument form which is taken to be
valid in the same sense that an axiom is taken to be true.
The rule of inference known as modus ponens is related to the tautology
[P A (P > Q)] = Q in the language of propositions. Other rules of inference
have similar interpretations; we have listed some of the most important rules of
inference in Table 1.4.1.
Sec. 1.4 LOGICAL INFERENCE 41
PO P=>(PV Q) addition
WPV QO
PAQ (PA Q)=P simplification
“?P
P
P>@Q [PA > Q)]=>20 modus ponens
“Q
—7Q
P=>Q ("2 A®=>Q)|=>7P modus tollens
—P
PV OQ
—p (PV Q) A 7P]= OQ disjunctive
ee) syllogism
P>Q
O>R (P> OA (Q2>R)]>[P> Rl hypothetical
P>R syllogism
P
Q conjunction
WPA Q
(P= Q) A (RS)
PVR (P>ODAR>SAPVRIS(CVS] constructive
“OVS dilemma
(P= Q) A (R> S)
“70 V 3S (P> Q)AR>SA (OV mS) =[-P V mR] destructive
“PV OR dilemma
Examples
Fallacious arguments are often the result of incorrect inferences. Here we pre-
sent some examples of common fallacies.
Presented in the form of our rules of inference, this argument can be presented
as follows:
P=>Q
Q
P
REASONING Ch.1
42 MATHEMATICAL
The argument is not cor rec t bec aus e the con clu sio n P can be fal se eve n tho ugh
the hypoth ese s P = Q and Q are tru e; i.e. , the ass ert ion [(P > Q) A Q] > P
is not a tautol ogy : the sou rce of the but ler ’s dis com for t ma y not hav e bee n
guilt but rather the beh avi or of the sto ck mar ket on the day tha t he was que s-
tioned.
(b) The Fallacy of Denying the Antecedent
This form of fallacious argument can be represented as
P=>Q
“iP
“1Q
The following exa mpl e ill ust rat es the cor rec t app lic ati on of som e of the rul es of
inference given in Table 1.4.1.
Example
Consider the following argument:
If horses fly or cows eat artichokes, then the mosquito is the national bird. If
the mosquito is the national bird, then peanut butter tastes good on hot dogs.
But peanut butter tastes terrible on hot dogs. Therefore, cows don’t eat arti-
chokes.
The first three assertions are the hypotheses of the argument; the last assertion is the
conc lusi on. We are aske d to dete rmin e whet her the trut h of the conc lusi on is impl ied
by the trut h of the hypo thes es. We begi n by repr esen ting the com pon ent prop osi-
tions as follows:
F denotes the proposition “horses fly”;
A denotes the proposition “cows eat artichokes” ;
M denotes the proposition “the mosquito is the national bird” ;
P denotes the proposition “peanut butter tastes good on hot dogs.”
Assertions 1, 2, and 3 are the hypotheses, and 4 is the conclusion. One way to test
whether the conclusion is implied by the hypotheses is to construct a truth table for
Sec. 1.4 LOGICAL INFERENCE 43
the implication which has the conjunction of the hypotheses as its antecedent and
the conclusion as its consequent; in the present case this is the implication
Proof:
Assertion Reasons
1 (FVAS>M Hypothesis 1
2. M=>P Hypothesis 2
3, (FV A)>P Steps 1 and 2 and hypothetical
syllogism
4. —P Hypothesis 3
5. ~-(F V A) Steps 3 and 4 and modus tollens
6. “FA A Step 5 and DeMorgan’s law
(identity 7, Table 1.1.1)
7. TAA OF Step 6 and commutativity of A
(identity 4, Table 1.1.1)
8. “A Step 7 and simplification |
Example
Consider the following argument:
Every man.has two legs. John Smith is a man.
Hence, John Smith has two legs.
Let M(x) denote the assertion “x is a man,”
L(x) denote the assertion “x has two legs,” and
J denote John Smith.
Expressed in logical notation, the argument is
1. VxlM(x) > LQX)]
2. MW)
3. LW)
A formal proof is as follows:
Assertion Reasons
In this section we have dealt with the problem of logical inference, i.e., infer-
ring the truth of one statement from the known or assumed truth of others. A rule
of inference is an explicit statement of when such an inference can be made. We
commonly apply rules of inference in mathematical arguments without explicit
reference to them; this is one reason why mathematical arguments are sometimes
difficult to follow. By treating these rules explicitly, we aim to provide a basis for
the understanding, construction, and description of mathematical arguments.
1. For each of the following sets of premises, list the relevant conclusions which can be
drawn and the rules of inference used in each case.
(a) I’m either fat or thin. I’m certainly not thin.
(b) If I run I get out of breath. I’m not out of breath.
(c) If the butler did it, then his hands are dirty. The butler’s hands are dirty.
(d) Blue skies make me happy and gray skies make me sad. The sky is either blue or
gray.
(e) If my program runs, then I am happy. If I am happy, the sun shines. It’s 11: 00
p.m. and very dark.
(f) All trigonometric functions are periodic functions and all periodic functions are
continuous functions.
(g) All cows are mammals. Some mammals chew their cud.
(h) All even integers are divisible by 2. The integer 4 is even but 3 is not.
(i) What’s good for the auto industry is good for the country. What’s good for the
country is good for you. What’s good for the auto industry is for you to buy an
expensive car.
REASONING Ch. 1
46 MATHEMATICAL
Show that the tautological for m of the fol low ing rul es of inf ere nce s giv en in Tab le
1.4.1 are tautologies:
(a) modus tollens
(b) disjunctive syllogism
(c) constructive dilemma
(d) destructive dilemma
Construct a proof for eac h of the fol low ing arg ume nts , giv ing all nec ess ary add iti ona l
assertions. Specify the rul es of inf ere nce use d at eac h ste p. (T he wor d “or ” den ote s
the “logical or” rather than the “exclusive or.”)
(a) It is not the case tha t IB M or Xer ox wil l tak e ove r the cop ier mar ket . If RC A
returns to the comp ut er mar ket , the n IB M wil l tak e ove r the cop ier mar ket .
Hence, RCA will not return to the computer market.
(b) (My program runs suc ces sfu lly ) or (th e sys tem bo mb s and I blo w my sta ck) .
Furthermore, (the sys tem doe s not bo mb ) or (I don ’t blo w my sta ck and my
program runs successfully). Therefore, my program runs successfully.
Supply the missing ass ert ion s to pro ve the fol low ing arg ume nt. Just ify the inc lus ion of
each assertion in the proof.
(P \ Q)=(RAS)
(T>Q) A (S>U)
(W=> P) A(T > VU)
—-R
Wo mT
5. Determine whi ch of the fol low ing arg ume nts are vali d. Con str uct pro ofs for the vali d
arguments. For tho se whi ch are not vali d, sho w why the con clu sio n doe s not fol low
from the hypotheses.
(a) AAB (b) AV B
Ax>C Ax>C
CAB “CV B
(c) A>B (d) A>(BV C)
A»>C D>—7C
.C>B B= 7A
A
Pp
“BA 7B
Determine which of the following are valid arguments. Construct proofs for those that
are valid and describe the fallacies of those that are not.
(a) If today is Tuesday, then I have a test in Computer Science or a test in Econ. If
my Econ professor is sick, then I will not have a test in Econ. Today is Tuesday
and my Econ professor is sick. Therefore, I have a test in Computer Science.
(b) Iam happy if my program runs. My happiness is a necessary condition for me to
enjoy life. Hence, if my program runs, then, if I enjoy life, then I am happy.
(c) Itis not the case that some trigonometric functions are not periodic. Some perio-
dic functions are continuous. Therefore, it is not true that al! trigonometric
functions are not continuous.
(d) Some trigonometric functions are periodic. Some periodic functions are con-
tinuous. Therefore, some trigonometric functions are continuous.
Sec. 1.5 METHODS OF PROOF 47
In the preceding section, we described the use of rules of inference to infer the
truth of one assertion from others. Rules of inference are characterizations of
the syntactic constraints which a proof must obey; in a formal mathematical sys-
tem, where the structure of proofs is precisely specified, the rules of inference
enable us to determine if an argument is a proof. In this section, we are concerned
with the structure of proofs as well as strategies for their construction. Although
it is not possible to consider all proof techniques, we will describe some of the most
common ones, give examples of their use, and relate them to the rules of inference
described in the previous section.
The most elementary form of theorem is the tautology. A tautology is a
theorem because of its sentential structure rather than its content; its truth is
MATHEMATICAL REASONING Ch. 1
48
Example
Consider the universe of int ege rs. Den ote by E(x ) the ass ert ion “x is eve n” and
by O(x) the assertion “x is not eve n”; i.e. , O(x ) <> E( x) . If we rea d O(x ) as “x is
odd,” then we can prove the theorem
The integer 3 is either even or odd.
The theorem is stated as
E(3) V O(3),
or alternatively
E(Q3) V 7£E(3),
which, if we use the letter P to denote E(3), can be written
PV —P.
From the truth table of the proposition P V —P, we know it is a tautology, and the
theorem is established. #
A the ore m is oft en exp res sed as a pro pos iti ona l for m whi ch is not a tau tol ogy .
The truth of suc h an ass ert ion is dep end ent on bot h the log ica l str uct ure of the
assertion and the meanin g of the com pon ent pro pos iti ons . Bec aus e the com pon ent
propositions can not ass ume all pos sib le tru th val ues , cer tai n line s of the tru th
table cannot occur; the the ore m is pro ved by sho win g tha t all the line s whi ch can
occur result in a value of true . We will trea t suc h the ore ms by con sid eri ng the mos t
important of the logical operators.
Let T be an ass ert ion of the for m —7P , whe re P is a pro pos iti on. In ord er to
prove T, we must est abl ish that P is fals e. Sim ila rly , if T is of the for m P A Q, the n
we mus t sho w tha t bot h P and Q are true . An ass ert ion of the for m P V Q is
often establ ish ed by pro vin g the log ica lly equ iva len t pro pos iti on —P => Q (or,
by symmetry, ~Q => P). A truth table can be used to show the logical equivalence
ofP V Qan—P d > Q.
A var iet y of pro of tec hni que s are use d for pro vin g imp lic ati ons , and bec aus e
these techniques are so common, they are frequently referred to by name. Recall
that the truth table for P => Q has the following form:
Q|P>Q
The four most common techniques for proving implications are the following:
1. Vacuous Proof of P > Q
The truth value of P > Q is trve if that of P is false. Consequently, if we
can establish that P is false, only the first two lines of the above truth
Sec. 1.5 METHODS OF PROOF 49
table can possibly apply, and it follows that the assertion P > Q is true.
A vacuous proof of P > Q is constructed by establishing that the truth
value of P is false.
While vacuous proofs appear to be of little value, they are often important in
establishing limiting or special cases. We will point out many examples of vacuous
proofs in the next chapter.
2. Trivial Proof of P > Q
If it is possible to establish that Q is true, only the second and fourth lines
of the truth table for implication can apply, and it follows that the theorem
P => Q is true. Construction of a trivial proof of P = Q requires showing
that the truth value of Q is true.
Like the vacuous proof, the trivial proof has limited applicability and yet is
extremely important. It is frequently used to establish special cases of assertions.
3. Direct Proof of P > QO
A direct proof of P = Q shows that the truth of Q follows logically from
the truth of P, i.e., the third line of the truth table for implication cannot
hold. Such a proof begins by assuming P is true. Then, using whatever
information is available, such as previously proved theorems, it is shown
that Q must be true. Since all the lines of the truth table except the third
have the value true assigned to P => Q, the assertion is established.
The following examples illustrate the use of direct proofs.
Examples
(b) Theorem: Let S bea set of one- and two-digit integers such that each of
the digits 0 through 9 occurs exactly once in the set S. Then the sum of the
elements of S is divisible by 9.
Proof: Assume that the hypothesis of the theorem is true. The digits 0
through 9 sum to 45. In any set S, some of the digits will occur in the 10’s
position and the remainder will occur in the 1’s position. Let T denote the sum
of digits which occur in the 10’s position. Then the sum of the elements of S
can be expressed as 107 -+ 45 — 7, which can be put in the form 9T + 45.
Since both terms of this sum are divisible by 9, the sum is also divisible by 9,
regardless of the value of T. IF #
Example
A perfect number is an integer wh ic h is equ al to the su m of all its div iso rs exc ept
the number itself. Thus, 6 is a per fec t nu mb er , sin ce 6 = 1 + 2 + 3, and so is 28.
We will prove the fol low ing th eo re m by est abl ish ing the con tra pos iti ve.
In summar y, to est abl ish P > Q by a pro of of the con tra pos iti ve,
Example
Let “||” den ote the ope rat ion “ma x” on the set of inte gers I; if a > 5 then
a||6 = b\|a = a. For example, 4||2 = 4 and 1|J3 = 3.
Theorem: The binary operation “max” is associative; that is, for any integers
a, b, and c, (a[[ 6) Uc = allo).
Proof: For any three integers a, 6 and c, one of the following six cases must hold:
a>b>coal>c>bb>al>cob>ctbacpaSbocebsa.
Case1: Assume a > b >. Then (al|5)iJc = alle = aand
all(téUed =aib=a.
Case2: Assume a >c > b. Then (al{ 5) lc = al[c =a and
all(6Uod =alle=a. ;
There are four other cases; the proofs are all similar. J #
Sec. 1.5 METHODS OF PROOF 51
Examples
(a) Theorem: There is no largest prime number.
The proof is by contradiction; we begin by assuming that a largest prime number
exists, and then show how to construct another which is larger.
Proof: Assume a largest prime exists; call it p. Because all primes are greater
than 1 and none are greater than p, there must be a finite number of them. Form the
product of all these primes and call it r; r = 2-3-5-7- ... -p. We now assert that
r + lisa prime. For if we divider -+ 1 by any prime between 2 and p, the remainder
is 1, which means that r + 1 cannot be expressed as a product of any two integers
other than r + 1 and 1. Since r > p, r + 1 is a prime number greater than p. This
contradicts the assumption that p is the largest prime number, and the theorem is
proved. ff
The logical structure of the preceding proof can be described as follows. Let P
denote “there is no largest prime number,” and Q denote “p is the largest prime
number.” The proof proceeds by assuming the theorem is false:
Gi) —™P
It follows that (for some particular integer p),
(ii) 7P>@Q
We then show how to construct a prime greater than p, i.e., we show
(ii) Q=>-7Q
From (ii) and (iii), applying the rule of hypothetical syllogism, we conclude
(iv) “P> 7Q
From (i) and (ii) and modus ponens, it follows that
(v) Q
and from (i) and (iv) and modus ponens,
(vi) “Q
Then from the rule of conjunction applied to (v) and (vi), we conclude
MATHEMATICAL REASONING Ch. 1
52.
(vii) QA 7Q
This is a contradiction. We conc lu de tha t the hy po th es is (i) is fal se and the th eo re m
is proved.
(b) Consider thé pro ble m of det erm ini ng whe the r a pr og ra m P wil l ter min ate
normally, i.e., not as the res ult of suc h thi ngs as exc eed ing its all ott ed exe cut ion
time or register overflow. It is con cei vab le tha t a co mp ut er pr og ra m cou ld be wri tte n
which would decide, for any pr og ra m P, whe the r P wil l hal t; suc h a pr og ra m wo ul d
be a “decis ion pro ced ure ” to sol ve wha t is kn ow n as the hal tin g pro ble m. We can
show by means of a pro of by con tra dic tio n tha t no pro ced ure exi sts whi ch wil l sol ve
the halting problem.
For ease of exp osi tio n, we res tri ct our dis cus sio n to pro ced ure s whi ch do not
read any input, alt hou gh the y ma y call oth er pro ced ure s. Thi s cor res pon ds to a
subproblem of the ori gin al pro ble m; if we can not dev ise a dec isi on pro ced ure for
the input-free pro ced ure s, the n we cle arl y can not dev ise one for arb itr ary pro ce-
dures. Let P be an inp ut- fre e pro ced ure . We ass ume (as a hyp oth esi s to be pro ved
false) that there exi sts a dec isi on pro ced ure HA LT suc h tha t the val ue of HA LT (P )
is “ye s” if the pro ced ure P hal ts and oth erw ise the val ue of HA LT (P ) is “no .” Th en
the following procedure could be executed:
procedure ABSURD:
if HALT(ABSURD) = “yes” then
while true do print “ha”
The proof methods described so far are often inadequate for proving quantified
assertions. We now describe some additional proof techniques based on the rules
of inference for quantified statements. We will discuss techniques for proving asser-
tions in each of the following forms:
—AxP(x), IxP(x), =VxP(x), and VxP(x).
An assertion of the form —3xP(x) is most often proved by contradiction:
to show something does not exist, we assume it does and arrive at a contradiction.
{This program and those in the remainder of the book will be written in an informal ALGOL-
like language described in the Appendix.
Sec. 1.5 METHODS OF PROOF 53
This technique was used in our earlier proof that there is no largest prime number;
we assumed there was a largest prime and derived a contradiction of the form
Q (\ —@. We also note that —4xP(x) is equivalent to Vx — P(x). Hence, our
later remarks on proving universally quantified statements will sometimes apply.
Proofs of assertions of the form 4xP(x) are referred to as existence proofs.
Existence proofs are classified as either constructive or nonconstructive. A con-
structive existence proof establishes the assertion by exhibiting a value c such that
P(c) is true. By applying the rule of existential generalization, we conclude that
4xP(x) is true. Sometimes, rather than exhibiting a specific value of c, a construc-
tive existence proof specifies an algorithm for obtaining such a value.
A nonconstructive existence proof establishes the assertion JxP(x) without
indicating how to find a value c such that P(c) is true. Such a proof most commonly
involves a proof by contradiction; it shows that ~4xP(x) implies an absurdity
or the negation of some previous result.
A constructive existence proof specifies an element precisely, while a noncon-
structive proof may not provide any information other than an assertion of exist-
ence. Some results in mathematics fall between these two extremes. For example,
the mean value theorem of differential calculus asserts the existence of a parameter
value with a special property. Although the proof places bounds on the parameter
value (and thus provides useful information), the exact value of the parameter is
not specified. Theorems of this character are common in numerical analysis.
Assertions of the form VxP(x) are often most naturally proved by proving
the equivalent assertion Ix — P(x). Both constructive and nonconstructive exist-
ence proofs can then be used. A constructive existence proof involves finding an
element c of the universe of discourse such that P(c) is false; such an element is
called a counterexample to the assertion VxP(x). The element c forms the basis of
a proof by counterexample of the assertion —VxP(x).
Counterexamples can also be used to show that assertions involving predicate
variables are not valid. Construction of such a counterexample requires that we
exhibit a universe of discourse and an interpretation of the predicate variables
which makes the assertion false.
Example
Construct a counterexample to show the following assertion is not valid:
dx[P(x) = Od] > (BxP(x) = 4xQ(x)}.
A disproof requires that we exhibit a universe and predicates P and Q such that the
assertion is false; to disprove the above assertion we must find a universe and inter-
pretations for predicates P and Q such that
(a) Ax[P(x) > QO(x)] is true and
(b) dAxP(x) = 4xQ(x) is false.
From (b) it must happen that
(c) AxP(x) is true and
(d) 4xQ(x) is false.
REASONING Ch. 1
54 MATHEMATICAL
Example
Theorem: For all integers x, x is even if and only if x? is even.
Since the pro of was for arbi trar y x, we can appl y univ ersa l gen era liz ati on to con-
clude that
Vx(x is even <> x? iseven). fF #
(a) That it be powerful enough to prove all valid assertions, that is, all
those assertions which are true regardless of the universe of discourse
and the interpretation of the predicate symbols.
(b) That it be powerful enough to prove all assertions which are true of some
particular universe with a specified interpretation of certain predicate
symbols. An example would be the universe of natural numbers with
predicates corresponding to equality and identities of arithmetic.
Without going into detail, we can say that mathematics has been rather successful
with (a), but not with (b). It has been established that, to a considerable extent, our
lack of success in (b) is inherent in our mathematical methods. For example, a
result due to Gédel asserts that if a formal system is powerful enough to express
assertions about integer arithmetic but permits only true assertions about arith-
metic to be proved, then there are other assertions which are true of arithmetic
but cannot be proved in the system.
The development of an understanding of the distinction between assertions
which are true and those which are formally provable was a magnificent accom-
plishment of mathematics; the work has profound implications for both philosophy
and mathematics. To explore further in this area, the student should consult the
excellent book of DeLong [1970].
When an argument is presented within a formal system, whether it is a proof
can be decided algorithmically, but formal systems do not encompass all of mathe-
matics. When an argument is presented outside a formal system, as most proofs
are, its validity must be determined by mathematicians; they must decide whether
the argument is convincing. Thus, the question is usually decided by consensus;
an argument is accepted as a proof if no one can perceive any flaws in its structure.
Agreement in such matters is very good, but the mechanism is not foolproof.
Although mathematical proofs are intended to be the quintessence of careful
argument, perceiving the flaws of an alleged proof can be a profoundly difficult
task. Examples exist of arguments which were widely accepted as proofs for many
years but were then shown to be fallacious by someone who discovered a possi-
bility which had been overlooked in the original argument. Sometimes such a
discovery results in a new argument being devised, which is then accepted as a
proof of the original assertion. But it is not uncommon for the overlooked possi-
bility to provide a basis for a counterexample to the original assertion, thus
disproving it. In summary, while a purported proof which is generally accepted is
rarely shown to be fallacious, examples of such occurrences do exist, and we must
conclude that “proof” is not a label which can never be removed.
Ch. 1
56 MATHEMATICAL REASONING
Example
Assertion Reasons
NR tn
1. P Premise 1
2. P>@Q Premise 2
3, —“@Q Assumption (negation of conclusion)
4. “PV Q 2, implication
5. —P 3, 4, disjunctive syllogism
6. PA —P 1, 5, conjunction
Sec. 1.6 PROGRAM CORRECTNESS 57
Writing good computer programs is not a well-defined process, and criteria for
the evaluation of programs are often vague and ill-formed. There are, however,
three questions that are commonly used to assess the quality of a program:
(a) Is the program “well written” ?
(b) Is the program efficient?
(c) Does the program do what it is supposed to do?
The first question addresses the matters of style, clarity, and ease of modification;
evaluation of these properties will probably always be difficult and, to some
degree, subjective. The second question concerns the cost of program execution,
usually measured in terms of storage requirements and program execution time;
the study of program efficiency, often called algorithm analysis, will be treated in
Chapter 5. To answer the third question, we must first specify precisely what task
is to be performed. Then we must prove that the program is correct in the sense
that it performs the specified task. Establishing that a program is correct, also
known as program verification, is generally more difficult than writing the program,
but the costs which result from an incorrect program can easily exceed the cost of
verification. As a consequence, techniques for establishing program correctness are
of singular importance to the computer scientist.
Most program errors can be classified as either syntactic or logical. A syn-
tactic error is one which violates the definition of a well-formed program in the
given programming language. Syntactic errors are generally detected by the lan-
guage translator program (i.e., the compiler or interpreter) and can usually be cor-
rected easily. After all syntactic errors have been eliminated, a program is usually
tested for errors in logic by executing the program on a selected set of input data. But
correct performance of a program on test data does not guarantee that the program
is correct unless the program is tested with every possible input. Because it is
usually impractical to test all possible inputs, logical errors may remain even if
the program produces the correct results for the test data. As a consequence, pro-
gram verification usually requires the use of proof methods similar to those de-
scribed earlier in this chapter.
In this section we will describe a method for program verification based on
assertions about the program variables before, during, and after program execu-
tion; we will call such assertions program assertions. For simplicity we will restrict
our examples to integer arithmetic, that is, the universe of discourse for numerical
variables is taken to be the integers. Furthermore, as is customary in treatments of
Ch. 1
68 MATHEMATICAL REASONING
le ms as st or ag e li mi ta ti on s an d re gi st er
this topic, we will ignore such potential prob
overflow.
er ti es of pr og ra m va ri ab le s an d re la ti on -
Program assertions characterize prop
es of pr og ra m ex ec ut io n. Th es e as se rt io ns ca n
ships between them at various stag
utiliz e wh at ev er pr ed ic at es ar e ap pr op ri at e, su ch as
“x is nonnegative”
ae
x=y ae)
“x < y?
“x -- y < 2”?
Definition 1.6 .1: A pro gra m or pro gra m seg men t © is cor rec t wit h res pec t to
an initial assertion I and a final assertion F if,t whenever I is true of the program
+Here and throughout the book we follow mathematical convention for definitions and use
“if” where in fact “if and only if” is intended. For example, when we assert “An integer is prime
if it is greater than 1 and has no positive divisors other than 1 and itself,” the intention is “An
integer is prime if and only if it is greater than 1 and has no positive divisors other than 1 and
itself.” This convention is used only in stating definitions.
Sec. 1.6 PROGRAM CORRECTNESS 659
variables prior to execution of ®, and @ terminates, then F will be true of the pro-
gram variables after execution of © is complete.
We now describe some notation which will be useful in treating program cor-
rectness. Let Ai and Aj be program assertions, and let S be a program segment.
We will use the notation
Ai {S} Aj
to denote “if Aiis true prior to the execution of S, and S is executed and terminates,
then Aj will be true immediately following the termination of S.” Using this nota-
tion we can restate Definition 1.6.1 by saying a program © is correct with respect
to an initial assertion J and a final assertion F if and only if J {@} F. When S con-
sists of a number of program statements it will sometimes be more convenient to
state that “The program segment
Ai
Ss
Aj
is correct” rather than using the notation Ai {S} Aj.
Examplet
The program segment
Al: true
x<—1;
ye 2
A2Z:x=1Ay=2 ,
+The early examples of this section will rely on the reader’s understanding of the effect of exe-
cuting an assignment statement. A careful treatment of this topic will be given later in this section.
REASONING Ch. 1
60 MATHEMATICAL
effect as first executing S, and then ex ec ut in g S,. Th e fir st rul e of in fe re nc e, cal led
Q, {S, } Q, an d Q, {S, } Q;, th en it fol -
the rule of composition, states that if both
if the pr og ra m as se rt io n Q, is ini tia lly tru e of
lows that QO, {S,;5,} Qs; that is,
ec ut ed , th en aft er te rm in at io n of the seg -
the program variables and S;; S, is ex
ment S,; S,, the assertion Q, wil l be tru e. Pr es en te d in the ta bu la r fo rm of ou r
previous rul es of in fe re nc e, thi s is sta ted as fo ll ow s:
Q, {5S} Q,
Q, {S52} Qs
“0, {S13 S2}Q; Rule of Composition
We can interpret the rule of com pos iti on in ter ms of bot h flo wch art s and pro-
grams by adding the pro gra m ass ert ion s to the flo wch art or the pro gra m text .
Note that program assertions are ass oci ate d wit h stat es of the com put ati on rat her
than actions. For this reason, progra m ass ert ion s are ass oci ate d wit h the edg es of
flowcharts, and they either pre ced e or fol low the sta tem ent s of a pro gra m. Whe n-
ever an edge of a flowchart is tra ver sed , the ass oci ate d pro gra m ass ert ion is true .
Immediately before a pro gra m sta tem ent is exe cut ed, the pro gra m ass ert ion whi ch
precedes it is true.
The rule of compos iti on can be int erp ret ed wit h flo wch art dia gra ms as fol low s:
and
is correct.”
Sec. 1.6 PROGRAM CORRECTNESS 61
Example
If we can establish that
are both correct, then we can conclude from an application of the rule of composi-
tion that
Al: true
x<-l;
yox+zZz
A3iy=2z+1
is correct. #
T {S13 Q1, Q1 (S2} Qo, ... 5 On-2{S,-1} Q,-1, and Q,_, {S,} F,
it will follow from repeated applications of the rule of composition that I {®} F.
The next rules of inference, called rules of consequence, state that a program
assertion which precedes a program segment can be replaced by a stronger one, and
an assertion which follows a program segment can be replaced by a weaker one
without affecting the correctness of the segment. (Recall that P is stronger than
OQ if P > Q.) The rules are given as follows:
Q,> Q, Q, {S} Q,
Q, {S} Q, Q, > Q;
“QO, {8S} Q; “QO, {83 Q, Rules of Consequence
The two rules of consequence allow us to ignore information about the program
variables if it is not important for the proof of correctness. For example, the value
Ch. 1
62 MATHEMATICAL REASONING
mi gh t pl ay an im po rt an t ro le in th e pr og ra m as se rt io ns wh ic h
of an in de x va ri ab le
en ex ec ut io n pr oc ee ds pa st th e lo op , th e
hold during execution of a loop, but wh
va lu e of th is va ri ab le ma y no t be si gn if ic an t.
, Q, , an d Q, de no te pr og ra m as se r-
In the rule of consequence the variables Q,
> Q; wh ic h ap pe ar as hy po th es es in
tions. The implications Q, > Q, and Q,
te tw o pr og ra m as se rt io ns . Th es e im pl ic at io ns
the rules are propositions which rela
g th e te ch ni qu es of th e pr ev io us se ct io ns of th is ch ap te r; th is is
are pr ov ed us in
done independently of an y co ns id er at io n of th e pr og ra m.
Example
If the program segment
Al: true
x<-1;
Zoey
Al:z=y+l
Al: true
x<_1;
ze-ytx
A2’:z>y
is correct. #
We next treat the rul es of inf ere nce wh ic h are co nc er ne d wit h so me of the
control statements of our pr og ra mm in g lan gua ge. The con tro l sta tem ent s inc lud e
conditional branches and loo ps; the y can cau se pr og ra m sta tem ent s to be exe cut ed
in an order different fr om tha t in wh ic h the y ap pe ar in the pr og ra m tex t. We wil l
treat three fundamental typ es; “if con dit ion the n S,” “if con dit ion the n S, els e S2, ”
and “while condit ion do S.” In eac h sta tem ent typ e, con dit ion is an ass ert ion (bu t |
not a,program assertion) abo ut the val ues of the pr og ra m var iab les ; wh en ev er
condition is evaluated , it is eit her tru e or fal se. For eac h st at em en t typ e, the por tio n
of the program to be exe cut ed nex t is de te rm in ed by the tru th val ue of con dit ion .
The precise effect of executing eac h sta tem ent typ e is cha rac ter ize d by a rul e of
inference.
When the sta tem ent “if con dit ion the n S” is exe cut ed, the pr og ra m st at em en t
S is executed if and only if con dit ion is tru e. (No te tha t S can be a sin gle st at em en t
or a sequence of sta tem ent s enc los ed in a be gi n. .. end pai r.) A rul e of inf ere nce
for thi s sta tem ent typ e mu st inv olv e pr ec ed in g and fol low ing pr og ra m ass ert ion s
which will be tru e wh et he r or not the st at em en t S is exe cut ed. Th e rul e, cal led the
if-then rule, is the following:
(Q, A condition) {S} Q2
(O, A — condition) > Q,
..Q, {if condition then S} Q, The if-then Rule
Sec. 1.6 PROGRAM CORRECTNESS 63
Note that the implication (Q, /(\ condition) > Q, is a proposition which must
be proved without reference to the program. The if-then rule can be interpreted
using flowcharts in the following way. (Note that when edges of a flowchart con-
verge, the point of convergence is treated as a node and different assertions can
appear on the edges which enter and leave it.)
and
QO, A condition
is correct.”
is true and
A1:Q1 A condition
RY
A2: Q2
is correct, then
Al: Q1
if condition then S
A2: Q2
is correct.”
Example
To show that
Al: true
y 0
if x < 0 then
A2:x2z0Vy=0
Al’:true Ax <0
yO
A2:xeOovy=0
Al :true Nx <0
yO
Al:ix>OVy=0
is correct, we first observe that, since y is assigned the value 0 and the value of x is
not changed,
is correct. Since A2’ > A2, it follows from a rule of consequence that
Al’:true A x < 0°
yo ID
A2:x>OVy=0
is correct. 3
When the statement “if condition then S, else S,” is executed, if condition is
true, then S, is executed; otherwise S, is executed. The if-then-else rule of inference
is the following:
(Q; A condition) {S,} OQ,
(Q, (\ condition) {S,} O,
”.Q, {if condition then S, else S,} O, The if-then-else Rule
We leave the flowchart and program formulations of the if-then-else rule as exer-
cises.
Example
In order to establish that
Al true
if x < 0 then »y «- —1 else
y — 1
Aa <OAy=-IV(KSOAy=1)
Al’:true Ax <0
yo ol
A(x<O0OAY=—-DV@>O0OAy=1)
and
are correct. #
a m o n g th e pr og ra m va ri ab le s ea ch ti me co nd it io n is
relationship wh ic h ho ld s
ly af te r ev er y ex ec ut io n of S. Fo rm ul at io n of th e pr op er
evaluated and cons eq ue nt
te n a di ff ic ul t st ep in pr ov in g a pr og ra m co rr ec t. Th e
loop invarian t re la ti on is of
ow in g (w he re th e pr og ra m as se rt io n Q is th e lo op in va ri -
rule of iteratio n is th e fo ll
ant relation):
Q A condition {S} Q
io n \ Q) Ru le of It er at io n
Q {while condition do S} (—condit
it er at io n ca n be ch ar ac te ri ze d us in g fl ow ch ar ts as fo ll ow s:
Th e ru le of
Q Acondition
is correct, then
is correct.”
Al:@Q A condition
Ss
A2:Q
Sec, 1.6 PROGRAM CORRECTNESS 67
is correct, then
A1:@
while condition do S
A2:Q A “condition
is correct.”
Example
The procedure PRODUCT given in Figure 1.6.1 sets y equal to the product of a
and b, where a is a nonnegative integer. The procedure multiplies a and b by re-
peated addition, that is, y is initialized to 0 and then b is added to y a times.
procedure PRODUCT:
comment: set y = ab, where a> 0.
Al:a=>0
begin
i<0;
Al:aSOAi=0
yO;
AB:aZSOAi=O0Ay=0
A4:y = ib Nixa
whii le < a do
AS:iy =ib Ni<a
begin
yo yt;
46:yp=%+D)bAti<a
ie-it+]
A4:y = 1b Nixa
end
AT:y
= ab
end
The procedure has been annotated with program assertions, one of which holds
after each step of the computation; Al is the initial assertion and A7 is the final
assertion. We will now describe how to prove PRODUCT is correct with respect to
Al and A7.
The proof of correctness can be divided into two parts by proving the following
two lemmas.t
+We continue to rely on the reader’s understanding of the effect of an assignment statement.
REASONING Ch. 1
68 MATHEMATICAL
Al {i — 0; y — 0} A4.
This completes the proof of Lemma 1.
Proof of Lemma 2: To pro ve the whi le loo p is cor rec t wit h res pec t to the ini tia l
assertion A4 and the final ass ert ion A7, we mus t firs t est abl ish tha t the hyp oth esi s of
the rule of iteration holds, that is,
A4 Ni<afy~yt+byicit I A4.
Observe that (44 (A i < a) <= A5, so it suffices to show
AS{y~—y+b;ie—it+ 1} A4.
We use the intermediate assertion A6 and first show
AS {y —y + b} A6.
Since the valu e of y is cha nge d by the ass ign men t stat emen t, let y’ den ote the valu e
of y before the assignment statement is executed. Then A5 is the assertion
y=zilb Ai<a.
We then show that (44 A i> a) = A7and apply a rule of consequence to complete
the proof of Lemma 2.
It follows from Lemma 1 and Lemma 2 and the rule of composition that PROD-
UCT is correct with respect to Al and A7. § #
mulate initial and final assertions for each subtask. In every case in which
a program segment S, may be executed immediately prior to a segment
S,, the final assertion of S, should imply the initial assertion of S).
3.- Prove that each program segment is correct with respect to its initial
and final assertions.
4. Conclude that the program is correct with respect to its initial and final
assertions.
Note that if intermediate assertions have been chosen correctly for a program
and the initial assertion was true prior to program execution, then each program
assertion is true at the appropriate point of the computation. It follows as a special
case that if execution reaches the end of the program (that is, if the program ter-
minates) then the final assertion will be true when execution is complete.
Formally, a program is correct so long as it has performed the correct task
whenever it halts. In fact, according to Definition 1.6.1, a program that never
halts is correct for every pair of initial and final assertions; it follows that proving
program termination is just as important as proving correctness. It is common to
refer to what we have called “correctness” as “partial correctness,” and to call a
program “correct” if it is both “partially correct” and always halts if the initial
assertion is true prior to execution. We will treat one technique for proving pro-
gram termination in Section 3.6.
Axioms of Assignment
The preceding discussion of the formal rules of program verification has only
treated rules of inference. Rules of inference are always of the form “if we know
one thing is true, then we can conclude something else is true.” Unless we have a
characterization of some true statements, we cannot apply the rules of inference;
thus we need some axioms for our system in order to complete the specification of
our proof mechanism. The axioms for program verification describe the effect of
executing an assignment statement.
Consider a program with variables x,, x,,..., x, An assignment statement
has the form
X; <~ E(Xy, Xp... + X,)s
where &(x,, X2,...,X,) is an expression involving (some of) the variables x,,
X2,...,X,. If the program assertion A(x,, x,,...,%,) holds prior to the execu-
tion of the assignment statement, then the following assertion will hold after the
assignment statement:
Ay[A(x, Xa, ees Xi~is Vo Xi+ts ae) Xn) /\
This program assertion states that there exists some value for y (namely the former
value of x,) which makes the assertion
A(X13 X95 066s XpntsVo Minty oe 9 Xn)
Ch.1
70 MATHEMATICAL REASONING
y is su bs ti tu te d fo r x, in th e ex pr es si on & to ge th er wi th th e
true, and if this va lu e of
va ri ab le s x, , wh er e j # i, th e re su lt wi ll be th e cu rr en t
current va lu es of th e ot he r
e ab ov e as se rt io n co rr ec t, bu t it is th e st ro ng es t co rr ec t
value of x; No t on ly is th
ma de ba se d on ly on th e kn ow le dg e th at A( x1 , %2 ,- -- > x, )
assertion wh ic h ca n be
em en t ex ec ut io n. Be ca us e th is wa y of co ns tr uc ti ng pr og ra m
holds prio r to th e st at
s th em in th e sa me or de r as pr og ra m ex ec ut io n, it ca n be us ed
assert io ns ge ne ra te
tr uc ti on of pr og ra m as se rt io ns fo r as si gn me nt st at em en ts .
for the “f or wa rd ” co ns
ti on of th e st ro ng es t po ss ib le pr og ra m as se rt io n is ch ar ac -
The fo rw ar d co ns tr uc
ow in g ax io m co nc er ni ng th e ef fe ct of th e as si gn me nt st at em en t.
terize d by th e fo ll
A(X45 Xq5 00 + 9 Xn) {Xi — E(%1 , Xa ,- + +> Xp )} Ay [A (% 1, Xa y 0 M e a Vs
ev er y oc cu rr en ce of x; in A( X4 5 Xa 3 ++ + Xp )
This assertion is obtained by replacing
as si gn me nt st at em en t; it is th e we ak es t
by the expression on the right side of the
x, , x2 ,- -. ,> X, ) Wi ll ho ld af te r ex ec ut io n of th e
statement which will assure that A(
tr uc ti on of as se rt io ns is fo rm al iz ed by
assignment statement. The backward cons
ca n be us ed in pl ac e of th e on e gi ve n
the following axiom of assignment, which
previously.
A(X 5 Xa 0 029 Mint y B X 45 Xap 00 Xe ee s Xn) > Xi zt s s+ +9 Xn)
(x, — BCX1, Xa5 00s Xn} As Kase es X_)s Alternate Axiom of Assignment.
co ns tr uc t a pr og ra m as se rt io n to pr ec ed e an as-
This axiom is commonly used to
rt io n wh ic h fo ll ow s the st at em en t. Th e ba ck -
signment statement based on the asse
ns for as si gn me nt st at em en ts is us ua ll y eas ier th an
ward construction of assertio
constructing them in the forward di re ct io n. Th e tw o ax io ms of as si gn me nt are
redundant in tha t on ly on e is re qu ir ed for pr og ra m ve ri fi ca ti on .
Examples
(a) Consider the program segment
Al
xe-x+ypt4z
A2
If Al is the assertion
Alixt+yt2z2=9
Sec. 1.6 PROGRAM CORRECTNESS 71
Alix=x Ay=y'
temp<— x;
xe y;
y <- temp
A4ix=y Ay=x’
The assertions Al and A4 involve variables x’ and y’ which are not program
variables. They are auxiliary variables bound by assigning them the original
values of x and y respectively.
To prove the program segment is correct with respect to Al and A4, we
use the backward construction of assertions. From A4 we construct 43 by
substitution of femp for y in A4; thus A3 is
A3:x =y' A temp = x’
Using backward construction from A3, we obtain
A2:y =y’ A temp = x’
Applying backward construction to A2 yields Al. By the rule of composition,
it follows that the program segment is correct with respect to Al and 44. #
1. (a) Give a flowchart int erp ret ati on of the if- the n-e lse rul e of inf ere nce .
Give an informal sta tem ent of the if- the n-e lse rul e of inf ere nce usi ng pr og ra m
(b)
segments.
2. Write a program se gm en t wh ic h is cor rec t wit h res pec t to the ini tia l ass ert ion tru e and
the final assertion false. (Hint: Study Definition 1.6.1.)
3. Prove the follow ing pr og ra m se gm en ts are cor rec t. Use bot h fo rw ar d and ba ck wa rd
construction of assertions.
4, Prove the fol low ing pro gra m seg men ts are cor rec t. Stat e whi ch rule s of inf ere nce are
used.
(a) In the following, x’ is an auxiliary variable.
5. Consider the fol low ing pro gra m seg men t whi ch sets d equ al to max (a, 6, c).
Al: true
d<—a;
<— b;
if b > dthend
ifc > dthend<c
AF:.d=aVd=bVd=c)A\dza\d>bAdzec
Al:n>0O
and the final assertion
AF: Vil <j<n=> V[j]= 0].
Use the following loop invariant relation
i<nt+1A Vil <j <i> Vij] =9).
procedure ZERO:
comment: Set all entries of V[1: n] to zero.
begin
i<1];
while i < n do
begin
V[i]<— 0;
ii+l
end
end
7. The procedure PRODUCT which was proved correct in this section is not the only
procedure which is correct with respect to the given initial and final assertions. Con-
sider the following procedure.
procedure SNEAKY:
Al:a>0
begin
b<-0;
yO
AF:y = ab
end
How could the initial and final assertion be changed so that SNEAKY would not be
correct with respect to AT and AF? Address the general question of how to rule out
unintended solutions.
procedure SUM:
comment: Set sum equal to sum of entries of V[I: n].
Al:n>0
begin
sum <— 0;
ic;
while i <n do
begin
sum <— sum + Vi];
iei+l
end
ena
of thi s ch ap te r co me pr in ci pa ll y fr om th e fie ld
The concepts and terminology
ve s a ve ry re ad ab le tr ea tm en t of ma ny of
of mathematical logic. Wilder [1965] gi
va nt to la te r ch ap te rs as we ll as thi s
the basic issues in this area; his book is rele
en t in tr od uc ti on to ma th em at ic al lo gi c, in cl ud -
one. Shoenfield [1967] gives an excell
ma th em at ic al th eo ry of mo de ls . De Lo ng
ing treatments of formal systems and the
of ma th em at ic al lo gi c, th e na tu re of its
[1970] describes the historical development
io ns . Th e ha lt in g pr ob le m an d re la te d qu es -
results, and its philosophical implicat
tions are treated nicely in Minsky [1967].
d [1 96 7] an d Ho ar e [1 96 9] pr ov id e an ex ce ll en t
The original papers by Floy
ve ri fi ca ti on . Th e su rv ey by El sp as , ed al. ,
introduction to the topic of program
so ci at ed wi th pr ov in g pr og ra m co rr ec tn es s; th ei r
[1972] treats several topics as
di ff ic ul t re ad in g th an th os e by Fl oy d an d
article is broader in scope and more
Hoare. The text by Manna [1 97 4] tr ea ts pr og ra m ve ri fi ca ti on fo r bo th fl ow ch ar t
programs an d pr og ra ms in an AL GO L- li ke la ng ua ge .
SETS
2.0 INTRODUCTION
A set is any collection of objects which can be treated as an entity, and an object
in the collection is said to be an element, or member, of the set. Given any object
x and set S, if x is an element of the set S, we will write x € S; if x is not an
76
Ch, 2
76 = SETS
Examples
Almost anything which wo ul d be cal led a set in or di na ry co nv er sa ti on is an
al sen se. Th e fo ll ow in g ex am pl es wil l ill ust rat e thi s
acceptable set in the mathematic
point.
The set of nonnegative integers les s tha n 4. Thi s is a fin ite set wit h fou r me mb er s:
(a)
0, 1, 2, and 3.
Pub lic Li br ar y at the pre sen t tim e. Thi s is als o
(b) The set of books in the New York
a finite set. It wo ul d be dif fic ult to list the me mb er s of thi s set be ca us e of the
constant flux in the Library’s col lec tio n, but the dif fic ult ies are pra cti cal on es
rather than theoretical.
(c) The set consisting of the na me s of the peo ple who spo ke to Ch ar le ma gn e on Ma y
10, 810 A.D. This set is fin ite and pr ob ab ly con tai ns at lea st one ele men t. It
has the disturbing charac ter ist ic tha t the re ma y not be a wa y to de te rm in e the
members of the set. Mo st ma th em at ic ia ns , ho we ve r, wo ul d not reg ard thi s as
detractin g fr om its acc ept abi lit y as a ma th em at ic al set .
(d) The set of live dinosaurs in the bas eme nt of the Bri tis h Mu se um . As su mi ng the re
have been no sinister experi men ts in the bas eme nt of the Bri tis h Mu se um , thi s
set has the pro per ty of not hav ing any me mb er s, and is cal led a null , or emp ty,
set.
(e) The set of integers gre ate r tha n 3. Eve n tho ugh this set is infi nite , the re is no
difficulty in determining whether a specified integer is a member.
(f) The set of all pro gra ms in the AL GO L lan gua ge whi ch can be pun che d on no mor e
than 500 car ds. Thi s set is ver y lar ge, but fini te, and a cor rec tly ope rat ing com -
piler can det erm ine whe the r or not a pro gra m is an ele men t of this set.
(g) The set of all pro gra ms in the AL GO L lan gua ge whi ch weu ld hal t if run for a
sufficiently long tim e on a com put er wit h unb oun ded sto rag e. Thi s set is not fini te
becaus e no mat ter how lar ge a pro gra m we wri te, it is pos sib le to wri te a lar ger
one by ins ert ing ano the r sta tem ent . (Th e sta tem ent nee d not per for m any
useful tas k.) Alt hou gh the re is a ma xi mu m size of AL GO L pro gra ms whi ch
can be run on any giv en com put er, the re is not hin g abo ut the AL GO L lan -
guage itself whi ch lim its the size of a pro gra m. Com put abi lit y the ory has
established tha t no alg ori thm exi sts to det erm ine whe the r an arb itr ary pro gra m
is an element of this set; such a set is called undecidable.
Sec. 2.1 THE PRIMITIVES OF SET THEORY 77
(h) The set of true assertions about the integers. This is an infinite set, as we can
easily demonstrate by considering assertions of the form
3+1=4,
The assertion
For every natural number n, })7_, i = n(n + 1)/2
is considerably less obvious, but can be proven. There are still other state-
ments which are conjectured to be true, but have never been proved. The fol-
lowing assertion, known as “Fermat’s Last Theorem,” is an example.
Fermat’s Last Theorem: If x,y,z, and n are positive integers and
x” + y" = 2", then n < 2.
This assertion has been a source of frustration to mathematicians for centuries.
In spite of much effort, neither a proof nor a counterexample is known.
Gi) The set with two members, one of which is the set of even integers and the other
the set of odd integers. This example illustrates that sets can have other sets as
members. Denote the set of even integers by A and the set of odd integers by
B, and let C be the set with elements A and B. Then C has only two elements,
each of whic
isa h
set: A ¢ Cand B € C.Notethat2 € A,2 ¢ Band2 ¢ C.
#
Since a set is characterized by its members, a set can be specified by stating
when an object is in the set. A finite set can be specified explicitly by listing its
elements. The elements of the list are separated by commas, and the list enclosed in
braces.
Examples
The following are explicit specifications of finite sets.
(a) The set which contains the elements A, B, and C is denoted by {A, B, C}.
(b) The set which contains all the even, nonnegative integers less than 10 is specified
by {0, 2,4, 6, 8}. #
Examples
The following are implicit specifications of sets. The first two examples are
infinite sets; the third is finite.
(a) The set of integers greater than 10 is specified by
{x[x © TA x > 10}.
(b) The set of even integers can be specified as
{x|dyly e TA x = 2yh.
78 =6SETS Ch. 2
Less formal means are often used to describe sets. One technique is to partly
specify the predicate by the entry to the left of the vertical bar.
Examples
(a) The set of integer multiples of 3 can be specified by {3x|x ¢ I} rather than
{x| dyly e LA x = 3y}}.
(b) The set of rational numbers can be specified by {x/y|x,y eT A y#0}. #
If a set is finite but too large to list easily, or if a set is infinite, ellipses can be
used to specify the set implicitly.
Examples
The following specifications use ellipses to characterize a list of the elements of
a set.
(a) The set of integers from 1 to 50 is specified by {1, 2, 3, ..., 50}.
(b) The set of nonnegative even integers is specified by {0, 2,4,6,...}. #
Axiom of Extension: Two sets A and B are equal, A = B, if and only if they
have the same members (i.e., every element of A is an element of B and every
element of B is an element of A).
The axiom of extension can be expressed in logical notation in two ways:
(a2) A= BoValx e Axe B]
(b.) A= Bo {Vaxlxe A>xec B) A Vxlx ce B>x ce Al
The axiom of extension asserts that if two sets have the same members, then
regardless of how the sets are specified, they are equal. It follows that if a set is
specified explicitly with a list, the order of the listing is immaterial; the set denoted
by {A, B, C} is the same as (equal to) the sets denoted by {C, B, A} and {B, A, C}.
Furthermore, it is of no consequence if an element appears in such a list more than
once; {A, B, A}, {A, B}, and {A, A, A, B, B} are different specifications of the same
set. A finite set can be characterized either explicitly or implicitly, as with the
specifications {1, 2, 3, 4, 5} and {x|x ¢ 1A 1<x< 5}. Moreover, the same set
can be specified implicitly with different predicates, e.g., the sets {x|x= 0} and
{x|x € 1A —1 <x < 1} are equal.
Sec, 2.2 THE PARADOXES OF SET THEORY 79
As we indicated in the introduction to this chapter, the naive set theory which we
have described was ultimately found to lead to logical inconsistencies known as
paradoxes. Although set theory had its bitter opponents, by the time the paradoxes
were discovered around the turn of the century, the theory was widely accepted
and work was under way to establish it as the foundation of logic and mathe-
matics. Discovery of the paradoxes seemed to threaten this fundamental role of
set theory. But the paradoxes were not generally viewed as a basis for abandoning
set theory and starting over again; instead, mathematicians felt that the theory had
to be patched in some way which would eliminate the paradoxes but not affect
the usefulness of the theory. In this section, we will describe the best known para-
dox and briefly indicate some of the means of modifying the theory to avoid such
paradoxes. These modifications can be imposed by axiomatizing set theory in such
a way that the paradoxes cannot occur.
A paradox similar to the one which will concern us is the “liar paradox.”
Consider a man who asserts
“T am lying.”
Is he lying or is he speaking the truth?
If he is lying, then what he asserts is false; since he claims he is lying, he must
actually be telling the truth. We conclude that if he is lying, then he is telling the
truth.
On the other hand, if he speaks the truth, then what he says is true, namely
that he is lying. We conclude that if he is telling the truth, then he is lying.
From the above analysis, we conclude he must be neither lying nor telling
the truth. Thus, the assertion “I am lying,” which appears to be a proposition,
cannot in fact be assigned a truth value.
The liar paradox has been known since antiquity and has no obvious relation
to set theory. Yet it resembles the first widely known paradox, commonly known as
Russell’s paradox, which was discovered by Bertrand Russell in 1901 and inde-
pendently by E. Zermelo. This paradox exploits the absence of restrictions in
naive set theory on the ways in which sets can be characterized. In order to present
the paradox, we consider the possibility of a set being a member of itself. Most
sets which occur to us are not elements of themselves; e.g., {1} € {1}. However,
the set of concepts is itself a concept, and hence this set is apparently a member of
itself. The assertions x € x and x ¢ x are therefore predicates which can be used
to define sets.
Russell proposed the following paradox. Let the universe of discourse be the
set of all sets, and define S to be the following set:
S = {x|x € x}
Thus, S is the set of all sets which are not members of themselves. We now ask
“Is S a member of itself?”
Suppose S is not a member of itself. Then S satisfies the predicate x ¢ x
which defines the set S and therefore S < S. On the other hand, if S € S, then S
must satisfy the predicate which defines S and therefore S ¢ S.
Thus, we are led to a contradiction analogous to that of the liar paradox:
neither S ¢ Snor S ¢ Scan be true. A “set,” such as S, which leads to a contra-
diction is said to be not well-defined.
The Russell paradox established that set theory, as originally conceived, led to
inconsistencies. Mathematicians were faced with the necessity of abandoning the
theory or modifying it in some way which would eliminate the paradoxes. The
difficulty was felt to originate in the unrestricted way in which sets could be defined;
in particular, the concept of a set being a member of itself was considered suspect.
A number of approaches were developed, each of which used axioms to restrict
the way in which sets can be specified.
Russell and Whitehead, in the Principia Mathematica, developed what they
called the “theory of types.” This is a set theory in which sets exist in a hierarchy.
The lowest level of the hierarchy contains “individuals.” All other levels of the
Sec, 2.2 THE PARADOXES OF SET THEORY 81
hierarchy contain sets whose members must be elements of the next lower level of
the hierarchy. Each level of the hierarchy is called a type. Since x can be a member
of y only if y is a level higher in the hierarchy than x, a set cannot be a member of
another set of the same type. Thus, in the theory of types, expressions such as
x € x are not meaningful and we are spared the problem of dealing with them.
Other formulations of set theory have been created which also avoid the
Russell paradox. In each of these formulations, there are restrictions on the ways
in which sets can be related, and these restrictions imply that no set is permitted
to be a member of itself. The axiomatic formulations of these theories are too
complex to present here, and we will forego a description of them even though they
are currently more popular than the theory of types created by Russell and White-
head.
Having axiomatized set theory in a way which avoids the Russell paradox, it
is natural to ask if we can be sure that no other paradoxes are lurking in the formal
structure we have created. Using the mathematical techniques which are currently
available, there is no way to show that new paradoxes will not arise. A logical
theory which does not lead to paradoxes is called a consistent theory; more
formally, a logical theory is consistent if it is impossible to prove both an assertion
P and its negation —7P. Since we only want to prove assertions which are true and
no assertion is admitted to be both true and false, we naturally want to use logical
systems which are consistent. However, consistency by itself is not enough, since
a theory which does not permit any theorems to be proved is consistent but worth-
less. A system in which it is possible to prove all the theorems that are true is called
complete. A trivial example of a complete system is one in which every assertion
can be proved, but such a system is obviously not consistent. What we really want
is a logical system which is both complete and consistent; in this case, we can prove
everything that is true and nothing that isn’t. It has been proved that no axiomatic
formulation of set theory can be both complete and consistent. Furthermore, in
order to prove the consistency of one of these formulations, we must construct
the proof in a more powerful system. But to be sure that such a proof is acceptable,
the more powerful system must itself be proved consistent, which requires a still
more powerful system, and so on. It follows that there does not exist any way to
establish that new paradoxes will not arise in set theory.
1. (The Barber Paradox) The only barber of a small town vowed that he would only
shave those citizens who did not shave themselves. If only a barber is permitted to
shave someone other than himself, how did the barber get shaved?
2. Show that the assertion
This statement is false.
is not a proposition.
3. Define an adjective to be homological if it applies to itself and heterological if it does
not. The words “ugly,” “English,” “erudite” and “eroneous” are homological, because
82 SETS Ch. 2
There are two fundamental relations that can hold between two sets: equality and
containment. The relation of set equality has already been defined by the Axiom of
Extension. The set containment relation is defined as follows:
Examples
(a) The set of even integers is a proper subset of the integers.
(b) The set of men is a subset (and also a proper subset) of the set of humans.
(c) The set {1, 2, 3, 4, 5} is a subset (but not a proper subset) of the set
{x]jxeTAQ0<x<6} #
Theorem 2.3.1: Let U be the universe of discourse and A a set. Then A < U.
Proof: The proof is an example of a trivial proof based on the fact that
x € U for every element x. The set A is a subset of U if and only if the implication
xEArmxceU
is true. But x < U is always true; hence the implication is true. Since x was arbi-
trary, it follows by universal generalization that
Vx[x € A>xe U]
and therefore Ac U. Jj
The next theorem establishes the relationship between set equality and set con-
tainment.
Sec. 2.3 RELATIONS BETWEEN SETS 83
[A4=B>AcBA[A=B>BcA]
which is equivalent to
(A=B)=>[
B)A
(A(Bc
<¢A).
(b) (the “if” part): [A c BA Bc A] >A=B.
Suppose A c Band Bc A. By Definition 2.3.1,
AcB=NVsxlxeA>xeB] and BoA>Vx[x
€ B>x € Al.
Hence,
(ACB
Bc A)>[(
A Wxlx e A> Xe BA Vax € BS x € Al].
Thus,
[Ac BA Bc A]>(A=B8B). f
The preceding theorem will be used in many of our proofs of set equality;
rather than showing directly that A = B, we will show A < Band Bc A, and
then conclude that they are equal.
The following corollary is a consequence of the preceding theorem. The proof
is left as an exercise.
Definition 2.3.2: A set with no members is called an empty, null, or void set.
A set with one member is called a singleton set.
The next theorem establishes that there exists one and only one empty set;
this is often stated as “the empty set is unique.”
Theorem 2.3.5: Let 6 and ¢’ be sets which are both empty. Then ¢ = ¢’.
Proof (Direct): Since ¢ is empty, it follows from Theorem 2.3.4 that 6 < ¢’.
Similarly, 6’ < ¢. Therefore, by Theorem 2.3.2,6=¢'. fj
Traditionally, the symbol ¢ is reserved to denote the empty set. Note that the
set is distinct from the set {¢}; the latter has one element, namely the empty set.
The empty set can be used to construct an infinite sequence of distinct sets. In the
sequence
Examples
(a) The set {a,b} has four distinct subsets: {a, b}, {a},{b} and ¢. Note that
{a}< {a,b} and a é {a, b}, but {a} ¢ {a,b} and a ¢ {a, 5}. Furthermore,
$ < {a, b} but $ ¢ {a,b}.
(b) The set {{a}} is a singleton set; its sole member is (the set) {a}. Every singleton
set has exactly two subsets; the subsets of {{a}} are {fa}} and g. #
In general, a set with n elements has 2” distinct subsets. We will prove this
in a later section.
Programming Problem
Write a program which decides if two input sets are equal or if one is contained
in the other. Assume all sets are finite subsets of the set of natural numbers N.
An operation on sets uses given sets (called the operands) to specify a new set
(called the resultant). We will first treat binary operations; a binary operation com-
bines two operands to produce a resultant.
As in the previous sections, we assume that all sets are constructed from some
implicitly specified universe of discourse U.
AN B={x|xEeAAx
ce Bh.
(c) The difference of A and B, or relative complement of B with respect to A,
denoted A — B, is the set
A—B={x|xeAAxéBh.
86 SETS Ch. 2
Examples
Let A = {0, 1, 2} and B = {I, 2, 3}. Then
(a) = {0,1,
AUB 2, 3}
(bt) {l, 2}
AN B=
(c) A—B= {0}
(dd) B-A=({3} #
Definition 2.4.2: If A and B are sets and A B=, then A and B are
disjoint. If C is a col lec tio n of sets suc h tha t any two dis tin ct ele men ts of C are
disjoint, then C is a collection of (pairwise) disjoint sets.
Example
If C = {{0}, {1}, {2},...} = (fi © N}, then C is a collection of disjoint sets.
ain
wT
We next defi ne som e imp ort ant clas ses of bin ary ope rat ion s. Not e that the
following definition is not restricted to operations on sets.
Definition 2.4.3: Let ["] denote a binary operation, and let x [] y denote the
resultant obtained by applying the operation [-] to the operands x and y. Then
(a) The operation [1] is commutative if x[]y=y(]~.
(b) The operation ["] is associative if (x Qy)C]z=x(10 (42).
Examples
For the integers, the binary operation of addition is commutative and associa-
tive since for all integers x, y and z,
X+y=ytx
(«ty4+z2=x4+04+2)
However, the operation of subtraction is neither commutative nor associative, ¢.g.,
6-444—-6
(6—4)-24#6—-(4-—2) #
Theorem 2.4.1: The set operations of union and intersection are commuta-
tive and associative, i.e., for arbitrary sets A, B, and C,
{a) AUB=BUA
(b) ANB=BQA
(c) (AUB)UC=AU(BUC)
(dd) AN BNC=AN(BNC)
The proofs of assertions (a)-(d) use the commutativity and associativity of the
logical operators \/ and /\. We will illustrate by proving assertions (a) and (c).
Sec. 2.4 OPERATIONS ON SETS = 87
Proof:
(a) Let x be an arbitrary element of the universe U. Then
xEAUBSXECAVXEB Definition of U
>xEBVxeEad Commutativity of
-xeEeBuA Definition of U
Since x was arbitrary, it follows that
Valxe AUBSxe BUA].
Hence, A UB=BUA.
(c) Let x be an arbitrary element. Then
xEAU(BUC)SxE AV XE(BUC) Definitiof
onU
-xEeAV (xe BV xeC) Definitiof
on U
(xe AVxe B)VxeEC Associativity of V
Hx E(AUBVxEC Definition of U
-xE(AUBUC Definition of U
Since x was arbitrary, it follows that
Valxe AU(BUCSx
) € (AUB) UC.
Hence, AU(BUC)=(AUB)UC. §
Examples
For the set of integers, multiplication distributes over addition:
x(Ytz2Hx yt xz
Addition does not distribute over multiplication, e.g.,
4+ (6-2)#(4+6)-44+2) #
Theorem 2.4.2: The set operations of union and intersection distribute over
each other, i.e., for arbitrary sets A, B and C.
(a) AU(BNOC)=(AVBN(AUC)
(b) AN(BUC=(AN BU(MANC)
88 = SETS Ch, 2
From parts (j) and (k) of Theorem 2.4.3, it follows that for any subset A of
a universe U, AU U = U and At U = A. When the universe of discourse is
understood, a unary operation of complementation is defined.
Examples
(a) If U = {i, 2,3, 4} and A = {1, 2}, then A = {3, 4}.
(b) If U=Nand A = {x|x > 0}, then A = {0}.
(c) If U=ILand A = {x|x > 0}, then A = {x|x <0}. #
90 SETS Ch, 2
The proofs follow directly from the previous theorem and are left as exer-
cises.
The following theorem states another useful relationship between a set and its
complement.
When the number of sets is small, the result of many set operations can be
represented pictorially using Venn diagrams. Examples of these diagrams are given
in Fig. 2.4.1. In each case, the rectangle represents the universe and the circles
Sec. 2.4 OPERATIONS ON SETS 91
represent arbitrary sets A, Band C. The shaded portion of each diagram represents
the expression which appears below.
The binary operations of union and intersection can be considered as special
cases of operations which form unions and intersections of any number of sets.
These more general operations are defined over collections of sets.
ral num ber s {0, 1, 2,. .., ”} the n the uni on and int ers ect ion of the mem ber s of
C can be denoted by using notation similar to the summation notion. Let
C = {Ap, A,,..-, A,}; then
Us=U4= U 4= U A, = Ap U Ay U +++ U A,
SEC i=0 O<isa f6{0,1,...,0]
In gene ral the set of indi ces need not be a subs et of N, but can be an arbi trar y set.
Examples
Let the universe be the set of real numbers R.
(a) If C = {{0}, {0, 1}, (0, 1, 2}}, then Usec S = {0, 1, 2}, and sec S = {0}.
(b) Let (a, b) denote the open interval from a to 5, i.e., (a, b) = {x|a <x < 5}. If
Usec S = (—o, 00) = R, and
C = {((—n,n)|n € I A n> 0}, then
sec S = (—1, 1).
(c) Let C = {A;|i © {a, b, c}}, where A, = {0, 1, 2}, A, = [4, 5, 6} and A, = {2}.
Then Uietatc) A; = {0, 1, 2, 4, 5, 6} and ( \eta,b,c} A; = @. #
We will often refer to the set of subsets of a set. Since the set of subsets of
a given set A is unique, we can define a unary operation on sets whose value is the
set of subsets of the operand.
Definition 2.4.7: Let A be a set. The power set of A, denoted (A), is the set
of all subsets of A.
Examples
(a) If A = @, then P(A) = {9}.
(b) If A = {1}, then @(A) = {@, {1}.
(c) If A = {1, 2} then @(A) = {¢, {1}, {2}, {1, 2H.
(d) If A is any (finite or infinite) set of natural numbers then A € O(N). #
(b) Give a formula which denotes the shaded portion of each of the following Venn
(i) ! Ce3,2)
(ii) a>
(iii)
(b) AN(AUB)=A
(c) A-~B=ANB
(dd) AU(ANB)=AUB
(e-) AN(AUB=ANB
11. In each of the following, find Usec S and (sec S.
(a) C= {¢}
(b) C = {6, (6}}
(c) C = {fa}, {5}, {a, b}
(dd) C={Hlie B
12. Let A, B, and C be subsets of some universe U, and let D be the following collection.
D={AN BOAC,AN BOC AN BOC ANBOC
AN BAC ANBNACANBACANBNCG
(a) Construct a Venn diagram for the elements of the collection D.
(b) Prove that AM BO Cand AN BN Care disjoint. Is D a disjoint collection
of sets?
(c) Prove that Usen S = U.
13. Let C be a nonempty collection of subsets of some universe U. Prove the following -
generalization of DeMorgan’s laws.
@ US=S8
SEC sec
(b) (\S=US
SEC SEC
14. Specify the power set for each of the following sets.
(a) {a, 5, c}
(b) {{a, 5}, {c}}
(c) {{a, b}, {b, a}, {a, 6, B} |
15. Let S, = {ao, @1,..., Gnt and S,.1 = (ao, @1,..- 5 &q, Anyi}. Describe how P(S,41)
is related to @(S,). (Hint: P(S,,1) contains P(S,).)
16. Let x and y be real numbers and define the operation x A y to be x” (x raised to the
power y). :
(a) Show that the operation A is neither commutative nor associative. :
(b) Let o represent multiplication. Determine which of the following distributive _
laws hold.
(i) xoYVAz=(@cy)A&ez)
(ii) GY Az)ox =Yox Alex)
(fii) x AQoz=@AYo@Az)
(iv) Woz Ax=WAx)°ZAx)
Programming Problems
L Write a program to generate the power set of {0, 1, 2,..., 2} for any natural number
n given as input.
(a) Write a program which accepts specifications of two finite sets A and B, where ©
A, B < N, and prints a nonredundant list of the elements of AU Band AO B. .
(b) Write a program to determine for a given set A and an arbitrary n © N whether
né A.
“
Sec, 2.5 INDUCTION 95
2.5 INDUCTION
Earlier in this chapter, we described how finite sets can be defined either
explicitly by listing the elements of the set, or implicitly by using a predicate with
free variables; we also observed that infinite sets can only be specified implicitly.
But predicates do not always provide a convenient means of charactering an
infinite set. For example, there is no convenient or obvious predicate to specify
the set of ALGOL, PL/I, or FORTRAN programs, or even such a basic structure
as the set of natural numbers N. Such sets are often most naturally defined using an
inductive definition.t
An inductive definition of a set always consists of three distinct components.
1. The basis, or basis clause, of the definition establishes that certain objects
are in the set. This part of the definition has the dual function of establish-
ing that the set being defined is not empty and of characterizing the
“building blocks” which will be used to construct the remainder of the
set.
2. The induction, or inductive clause, of an inductive definition establishes
the ways in which elements of the set can be combined to obtain new
elements. The inductive clause always asserts that if objects x, y,...,2
are elements of the set, then they can be combined in certain specified ways
to create other objects which are also in the set. Thus, while the basis
clause describes the building blocks of the set, the inductive clause de-
scribes the operations which can be performed on objects in order to
construct new elements of the set.
3. The extremal clause asserts that unless an object can be shown to be
a member of the set by applying the basis and inductive clauses a finite
number of times, then the object is not a member of the set. The extremal
clause of an inductive definition of a set S has a variety of forms, such as
(i) “No object is a member of S unless its being so follows from a finite
number of applications of the basis and inductive clauses.”
(ii) “The set S is the smallest set which satisfies the basis and inductive
clauses.”
(iii) “The set S is the set such that S satisfies the basis and inductive
clauses and no proper subset of S satisfies them (i.e., if T is a sub-
set of S such that T satisfies the basis and inductive clauses, then
T= 8S).”
(iv) “The set S is the intersection of all sets which satisfy the properties
specified by the basis and inductive clauses.”
In fact, all these forms of the extremal clause are equivalent in consequence though
{The term “recursive definition” is often used to denote what we call an “inductive definition.”
96 SETS Ch, 2
not in form, and all serve the purpose of establishing that nothing is a member
of the set being defined unless it is required to be so by the first two steps of the
definition. Often the extremal clause is not stated explicitly in an inductive defini-
tion; this rarely leads to misunderstandings.
Example
We will now introduce some notation and terminology that will enable us
to give some further examples of inductively defined sets. We use Z to denote
a finite and nonempty set of symbols or characters; & is called an alphabet. A string
of a finite number of symbols, each of which is an element of 2, is called a word
or string (or sometimes a sentence) over the alphabet X. Let x be a word over
x; if x = a,a,a;...a,, where n € N and a, & 2 for each 1 <i<n, then the
length of x isn, the number of symbols in the word x. The string of length 0, denoted
A, is called the empty (or null) string. If x and y are strings of symbols over XZ,
x = a,a,...a, and y= b,b,...b,, where a, € Z and b, € & for all i andj,
then x concatenated with y, denoted xy, is the string
xy = a,a,...4,b,b,...5,5
if x = A then xy = y and if y = A then xy = x. If z = xy, then x is a prefix of
zand y is a suffix. If x ~ z, then x is a proper prefix; if y % z then y is a proper
suffix. If w = xyz then y is a substring of w and if y ¥ w, then y is a proper substring.
The following two definitions describe sets which are widely used in computer
science. In later parts of this text we will develop some of the properties of these
sets, and we will often refer to them in examples.
We will not distinguish between the symbol a € X and the word over which consists of
the single symbol a. These two objects are not the same, but the distinction is generally not an
important one for our purposes.
Sec. 2.5 INDUCTION 97
Example
If & = {a, b}, then L* = {a, b, aa, ab, ba, bb, aaa, aab,..3. +
The set of all finite strings of symbols from the alphabet © is denoted by &*. The
set 2* includes the empty string and can be defined as £* = E+ U {A}, or it can
be defined inductively.
Examples
(a) If X = {a, b}, then X* = {A, a, b, aa, ab, ba, bb, aaa, aab, . . .}.
(b) If X = {0, 1}, then L* is the set of all finite binary sequences, including the
empty sequence. #
Examples
(a) The set of arithmetic expressions includes sequences of symbols such as
“(5 + 6)/2)” and “((4/2) — 13)” but does not include sequences such as
“+ 6+”, and “+) (”, even though all these expressions are sequences of
symbols from the same alphabet. We will illustrate how to define the set of
98 SETS Ch. 2
well-form ed ari thm eti c exp res sio ns by mea ns of an ind uct ive def ini tio n. For
simplicity we will res tri ct our def ini tio n to the set of ari thm eti c exp res sio ns
involving onl y int ege rs, the una ry ope rat ion s of + and —, and the bin ary
operations of +, —, / and ».
1. (Basis) If D = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} and x € D*, the n x is an ari th-
metic expression.
2. (induction) If x and y are arithmetic expressions, then
(i) (+ x) is an arithmetic expression,
(ii) (— x) is an arithmetic expression,
(iii) (x + y) is an arithmetic expression,
(iv) (x — y) is an arithmetic expression,
(v) (x/y) is an arithmetic expression, and
(vi) (x * y) is an arithmetic expression.
3. (Extremal) A sequence of symbols is an arithmetic expression if and only
if it can be obtained by a finite number of applications of clauses 1 and 2.
The set of arithmetic expressions characterized by this definition includes
346, 0000, (—64), (3 + 7), 3#(—61)), and (+(—(+-(6/7))).
(b) The set of propositional forms is another set which is most naturally defined
inductively. Let V = {P, Q, R,...} be a set of propositional variables, where
V does not contain any of the following symbols: (,), A, V, =>, <>, 7,9, 1.
Then
1. (Basis) 0 is a propositional form.
1 is a propositional form.
If x € V, then x is a propositional form.
2. (Induction) If E and F are propositional forms, then
(7),
(EV F),
(E \ F),
(E => F), and
(E <> F) are all propositional forms.
3. (Extremal) The set of propositional forms is the set of all expressions
which can be formed by a finite number of applications of clauses 1 and 2.
Using this definition, if V = {P, Q, R,}, then (P A Q)=> R) is a proposi-
tional form over V. This can be established as follows: From the basis clause, it
follows that P, QO, and R are all propositional forms. Applying the induction clause
to P and Q, it follows that (P A Q) is a propositional form, and by another
application of the inductive clause, this time to (P A Q) and R, it follows that
(P A Q)= R) is a propositional form. Thus one can show that an element is a
member of an inductively defined set by exhibiting a sequence of applications of
the basis and inductive steps which produces the element in question. #
Recursive Procedures
of an inductive definition. As we use the terms,t not all recursive definitions are
inductive; we will give examples to illustrate the difference in a later chapter.
In programming, a recursive procedure, or recursive subroutine, is one which
can call itself, either directly or indirectly. Recursive procedures are based on
recursive definitions, although the definition need not be of a set. If a recursive
procedure is based on an inductive definition, the segments of the procedure often
correspond in a natural way to the basis and induction clauses of the definition.
It is often necessary to write procedures to determine whether an input has
a specified property. If the set of elements which have the property is defined
inductively, a recursive procedure is a natural and powerful mechanism for deter-
mining set membership.
Examples
(a) Consider the universe I, and let E be the set of nonnegative even integers de-
fined inductively in the first example of this section. The recursive procedure
EVEN() given in Fig. 2.5.1 returns “yes” if an input 2 € I is an element of
the set E; otherwise it returns “no.” The procedure has three parts. The first
part causes “no” to be returned if the input is too small; this part of the pro-
cedure does not correspond to any part of the inductive definition of E. The
second part of the procedure tests if n = 0; this corresponds to the basis clause
of the definition of E, The third part corresponds to the inductive clause of the
definition and causes EVEN to call itself with the parameter n — 2.
procedure EVEN(n):
comment: If is even and n> 0, then return “yes.”
Otherwise, return “no.”
if 2 < 0 then return “no”
else
if nz = 0 then return “yes”
else
return EVEN( — 2)
TA distinct but related meaning of the term “recursive” is used in mathematical logic and the
theory of computable functions, but a discussion of the relationship between the two uses is beyond
our scope. We will only use the term in the informal sense described above.
Ch. 2
100 = SETS
procedure ARITH(exp):
comment: If exp is an ari thm eti c exp res sio n, the n ret urn “ye s.”
Otherwise return “no.”
begin
comment: Determin e if exp is gen era ted by the bas is cla use .
if exp is a string of digits then return “yes”
else begin
comment: De te rm in e if exp is gen era ted by the ind uct ive cla use .
if exp contains a sub str ing exp _1 suc h tha t eit her exp = (+ ex p_ 1)
or exp = (—exp_1))
then return ARITH(exp_})
else if exp con tai ns sub str ing s exp _1 and ex p_ 2 suc h tha t
exp = (exp_1 1] exp_2)
where [[] is an operation symbol (+, —, / or *)
and ARITH(exp_1) = “yes”
and ARITH(exp_2) = “yes”
then return “yes”
end;
comment: exp is not pr od uc ed by eit her bas is or ind uct ive cla use s.
return “no”
end
Fig. 2.5 .2 Rec urs ive pro ced ure AR IT H to det erm ine whe the r a
string of symbols is an arithmetic expression
string of digits. If so, the procedure ret urn s “ye s.” If exp is not a str ing of dig its ,
then ARITH breaks exp int o no no ve rl ap pi ng su bs tr in gs to de te rm in e if exp is
generated from other ar it hm et ic ex pr es si on s by the in du ct io n cla use . If thi s is
not the case, ARITH conc lu de s tha t exp is not an ar it hm et ic ex pr es si on an d
returns “no.” #
Inductive Proofs
Inductive definitions not onl y pro vid e a me th od of def ini ng inf ini te set s, but
they also form the basis of som e pow erf ul tec hni que s for. pro vin g the ore ms. If
a set is finite, a sta tem ent of the for m Vx P( x) can in pri nci ple be est abl ish ed by an
Sec. 2.5
INDUCTION 101
Most commonly, proofs by induction deal with the natural numbers. In order
to discuss these proofs, it will be useful to have the following inductive charac-
terization of N.
1. (Basis)O EN.
2. (Induction) Ifn < N, then(n+ 1) EN.
3. (Extremal) If S < N, and S has the properties
(i) OES,
(ii) For everyn EN, ifn e Sthen(@z+ lI é S,
then S=N.
In fact, this does not suffice to define the natural numbers because we have
not carefully specified what is meant by the basis and inductive steps; we will
present a proper definition of N in the next section. However, the above charac-
terization will enable us to discuss inductive proofs for the universe N. The extre-
mal clause in the above characterization of N is the form customarily used in
definitions of the natural numbers; it is called the First Principle of Mathematical
Induction. This form of the extremal clause implies the procedure to be used for
inductive proofs of assertions of the form VxP(x) for the universe of natural
numbers. Such a proof proceeds as follows:
1. (Basis) We first show that P(0) is true, using whatever proof technique is
appropriate.
2. (Induction) We next show Vn[P(n) > P(n + 1].
The inductive step of the proof is usually a direct proof of the implication
P(n) = P(n + 1), where the implication is established for arbitrary n < N. The asser-
tion P(n) is known as the induction hypothesis. The induction hypothesis is often stated
as “Assume P(n) is true for arbitrary n < N”. Note that this is not equivalent to
assuming the truth of the theorem; P(n) is assumed only for the purpose of proving
the universally quantified assertion Wn[P(n) = P(n + 1)]. Once P(n) > P(n + 1)
has been proven for arbitrary n, it follows (by the rule of inference known as
Universal Generalization) that Vn[P(n) = P(n +- 1)]. Then from the First Principle
of Mathematical Induction we can conclude VxP(x). For suppose S is the subset
of N such that P(n) is true for every n < S. The basis step of the proof establishes
that 0 € S, The inductive step establishes that for every n € N, if n € S, then
(n + 1) & S. By the extremal clause of the definition of N, itt follows that S=N,
i.e., WxP(x).
To illustrate proofs by induction over N, we will prove the following.
Sec. 2.5 INDUCTION 103
aa2.
=O a
00+1
9G
The following theorem gives algebraic expressions for two more finite sums
which will occur in Chapter 5 when we treat the analysis of algorithms. The proofs
are by induction and are left as exercises.
Often sets which hav e bee n ind uct ive ly def ine d are use d as a bas e for oth er
inductive definitions. Such “s ec on da ry ” ind uct ive def ini tio ns req uir e no ex tr em al
clause because the ext rem al cla use of the un de rl yi ng set ful fil ls the ap pr op ri at e
function.
Example
The following is an in du ct iv e de fi ni ti on of th e ex po ne nt ia l a" fo r no nn eg at iv e
intege r va lu es of n. Th e un de rl yi ng in du ct iv el y de fi ne d set is N.
Definition 2.5.4: Leta © R-+ andz & N. The value of a” is defined inductively
as follows: ;
1. (Basis) a® = 1.
2. (induction) a**! = aa.
The inductive definition can be used to establish the following:
P(0)
ValP(™) > P(n + 1]
“. VxP(x)
We often wish to prove that a predicate P holds for all x =
k for some integer
k. A proof by induction is still appropriate but the basis st
ep must be changed to
prove P(k). The rule of inference is then
P(k)
Vn[P(n) > P(n + 1)]
Example
We use the Second Principle of Mathematical Induction to prove that all
integers n > 2 can be written as a product of prime numbers. The induction hy-
pothesis asserts that for arbitrary n,
For every k such that 2<k <n, k can be written as a product of prime
numbers.
On the basis of this assumption we must show that n can be written as a product of
primes.
The proof is by cases.
Case 1: If nis a prime, then 7 is such a product of one prime.
Case 2: Ifnis nota prime, thenn = ab, where 2 < a,b <n. By the induction
hypothesis, both a and 6 can be written as products of primes and therefore their
product can be written in this form. #
al=TLi
i=l
whe
nn> 1.
Prove by inducti
tho
atn (1 +2+3+4... t nj? = 13 +23 433 4+... 4 73 for
all ec x
I-+.
Let a be a positive number. Prove
() MDi)=@+)!-1
=0
(4) 14+2n<3"
Prove Theorem 2.5.3.
A polygon is convex if every line joining two points of the polygon lies wit
hin the
polygon. Prove that the sum of the interior angles of a convex polygon wit
h x
sides is equal to (x — 2) 180° for all n > 3. (Hint: If 2 > 3, the polygon can be di
vided
into two parts by connecting nonadjacent vertices.)
Find predicates P and Q over the natural numbers which will establish that the basis
step and the induction step of an inductive proof are independent, i.e., neither
logically implies the other. Specifically, find a predicate P such that P(O) is true
and Wnr[P(n) > P(n + 1)] is false and a predicate Q such that Q(0) is false and
Vin[Q(n) = O(n + 1)] is true.
10. What is wrong with the following proof that all people are the same size? We purport
to prove that for all n and for all S, if S is a set with 2 people, then all people in §
are the same size.
1. (Basis) Let S be an empty set of people. Then for all x and y, if x € S and
y € S, then x is the same size as y.
108 SETS Ch. 2
2. (Induction) Assume the assertion is true for all sets containing n people. We
show it is true for sets containing n -+ 1 people. Any set consisting of n + 1
people contains two nonequal subsets of n people which must overlap. Denote
these sets by S’ and 8”. Then by induction hypothesis, all people in S’ are the
same size and all people in S” are the same size. Since S’ and S” overlap, all
people in S = S’ U S” are the same size.
11. Let {A;, Az, ..., A,} be a nonempty collection of sets. Prove the following generali-
zations of DeMorgan’s Laws by induction on n.
@ Ua=O4
6) Q4=U%
12. A binary operation ["] is said to be associative if a [7] (6) c) = (a(.] 6) Lic. From
this “associative law” we infer a much stronger result, namely that in any expression
involving only the operation ["], the placement of parentheses does not affect the
result, that is, only the operands and the order in which they occur in the expression
are important. In order to prove this “generalized associative law,” we define the “set
of [] expressions” as follows:
1. (Basis) A single operand a, is a [_] expression.
2. (Induction) Let e,; and e, be [] expressions. Then (e, [_] e2) is a ([] expression.
3. (Extremal) There are no [_] expressions other than those which can be constructed
from 1 and 2 in a finite number of steps.
The generalized associative law can now be stated as follows:
Let e be a [] expression with n operands a;, a2,..., a, which appear in that
order in the expression e. Then
[email protected],)--.))).
Prove this generalized associative law. (Hint: Use the Second Principle of Mathe-
matical Induction.)
In this section, we will exhibit a careful set theoretic definition of the natural
numbers. In the previous section, we used the operation of addition to give an
inductive characterization of N. Since the definition of addition of natural numbers
must be based on the set N, the characterization we gave is circular and hence
unacceptable as a formal definition of N. To avoid this circularity, N must be
defined without using addition. The following is a better (but not yet successful)
characterization of N which uses n’ to denote the “successor” of a natural number
n; informally, we interpret n’ as n + 1.
1. (Basis)0 < N.
2. (Induction) Ifn € N, then n’ € N.
3. (Extremal) If S c N and S satisfies clauses 1 and 2, then S = N.
Sec. 2.6
THE NATURAL NUMBERS 109
On
”
05,
uw
09
Ww
0 0’ 0” 0
To rule out suc h mod els , the def ini tio n of N mus t gua ran tee tha t if x' = y’ the n
x == y, that is, a natural number can have at most one predecessor.
A def ini tio n of N whi ch sat isf ies all of the se con str ain ts can be con str uct ed
using set the ory . Eac h nat ura l nu mb er wil l be a set. The firs t nat ura l nu mb er is
defined to be ¢, changing the basis step to
1. (Basis) ¢ is a natural number.
For each natural number 2, its successor, n’, is constructed as follows.
2. (Induction) If is a natural number, then 7 U {n} is a natural number.
The extremal step remains unchanged. The result is the following definition.
Definition 2.6.1: The set of natural numbers N is the set such that
1. (Basis) ¢? € N,
2. (Induction) Ifn e N, then zn U {n} € N,
3. (Extremal) If S < N and S satisfies clauses 1 and 2, then S = N.
The set of natural numbers, according to this definition, has as its elements the .
sets b, {h}, {b, {G3}, {6, {G}, {G, {G}}},... which we denote by the numerals 0,1, |
2,3,... Many of the familiar properties of the natural numbers can now be |
established, including the following theorems. (The proofs can be found in Chapter |
1 of Cohn [1965].) r
Examples
(a) If X = {a,b} and x = ab, then x° = A, x! = ab, x* = abab, and
x3 = ababab.
(b) The set {a"b"| n > 0} denotes the set {A, ab, aabb, aaabbb,...}. #
Examples
(a) The set {a, ab, abb} is a language over X = {a, b}.
(b) The set of strings consisting of sequences of a’s followed by sequences of b’s,
{a"b™ |n, m & N}, is a language over fa, b}.
(c) The set of ALGOL programs is a language over the alphabet consisting of the
ALGOL character set. #
Since every language is a set, the usual collection of set operations introduced
earlier in this chapter can be applied to languages. However, because they are
collections of strings, other important operations on languages can be defined as
well, many of which are based on the operation of concatenation. The principal
goal of this section is to introduce these operations on languages and describe some
of their properties. These operations are important in a variety of application areas
as well as for the study of models of computation.
Definition 2.7.4: Let A and B be languages over &. The set product of A with
B, denoted A-B, or simply AB, is the language AB = {xy|x € A A y © B}.
Example
Let & = {a, b}, A = {A, a, ab} and B = {a, bb}. Then
AB = {a, bb, aa, abb, aba, abbb},
BA = {a, aa, aab, bb, bba, bhab}.
Sec. 2.7
SET OPERATIONS ON £* 113
Note that, in general, AB BA ; Le., the operation of set product is not commu-
tative. +
Example
Let © = {a,b} and A = {A, a, ab}. Then A® = {A}, A! = A = {A, a, ab}, and
A? = A-A = {A, a, aa, aab, ab, aba, abab}. #
= am" ’
(b) (any CB
(c) Ac >Ace B
Proof: The proofs of parts (a) and (b) are left as exercises. The proof of
part (c) is by induction on n:
1. (Basis) Since A° = {A} and B° = {A}, it follows that A” < B” ifn = 0.
2. (Induction) We wish to prove that for all n, if A” < B", then A”*? < Bt?”
By Theorem 2.7.1(d), if A” < B” and Ac B, then A”-A c B"-B, i.,
Anti Cc Br, |
We have used the notation £* to denote the set of all finite strings formed by
concatenating elements of Z. This notation can be extended in a natural way to
any subset of £*. We use the symbols “*” and “*” to denote unary operations
(called closure operations) on languages.
Definition 2.7.6: Let A be a subset of &*. Then the set A* (read “A star”)
is defined to be
A® — LJ A
neEN
Examples
(a) If A = {a}, then
At* = fa} U {aa} U {aaa} U ---
= {a"|n>1};
A* = {A} U At
= fa"|\n
> O}.
Sec, 2.7
SET OPERATIONS ON Z* 115
(b+) O* ={AJUPUPUPL-
A};
gr =. #
The following theorem characterizes some important properties of
the lan-
guage closure operations.
Theorem 2.7.3: Let A and B be languages over ¥ and let n & N. Then
the
following relationships hold.
(a) A® = {A} U At
(b) A’ = A* forn>0
(c) A’ =< At forn>1
(d) Ac AB*
(e) Ac B*A
(f) (A < B) => (A* & B*)
(g) (A < B) => (At c Bt)
(h) AA* = A*A = At
(i) AE AoAt= A*
GQ)
(A*)* = (At)* = A*
(k)
() A*A* = AtA* = At
(m) (A*B*)* = (4A U B)* = (A* U B*)*
Proof: Parts (a), (b), and (c) are immediate from the definition of A”, At,
and A*,
(d) (A < AB*.) By part (a), B* = {A} U B*. Therefore, AB* = A({A} U Bt)
= A U AB* which contains A. A similar proof establishes (e).
(f) (4 < B= A* co B*,) If x € A*, then x € A" for some n>0O. But
A c Bso by Theorem 2.7.2, 4” < B". Therefore, x € B" and from part
(b) it follows that x < B*. A similar argument holds for part (g).
(h) We show only A*A = A*. An intuitively appealing argument can be
constructed by noting 4* = A®° U A! U 4? U A? U -- and therefore
A®PA=(PUAUAUAU---)A
=A AUAIAUAAU-::
=A'UA? UA Us.
= At,
The preceding argument, while valid, uses the fact that set product
distributes over infinite unions, which we have not proved. The following
alternative argument does not use this fact.
x € A*A<=>yz
xforsomey € A* andz ce A
<> x = yzforsomey € A" andz ec Aandne N.
116 8 8=SETS Ch. 2
The following theorem, due to Dean Arden, has many important applications in
the study of finite automata and formal languages.
Examples
(a) If A = {a} and B = 4, then the equation ¥ = AX U B has the uniq
ue solu-
tion X = A*B = g.
(b) If A = {a, ab} and B = {cc}, then the equation X = AX U B has the
solution
X = fa, ab}*{cc}. #
1. Let A = {A, a}, B = {ab}. List the elements of the following sets.
(a) A?
(b) BS
(c) AB
(d) A*
(e) B*
2. Let A, B, and C be languages over E. Prove the following relationships.
(a) A(BC) = (AB)C
(b) A™A” = A™** for all m,n > 0. (This implies that {A}A = A{A} = A.)
(c) (A™)* = A™ for all m,n > 0
3. Let A and B be languages such that A? = B. Does it follow that A = B? Prove
your assertion.
SETS Ch. 2
While A* = At U {A}, it is not generally true that A+ = A* — {A}. For & = {a},
find the smallest set A such that At + A* — {A}.
(a) Prove that the operation of set product distributes over infinite union, i.e., show
that
ACY B) = U (AB,).
i@N 1€N
A similar proof can be used to show the other distributive law,
(U B)A = U (B;A).
iegN ieN
(b) Prove that
A*B = (Jo A‘B.
Let A and B be arbitrary languages over XZ. Prove the following.
(a) (A*)* = A*
(b) Ac A<> At = A*
(©) (4%) =a"
(d) A*A* = At
(0) (A*B*)* = (4* U BY) *
Show that if A 4 @ and A? = A, then A* = A.
Let A, B, and C be languages over £. Determine which of the following assertions
are true and give counterexamples for those that are false.
(a) (A*)" = (4")* for anya e N
(b) (AB)* = (BA)*
(c) (A —B)C = AC — BC
(d) A* coc B*>AcB
(ec) (A*B*)* = (B*A*)*
(f) AUBUCc A*BtCt
(g) (At)* = At
(h) (A)* = (A*), where B = X* — B
@) (AB)*A = A(BA)*
G) (A*B)*A* = (A* U B*)*
(k) At = AtAt
Let Ei, E,,..., E, be subsets of &*, Is it always true that
(E; UE,U +++ UE,)* = (EER... E*)*?
Prove your assertion.
Complete the proof of Theorem 2.7.4 by showing that X = A*B is a solution to the
equation X = AX U B.
Assume the same hypotheses on A and Bas in Theorem 2.7.4. Find the solutions to
the equation X = XA U B. Prove your assertion.
12. Suppose X = AX U BandA ¢€ A. Show thatif C > Bthen ¥ = A*Cisa solution.
13. Let A = {a}, B = {b}. Using Theorem 2.7.4, find subsets X,, X, of {a, b}* which
solve the following set of simultaneous set equations. (Hint: Solve for one variable
in terms of the remaining variables and then substitute.)
(a) X; = AX; U BX,
Ch. 2
SUGGESTIONS FOR FURTHER READING 119
(b) X, = AX,
14. Use finite sets and set Operations to characterize the follo
wing languages over
X = {a, b}. For example, the set of string of even length is {aa, ab
, ba, bb}*.
(a) The set of strings of odd length.
(b) The set of strings which contain exactly one occurrenc
e of a.
(c) The set of strings which either begin with an a or end wit
h 2 d’s or both.
(d) The set of strings which contain at least 3 consecutive a’s.
(e) The set of strings which contain the substring “bbab.”
BINARY RELATIONS
3.0 INTRODUCTION
Relations characterize structure. In the last chapter we studied sets and their
elements. In this section we will study some basic forms of structure which can be
represented by relationships between elements of sets. Relations are of fundamental
importance to both the theory and applications areas of computer science. A com-
posite data structure, such as an array, list, or tree, is generally used to represent
a set of data objects together with a relation which holds between members of the
set. Relations which are a part of a mathematical model are often implicitly rep-
resented by relations within a data structure. Numerical applications, information
retrieval, and network problems are examples of application areas where rela-
tions occur as a part of the problem description, and manipulation of the relations
is important in solution procedures. Relations also play an important role in the
theory of computation, including program structure and analysis of algorithms.
In this chapter we will develop some of the fundamental tools and concepts asso-
ciated with relations.
120
Sec. 3.1
BINARY RELATIONS AND DIGRAPHS 121
Examples
Let A = {1, 2}, B = {m, n}, C = {0} and D = ¢. Then
(a) 4 x B= {<1,m, <1, >, <2, m, <2, ny},
(b) Ax C= {<1, 0, <2, 0},
() Ax D=4¢,
When A and B are sets of real numbers, then A x B can
be represented as
a set of points in the cartesian plane. For example, let A =
{x|1 <x < 2} and
B= {y|0<y< 1}. Then
(d) Ax B={@%yl1A< 0<x
y<<1}2
, and
() BxA={y,xo|1<x<2A0<y<Jj.
tAssociativity is sometimes an annoying problem when trea
ting cartesian products. Defini-
tion 3.1.2 distinguishes between the sets Ai X Az X A3, (At
X Az) X A, and Ay X (Az X A3)
because the elements of these sets are of the forms <@1, @2, 43>,
<<a1, 22>, a3), and <a1, (a2, a3>>
respectively. These distinctions are sometimes important, but we
will usually wish to use the set
A; X Az X A3. We will therefore treat the binary operation of
cartesian product as though it
were associative, unless specific mention is made to the contrary.
122 BINARY RELATIONS Ch. 3
These rel ati ons are rep res ent ed by the sha ded are as in the fol low ing dia gra ms.
we
Lee
tho
bh
1 2 3 l 2 3
AXB BXA
The preceding examples show that the operation of binary cartesian product
is not commutative, i.e., it is generally not true that A x B= B x A. The fol-
lowing theorem establishes that the operation of binary cartesian product dis-
tributes over union and intersection.
Examples
(a) Let the universe of discourse be the set A = {1, 2, 3}. The three variable
predicate “x +y =z” on the universe A corresponds to the relation
R= {tx,y, |x +y =zlon a,
(b) Consider the universe N. The property “x is an even integer” can be charac-
terized by a unary predicate
P(x) <> x is even,
or a unary relation
{<x>|x is even},
or a subset
{x|xiseven}. #
Binary Relations
The most important class of relations is the set of binary relations. Because
binary relations are referred to more frequently than others, the unqualified term
124 BINARY RELATIONS Ch. 3
Defi niti on 3.1. 5: Let R be a bina ry rela tion over A x B. The set A is the
domain of R; B is the codomain. We denote <a, b> € R by the infix notation aRb
and <a, b> ¢ R is denoted by aRb.
Examples
(a) Let L be the relation on the integers I of “less than.” Then we write 4 < 6 to
denote <4, 6> € Land 6 + 4 to denote <6, 4> ¢ L.
(b) Let M denote the relation “is a multiple of” for the universe N. Then 4M2 but
2M4. More generally, xMy if and only if x = ky for some k € N. Thus for
all x, OMx and xM1. If p > 1, then p is prime if xMp implies that either
x = 1lorx =p. A number x is odd if x32.
(c) When a compiler translates a computer program it constructs a symbol table
which contains the symbolic names which occur in the program, the attributes
associated with each name, and the program statements in which each name
occurs. Thus if S is the set of symbols, A is the set of possible attributes and P
is the set of program statements, then the symbol table includes information
which represents binary relations from S to A and S to P.
(d) Let A bea set of documents in a library, and B be a set of descriptors used to
describe the documents. Let R be the relation from A to B such that aRb
if and only if the descriptor b applies to document a. For example, if X is an
article on automatic word recognition, then <X, “pattern recognition”) and
<X, “speech processing”> might be elements of R. Such relations form a basis
for automatic document retrieval systems. The user of such a system describes
his interests by choosing a set of appropriate descriptors; the document
retrieval system uses the relation R to determine what documents in the library
are likely to be relevant to the user’s needs.
(e) Binary relations on the set of real numbers can be represented graphically in the
cartesian plane. The following is a graph of the relation {<x, y>||x| +|y] = 1}.
Example
The relation “less than” over the natural numbers N can be defined inductivel
y
as follows (the corresponding “ordered pair” formulation is given on the right
for
the basis and induction clauses):
If D = <A, R) is a digraph and A isa finite set, then D is called a finite digraph.
A finite digraph <A, R> can be represented graphically by denoting the elements
of A by labeled points. An arc xRy is represented by an arrow from x to y.
o_o V
We will frequently represent digraphs with such diagrams, and in fact we will
call such a diagram a digraph, even though it is only a convenient representation.
Examples
(a) Let D = <A, R>, where A = {a, b, c,d} and R = {<a, c), <b, c>, <a, a>}. The
digraph D is represented by the following diagram.
{The definitions and terminology used in graph theory vary considerably among different
authors. We have chosen the nomenclature most appropriate for our purposes but the reader is
advised to be alert for differences in definitions when consulting other works.
126 BINARY RELATIONS Ch. 3
(b) Let D = <N, R), where the relation R consists of all integer pairs of the form
<x, x +2)>. Although N is infinite, we can represent this digraph by the
following (incomplete) diagram:
0 1 2 3 4 ; #
Digraphs constitute an important class of data structures. They may be
represented in a computer memory in a variety of ways, each of which has its
particular advantages. If the vertices of the digraph are indexed from 1 to n, the
digraph can be represented by an n xX n binary matrix M, called the incidence
matrix, where the entry in the ith row and jth column of M, denoted M[i, /], is 1
if there is an arc from the ith node to the jth node; otherwise M[i, j] = 0.
An alternate representation of a digraph consists of a list of ordered pairs
where <i, 7» is included in the list if and only if there is an arc from node i to node j.
Still another representation is a linked list, where each node of the graph is rep-
resented by its label and a list of pointers to the other nodes of the graph; each
pointer represents an arc. These representations, illustrated in Figure 3.1.1, are
only some of many possible ways to represent a digraph.
1 2
1 2 3 4
! Oo 1 0 0 (1, 2)
2 i 0 1 0 (2, 1)
3 7;0 0 0 0 (2, 3)
‘ 3 4 [1 0 0 4.1)
Incidence ast of Linked
Matrix Ordered Pairs List
Definition 3.1.7: Let D = <A, R> be a digraph. If aRb, then the arc <a, b>
originates at a and terminates at b. An arc of the form <a, a> is called a Joop. The
number of arcs which originate at a node a is called the outdegree of node a; the
number of arcs which terminate at a is called the indegree of node a.
Examples
(a) Let D be the following digraph:
c d
Then <a, c>, <a, b, c>, <a, c, a, c> and <a, b, b, c> are dire
cted paths from a to
c; of these, the first two are simple and the last two are no
t. The sequences
<c, 6, d> and <c, a, b, d> are undirected paths from ¢ to
d. The sequences
<a, c, a> and <a, b, c, a> are simple cycles; <a, ¢, a, c, a> and
<a, b, b, c, a> are
cycles but not simple. The path <a> is a simple cycle of length
0. Node a has
indegree 1 and outdegree 2; node d has indegree 1 and outdegree 0.
(b) Algorithms are often represented by flowcharts; a flowchart is a direct
ed graph
with labeled nodes and arcs. The node labels are represented
by boxes of
various shapes, together with notations written inside the boxe
s; the labelled
nodes represent starting points, exits, operations and tests. If a node
has only
one outgoing arc, the arc is commonly left unlabeled; in the case of a tes
t node,
outgoing arcs are labeled to indicate the results of the test, e.g., true and
false,
< and >, etc. A careful characterization of the class of flow
charts would
include other constraints on the form of a flowchart graph; for exampl
e, it
would be reasonable to require that each flowchart have exactly one sta
rt node
and at least one stop node.
A computation consisting of the execution of an algorithm represen
ted by
a flowchart corresponds to a path which begins at the start node of the
flow-
chart. The computation halts if the path terminates at a stop node. A proof
of
RELATIONS Ch. 3
128 BINARY
correctness of the algorithm mu st tre at eve ry dir ect ed pat h fr om the sta rt no de
to a stop node; for thi s rea son , pro ofs of cor rec tne ss oft en tak e the fo rm of
proofs by cases. #
We often wish to ref er to par ts of a dig rap h. For thi s pu rp os e, we def ine
subdigraphs and partial subd ig ra ph s. A su bd ig ra ph is ob ta in ed fr om a di gr ap h by
taking a subset of nodes and all arc s be tw ee n no de s of the sub set . A par tia l sub -
digraph also contains a sub set of nod es but nee d onl y con tai n so me of the arc s
between nodes of the subset.
Examples
If D = <A, R> is represented by
ec
The following is a partial subdigraph but not a subdigraph of D, since the loop
<a, a> is not included.
b
g
Example
Consider the following digraphs:
OO
a b
..
a b a b
(i) (ii) (iii)
a d
QO
“e
&
nD
#
Definition 3.1.12: Let A be a set with n elements. The complete digraph over
A is the digraph (A, A x A), that is, A together with the universal binary
relation
on A.
Example
The following digrams are complete digraphs over sets with 1, 2, and 3 elem
ents.
eS
130 BINARY RELATIONS Ch. 3
1. Let A = {0, 1, 2, 3,4}. For each of the predicates given below, specify the set of
n-tuples in the n-ary relation over A which corresponds to the predicate. For parts
(d)-(f), draw the digraph which represents the relation.
(a) Px)ox<l
(b) PQ)<3>2
(c) Px)<2> 3
(d) PX, yx <y
(e) P(x, y) <> dk[x = ky A k <2]
(f) P(x, y)<> [x =0 V 2x < 3]
(g) P(x, y,z)<>x? + y =z
2. For the following digraphs A and B,
(A)
(a) Find all simple paths from node a to node c. Give the path lengths.
(b) Find the indegree and outdegree of each node.
(c) Find all simple cycles with initial and terminal node a.
(d) Find the subdigraph containing the nodes a and c.
(e) Determine how many partial subdigraphs exist which contain only nodes a and c.
3. For each of the following, sketch a digraph of the given binary relation on A. State
whether the digraph is disconnected, connected or strongly connected, and state
how many components the digraph has.
(a) {<1, 2>, <1, 3, <2, 4}, where A = {1, 2, 3, 4
(b) {<1, 2>, <3, 1, <3, 35}, where A = {1, 2, 3, 4}
(c) {Xx,y>|0< x <y < 3}, where A = {0, 1, 2, 3, 4}
(d) {Xx,y>|2< x, y<7 A x divides y} where A = {nln EN Ar < 10)
(e) {<x,y>|0< x — y < 3}, where A = {0, 1, 2, 3, 4}
(f) {<x, y>|x and » are relatively prime}, where A = {2, 3, 4, 5, 6}.
4. Construct the incidence matrix for the following binary relation on {0, 1, 2, 3, 4, 5, 6}:
{<x, y>|x <y'V x is prime}.
5. For each of the following, give an inductive definition for the relation R on N. In
each case, use your definition to show x € R.
(a) R= {a,b|a>b};x =G,1>
(b) R = {<a, b>|a = 2b}; x = <6, 3>
(c) R= {Ka,b,c>|a + b = ch; x = <1, 1, 2>
Sec, 3.2 TREES 131
(b) Defining ordered triples is not completely straightforward. Show that the
following definition of ordered triples does not have the property for equality
specified in Definition 3.1.1.
An ordered triple <a, b, c> with first element a, second element b,
and third element c is the set {{a}, {a, b}, {a, b, c}}.
3.2 TREES
The set of digraphs known as trees represent an important class of binary relations.
Trees provide a way to represent hierarchical structures, such as a family gen-
ealogy, the administrative structure of a corporation, or a categorization of
a collection of objects into classes. We will consider a few of the many applications
of trees in computer science, including data structures and the design and analysis
of algorithms.
Trees denote a particular kind of binary relation. Because the graphical rep-
resentation is such a natural one, definitions and theorems are usuaily couched in
the terminology of the digraphs rather than that of the binary relations.
Definition 3.2.1: A tree is a digraph with a nonempty set of nodes such that
(i) there is exactly one node, called the root of the tree, which has indegree
0;
(ii) every node other than the root has indegree 1;
(iii) for every node a of the tree, there is a directed path from the root
to a.
We will represent trees with the root node at the top and all arcs directed down-
ward, leaving the arrowheads of the arcs implicit.
132. BINARY RELATIONS Ch. 3
Examples
(a) The following digraphs are trees. The root of each tree is node a.
a
a ea
a b a a
<> <>
ad d d
Because trees are such an important class of digraphs, there is a rich ter-
minology associated with them. Different authors, however, do not use the terms
consistently. We will use just a few of the most widely accepted terms.
Example
Consider the following tree.
d e f
The root of the tree is node a. The root a has two sons, b and c; node b has three
sons and d has no sons. The father of d is b. The leaves of the tree are the nodes
c, d, e, and f; a and 6 are the only interior nodes. The height of the tree is 2. The
subdigraph with nodes {b, d, e, f} is a subtree with root 6. The subdigraph consisting
only of node d is a subtree of height 0 with rootd. #
The usefulness of trees is due in part to the restrictions on paths which are
implied by their definition. These restrictions make it possible to traverse a tree
algorithmically (visit all its nodes) and perform searches for data more efficiently
than is possible with the general class of digraphs. The following theorems estab-
lish some of the most important properties of paths in trees.
Theorem 3.2.1: Let T be a tree with root r and let a be any node of T. Then
there is a unique directed path from r to a.
induction hypothesis, the path <r, b,, b,,...,5,-;> is the only directed
path in T from r to b,_,. Since the indegree of a is 1, there is only one
directed path of the form <b,-1, @>; i.e., there is a unique element 8,_,
such that <r, b,, b,,..., 6,1, 4 is a directed path. Thus the only directed
path from r to a consists of the unique path from r to b,_, followed by
the unique path from b,._, toa. J
Examples
(6 + 4) * 8) — (4 * 5))
can be represented by the following labeled ordered tree.
4 5 $ 4
operand of the expression is nes ted in par ent hes es to a dep th tha t equ als its dis tan ce
from the root of the tree. Bec aus e exp res sio ns in inn erm ost par ent hes es are eva lua ted
first, the tree is evaluated by sta rti ng at the bo tt om and ass ign ing val ues to eac h
interior node. A node lab ele d wit h an ope rat or is ass ign ed the val ue whi ch res ult s
from performing the operat ion on the val ues of its son s. The pro ces s can be vie wed
as a collapsing of the tree upw ard ; thi s is ill ust rat ed by the fol low ing seq uen ce of
trees. Each tree of the sequen ce is obt ain ed fro m its pre dec ess or by col lap sin g a
subtree con sis tin g of a nod e and the two lea ves whi ch are its son s.
The procedure des cri bed abo ve is a “bo tto m-u p” eva lua tio n of the tree rep res ent ing
the expressio n. Suc h tree s can also be eva lua ted in a “to p-d own ” fas hio n by usi ng
a recursive pro ced ure to exp res s the val ue of eac h nod e in ter ms of the val ues of
its sons. We leav e it as an exe rci se to writ e a rec urs ive pro ced ure for top -do wn
evaluation of trees which represent algebraic expressions. 7
Search Trees
One of the most important uses of trees is for storing collections of records,
wher e each reco rd may cons ist of seve ral asso ciat ed data item s. Suc h a coll ecti on
of records is called a file. The choice of how a file is stored is based on a number
of fact ors, incl udin g the fre que ncy with whi ch cert ain oper atio ns are per for med
on the file. Common operations on a file include insertion of a new record, dele-
tion of a record, and searching for a record in the file. The most straightforward
search tech niqu es are base d on the valu e of som e spec ific field or item in each reco rd
called the search key. For example, a file consisting of employee records might use
the soci al secu rity num ber as a sear ch key; each reco rd wou ld then have the
employee’s soci al secu rity num ber as its sear ch key valu e. In man y sear ch tech -
niques, the value of the search key of the record sought is used to direct the search;
if the valu e of the sear ch key in each reco rd is uniq ue, then the sear ch key can be
used for record identification as well.
A file can be organized for fast access using a search key by means of a
type of binary tree known as a binary search tree. To illustrate search trees and
their use, we will assume a file whose key values are all distinct, and a search tree
in which a single record of the file is stored at each node of the tree. A binary
search tree is constructed so that if node b is the left son of node a, then the key of
every descendant of 5 (including b itself) is less than the key of a. On the other
hand, if node b is the right son of node a, then the key of every descendant of 8 is
greater than or equal to the key of a.
Sec. 3.2 TREES 137
Example
The following is a binary search tree. Each node label is the key of the record
stored at the node.
(s)
Q) ©)
©) oY) ©
(S) 20)
w) #
We will illustrate the use of binary search trees by describing two search pro-
cedures. To construct these programs, we need a way to refer to the key of the
record stored at a node of a tree as well as the left son and right son of the node.
If node is a program variable whose value is a tree node, then KEY (node) will
denote the value of the key stored in the record at node. LEFTSON (node) and
RIGHTSON (node) will have as values the left son and right son of node respec-
tively, if these sons exist; if no left son exists, the value of LEFTSON (node) will
be the distinguished value null and similarly for RIGHTSON (node).
A search algorithm for records stored in a binary search tree is given in Figure
3.2.1. To find a record in the tree, we call TREESEARCHI (root, arg), where the
value of arg is the key of the record sought and the value of root is the root node
of the search tree unless the file contains no records, in which case the value of
root is null. After node is set equal to root, if node ~ null, then arg is compared
with KEY (node). If arg = KEY (node), then the record has been found and is
stored at the root node of the search tree. If arg < KEY (node), then either the
record is not in the file, or it is in the subtree whose root is LEFTSON (node).
138 BINARY RELATIONS Ch. 3
If arg > KEY (node), then either the record is not in the file or it is stored in the
subtree whose root is RIGHTSON (node).
The search proceeds by progressing down into the tree, at each step examin-
ing a node which is a son of the node previously examined. If the record is in the
file, the procedure will eventually find it by following the (unique) simple directed
path from the root of the tree to the correct node. If the record is not in the file,
the search will eventually either reach a node whose key value is greater than arg
and which has no left son, or it will reach a node whose key value is less than arg
and which has no right son. In these cases, the search procedure will terminate after
assigning the value of null to node.
The procedure TREESEARCHI given in Fig. 3.2.1 is called an iterative
procedure because the principal computation is done in a loop; in this case, the
loop uses a while statement. A search of a binary tree can also be done recursively.
The recursive search procedure rests on the following inductive definition of
binary trees. (This recursive definition of binary trees is equivalent to the non-
recursive characterization given earlier in this section, but we will not prove the
equivalence.)
examining the root and then, if necessary, searching either the left or right subtree
of the root.” TREESEARCH2 is called in the same way and returns the same
value as TREESEARCHI.
The height of a binary search tree is a measure of the maximum number of
Steps it will take to locate a record in the file. The following theorems relate the
size of the file to the height of a binary tree. The proofs are left as exercises.
We have described binary search trees with records stored at all nodes of the
tree. In some circumstances it is advantageous to store records only at the interior
nodes or only at the leaves. If records are stored at interior nodes, each leaf can
have an associated action which is to be taken if a search fails at that leaf. If records
are stored only at the leaves, each interior node contains a value for comparison
rather than an entire record. In this case, each leaf may be a single record, or it
may be a “bucket” which contains a subfile. A search for a record in such a tree
need not read all records of the file into main storage, since the interior of the
tree can be searched and the result used to bring only the appropriate bucket into
main storage.
Using only interior nodes or only leaves for record storage significantly in-
creases the number of nodes of the tree, but it has only a small effect on the height
of the tree; as a consequence, the number of steps of a search procedure in such
a tree is not much larger than that for other search trees. We leave it as an exercise
to show that if all leaves of a tree are approximately the same distance from the
root, then the height of a search tree with records stored only at the leaves is only
slightly greater than the height of one with records stored at all nodes.
If records are stored only at leaves of a search tree, then there must be some
value, which we will call a discriminator, associated with each internal node of the
tree. The discriminator of a node is used to direct the search process in the same
way as the key of a record stored at the node.
Example
The following graphs are binary search trees for a file with the key set
{0, 2, 4, 7, 8, 9}. In the tree on the left, records have been stored in all nodes and
each node is labelled with the key of its record. In the tree on the right, all records
are stored at the leaves, which we have drawn as squares. Labels of the internal
nodes of this tree are discriminators and need not be members of the key set.
{Unless specified otherwise, all logarithms in this book are to the base 2.
By storing more than one discriminator at each node, it is possible to imple-
ment ternary or higher-order searches. For example, for a ternary search, each
node has two discriminators d, and d, and an outdegree of 3 or less. When search-
ing for a record with key k, if k < d,, then the left subtree is searched, if d; <k <d,,
the middle subtree is searched, and if k > d,, then the right subtree is searched.
Example
The following graph represents a ternary search tree. The two discriminators
of each internal node are given as a node label x : y. Records are stored only at the
leaves of the tree.
OY 0 &
(2) fe] (920)
P} OI) Gs) Po GIGI bY
,
Tree Traversal Algorithms
When using trees as data structures, it is often necessary to traverse the tree,
that is, to inspect each data item stored in the tree. We will describe three traversal
140
Sec. 3.2 TREES 141
algorithms for binary trees; each traversal scheme will be defined by specifying
an order for processing the three components of root, left subtree and right sub-
tree. We consider the following three orders.
Vist the root, then the left subtree, then the right subtree.
Visit the left subtree, then the root, then the right subtree.
Visit the left subtree, then the right subtree, then the root.
Whatever choice is made, it is natural to apply the same strategy to the subtrees
as was chosen for the tree, making the traversal algorithm recursive. To describe
the three algorithms, we assume a binary tree T with root r, a left subtree T, and
a right subtree T,; note that T, and 7, may not exist. The order in which the nodes
of T are visited is called preorder, inorder, or postorder depending on whether the
root is visited first, second, or third. The following are recursive definitions of the
three traversal algorithms.
Preorder: 1. Process the root node r of T.
2. If 7, exists, then process 7, in preorder.
3. If T, exists, then process 7, in preorder.
Inorder: 1. If 7, exists, then process 7; in inorder.
2. Process the root node r of T.
3. If T, exists, then process 7, in inorder.
Postorder: 1. If T, exists, then process T, in postorder.
2. If 7, exists, then process 7, in postorder.
3. Process the root node r of T.
Example
The node labels of the following binary trees give the order in which the nodes
are visited by each of the traversal algorithms.
3 © ®
2) (4) 09 (s) 09)
OOM OUOWO @O® ©
4) @ & OO 2)
© ©
(s) @) )
Preorder Inorder Postorde #
142. BINARY RELATIONS Ch. 3
procedure LIST(root):
comment: using inorder traversal, list keys stored in binary tree.
begin
if LEFTSON(root) + null then LISTC(LEFTSON(root));
print KEY(root);
if RIGHTSON(root) # null then LIST(RIGHTSON(root))
end
If L is the set of possible node labels of a tree, then each traversal order cor-
responds to a unique word w over the alphabet L for any given tree. In general, it
is not possible to reconstruct the tree given only the word w and the traversal order,
but this reconstruction can be done in certain important cases. In particular, if.
a labelled tree represents an algebraic expression, then each internal node is labelled
with an operation, such as +, —, *, and /, and each leaf is labelled with a variable
or a value. For such trees, if the node labels are listed in either preorder or post-
order, the result is a word from which the original algebraic expression can be
reconstructed. This way of representing algebraic expressions is known as paren-
thesis free or Polish notation and is extremely convenient for computer evaluation.
Evaluation is usually done using a pushdown store; a discussion of this topic is
beyond our scope.
Example
Consider the algebraic expression (a — (b + c)) * d and its associated labelled
binary tree:
Preorder traversal results in the word « — a + bcd, and postorder traversal pro-
duces abc + —d+. Both of these words can be used to reconstruct the original
tree, but the inorder expression a — b + c*dis ambiguous. #
Sec. 3.2 TREES 143
1. State which of the following digraphs are trees. For those that are not, state why.
@ ¢ (e) ° (f) a
d a b b aN
.
f b ;
e c
é€ f
Ae
d
2. For each of the following trees identify the root, the leaves, the height, and all proper
subtrees.
(a) Prove that if any arc of a tree is deleted, the resulting digraph is not connected.
Pal
(b) Characterize the digraph which results when a single arc is deleted from a tree.
Give a recursive definition of the height of a binary tree.
%
. Let S bea finite set of k integers. Describe an algorithm to construct a binary search
\o
tree with k nodes, where each node is labelled with a distinct element of the set S.
Your algorithm should produce a tree of height Llog, (4)|.
10. Prove Theorem 3.2.3.
11. Prove Corollary 3.2.3.
144 BINARY RELATIONS Ch. 3
12. (a) Prove that the number of interior nodes of a binary tree of height A > 0 is less
than 2’-!,
(b) Find an uppe r bou nd for the num ber of inte rior node s of an n-ar y tree of
height A.
13. Consider the following labelled binary tree.
Give the sequence of labels encountered when the tree is traversed in each of the
following orders.
(a) Preorder
(b) Inorder
(c) Postorder
14. Represent the following propositional forms as ordered trees.
(a) (AV B)>C]+(DV A)
(b) (A>B)A [OC V B= 4] (Note that this expression contains a unary
operator.)
15. Construct the labelled binary tree corresponding to the following parenthesis-free
expressions. These expressions were obtained by traversing the trees in the order
given.
(a ——-—abed (preorder)
(b) —a—b—cd (preorder)
(c) abcwdex] + (postorder)
16. Write a recursive procedure to evaluate an algebraic expression represented by
a labelled binary tree. Assume that the leaves of the tree are labelled with integers and
the only operations used are the binary operations -+, —, *, and /.
17. Show that inorder traversal of labelled trees representing algebraic expressions may
produce an ambiguous expression; in particular, two trees representing different
expressions can produce the same word when inorder traversal is used.
18. (a) Show that the number of leaves on a complete binary tree is always one greater
than the number of interior nodes of the tree.
Sec, 3.3 SPECIAL PROPERTIES OF RELATIONS 145
(b) Find an expression for the number of leaves on a complete n-ary tree in terms
of the number of interior nodes of the tree.
19. Let'T, be a complete binary search tree of height 4, with records stored in both inte-
rior nodes and leaves such that the length of any path from the root of T, to a leaf
is either h, or h, — 1. Let T, be a complete binary search tree with records stored
only at the leaves; T; is of height A. and the length of any path from the root of T,
to a leaf is either h, or h, — 1. Suppose both search trees contain n records.
(a) What is the difference in the heights of the trees?
(b) What conclusions can be drawn about the difference in the maximum number
of nodes visited in searching for a record in the two trees?
20. An array A can be used to represent a binary tree as follows:
(i) The root value is stored at A[1].
(ii) For each i such that a value of a tree node is stored at A[/], the value of the left
son of Ai] is stored at A[2i] and the value of the right son of A[iJis stored at
A{2i + 1).
A distinguished value can be used to indicate that the corresponding tree node does
not exist.
(a) How many entries must the array have if the tree is of height A?
(b) Generalize this technique for n-ary trees.
21. Let T be a complete binary tree with n leaves, b;,b2,...,b,, and let d, be the
length of the path from the root to leaf 6;,, 1 <i<n.
(a) Show that 7.1; 27# = 1.
(b) (For students with an understanding of elementary probability.) Interpret the
equality of part (a) in terms of probabilities, and generalize the equality for
complete n-ary trees.
(c) Show max{d,} > [log nx].
Programming Problems
A relation R is reflexive on a set A if xRx for every x < A, i.e., every element
is in the relation R to itself. The digraph of a reflexive relation has a loop on every
node of the digraph. A relation R is irreflexive on A if no element x € A is in the
relation R to itself. The digraph of an irreflexive relation does not have loops on
any nodes. Note that it is possible for a relation R to be neither reflexive nor
irreflexive; the graph of such a relation would have loops on some but not all
nodes.
Examples
(a) The relation of equality (a = 4) is reflexive on any set.
(b) Consider the set of integers I. The relation < is reflexive and not irreflexive,
and the relation < is irreflexive and not reflexive.
(c) Consider the following relations on the set £*, where & = {a, b}. The relation
“is the same length as” is reflexive and not irreflexive. The relation “is longer
than” is irreflexive and not reflexive. Let R bea relation such that xRy if and only
if some proper prefix of x is a proper suffix of y. Then R is neither reflexive nor
irreflexive, since aaRaa but abRab. #
Examples
(a) The relation of equality on any set is both symmetric and antisymmetric.
(b) For the set of integers I, the relations < and < are both antisymmetric;
neither is symmetric. The relation “xRy if and only if the absolute values of
x and y are equal” is symmetric and not antisymmetric.
(c) For the set X*, the relation “is a substring of” is antisymmetric and not sym-
metric. The relation “xRy if and only if x and »y have a common nonempty
prefix” is symmetric and not antisymmetric. +
If R is a transitive relation, then whenever xRy and yRz it follows that xRz.
If D is the digraph of a transitive relation, and there are arcs from x to y and from
y to z, then there is an arc from x to z. It follows that if D is the digraph of a transi-
tive relation R and there is a path of length greater than 0 from. x to y, then there is
an arc (a path of length 1) from x to y.
Sec. 3.3 SPECIAL PROPERTIES OF RELATIONS 147
Examples
(a) The equality relation is transitive for all sets.
(b) For the set of integers I, the relations < and < are transitive. The relation
“xRy if and only if x divides y” is also transitive.
(c) For the set £*, the relation “is a prefix of,” “is a proper prefix of,” “is a-sub-
word of,” and “is the same length as” are all transitive relations. +
We conclude this section with examples which list the properties of some
specific relations.
Examples
Consider the set {1, 2,3} and the relations represented by the following
digraphs.
@) 2 2
2
OQ: 3) Ci 3 3
Ry Ry R;
1. List the properties defined in Definition 3.3.1 which hold for the relations represented
by the following digraphs.
148 BINARY RELATIONS Ch, 3
(a) (b) @)
ee:
(CO , (OK)
d c
Reflexive
Irreflexive
Symmetric
Antisymmetric
Transitive
Transcribe each part of Definition 3.3.1 into logical notation. For example, part (a)
becomes
R is reflexive <> Vx[x € A => xRx]
(a) Find a nonempty set and a relation on it which is neither reflexive nor ir-
reflexive. Choose the set to be as small as possible. What if the set is permitted
to be empty?
(b) Construct a binary relation on a nonempty set which is neither symmetric nor
antisymmetric. Choose the set to be as small as possible. What if the set is
permitted to be empty? :
Consider the set of binary relations over an arbitrary set A. We say a property of
relations is preserved under a particular set operation if applying the operation to the
relation(s) results in a relation with the same property. For example, the reflexive
property is preserved under the binary operation of set union since the union of two
reflexive relations is reflexive. However, the reflexive property is not preserved under
the unary operation of set complement, since the absolute complement of a reflexive
relation on a nonempty set is not a reflexive relation. Complete the following table
Sec. 3.4 COMPOSITION OF RELATIONS 149
with Y (yes) and N (no) according to whether the given property is preserved under
the indicated set operation. For each “no” answer, give a counterexample.
Reflexive
Irreflexive
Symmetric
Antisymmetric
Transitive
7. Sketch graphs of the following relations on the set of real numbers and determine
for
each relation which of the properties in Definition 3.3.1 apply.
(a) {<x, y>|x = y}.
(b) {<x»>,|x? -1=0 A y > 0}.
(©) Kx wl lxl<1 Aly|>0.
8. (a) State which of the following terms apply to the binary relations represented by
trees: reflexive, irreflexive, symmetric, antisymmetric, transitive.
(b) Does the list of applicable terms completely describe the relations represented
by trees, or are there binary relations which possess these characteristics whose
digraphs are not trees?
Programming Problem
Write a program which takes as input the ordered pairs of binary relation and deter-
mines which of the properties of Definition 3.3.1 apply.
Examples
(a) If R, is the relation “is the brother of” and R, is the relation “is the father of,”
then R,R, is the relation “is the paternal uncle of.”
(b) If R; is the relation “is the father of,” then R,R, is the relation “is the paternal
grandfather of.”
(c) In the execution of programs written in a high level language, a sequence of
data conversions sometimes occurs. For example, a string of decimal digits
in an arithmetic expression may be converted first to a binary integer represen-
tation and then to a floating point representation. If <x, y> € R, implies
that digit string x is converted to binary integer y, and <y, z> € R2 implies
that binary integer y is converted to floating point number z, then <x, z> €
R,R,z implies that digit string x is converted to the floating point number z, +
Proof:
(a) <a,c> € R,(R, U R;) if and only if there exists some b € B such that
- <a, b> & R, and <b, c> € R, U R;. Furthermore,
dd[<a, b> € R, A <b, > © R, U Rs]
<> Aba, b> € Ry A (<b, > € Ry V <b, > € R;)]
<> Ab[(<a, b> € Ry A <b, c> & R,) V Ka, b> € R; A <b, © R;)]
<> Ad[<a, b> € Ry A <b, c> € Ry] VV Abl<a, bd ER, A <b,c> € R;]
<> a,c> € R,R, \/ <a, c> © RR;
<> <a,c> © R,R, U R,R3.
We leave the proofs of parts (b)-(d) as exercises. J
The above proof is in two parts in order to show containment in both direc-
tions. The theorem can also be proved using a sequence of equivalences as fol-
lows:
Proof:
<a, d> € (R,R,)R3
<> dc[<a, c> € R,R, A <c,d> € R3]
<> Ac[db[<a, b> € R, A <b, c> € Ra] A <c, d> © R3]
<> AcFb[[<a, b> © R, \ <b, c> & Re] A <c, dd © Ry]
<> dcdb[<a, b> € R, A [Xb c> € Ry A <c,d> & Ral]
<> TbAcl<a, b> € R, A [<b, c> € Ry A <c, dd © Ral]
<> db[<a, b> € R, (A Acl<b, c> € R, A <c, dd € R3J]
<> Fla, bY & R, A <b, d> © RyRy]
<<a,d>€R,(R,R,). |
(be) (RnY = Re
The proofs are by induction and are left as exercises.
If D is the digraph of a binary relation R on a set A, then <x, y> © R” if and
only if there is a path of length n from node x to node y.
Examples
Let A = {a, b, c, d} and let R be the relation on A represented by the following
digraph:
R? by
Sec. 3.4 COMPOSITION OF RELATIONS 153
We note that R* = R* and hence R4R = R?*R, that is, R5 = R3. Similarly
R®& = R* = R2, It follows (by induction) that R2**! = R3 and R2" = R2 for n> 1.
This relationship can be represented by the following digraph where each node
represents R* for some k and an arc exists from XY to Yif ¥-R = Y.
In the preceding example, not all powers of the relation R were distinct rela-
tions. In fact, if R is a relation on a finite set, this will always be the case, as the
following theorem asserts.
If R is a relation on an infinite set A, then there may not exist two integers
s and ¢ such that R* = R‘. For example, if A= Nand <x,y) © Rey=x+1,
then <x, z> € R'<>z =x + 5; in this case, all powers of R are distinct relations
on A,
Proof: Parts (a) and (b) require proofs by induction and are left as exercises.
(c) Letg e¢ N. If q <4, then R? € S by definition of S. Suppose g >t.
Then we can express q in the form s + kp + i, where i < p. By part (b),
it follows that R? = R**', Since s + i < t, this establishes that R° <¢ S. fj
h g
Find the smallest integers m and n such that m <n and R™ = R’.
4. Prove that if R is either the empty relation or the universal relation on a set A, then
R2 = R,
5. Prove or disprove:
(a) If D is the digrap h of a relati on R and D is connec ted, then the digrap h of R” is
connected for every n > 0.
(b) If D is the digrap h of a relati on R and D is strong ly connec ted, then the digrap h
of R* is strongly connected for every n > 0.
6. (a) Prove part (b) of Theorem 3.4.1.
(b) Give examples to show that the containment of parts (b) and (d) may be proper.
7. Prove Theorem 3.4.3.
8. Prove parts (a) and (b) of Theorem 3.4.5.
Let R; and R, be arbitrary relations on a set A.
Prove or disprove the following assertions.
(a) If R; and R, are reflexive, then R,R, is reflexive.
(b) If R, and R, are irreflexive, then R,R, is irreflexive.
(c) If R,; and R, are symmetric, then R;R,z is symmetric.
(d) If R,; and R, are antisymmetric, then R,Rz is antisymmetric.
(e) If R, and R, are transitive, then R,R, is transitive.
Sec. 3.5 CLOSURE OPERATIONS ON RELATIONS 155
Proof:
(a) If R is reflexive, then R has all the properties given in Definition 3.5.1 for
the relation R’. Hence r(R) = R. Conversely, if r(R) = R, then by prop-
erty (i) of Definition 3.5.1, R is reflexive.
The proofs of (b) and (c) are similar. J
Examples
(a) The reflexive closure of the relation < on the integers Lis <.
(b) The reflexive closure of E is E.
(c) The reflexive closure of + is the universal relation.
(d) The reflexive closure of the empty relation is the relation of equality, E. #
Examples
(a) The converse of the relation < on J is the relation >.
(b) The converse of the relation < on a collection of sets is the relation >. #
Theorem 3.5.3: Let R, R,, and R, be binary relations from A to B. Then each
of the following holds.
(a) (RY°=R
(b) (R, UR, c= Rj U RS
() (Ri 0 RR.) = REM RS
(d) (A x BY = BA
©) #=¢ _ _
(f) (R)° = (R°), where R denotes (4 x B) — R.
(g) (R, — R,)° = Ri — Ry
(h) If A = B, then (R,R,)° = RSRS
(Gi) R, oR, > REC RS
Proof:
(a) ((R°) = R.) Let <x, y> be an arbitrary element of R. Then, <x,y> ER
<> <y, x> © R° <> <x, y> € (R*)*; therefore (R°)° = R.
(b) (R, UR.) = Rf U R3.)
<x, ¥) € (Ri U Rx> ¢y, xD ER, UR,
<> <y,x) © Ry V Cy,2D ER,
<> <x,y> © REV <x,y> © RG
_ _ _ <> x, y> © RLU Rj.
(fF) (RY = (R*).) Gy > € (RY yD ER
> x ER
<> <x, y> E Re
<> <x, y> © (R*). _
(g) (R, — Ry) = Ri — Rj.) Using the identity R, — R, = Ri A R,, we
have (R, — Ra)? = (Ri A R,)° = REO (R,)°
= RL (R35)
= Rj — RS.
The proofs of the remaining parts are left as exercises. J
Theorem 3.5.4: Let R be a binary relation on A. Then R is symmetric if and
only if R = R°.
We leave the proof as an exercise.
<a, b> © RU R*. If <a, b> © R, the n <a, b> © R’ by hyp oth esi s. If <a, b> € R*,
then <b, a> € R and the ref ore <b, a> € R’. But R’ is sym met ric and the ref ore
<a, b> € R’. It follows that RU R° = R’.
Examples
(a) The symmetric closure of the relation < on the integers I is the relation +,
or E.
(b) The symmetric closure of < on the integers I is the universal relation.
(c) The symmetric closure of Eis E, and of #is#. #
If R is a binary relation on A, then <a, b> € t(R) if and only if there is a se-
quence of elements of A, Co, C,,...,¢,, Where n> 1, cg = a, ¢,=5 and for
O<i<n,<c, C4. € R. If Dis the digraph of R, then <a, b> € t(R) if and only
if there is a path of nonzero length from node a to node b.
Examples
(a) Let R bea relation on I such that aRbd if and only if b = a + 1. Then 7(R) is
the relation <.
(b) Let R be the relation “is the child of.” Then ¢(R) is the relation “is the
descendent of.”
Sec. 3.5 CLOSURE OPERATIONS ON RELATIONS 159
(c) Let R be the relation < on a set of integers A. Sorting the elements of A
according to R requires finding the smallest relation R’ on A such that R =
(R’). #
When A is finite with n elements, it follows from Theorems 3.4.4 and 3.4.5 that
an
t(R) = Rt
t=]
Example
Consider the following digraph representing a relation R.
a b c d
Theorem 3.5.8:
(a) If Ris reflexive, then s(R) and f(R) are reflexive.
(b) If R is symmetric, then r(R) and ¢(R) are symmetric.
(c) If R is transitive, then r(R) is transitive.
The proofs of all parts are straightforward and are left as exercises.
160 BINARY RELATIONS Ch. 3
Example
The relation < on the set of integers I can be used to show that in general
st(R) 4 ts(R). For st(<) = s(<) = # (.e., st(<) is the inequality relation), while
ts(<) = t(4) =I x I (ie., ts(<) is the universal relation). +
The transitive closure and reflexive transitive closure operations are used in
several application areas. The “plus” and “star” notations are used to denote
these closure operations in a way analogous to the use of At and A* to denote
closure operations on a language A.
The plus and star closure operations are often used in studying formal lan-
guages and models of computation as well as application areas such as compiler
design.
Example
Let P = {P,,P2,...,P,} be the set of programs and subroutines in a pro-
gram library. Define the binary relation = over P as follows:
Sec. 3.5 CLOSURE OPERATIONS ON RELATIONS 161
1. Find the reflexive, symmetric, and transitive closures of each of the foll
owing.
(a) ae es
(b) ena
(c) we
Let R, and R, be relations on a set A and suppose R; > R,. Prove each of the
following.
(a) r(R1) > r(R2)
(b) s(Ri) > s(R2)
(c) t(Ri) > t(R2)
5. Let R; and R, be relations on A. Prove each of the following.
(a) r(R, U R2) = r(R}) U r(R2)
(b) s(Ry U R2) = s(Ri) VU 5(Ra)
(c) t(R; U R2) > t(R,) VU t(R,).
Show by counterexample that
(d) ¢(Ri U Rz) 4% t(Ry) VU t(R2).
6. Find a set A with n elements and a relation R on A such that R', R2,..., R® are all
distinct. This establishes that the bound given in Theorem 3.5.7 is attainable.
7. Prove Theorem 3.5.8,
A g
(a) Find the incidence matrix for R’ U R” in terms of M “, M”, and the operations
of matrix addition and multiplication.
(b) Find the incidence matrix for R’R”.
Let M be the incidence matrix for R.
(c) Find the incidence matrix for R*.
(d) Find the incidence matrices for R+ and R*.
(e) Find the smallest relation R on the set {a, b, c}, for which the incidence matrix
for Rt is
111
111
11 1
14. (For students with an understanding of elementary probability.) Consider the fol-
lowing four dice, which we will call A, B, C, and D.
If two dice x and y are chosen and rolled, we say “x beats y” if a higher number
shows on x than y.
(a) For each pair of dice x and y, calculate the probability that x beats y. Present
your results as a two-dimensional array whose entries are probabilities.
Let R denote the binary relation “is more likely to win than” on the set {A, B, C, D}
where R is defined as follows:
xRy <> the probability that x beats y is greater than 4,
(b) Give the digraph associated with the relation R.
(c) Find the transitive closure of R.
(d) Is the relation R transitive?
(e) Suppose someone proposes the following game. You may choose whichever
die you like from the set [4, B, C, D}. After your selection, your opponent will
select a die from the remaining three dice. You then roll the two dice ; the
winner is the person whose die beats the other. The loser pays the winner $1.
Assuming your moral character is such that this proposal does not make your
skin crawl, would you accept, and why?
Programming Problem
It follows from the preceding definition that a partially ordered set is also
a digraph whose relation is a partial order on the set of nodes. We will use the
symbol < to denote an arbitrary partial order; thus, if R is an unspecified partial
order, we will usually write either a < b or b > a rather than aRb.
Examples
(a) The relation of set containment is a partial order on any collection of subsets
of a set A; that is, c is a partial order on P(A) and <@(A), <)> is a partially
ordered set.
(b) The relation < is a partial order relation on the set of integers.
(c) Let B = {b;, bz,..., 6,} be the set of blocks in a program in a block-struc-
tured language such as ALGOL or PL/I. For all i and /, define 5; < 5, if 5;
is contained in b;. Then <B, <> is a poset.
(d) The relation < is not a partial order on I because it is not reflexive. #
The diagrams we have described for digraphs can be used for partially ordered
sets as well. However, posets are traditionally represented in a more economial
way by poset (or Hasse) diagrams. These diagrams do not explicitly represent all
ordered pairs of the partial order. The edges of a poset diagram for the relation R
represent the smallest relation R’ such that (R’)* = R. Thus, on a poset diagram
all loops are omitted, eliminating explicit representation of the reflexive property.
Furthermore, an arc is not present in a poset diagram if it is implied by the transi-
tivity of the relation. That is, there is an arc from a to b only if there is no other
element c such that a < cand c < D. Finally, the antisymmetry of a partial order
implies that the only directed cycles ina digraph representation of a poset are the
node loops. By convention, poset diagrams are drawn so that all arcs point upward
and arrowheads are not used. Poset diagrams are more easily grasped than digraph
representations of posets, and we will use them freely.
Examples
(a) The following are alternate diagrammatic representations of a partial order
Rona set S = {a, b, c, d}.
Sec, 3.6
ORDER RELATIONS 165
Note that if the edges of the diagram on the right are directed upward and
the reflexive transitive closure is formed, the result is the digraph on the left.
(b) Consider the binary relation “divides” defined on a set of nonzero integers,
where a divides 6 if and only if 5 is an integer multiple of a. If we choose
the
set of positive integers from 1 to 12, the resulting poset is represented by the
following diagram.
Examples
(a) The relation < is a quasi order on any set of real numbers.
(b) The relation “is a proper subset of” is a quasi order on any collection of sets.
166 BINARY RELATIONS Ch. 3
(c) The relation “is a prerequisite for” is a quasi order on any set of college courses.
(d) The transitive closure of the relation “calls” is a quasi order on any collection
of nonrecursive programs and subroutines.
(e) PERT is a method of scheduling tasks to minimize the total time required for
completion of the tasks. Application of the method usually involves the con-
struction of a PERT chart which represents a quasi order on the collection
of tasks to be performed; xRy means that task y cannot be started until task x
is finished. #
The only distinction between quasi orders and partial orders is the equality
relation E. The proof of the following theorem is left as an exercise.
Because of the similarity between quasi orders and partial orders, it is con-
venient to use the same diagrams to represent both kinds of order relations. Thus,
a poset diagram for the partial order < over a set of integers can also be used to
represent the quasi order < over the same set.
If < is a partial order and either a < b or b < a, we say aand bare comparable.
If < is a partial order on A such that every two elements of A are comparable, then
< is called a linear order.
Definition 3.6.3: A partial order < on a set A is a linear (or simple, or total)
order if either a < bor b < a forevery a,b € A. If < isa linear order on A, then
the ordered pair <A, <> is a linearly ordered set, or chain.
Examples
(a) The linearly ordered set
<(1, 2, 3}, (1, 1D, <2, 2), <3, 3>, C1, 2, C1, 3D, <2, OD
is represented by
3
2
1
(b) The linearly ordered set <I, <> can be represented by the following (incom-
plete) diagram.
Sec. 3.6 ORDER RELATIONS 167
Cet
ee
RQ BQ:
(c) Consider the universe of real numbers R. For every real number a, let
Sa = {x|0< x <a}, and let S be the collection {S,|a > 0}. If a <b, then
Sa = S;, and consequently <S, <> is a linearly ordered set.
(d) If A is a set with more. than one element, then <P(A), c:> is not a linearly
ordered set. #
Example
Consider the poset <P({a, b}), <> represented by the following diagram:
fo.
f«} {5}
b
Theorem 3.6.2: Let <A, <> be a poset and B c A. If a and b are greatest
(least) elements of B, then a = b.
Proof: Suppose a and b are both greatest elements of B. Then a < b and
b < a. It follows from the antisymmetry of < that a = b. The proof for the case
when a and b are least elements of Bis similar. J
induced well ordering on S is defined as follows: if a,b € S and ais paired with
n, and b with n,, then aRb <n, <n,.
Example
A well order for the set of integers, I, can be constructed by listing the elements
of N in ascending order and then pairing the elements of I with those in N as fol-
lows:
N: 0 1 2 3 4 5 6
$( ¢ ¢ ¢ $¢ $ 4
I: 0 —1 1 —2 2 —3 3
The relation R implied by the above pairing is
aRb <> |a| <|b| V (a[=|b
Aa]
<b) #
We are often interested in the set of integer n-tuples I” and the set of n-tuples
of natural numbers N’. The linear ordering < on I or N can be used to induce a
linear ordering on these sets. For example, if n = 2, we can define the ordering on
either I? or N? as follows:
<a, b><<e,d>la<cV(a=cAb<a).
The relation of “strictly less than” can be defined as
<a, b> < <c, d> <> (Ka, b> < <e, d> A <a, b> & Xe, dd).
Note that the set <N?, <> is well ordered, but <I?, <> is not.
If a linear order is imposed on the symbols of a finite alphabet ¥, then this
alphabetic ordering can be used to induce two distinct linear orderings of the
elements of X*.
Example
Let X = {a, b}, and let a precede b in the alphabetic order. Then if x is
any string in X*, the immediate successor of x is xa. The immediate predecessor
of xa is x, but there is no immediate predecessor of xb. Moreover, the set
{6, ab, aab, aaab,...} has no least element, since each string ab precedes any
string ab if m > n. It follows that the lexicographic order is not a well order. +
170 BINARY RELATIONS Ch. 3
In the standard order, every element has an immediate successor, and every ele-
ment other than A has an immediate predecessor. The least element of any set is
the shortest element of the set which occurs earliest in the lexicographic ordering
of E*. Since such an element exists for any subset of £*, it follows that the standard
ordering of £* is a well ordering.
Example
If<X = fa, b,c}, and x € &*, then the immediate successors of xa, xb, and xc
under standard order are xb, xc, and ya respectively, where y is the successor of x.
The immediate predecessors of xb and xc are xa and xb respectively. If x + A,
then the immediate predecessor of xa is yc, where y is the immediate predecessor
of x. The least element of {a"b|n € N}isb. #
(which relies on the well order < rather than the successor operation) can be
applied, as it can to any well ordered set. Let S be a universe of discourse, let < be
a well ordering of S, and let < denote < — E (ie., x < y denotes x < y and
x # y). Then the following rule of inference holds:
Vxl< V
x >oPO)ly
= PO)
“. WxP(x).
Thus, if we can show that an arbitrary x has property P if every element less than x
has property P, then we can conclude that every element of S has property P. To
show that the conclusion of the rule of inference is valid, suppose we can prove the
premise
Example
Let X = {a, b}, let a precede b in the alphabetic ordering, and let < denote the
lexicographic order on X*. Then < is a linear ordering but not a well ordering of
*, and the Second Principle using < is not a valid rule of inference. For consider
the following predicate P on the universe &*:
is false. It follows that the Second Principle cannot be applied to &* using the lexi-
cographic ordering <. #
172 BINARY RELATIONS Ch. 3
Example
Rather than treat a specific example we will describe the application of termi-
nation techniques to the nested loop structure which appears below. Assume that
the value of m is a positive integer and is not changed inside either loop, that all
statements which affect the values of i and j are shown, and that the loops do not
contain other loops or branch statements.
for i — 1 step 1 to m do
begin
jom
whilej > ido
begin
icj—il
end
end
Sec. 3.6 ORDER RELATIONS 173
For the outer (for) loop, consider the quantity (m — i). Since we have assumed
m > 1, the initial value of this quantity is in the well ordered set N. Incrementing
i with each traversal of the loop causes the quantity (7m — 7) to be decremented.
When i > m, the quantity (m — i) is no longer an element of N and the execution
of the loop ceases. Thus, the outer loop will terminate if each execution of the
inner loop terminates. The variable j of the inner (while) loop is initialized to m
and decremented by 1 during each traversal of the loop. Since i < m and execution
of the loop leaves the value of i unchanged, execution of the loop will cease when
j is no longer an element of the well ordered set {i + 1,i+2,... , m}. Thus, each
execution of the inner loop will terminate and therefore the outer loop will term-
inate. #
Examples
(a) Consider the poset <P({a, b}), <> represented by the following poset diagram.
fo.
fe} {0}
174 BINARY RELATIONS Ch. 3
(b) Consider the poset <R, <>, and let B= [0, 1) = {x|O<x <1}. Then
B has no greatest or maximal elements, but 0 is a least and minimal element.
The set of upper bounds of B is the set {x|x > 1}, and 1 is a least upper
bound. The set of lower bounds of B is {x |x < 0} and 0 is a greatest lower
bound.
(c) Consider the set of integers from 1 to 6 under the partial order “divides.” The
poset diagram is the following.
4 6
Let B be the entire set {1, 2, 3, 4, 5, 6}. Then, 4, 5, and 6 are all maximal ele-
ments of B, but B has no greatest element. The set B has no upper bounds,
and therefore no least upper bounds. The element 1 is a least element, a minimal
element, a lower bound, and a greatest lower bound of B.
Proof: (a) We will prove the contrapositive, that is, if 6 is not a maximal
element of B, then b is not a greatest element of B. If b is not maximal,
then there exists an element 5’ € B such that b 4 b’ and b <b’. Then
b’ < bis false, and hence b is not a greatest element of B.
(b) Since B c A, it is immediate from the definitions that if b is a greatest
element of B, then b is an upper bound for B. If a is an upper bound
for B, then b <a, since b € B. Therefore, b is a least upper bound
of B.
(c) Ifb © Bis an upper bound for B, then b’ < b for all b’ & B. Therefore,
b is a greatest element of B. fj
Theorem 3.6.5: Let <A, <> be a poset and Bc A. Ifa least upper (greatest
lower) bound for B exists, then it is unique.
The proof is left as an exercise.
1, Fill in the following table describing the characteristics of the given ordered sets.
Use Yfor yes and N for no.
<N, <>
<N, =>
d,=>
<R, =>
< O(N), Proper
containment>
<P(N), =>
<P ({a}), <>
<P(g), <>
3. State which of the following digraphs represent a quasi-ordered set; a poset; a linearly
ordered set; a well ordered set.
(c)
(e) C,.)
<1, Vip T<x2, y2> if and only if x, <x, and y; < yo.
Determine whether each of the following assertions is true or false. Justify your
answer if the statement is false.
(a) T is a partial order.
(b) T is a linear order.
(c) T is a well order.
t(d) Every subset of R x R which has a lower bound has a glb.
(e) If the second condition is eliminated (that is, we only require x, < x), then the
resulting relation is a partial order.
178 BINARY RELATIONS Ch. 3
Programming Problems
1. Write a program which accepts as input a set of ordered pairs and determines if the
relation is a quasi order, partial order, or linear order.
2. Write a program which accepts as input a set of ordered pairs denoting adjacent
nodes of a poset diagram and produces a minimal element of the poset.
3. Write a program to perform a topological sort of a finite poset. Assume the input is
presented as a set of ordered pairs denoting adjacent nodes of a poset diagram. One
technique is to select and list a minimal element of a poset, delete the element listed,
and repeat the process, continuing until all elements are listed. (Ref. Knuth, [1969],
Vol. 1, p. 262.)
Often the elements of a set are treated according to their properties rather than
as individuals. In such a situation, we can ignore all properties which are not of
interest, and treat different elements as “equivalent,” or indistinguishable, unless
they can be differentiated using only the properties which are of interest. The notion
of “equivalence” has three important characteristics:
(i) Every element is equivalent to itself (reflexivity). -
(ii) If ais equivalent to b, then b is equivalent to a (symmetry).
Sec. 3.7 - EQUIVALENCE RELATIONS AND PARTITIONS 179
(iii)
Ifa is equivalent to 6 and b is equivalent to c, then a is equivalent to c
(transitivity).
These properties form the basis of an important class of binary relations on a set.
Examples
(a) The universal relation on any set A is an equivalence relation. If A = {1, 2, 3},
then the digraph of universal relation on A is a complete digraph with 3 nodes.
(b) The empty relation ¢ is an equivalence relation over the empty set ¢@. How-
ever, the empty relation is not an equivalence relation over any nonempty set
because it is not reflexive.
(c) Consider the class of propositional forms over some set of propositional
variables. The relation R defined by R = {<P, Q>|P <> Q}, where P and Q
are propositional forms, is an equivalence relation over this set.
(d) A predicate P with one argument induces a natural equivalence relation ~
over a universe of discourse U. Under this relation, two elements, a,b <¢ U
are equivalent if and only if P(a) and P(d) are logically equivalent:
a~ b<>[P(a@) <> P(d)].
(e) The equality relation, E, on any set is an equivalence relation. +
An important class of equivalence relations over the integers (or any subset
of them) consists of the modular equivalences.
Definition 3.7.2: Let k be a positive integer and a,b <€ I. Then a and 5 are
equivalent mod k, written
a = b (mod k)
if for some integer n, (a — b) = n-k. The integer k is called the modulus of the
equivalence.
(ii) Symmetry: If a = b (mod 4), then there exists some n € I such that
(a — b) = n-k. Then (6 — a) = —n-k, and hence b = a (mod k).
(iii) Transitivity: Suppose a = b (mod k) and b =c (mod k). Then there
exist 1,,”,, € I such that (a — b) = n,-k and (b — c) =n,-k. Adding
both sides of these equations, we find (a — c) = (nm, +7,)-k and there-
fore a=c(modk). fj
Examples
(a) Let the relation R be equivalence mod 3 on the set A = {0, 1, 2, 3, 5, 6, 8}.
The elements 0, 3 and 6 are equivalent, as are 2, 5 and 8. The digraph of the
relation R on the set A is the following:
The equivalence class [a], is nonempty for each a € A since a & [al]. If the
equivalence relation R is understood, we will usually write [a] in place of [a],.
Examples
(a) Let A = {a, b,c, d} and R be the set
{<a, a>, <a, b>, <b, a>, <b, b>, <c, &, <c, dD, <d, o>, <d, dd}.
The digraph of <A, R> is
OSD OS®
Sec. 3.7 EQUIVALENCE RELATIONS AND PARTITIONS 181
eT >
[a] [b]
Let c be an element of [a] \ [6]. Then c & [a], so cRa. Similarly, ¢ € [b],
so cRb,
Since R is symmetric, it follows that aRc and since R is transitive,
aRb. Now consider an arbitrary element x of [a]. Then xRa and by transi-
tivity of R, xRb. Hence x é€ [b] which establishes that [a] < [Bb]. A
similar proof establishes that [b] < [a], and we conclude [b] = [a].
Therefore, if [a] ~ [b] @ then [a] = [5]. Since [a] and [b] are nonempty,
it follows that either a ; [6] = ¢ or [a]= [b].
(b) We must show (),<4[x] = A. We first establish that J,e4[x] c A.
Suppose c € (J,<4 [x]. Then c &€ [a]for some a € A, and since [a] < A,
ce & A. Therefore (),<4[x] < A. We next establish that A < Jre4 [x].
Let c € A. Thene ¢€ [c] < |, ¢4[x] and therefore A < |), <4 [x].
r(R) is reflexive,
sr(R) is symmetric and reflexive, and
tsr(R) is transitive, symmetric, and reflexive.
Hence, R’ = tsr(R) is an equivalence relation on A.
(b) Let R” be any equivalence relation containing R. Then R” is reflexive
and symmetric so R” > RU R*° U E = sr(R). Since R”’ is transitive
and contains sr(R), R” contains tsr(R). Jj
Example
Let A = {a, b, c, d} and let R be represented by the following digraph.
a b c d
The equivalence classes of fsr(R) are {a, b} and {c, d}. Each equivalence class of the
induced equivalence relation is the set of nodes of a component of the digraph
(A, RD. #
Sec. 3.7 EQUIVALENCE RELATIONS AND PARTITIONS 183
Partitions
Example
(a) The following diagram represents a partition of a set.
The rank of this partition is four. By viewing the diagram alone, there is no way
of determining how many elements are in the set or how many elements are
in each block of the partition, but by definition, no block is permitted to be
empty.
(b) Consider the set of positive integers I+-. Then the sets
S; ={x|x € 1+ A xis prime} and S,=S;,
form a partition of 1+ of rank 2.
(c) Cutting a sheet of paper into pieces results in a partition of the original sheet.
(Each piece is a block of the partition.) This notion can be generalized to the
physical tearing asunder of any object.
(d) The set of tautologies, the set of contingencies, and the set of contradictions
form a partition of rank three of the set of all propositional forms.
184 BINARY RELATIONS Ch. 3
Except for the fact that equivalence relations are defined for empty sets and
partitions are not, equivalence relations and partitions are different descriptions
of the same concept. The following theorems establish a natural correspondence
between the partitions and equivalence relations over a nonempty set.
Example
OQ
a
Cfo
Then 4/R = {{a}, (6, c}}. The cank of the relation R is 2; the blocks of A/R are {a}
and {b,c}. #
Theorem 3.7.8: Let 2 be a partition of the (nonempty) set A, and define the
binary relation ~ on A as follows:
an~bAS[SexnA\aEeESAbeS].
Then ~ is an equivalence relation on A, called the equivalence relation induced
on A by the partition n.
Proof: We must show ~ is reflexive, symmetric, and transitive.
(a) Reflexivity: Since 2 exhausts A, every element of A is in some element
S of z and therefore a ~ a for every a € A.
(b) Symmetry: Suppose a ~ b. Then there is some S € z such that ae S
and b &€ S, and therefore b ~ a.
(c) Transitivity: Suppose a ~ b and b ~ c. Then there are elements S; € 2,
S, € # such that a,b € S,; and b,c & S,. But since z is a partition,
either S$; 1 S, = ¢ or S; = S,. Since b € S, andb € S,, 8,0 S,#¢.
Therefore S, = S,, and hence c € S,. We conclude thata~c. J
Example
Let A = {a, b, c, d}, m = {fa, b}, {c}, {a}. Then the equivalence relation ~
induced by z is represented by the following digraph.
Since set containment is a partial order over any collection of sets, itis a partial
order over any collection of equivalence relations on a set A. A corresponding
partial order of “partition refinement” exists over any set of partitions of A.
If x and x’ are partitions of a set A and 7’ refines z, then we can think of the
elements of 2’ as having been obtained by “breaking up” the elements of z into
smaller subsets of A.
Examples
(a) Using our diagram representation of partitions, the following illustrates two
partitions such that 7’ refines 7;
T W
We will often compare the sizes of different equivalence relations and dif-
ferent partitions of a set. A partition z is larger than z’ if z has more blocks
than
xn’, and an equivalence relation R is larger than R’ if R has more ordered pairs than
R’. It is a confusing fact of life that for any set A, the large partitions of
A cor-
respond to the small equivalence relations and vice versa. To illustrate the po
int,
consider a set A with n elements. The largest equivalence relation on
A is the
universal relation A x A; this relation has n? elements. This equivalence rela
tion
induces the partition {4} which has a single block: this is the smallest partit
ion
of A. The size of a partition cannot generally be determined on the
basis of the
size of the associated equivalence relation, but the following theorem sh
ows that
if x’ refines (and is therefore at least as large as) z then the equivalence
relation
R’ induced by z’ is contained in (and is therefore no greater than) the rela
tion R
induced by z.
Sec, 3.7 EQUIVALENCE RELATIONS AND PARTITIONS 187
Proof: We first show that if x’ refines 2 then R’ < R. Suppose aR’b. Then
there is some block S’ of x’ such that a, b € S’. Since z’ refines z, there is a block
S of z such that S’ c S, and therefore, a,b € S. It follows that aRb and hence
R' & R. We next show that if R’ c R then z’ refines z. Let S’ be a block of n',
and a € S’. Then S’ = [a]p = {x|xR’a}. But for each x, if xR’a then xRa since
R' & R. Therefore, {x|xR’a} < {x|xRa} and [a] < [a]lg. Denote [a], by S; then
S is a block of z and S’ < S, which establishes that z’ refines x. fj
Example
Let A = {1, 2,3} and consider the following equivalence relations on A.
AXA
E = {<l, 1, <2, 2), <3, 3}
Ri = {d, D, <2, 2, <3, 3D, <2, 3D, <3, 29}
Rz = KI, 1), <2, 29, <3, 3, <1, 29, <2, 1}
The following is a poset diagram of <{A/(A x A), A/E, A/R1, A/R2}, refines).
AIA X A)
AIR, AIR
A/E #
Let S be the set of partitions of a nonempty set A. We now define two useful
binary operations on S, called the “sum” and “product.” The sum of two parti-
tions z, and z, is the largest partition (the one with the most blocks) that is refined
by both z, and z,. The product of z, and z, is the smallest partition (the one
with the fewest blocks) that refines both z, and z,.
The following two theorems show that the product of two partitions always
exists and is, in fact, unique.
my ™ Wyo
Example
Suppose a sheet of paper is marked with red lines and green lines so that
cutting the paper on the red lines would result in the partition z; and cutting it on
the green lines would result in partition 2,. Then cutting it on both the red and
green lines would produce the product partition 2,-2,. +
Condition (ii) of the preceding definition ensures that the sum of z, and 7, Will be
the largest partition refined by both z, and z,. The sum of two partitions always
exists and is unique, as we show in the two following theorems.
Sec. 3.7 EQUIVALENCE RELATIONS AND PARTITIONS 189
Then R is an equivalence relation on A, and the partition 4/R is a sum of m, and z,.
Proof: R, UR, is reflexive and symmetric because the operation of set
union preserves these properties. Therefore, by Theorem 3.7.5, R = t(R, U R,)=
tsr(R, U R,) is the smallest equivalence relation which contains R, and R,. Since
R>R, and R > R,, both z, and z, refine A/R. Furthermore, any partition which
is refined by z, and z, induces an equivalence relation which contains both R,
and R,. Since t(R, U R,) is the smallest such equivalence relation, it follows that
A/R refines all such partitions. Therefore, 4/R is a sum of z, and 7. |
Ts Ty nm, + 14
Examples
(a) Suppose a sheet of paper is marked with red lines representing the partition
mz, and green lines representing the partition 2,. Then cutting the paper on
those lines which are colored both red and green would produce the sum
partition 7, + 7.
(b) In an information retrieval system, each “descriptor” induces a partition with
two blocks over the set of documents. If one descriptor is “artificial intelli-
gence,” then the documents will be categorized according to whether or not
this descriptor applies to the document. Suppose ten descriptors are used. If
retrieval is done by specifying a single descriptor, any of ten sets of documents
190 BINARY RELATIONS Ch. 3
can be specified. If retrieval can also be done using the negation of a descriptor
(meaning the descriptor is not appropriate), any of twenty sets of documents
can be obtained. If a single use of the connective AND is also permitted, then
one can obtain a set of documents corresponding to any block of a product
partition %,-72, where 2, and 7, are two of the partitions induced by a single
descriptor. A single use of the connective OR will not result in a block of
the sum partition; instead it will produce the union of some blocks of the
product partition. 3
1. Prove that the universal relation on any set A is an equivalence relation. What is the
rank of this relation?
2. Prove that the empty relation is an equivalence relation on ¢. What is the rank of
the relation?
3. Suppose 4 is a finite set with n elements.
(a) How many elements are in the largest equivalence relation on A?
(b) What is the rank of the largest equivalence relation on A?
(c) How many elements are in the smallest equivalence relation on A?
(d) What is the rank of the smallest equivalence relation on A?
4. Suppose A = {a,b, c, d} and 7, is the following partition of A:
%, = {{a, b, ch, {d}.
(a) List the ordered pairs of the equivalence relation induced by 7.
(b) Do the same for the partitions
(c) Draw a poset diagram of the poset <{7;, %2, 73}, refines).
5. Let Rand R’ be equivalence relations on a set A. Show by example that R U R’ is
not necessarily an equivalence relation. What properties of an equivalence relation
are violated by your example? Choose the set A to be as small as possible.
6. State whether or not the following binary relations are equivalence relations. If they
are not, state which of the properties of an equivalence relation they violate. All
relations are on the set I. In each case, find the equivalence relation induced by R.
(a) <
(b) <
(c) R={a,b|@>0A5
@<>0A
0)5V
<0)
d) R={ab|@>0Ab6>0V@<0A56<0)}
() R={ab(@<0Ab>0VaG<0 A)}
b<0
(ff) R={ab A | a
b>0V @
Ea A<>
0A0
5<0}
(g) R={ab|a>0Ab>0)V @<0A6<0)V@=b=0)}
(h) R = {<a, b>|a divides b with 0 remainder}
(i) R= {<a, b>| |a — b| < 10}
G) R= {<a, b>[dxfx e 1A 10x <a <b< 10% +1}
(k) R= {<a, b>|Ax[x € 1A (10x <a < 10(« + 1)) A (10x <b < 10(x + 1)}}
Sec. 3.7 EQUIVALENCE RELATIONS AND PARTITIONS 191
Programming Problem
Write a program which accepts as input the incidence matrix of a relation and
determines if the relation is an equivalence relation.
The text by Deo [1974] is an excellent treatment of graphs with special atten-
tion to problems of interest to computer science; the book by Busacker and Saaty
[1965] is an earlier work of the same nature which considers applications in a wide
variety of areas. Aho, Hopcroft, and Ullman [1974] present and analyze many
algorithms associated with sets, graphs, and trees. Knuth [1969] treats the general
topic of trees and Knuth [1975] analyzes search trees.
4
FUNCTIONS
4.0 INTRODUCTION
193
194 FUNCTIONS Ch. 4
To define a function we must specify the domain, the codomain, and the
value f(x) for each possible argument x. The notation f: A — B denotes that f
is a function with domain A and codomain B. The values of f(x) are specified by
a set of rules which cover all possible values of x, e.g.,
f: NON,
f(x) =1 if x is odd,
fo) = + if x is even.
If the domain of the function is finite, the function can be specified explicitly by
giving the values for all possible arguments, e.g.,
g: {l, 2, 3} > {A, B, C},
g(1) = A,
g(2) = C,
g(3) = C,
or by a digraph, e.g.,
2. .
3, osc
Examples
(a) Let A = {a, b} and B = {1, 2, 3}. The following digraphs represent functions
from A to B.
a. wl a. | a. al
b. 2 be boo
ee eee 3
(b) Let A and B be as above. The following digraphs represent relations from A’
to B which are not functions.
a. Al a.
b. 2 b. 2
(c) IfA = and Bis any set, then the empty relation is vacuously a function from
A to B. If A # ¢ and B = @, then the only relation from A to B is the void
relation; but this relation is not a function from A to B. There are no functions
which have a nonempty domain and an empty codomain. #
Sec. 4.1 BASIC PROPERTIES OF FUNCTIONS 195
For any function f: A —> B, Definition 4.1.2 implicitly specifies another func-
tion F, where F': @(A) — (B); that is, F maps subsets of the domain to subsets of
the codomain, For A’ < A, the set F(A’) is denoted by f(A’). Note that f and F
are not the same function; the domain and codomain of f are the sets A and B
while the domain and codomain of F are the sets (A) and @(B). Thus the function
jf maps the element of A to elements of B, while the function F maps subsets of A
to subsets of B. This is illustrated by the following diagram:
In spite of the distinction between F and f, we will adopt the convention of using
f to represent both the original function fand the induced function F. This notation
is usually not ambiguous because the argument usually specifies which function
is intended.
Examples
(a) Suppose /: {0, 1, 2, 3} — {a, b, c} is defined by the following digraph:
0. a
Loo
ma c
3,
#
Binary relations are defined to be equal if they have the same domain and
codomain and are equal sets of ordered pairs. Because functions are relations, the
same definition holds for functions; two functions fand g are equal if and only if
their domains and codomains are equal and for every element a of their domain,
f(@) = g(a).
Since functions are relations, if g is a function from A to Band fisa func
tion
from B to C, a composite relation exists from A to C. Furthermor
e, the next
theorem shows that this composite relation is itself a function. We
will use the
standard notation and represent the composite function by fo g or simp
ly fg.t
Theorem 4.1.1: Let g: A> Band f: B > C be functions. Then the composite
function fo g is a function from A to C, and (f o g)(x) = f(g(x)
) for all x € A.
Proof: Since f and g are relations, fo gis a relation from A to C. We mu
st
establish that fo g is also a function, that is, for every a € A,
there isa unique
¢ € Csuch that <a, c> € fog.
Since g is a function, for each a € A there isa bec B such
that g(a) = b;
since fis a function, for each b € B there isa c &€ C such that
f(b) = c. Because
<a, b> & g and <b, c> € f, it follows that <a, ¢> & fo g. Furt
hermore, b was uni-
quely determined by the argument a for the function g, and
c was uniquely deter-
mined by the argument for the function f. It follows that <a, c> is the only ordered
pair of the composite fo g with a as the first element. Thus
fo gis a function and
(fe g(a) =c = f(b) = f(g@). I
Examples
(a) Let g: {0, 1, 2} > {a, b} and J: {a, b} — [A, B, C} be defined by the following
digraphs.
& f
0. etree 4 CE » A
Lb
2. a
b. B
c
(c) Let g: N—N, where g(x) = 2x for x € N, and let f: N-> N, where f(x) =
x/2 if x is even, f(x) = 0 otherwise.
Both fog and go/f are defined.
Ig:N-N,
fe(x) = f(2x) = x.
gf: NN,
ef (x) = e(x/2) =x if x is even,
ef (x) = g(0) = 0 ifxisodd. #
fog
A Cc
A g B
k f
C ; D
A B
gh
; &
Ig
D — Cc
f
If f: A — A for some set A, then the function f can be composed with itself
any number of times. The notation used to denote the repeated composition of f
with itself is defined inductively as follows:
1. f(a) =a,
2 f"'@=ff@), forneN.
The set of all functions from a set A to a set B is often denoted by B4; this
notation has some useful properties which will become apparent later. If either of
the sets A or B is the first n natural numbers, {0, 1, 2,..., — 1}, then the set is
often represented by the symbol n. For example, the set of all functions from a
set A to {0, 1} is denoted by 24 and the set of all functions from {0, 1, 2} to aset B
is denoted by B*. Thus, the notation A” may denote either the set of n-tuples of —
elements of A or the set of all maps from {0, 1, 2,..., 2 — 1} to At. No difficulties
result from this ambiguous use of A” because there is a natural correspondence ©
between the two possible meanings; defining this correspondence is an exercise
in the next section.
The domain of a function is often a cartesian product of sets. A function f
+The notation A” has still a third use in our text. If A is a subset of E* for some alphabet
, then A” is used to denote the set product of A with itself 1 times (see Definition 2.7.5). The
context will determine the intended meaning of A”.
Sec. 4.1 BASIC PROPERTIES OF FUNCTIONS 199
with domain
Xi=l A,
is said to be a function of n variables. The value of f at <(x;,x2,...,X,>, where
x; € A,, will be denoted by f(x,, x,,...,X,)-
Example
Arithmetic operations such as addition, subtraction and multiplication are
examples of functions of two variables. These functions are commonly represented
by an infix notation; thus the function + (x, y) is denoted byx +y. #
Examples
(a) The length of a word x € X* can be inductively defined as a function from
=* to N. (The length of x is denoted by || x||.)
1. (Basis) ||A|| = 0.
2. (induction) If x €¢ 2* anda é€ X, and ||x|| = x, then ||ax|| =” + 1.
Note that no extremal clause is necessary here; the function has been defined
for the entire domain 2* because it follows the inductive definition of Z*
(Definition 2.5.2).
(b) The successor function S: N — N maps each integer 1 € N into its successor,
n+ 1l;ie, S@ =n +1. Arithmetic operations on N can be defined induc-
tively using the successor function; we illustrate with a definition of the opera-
tion of addition +:N*— N,
1. (Basis) +(m, 0) = m forme N.
2. (Induction) + (m, S(@m)) = S(+(m,n)) formyn EN.
(c) The Fibonacci sequence
0, 1, 1, 2, 3, 5, 8, 13, 21,...
has the property that each term after the second is the sum of the two pre-
ceding terms. This sequence arises in a number of contexts. It can be induc-
tively defined as a function F on N as follows:
1. (Basis) F(O) = 0, and F(1) = 1.
2. (Induction) F(m + 2) = F(n + 1) + F(n) forallnae N. #
In each of the above examples, the value of the function in the induction step
is specified using values of the function for “earlier” arguments. A specification of
J(n) in terms of f(k) for k #7n is called a recursion formula, and f is said to be
recursively defined. Not all recursively defined functions are defined inductively.
200 FUNCTIONS Ch. 4
Example
The “91 function” is defined recursively (but not inductively) as follows:
[NON,
f@)=x-—10 if x > 100,
fix) =ff@4+ 11) — ifx < 100.
This function has the property that f(x) = 91 for all x such that 0 < x < 100;
otherwise, f(x) =x —10. #
Example
Consider the set of arithmetic expressions E defined inductively as follows:
1. Every digit (0 through 9) is an element of E.
2. If Xe Eand Ye E,then X¥— Ye E.
3. The set EZ is the smallest set which satisfied clauses 1 and 2.
The above definition of E allows the construction of some elements, such as
3 — 4 — 5, in more than one way; in the inductive step one can either let X be 3
and Y be 4 — 5 and then form X — Y, or X can be 3 — 4. and FY can be 5. A func-
tion defined on E following the inductive definition may or may not be well-defined.
The following function fis well-defined because the definition does, in fact, charac-
terize a function on the elements of EZ. The function f sums the digits which appear
in an element of E.
f. EN,
1. If X e Eand X isa digit, then f(X) = X.
2. If Xe E and Ye E, then f(X¥ — Y) = f(X) + f(Y). Thus
fB—4-—5) = 12.
The following definition of g does not characterize a function.
gi: EON,
1. If X € Eand Xis a digit, then g(X) = X.
2. If X € Eand Ye E, then g(¥ — Y) = g(X) — e(Y).
The difficulty stems from the fact that subtraction is not associative, and
consequently there are two possible values of the “function” g for such expressions
as 3 — 4 — 5, namely: ,
&3 — 4 —5) = 83 — 4) — eS) = (3) — ge) — eS) = 3B -4—5
g(3 — 4 — 5) = g(3) — e(4 — 5) = g(3) — (e@ — 265) = 3 — (4 — 5).
Thus, g is a relation but not a function and we conclude g is not well-defined.
Sec. 4.1 BASIC PROPERTIES OF FUNCTIONS 201
Note that by using parentheses in the inductive step of the definition of E, the diffi-
culty can be eliminated. The inductive step would then read
2. If Xe Eand Ye Ethn(X¥-Yyck #
Inductively defined functions can often be computed either iteratively or re-
cursively. A program is said to compute a function iteratively if the computation
for most arguments is done by the statements in a program loop. A program is
said to compute the function recursively if the computation is done by a recursive
procedure.
Example
The factorial function can be computed either iteratively or recursively. The
following procedure computes n! iteratively for any n < N; the value returned by
ITERFACT for the argument z is n!.
procedure ITERFACT(n):
begin
pl;
for
7 — 1 step 1 until n do p — p «i;
return p
end
procedure RECURFACT(n):
if n = 0 then return 1 else return 2 * RECURFACT (n — 1)
Partial Functions
Definition 4.1.3: Let A and B be sets. A partial function f with domain A and
codomain B is any function from A’ to B, where A’ c A. For any x € A — A’,
the value of f(x) is said to be undefined.
a total function for emp has is. We will alw ays use the qua lif ier “pa rti al” whe n ref er-
rin g to par tia l fun cti ons ; the unq ual ifi ed ter m “fu nct ion ” will be res erv ed to des ig-
nate total functions.
The notation and the ore ms we hav e dev elo ped app ly to par tia l fun cti ons in
str aig htf orw ard way s. For exa mpl e, if g and f are par tia l fun cti ons fro m A to B
and B to C res pec tiv ely , the n fg is the par tia l fun cti on fro m A to C suc h tha t g(x )
is defined if and onl y if g(x ) and f(g (x) ) are bot h def ine d, and in tha t cas e, fg( x) =
f(g(x)). We will not develop all the analogous terms and definitions for partial
functions , alt hou gh we will occ asi ona lly use the m whe n the ir mea nin g is clea r.
Examples
(a) The operation of taking a square root of a real number is a partial function
from R to R;./x is undefined for x < 0.
(b) The partial function f(x) = 1/x from R to R is undefined for the argument
x=0.
(c) The partial function f(x) = x from R to R is a total function.
(d) Computer programs represent partial functions. The input to a program is the
argument of the partial function, and the output of the program is the value
of the partial function. If the program does not terminate or if it terminates
abnormally (e.g., by attempting to execute an illegal operation such as divi-
sion by 0), then the partial function is undefined for the argument. Using the
output of one program as the input of another corresponds to composition
of the partial functions implemented by the programs. This view of programs
provides a basis (different from the one we described in Section 1.6) for inves-
tigating program correctness. The “meaning” of a program can be defined to
be the partial function it computes, and a program is correct if it computes
the intended partial function. The program will halt for all inputs if the partial
function is total. #
1. Determine which of the relations represented by the following diagraphs are func-
tions from A = {a, b, c} to B = {0, 1, 2}. For those that are functions, find the image
of the subset {a, b}. For those that are not, state what properties of a function are
not satisfied.
(a) a. O (b) a. 0
bo S A b. Jl
c. 2 c. 2
(c) a ..0 (d) a. 0
b Jl b. Jl
c. 2 c. 2
(e-) a._______+.0 (f) a. 0
b+ bs
C.D Ca 2
Sec. 4.1 BASIC PROPERTIES OF FUNCTIONS 203
Prove by induction on ” that f,(x;, x2,...,%,) = 0. You may assume x? > 0 for
any x © Randifx,y>Othenx+y>0.
Define f: N2 - N as follows.
1. (0,2) = 1 for alln € N, .
2. f(m+1,n) = f(m, n)-n.
Find an algebraic expression for fand prove by induction that it represents f-
Using addition, inductively define a function f: N? - N such that f(x, y) = x-y.
Write an iterative algorithm and a recursive algorithm to compute the value of
m-+n for mneéN. Your algorithm should use only the successor function,
S(n) = n + 1, and the predecessor partial function P, where P(S(#)) = n and P(0)is
undefined.
10. Write an iterative algorithm and a recursive algorithm to compute m” for m,n & N.
Assume only the operations of addition and multiplication together with the pre-
decessor partial function.
11. Let f be the “91 function” defined in this section.
(a) Show that f(99) = 91.
(b) Prove that f(x) = 91 for all x from 0 to 100.
12. Consider the following partial functions from R to R:
gx) =-)
204 FUNCTIONS Ch. 4
h(x) = x?,
k(x) = / x.
For each of the fo ll ow in g co mp os it e pa rt ia l fu nc ti on s, ch ar ac te ri ze th e su bs et of R
for which the pa rt ia l fu nc ti on is de fi ne d, gi ve an al ge br ai c ex pr es si on for th e co m-
posite partial function, and characterize the image of the partial function.
(a) eg
(b) Ak
(c) kh
Programming Problems
1. Write both iterative and recursive procedures which, when passed 2 € N, will return
the nth element of the Fibonacci sequence.
2. Recursion can be used to define functions which grow very fast. Consider the func-
tion defined recursively as follows:
A:N2-—~N,
A(n, 0) =n +1,
AQ, m + 1) = A, m),
A(n + 1,m +1) = A(A(n, m + 1), m).
Write a program to evaluate this function for any argument. Investigate the comput-
ing time required to calculate A(0, 0), A(1, 1), A(2, 2),.... Warning: The time and
storage required to compute A(i, i) grow very fast as i increases.
Examples
(a) The following digraphs illustrate the concepts of Definition 4.2.1. For each
function, the domain and codomain are represented as columns of dots on the
left and right sides respectively.
— o> oo »@
= a
Injective Surjective
Not surjective Not injective Bijective
(f) Let [a, b] denote the closed interval of real numbers, [a, b] = {x|a <x < 5},
where a <b, and let f: [0,1] — [a, 5], where f(x) = (6 — a)x +a. This
function is a bijection.
(g) The empty relation is an injective function from an empty domain to an
arbitrary codomain. If the codomain is also empty, then the function is a bijec-
tion.
(h) The properties of being injective, surjective and bijective all can be interpreted
in terms of the graphs of functions from R to R. Consider the following graphs
OEE re
of functions.
Since these are graphs of functions from R to R, any vertical line will intersect
the graph at exactly one point. If every horizontal line intersects the graph at
leas t once , then the gra ph repr esen ts a surj ecti ve func tion . Thus , of the abo ve
functions, f(x) = x and f(x) = x3 + 2x? are surjective but the others are not.
If no horizontal line intersects the graph more than once, then the function
is injective. Thus, f(x) = x and f(x) = 2* are injective but the others are not.
If every horizontal line intersects the graph exactly once, then the function is
bijective; f(x) = x is bijective and the others are not.
(i) Sets of data records, called files, are often stored in tables or vectors; if T
deno tes the table , then each reco rd is loca ted at some tabl e addr ess 7;. Ther e
are many ways of assigning a table address T; to each record. In one method,
a hash function uses a part of each record called the key to compute a storage
address for the record; hash functions are also key transformations. For exam-
ple, a company might use each employee’s social security number as the key
to access the employee’s record. If the company has 400 employees, their
records could be stored in a table with 500 entries by using the first 3 digits of
the social security number mod 500 as the table address. Thus, if fis the hash
function, then
£(136 29 4516) = 136
(729 00 0345) = 229
This hash function would map each social security number into an address 7;
where 0 < 7; < 499. This particular hash function will probably be unsuitable
because it is likely that too many records will be assigned to some table
addresses. More suitable hash functions usually involve the entire key. An
example of such a function for social security numbers would be
Since f is surjective, there is some element b & B such that f(s) =c.
Since g is surjective there is some element a € A such that g(a)=b.
Then fg(a)= f(g(a)) = f(6) =c, and therefore c € fg(A). Since c
' was arbitrary, this establishes part (a).
(b) (If f and g are injective, then fg is injective.) Let a, b be elements of A,
and assume a + b. Since g is injective g(a) % g(b). Since / is injective
and g(a) # g(b), it follows that f(g(a)) + f(g(b)). Therefore, a +b
implies fg(a) fg(b), which establishes part (b).
(c) (If f and g are bijective, then fg is bijective.) Since f and g are bijective,
they are both surjective and injective. From parts (a) and (b), it follows
that fg is both surjective and injective and therefore bijective. J
Examples
(a) Let A be the set of negative integers, and define the bijections f and g as
follows.
g:A—- I+, where g(x) = —x;
f:14+ -N, where f(x) = x — 1.
Since f and g are bijections, the composite function fg is a bijection and
fe(x) = —-x — 1.
(b) We will construct an injection from [0, 1] to (0, 1). Define g:[0, 1] — [0, 4]
by g(x) = x/2; andf: [0, 4] > (0, 1) by f(x) = x + 4. Then fg: [0, 1] > ©, 1)
is the injection fg(x) = x/2 + 4. The image of (0, 1] is the interval [4, 3] which
is contained in the open interval (0,1). #
The converse of each part of the Theorem 4.2.1 is false, but the following
theorem provides a “partial converse” to each of its assertions; its proof is left as
an exercise.
Note that every identity function L, is a bijection, The next theorem asserts
that if f: A — B, then the identity function of A is a “right identity” for f and the
identity function on B is a “left identity” for f.
208 FUNCTIONS Ch, 4
A f B
1, lp
“
A B
Definition 4.2.4: A permutation on A is a bijective function on A.
Examples
(a) The identity function on a set A is a permutation on A.
(b) The function /: {0, 1, 2} > {0, 1, 2}, where f(0) = 1, f(1) = 0 and f(2) = 2,
is a permutation.
(c) The function f: 1 I, where f(x) = x + 3, is a permutation on the inte-
gers. #
Examples
(a) Let f:N— Nand f(x) = x + 1. Thenf is strictly monotone increasing.
(b) Any constant function on R is both monotone increasing and monotone
decreasing.
(c) The function f: R — R such that f(x) = x* is neither monotone increasing nor
monotone decreasing. +
Sec, 4.2 SPECIAL CLASSES OF FUNCTIONS 209
Inverse Functions
If fis a bijection from A to B, then f consists of a set of ordered pairs with the
property that every element a € A appears exactly once as the first element of a
pair and every element b € B appears exactly once as the second element of a pair.
The converse relation, formed by reversing the ordered pairs of f, is a relation with
the same properties, i.e., the converse of fis a bijection from B to A.
Note that the inverse function f~' is defined only if fis a bijection.
Theorem 4.2.4: Let f be a bijective function, f: A—> B. Then f-' isa bijec-
tive function and f-!: B— A.
Proof: Consider the sets of ordered pairs corresponding to fand f-!.
f= {a blac AANbe BA f@ =),
fo? = {<b, a>| <a, b> € fF}.
Since fis surjective, every b € B occurs in an ordered pair <a, bY € f and hence
appears in an ordered pair <b, a> € f~!. Furthermore, since f is injective, for
each b € B there is at most one a € A such that <a, b> € f; hence there is only
one a & A such that <b, a> € f~!. These two statements establish that f~! is a
function and f-'!: B- A.
We leave it as an exercise to show that f~! is bijective. Jj
The inverse function has the property that it can be composed with the function
f to form an identity function. For if f(a) = b, then f~1(b) = a and it follows that
fUL@ = fUS@) = f'O) = a;
therefore, f~'f = 1,. Similarly,
Sf-(6) = fF) = f@ = 4,
which establishes that ff-! = 1,. Note that composing f and f~! always results
in an identity function but the domain may be either A or B, depending on the
order of the composition.
Example
Let f: {0, 1, 2} — {a, b, c} be defined by the following digraph:
210 = =FUNCTIONS Ch. 4
fo
jou)
a
SY
e
bo
These functions can be composed to form 1, and 13.
Examples
(a) Consider the function represented by the following digraph:
0.
1.
Mt
2.
3.
Then f~*({a}) = {0}, f-*({a, b}) = {0, 1, 2}, f- (fc, d}) = {3}, and f-'({d, e}) =o.
Note that f does not have an inverse function.
(b) It is possible for the notation f-! to be ambiguous. Suppose f: A > B, where
A = {X, Y}, B = {1, {1} and f(X) = 1, f(Y) = {1}. Then, using the inverse
function of the bijection f,
f(a) = ¥.
But, using the induced function from @(B) to P(A),
f7() = {xX}. #
If A ~¢ and f: A — B, then the collection of sets {f-'({b})|b € B} forms
a partition of A, and the associated equivalence relation is known as the equivalence
relation induced by f. Two elements are equivalent under this relation if the function
J maps them to the same element of B.
Example
Let A = {], 2,3, 4}, B = {a, b, ch, and f: A B.
If f() =a, f(2) = 6, f(3) =e and f(4) = c, then the equivalence relation on
A induced by fhas equivalence classes {1}, {2}, and {3,4}. #
Example
Let A = {I1, 2, 3} and let ~ be an equivalence relation on A with equivalence
classes {1, 2} and {3}. Then the canonical map from A to A/~ is the function g
defined as follows:
g: (1, 2, 3} > (1, 2}, GB,
g(1) = {1, 2}, (2) = {1, 2}, ¢3) = (3). #
The following definitions give us additional facilities for creating and modify-
ing functions. The first definition allows us to form a new function by deleting
part of the domain of a given function.
Defi niti on 4.2. 9: Let f : A— B, and let A’ be a subs et of the dom ain of
f. The restriction of f to A’ is the function denoted f |, and defined as
Ave A’ B,
Ff \akx) = f).
The next definition enables us to enlarge the domain of a function.
Definition 4.2.10: Let f: A’ > B, g: A— B, and A > A’. Then gis an exten-
sion of f to the domain A if gla =f.
Examples
Let f and g be defined by the following diagraphs.
A}
PY
&
N
&
Rh
o>
2. a
<<
4. c
ee
e
Then g = f |2,3,4) and fis an extension of g to the domain {1, 2, 3,4}. #
The following class of functions provides a way to specify sets using functions.
Definition 4.2.11: Let A be a set. For every set A’ cA, the characteristic
function (with domain A) of the set A’, denoted y», is defined as follows:
Sec. 4.2 SPECIAL CLASSES OF FUNCTIONS 213
Examples
(a) Let A = {a, b,c} and let A’ = {a}. Then
Xala) = 1,
Xa(b) = 0,
Xalc) = 0.
(b) Let A = [0,1] and 4’ = [4, 1]. The following is a graph of the function
Mar
I cuanemmnmmn
> #
tOne-Sided Inverse Functions
(c) fhas a lef t and rig ht inv ers e if and onl y if fis bij ect ive .
(d) Iffis bijective, then the left and right inverses of fare equal.
The fol low ing ill ust rat ion is app rop ria te to par t (a) of the the ore m.
f g h
A Qt @ () oO ew
a g e 0
e | 1 i
b o——____>e 2 b e~<—_____e 2 aa 2
Let A = {a, b}and B = {0, 1, 2}, and let f: A — B. The fun cti on f has two dis tin ct
left inverses which we hav e na me d g and h. If fis inj ect ive , a left inv ers e ma y alw ays
be formed by mapping eac h ele men t f(a ) in the ima ge of f bac k to a and ma pp in g
each element of B which is not in f(A ) to som e arb itr ary ele men t of A. If f wer e
not injective, there wou ld be two ele men ts a, a’ < A suc h tha t a # a’ and f(a ) =
f(a’). Thus f would “merge ” two ele men ts of A and no left inv ers e wou ld exi st.
Proof of (a): We firs t est abl ish tha t if a left inv ers e exis ts for f, the n fis inj ec-
tive. Suppose g is a left inv ers e for f. The n gf = 1,, whi ch is inj ect ive . It fol low s
from Theorem 4.2.2b that f is injective.
We next use a con str uct ive pro of to sho w tha t if fis inj ect ive , the n the re exi sts
a left inv ers e g. Cho ose an arb itr ary ele men t c € A and def ine g as fol low s:
g: BA,
g(b) =a if b < f(A) and f(a) = B,
g(b) =e ifb ¢ f(A).
The fun cti on g is wel l-d efi ned , sin ce exa ctl y one val ue is spe cif ied for eac h arg u-
ment b € B. Fur the rmo re, g is a left inv ers e of f sin ce if f(a ) = b, the n gf (a )=
(f(a) = g)=4.
The following illustration is appropriate to part (b) of the theorem.
f g h
a Me) a eo«—___—_—-# 0) a e=x——___-__@ 0
be .
b " b
€ l ce 1 Cc <2 |
Proof of (d): Suppose f is bijective with a right inverse A and a left inverse
g; then go f= 1, and foh = 1,. From Theorem 4.2.3,
&= golg=gofoh=1,oh=h, i
ditions on B and f for which the rank of the equivalence relation induced by fon A
is
(a) 1
(b) 2
(c) A
17. Let R be an equivalence relation on a set A. Under what conditions is the canonical
map g: A — A/R a bijection?
18. Prove Theorem 4.2.6,
19. (a) Prove that if f: A — Bis injective and A’ is any subset of A, then fla: A> B
is an injection.
(b) Suppose f: A’ > B is a surjection. Prove that if g is an extension of f to
A > A’, then g: A > Bis a surjection.
(c) Prove if f:A— B is a surjection, then there exists A’ c A such that
tla: A’ > Bis a bijection.
20. Verify the following for the characteristic functions of subsets A and B of C.
(a) Xa) = Xam — Lal).
(b) Xavae) = XAX) + Xal(x) ~— XA) Xalx).
(C) Xanax) = XAX)Za().
$21. Determine left and/or right inverses for the following functions when they exist.
Specify the equivalence relation induced on the domain by the function. In each
case, construct the canonical map.
(a) (b)
a.—____.-.-__.-» .0 a. ee ~0
bo ———— J
Co g
ee e c. 2
(c) (d)
(e)
The material in this chapter is classical and treated, at least briefly, in a num-
ber of books. The first two chapters of the text by MacLane and Birkhoff [1967]
will provide a distinct but related development of much of the material of our
Chapters 2, 3, and 4, along with some of the material of our Chapter 7.
S
5.0 INTRODUCTION
In order to compare, evaluate, and predict, we must often count the objects in a
finite set. For example, one way to compare the cost of applying two algorithms
is to determine, or at least estimate, how many operations each of them executes
when solving a problem. This is often done by counting only certain kinds of
operations which are executed by the algorithms. Thus, the cost of a direct method
for solving sets of simultaneous linear equations can be estimated by counting the
number of multiplications and divisions executed by the algorithm. The cost of
some sorting algorithms can be estimated by counting the number of comparisons
made between data items. The cost of using a particular data structure for a file
can be estimated by determining the average and maximum lengths of searches
for items stored in the data structure. Problems such as these ultimately involve
either counting (exactly or approximately) the elements of a set or enumerating
the elements of a set which have a common property. This chapter first introduces
some basic techniques for counting and enumerating the elements of finite sets;
we then illustrate how these techniques can be applied to the analysis of algorithms.
218
Sec. 5.1 BASIC COUNTING TECHNIQUES 219
integer 7 is called the cardinality of A, and we say “A has n elements,” or “nis the
cardinal number of A.” The cardinality of A is denoted by | A].
Example
Let A = {a, b, c}. Then the cardinal number of A is 3, i.e., | A| = 3, since the
function
Ff: {0, 1,2>}A,
fO) = a, f(1) = 6, f2) =e,
is a bijection from the first three natural numbers to A. #
The special case of the cardinality of the empty set deserves mention. As we
noted in Section 4.2, an “empty” function (consisting of the empty set of ordered
pairs) is an injection from the empty set to any set A, and if A is empty, then this
function is a bijection. Consequently, our definition states that a set A has cardi-
nality 0 if there is a bijection from the first zero natural numbers to A. But the set
consisting of the first zero natural numbers is empty, and a bijection will exist if
and only if A is empty. We conclude that | A| = 0 if and only if A = ¢.
We now introduce a fundamental rule of counting known as the “pigeonhole
principle.” Informally, the pigeonhole principle asserts that if m objects are placed
in n boxes (or pigeonholes) and m > a, then some box will contain more than one
object. This principle, which we will not prove, can be stated more formally as
follows.
Pigeonhole Principle: If A and B are finite sets with |A| =m and |B|=n
and m > n, then no injection exists from A to B.
Theorem 5.1.2: Let A and B be finite sets, and suppose there is a bijection
from A to B. Then |A| = | Bl.
220 COUNTING AND ALGORITHM ANALYSIS Ch. 5
Two additional principles are fundamental for counting sets which have been
formed by using the operations of union and cartesian product. We have implicitly
used these principles in earlier chapters, but for the sake of completeness, we will
state them as theorems about the cardinalities of sets; their proofs are left as exer-
cises. The first principle is called the Rule of Sum.
Theorem 5.1.3: \f A and B are finite disjoint sets with cardinalities m and n
respectively, then |A U B| = m-+ n.
Theorem 5.1.4: If A and B are finite sets with cardinalities m and n respec-
tively, then |A x B| = mn.
Examples
(a) Suppose statement labels in a programming language must be either a single
alphabetic symbol or a single decimal digit. The first set, {4, B, C,...,Z},
has 26 elements, and the second set, {0, 1, 2,..., 9} has ten elements. Because
the two sets are disjoint, the rule of sum can be applied, and we conclude that
there are 26 + 10 = 36 possible statement labels.
(b) A variable name in the programming language BASIC must be either an
alphabetic symbol or an alphabetic symbol followed by a single decimal digit.
If § denotes the set of alphabetic symbols and D denotes the set of digits, there
is a one-to-one correspondence between the variable names and the set
SU (S x D). By the rule of product, there are 26-10 elements in S x D and
hence by the rule of sum there are 286 possible variable names in BASIC.
(c) Consider the puzzle sometimes called the “four cubes problem.” It involves
four cubes such that each face of every cube is painted one of four colors. The
problem is to stack the cubes in such a way that each vertical side of the stack
contains squares of all four colors.
The order of the cubes in the stack is clearly unimportant, and we do not
wish to distinguish between arrangements which are identical except for rota-
tion. We can count the number of significantly different arrangements as
follows:
1. The first cube can be positioned in any of three different ways because
there are three pairs of faces which can be made the top and bottom
surfaces. ,
2. For each remaining cube, one of the six faces must be chosen as the
bottom and then one of four possible rotational positions must be chosen.
This gives 24 different ways to position each of the last three cubes in the
stack.
Thus there are 3-24-24-24 = 41,472 different arrangements, making an
exhaustive search costly. For a discussion of how to solve the problem (easily!)
by constructing a graph with 4 nodes and 12 edges, the reader is referred to
Deo [1974], p. 18, or Busacker and Saaty [1965], p. 153. #
Sec. 5.1 BASIC COUNTING TECHNIQUES 221
We will now develop several basic counting results, all of which are based
on the rules of sum and product.
Theorem 5.1.5: Let A and B be finite sets with cardinalities m and n respec-
tively. There are n™ functions from A to B, ice.,
| BA] = | BIl4!
Proof: If A = ¢, then the assertion holds since we define n° = 1 for all
n & N. No functions exist from A to B if B is empty and A is not. If both A and B
are nonempty, then index the elements of A in some arbitrary fashion with the
first m natural numbers: do, a;, d2,...,@,-,. Each element of A can be mapped
to any of n elements of B. Thus, there are n possible values of f(a), n possible
. values of f(a,), etc. It follows that there are n-n-n...n-n or n™ functions. Hence,
| B4| = | Bil | m factors
Example
Assume we wish to represent integers using sequences of n digits, where each
digit is one of b distinct symbols, b > 2. Choosing the symbol set to be
(0,1,2,...,5—1j,
each n digit sequence of symbols can be associated in a natural way with exactly
one function f/f: {0,1,2,...,”—1}— {0,1,2,...,6—1}. Thus, there is a
bijection from the set of all such sequences to {0, 1, 2,..., 8 — 1}{0.1:2.--.m-1),
By Theorem 5.1.5, there are b* functions from
{0,1,2
—1} ,.
to {0,1,2
.. ,...,6
,n -VJ
and therefore we can represent b" distinct integers. In the case of the standard
positional number notation in base 6, where the sequence
Qn ~14n—-24,~3 °° * A1ag
represents the number
each sequence of length m represents an integer greater than or equal to 0 and less
than b. #
We proved the following assertion inductively in Section 2.5. Here the result
follows as a special case of the preceding theorem.
Corollary 5.1.5: If A isa finite set, there are 2'4! distinct subsets of A.
Proof: For each subset A’ < A, let X be the characteristic function of A’:
Xvi A — {0, 1},
4x) = lifxe A,
= 0 otherwise.
For every pair of subsets B, C contained in A, X, = X, if and only if B = C. Hence,
AND ALGORITHM ANALYSIS Ch. 5
222 COUNTING
are ch ar ac te ri st ic fu nc ti on s de fi ne d on A,
there are as many subsets of A as there
and by Theorem 5.1.5, this number is 2'4!. Jj
Now suppose the selection process is one in which each item can be selected
at most once; in this case, the process is said to be a selection without replacement.
The sequence which results from a selection without replacement of r objects from
n objects where r < n, is called a permutation of n objects taken r at atime. A
permutation of n objects taken r at a time is an r-tuple, (a,,a,,..., a,> such that
each a; is one of n objects and if i # j, then a, ¥ a,.
P(n, r) = aoa
Proof: \fr = 0, then P(m, r) = | because there is only one empty sequence.
Suppose r > 0. Then there are n possible values for the selection of the first of
r objects from n objects. Since selection is without replacement and one object
has been chosen, there are only n — 1 possible values for the selection of the
second object. Similarly, there are n — i+ 1 possible values for the selection of
the ith object for all i,1<i<r. By the rule of product, we have
Pa
r) ,
= n(n — ID —2)---m—rt+)D=ant(r—r! §
Examples
(a) Let & = {a, b, c, d, e}. Find the number of strings in £* of length 3 such that
no symbol is used more than once. This is the number of permutations
of 5 things taken 3 at a time because selection is without replacement, and
PG, 3) = 5-4-3 = 60.
(b) Find the number of injections from a finite set A to a finite set B. If| A| > | Bl,
there are no injections from A to B (this follows from the pigeonhole principle).
If |A| <|B|, then the number of injections is P( Bl,|A). #
(" ) == 1 since there is only one way to choose the entire set of n objects. If
Theorem 5.1.8: Let r,n < Nand r <n. The number of combinations of n
_ n!
things taken r at a time is (7 )
~ ria — ryt
Proof: An ordere d list of r ele men ts can be fo rm ed by fir st ch oo si ng r ele -
ments and then ordering the m. Co ns eq ue nt ly , the nu mb er of list s of r ele men ts,
P(n, r), is equal to the num ber of way s of cho osi ng a sub set of r ele men ts, (7) ,
times the num ber of way s of arr ang ing the r ele men ts in a list , r! Thu s
and therefore
Proof; Let A be a finite set with cardinality n. Then the number of distinct
> CG ).
subsets of A with r elements is (" ), and the total number of subsets is r=0
By Corollary 5.1.5, the number of subsets of A is 2”. Jj
overhead. The total system overhead depends on the order in which the jobs are
run. For example, if two programs both require an ALGOL compiler, running one
program after the other will often eliminate the cost of bringing the compiler into
core the second time. This is the reason for “batch processing” programs written
in a single language. An algorithm to solve the traveling salesman problem would
enable us to specify the sequence of jobs which will minimize the total system
overhead for running the programs.
The set of n cities can be thought of as a complete digraph of n nodes; the values
ce;; represent the distances between the nodes. If the triangle inequality holds, then
the shortest route will visit each city other than C, only once, and so the only
routes of interest are the simple cycles beginning and ending with C,. It follows
that there are (7 — 1)! possible routes for the salesman. The most straightforward
way of finding the shortest route would be to list all (7 — 1)! cycles and then
calculate the total distance associated with each cycle. Such a process of “complete
enumeration” has the virtue of being easily programmed, but the problems of
using such an algorithm become apparent if we consider an example for which the
number of nodes is not small.
Finding the total distance for a single route will involve n additions. Since there
are (n — 1)! possible routes, the total number of additions is n! Suppose there are
50 nodes. The value of 50! is approximately 3 x 106+, Even assuming a computer
which performs 10° additions per second, it will take more than 1047 years just to
perform the additions required by the algorithm. #
The straightforward algorithms for solving the traveling salesman problem are
easily written but impractical for large values of n. This is because the number of
operations required to solve the problem by complete enumeration grows very
fast as the number of nodes increases. In practice, the number of arithmetic opera-
tions can be reduced by eliminating duplications, and the size of the problem can.
sometimes be reduced by constraining the set of acceptable solutions or by using
heuristic methods which consider only some of the cycles. For any but small values
of n, one or more of these techniques must be incorporated if an algorithm is to be
economically feasible. Depending on the exact techniques chosen, however, the
resulting algorithm may not be guaranteed to produce the shortest route, but
rather the shortest of all those routes considered by the algorithm.
Decision Trees
Digraphs are often useful for counting and enumeration problems. In model-
ing system behavior, a state graph, or state diagram, is a digraph in which each
node represents one state of a system, and each edge represents a possible transi-
tion from one state to another. Each node of a state graph is labeled with a state
name, and the edges are labeled with the input or action which causes the transi-
tion.
We often wish to consider systems in which every sequence of transitions
causes the system to enter a unique state. If we consider only states which are
accessible from some given initial state, then the state graph is a tree whose root
ANALYSIS Ch. 5
226 COUNTING AND ALGORITHM
st at e; su ch tr ee s ar e of te n ca ll ed de ci si on tr ee s. Fo r so me
repr es en ts th e in it ia l
nv en ie nt wa y of en um er at in g th e se t of
problems, decision trees provide a co
ie s of a so lu ti on pr oc ed ur e. Ea ch in te rn al no de of a de ci si on tr ee
po ss ib le hi st or
le af co rr es po nd s to a so lu ti on . Ev er y in te r-
corresponds to a partial solution; each
ad di ti on al kn ow le dg e, an d ea ch br an ch
nal node is associated with a test to obtain
a no de is la be le d wi th a di st in ct te st ou tc om e. Vi ew ed in te rm s of
outwar d fr om
pr oc ed ur e co rr es po nd s to tr av er si ng a
its decision tree, execution of a solution
to a le af . Th e le ng th of th e pa th tr av er se d is eq ua l to th e nu mb er
path fr om th e ro ot
e, an d th e he ig ht of th e tr ee is eq ua l to th e
of tests made by the solution procedur
by an y ex ec ut io n of th e pr oc ed ur e.
maximum number of tests required
st ra ti on of th e us e of de ci si on tr ee s, su pp os e we ar e gi ve n ei gh t
As an il lu
co un te rf ei t an d he av ie r th an th e ot he rs .
coins, exactly one of which is known to be
co in us in g on ly a pa n ba la nc e to co mp ar e th e
We are asked to find the counterfeit
ng ha s th re e po ss ib le ou tc om es : th e le ft
weights of two sets of coins. Each weighi
in th e le ft pa n we ig h mo re th an th os e
pan can go down (indicating that the coins
ve l, or th e ri gh t pa n ca n go do wn . Fo r co n-
in the right), the pans can remain le
fo r th is pr ob le m, we as su me th e co in s
venience in describing solution procedures
are indexed from | to 8.
in wh ic h ea ch te st ha s tw o po ss ib le ou t-
A binary solution procedure is one
tr ee fo r a bi na ry so lu ti on pr oc ed ur e to fi nd th e
comes. Figure 5.1.1 is a decision
s 1 th ro ug h 4 ar e fi rs t we ig he d ag ai ns t co in s
counterfeit coin. In this algorithm, coin
se t of fo ur co nt ai ns th e he av y co in . Th e se t wi th
5 through 8 to determine which
an d th e pr oc es s re pe at ed . Th is al go ri th m,
the heavy coin is then divided in half
left right
pan pan
down down
which requires three weighings to locate the counterfeit coin, is not very efficient,
since the coins are never weighed in a way which permits the pans of the balance to
remain level; thus, one of the possible test outcomes can never occur.
A ternary solution procedure involves tests with as many as three possible
outcomes. Figure 5.1.2 is a decision tree for a ternary solution procedure to find
the counterfeit coin. By exploiting the fact that each weighing can result in any
of three outcomes, this procedure reduces the number of weighings to two. A third
solution procedure, which requires from one to four weighings, is represented by
the decision tree of Fig. 5.1.3.
Efficiency is a prime consideration in algorithm selection but comparisons of
algorithms must be made with respect to the particular problem to be solved. For
example, if a heavy coin is known to exist and is most likely either coin 1 or coin 2,
then the algorithm of Fig. 5.1.3 may be preferred. But in the absence of informa-
tion to the contrary, we commonly assume that all possible outcomes are equally
likely. In this case, we often prefer a procedure in which the maximum number of
steps executed by the algorithm is as small as possible. A minimax procedure is one
which minimizes the maximum number of steps required to solve the problem.
When a solution procedure is represented by a decision tree, the height of the tree
is the maximum number of steps that can be executed. It follows that the height of
the decision tree of a minimax procedure is no greater than the height of a decision
tree for any algorithm which solves the problem. The algorithm represented by
Fig. 5.1.2 is minimax for the counterfeit coin problem in which one of eight coins
is known to be heavy.
Basic counting techniques can often be used to find bounds on the number
of steps of the minimax solution of a task.
{1, 2.3} a{i} vs. {3} {4,Sta {4} vs. {5} {6,
7, 8a {6} vs. {8}
Fig. 5.1.3. Decision tree for using a pan balance to find a counter-
feit coin known to be heavy.
Example
Suppose you are given 13 coins where at most one is counterfeit; a counterfeit
coin is either heavier or lighter than a genuine one. Consider the problem of
devising a minimax algorithm to detect the counterfeit coin if one exists, and state
whether it is heavier or lighter. The algorithm should only use a pan balance for
comparisons.
Analysis: To determine a lower bound on the height of a decision tree for the
problem, we begin by arranging the coins in some fixed but arbitrary order. Any
one of 27 conditions may exist. For some i between 1 and 13, the ith coin may be
counterfeit and either heavy or light; hence, if there is a counterfeit coin, then one
of 26 conditions is possible. A 27th condition occurs if no coin is counterfeit. Con-
sequently, the decision tree of a solution procedure must have at least 27 leaves.
Since each weighing will have one of three possible outcomes, k weighings can
yield any of 3* different results, from which we must infer which of 27 conditions
holds, It follows that a minimum value of & can be obtained from the inequality
3* > 27.
Thus, we have found a lower bound for the number of weighings necessary; k >-3.
In fact, three weighings will not suffice. This can be shown by considering the
number of cases which must still be distinguished after the initial weighing. If the
initial weighing compares coins 1 through 4 with coins 5 through 8 and the weights
are equal, then there are still 11 conditions which may hold: any of coins 9 through
13 may be heavy, or light, or all may be equal. But there are only nine possible
Sec. 5.1 BASIC COUNTING TECHNIQUES 229
outcomes for two weighings; thus, two weighings are not sufficient to distinguish
among the remaining eleven possible conditions. It follows that any algorithm
which uses an initial weighing which compares two sets of four coins will require
more than three weighings to distinguish some of the conditions.
Now suppose the initial weighing compares coins 1 through 5 with 6 through
10. If the weights are not equal, then any one of ten conditions may hold since any
of the coins on the light side may be light or any of those on the heavy side may be
heavy. Again, two additional weighings are not sufficient to determine which of
these conditions holds. In a similar way, it can be shown that any other initial
weighing will leave too many conditions to be resolved by the last two weighings.
This establishes that three weighings are not sufficient and therefore the height of
a decision tree for this problem must be at least four. #
If you flip a coin 5 times, ho w ma ny dif fer ent way s can you get exa ctl y 1 hea d?
2 heads? Find a formula for the nu mb er of way s of obt ain ing r hea ds wit h n fli ps
of a coin.
Cou nt the nu mb er of dig rap hs wit h nod e set S = {0 ,1 ,2 ,. .. ,” — 1}.
© ("P)=62)+()
na+i\_ n n
15. Pro ve The ore m 5.1 .9 by ind uct ion . (Hi nt: Use pro ble m 14( c). )
(6-1) Sb =o! 1
i=
(b) Interpret this ide nti ty in the con tex t of num ber rep res ent ati on in the bas e 4
using the sta nda rd pos iti ona l not ati on. (It may hel p to exp and the ide nti ty
for b = 10 and n = 4.)
17. Let S be the set of fun cti ons {0, 1}4 whe re A is the set of bin ary n-t upl es, {0, 1}”. The
set S is called the set of switching functions of n variables.
(a) Specify |.S| as a function of n.
(b) A swi tch ing fun cti on is sel f-d ual if it rem ain s unc han ged whe n all occ urr enc es
of 0’s and 1’s are int erc han ged in its def ini tio n. For exa mpl e, if n = 2 the
function
f(O, 0) = 0,
£0, 1) = 1,
fC, 0) = 0,
fa,) =1,
is self-dual. Count the number of self-dual switching functions of n variables.
18. Consider a computer in which numbers are represented with p binary digits as
follows. For integer arithmetic, a number is represented using one bit to indicate the
sign and the remaining p — 1 bits represent the magnitude. (This is called a sign-
magnitude representation.) The floating point representation uses m bits to represent
the mantissa of a floating point number and k = p — m bits to represent the expo-
nent, where m, k > 2. Both the mantissa and the exponent are represented using
a single sign bit and the remaining bits as magnitudes. The exponent specifies a power
of 2, and the floating point representation is normalized, i.e., the exponent is
Sec. 5.1 BASIC COUNTING TECHNIQUES 231
adjusted so that the radix point is to the left of the digits of the mantissa and the
leading digit of the mantissa is 1 unless the value of the mantissa is 0.
(a). How many distinct integers can be represented in the integer notation? (Note
that there are two distinct representations of 0.)
(b) How many distinct real numbers can be represented in the floating point
notation?
(c) Estimate the number of distinct integers that can be represented in the floating
point notation ifm = 24 and k = 8.
(d) Estimate the ratio of integers representable in integer representation to integers
representable in floating point representation if m= 24 and k = 8.
19. (a) You are given 12 apparently identical coins of which at most one may be
counterfeit. A counterfeit coin is always either heavier or lighter than a genuine
coin. Find a minimax algorithm using a pan balance to locate the counterfeit
coin if it exists and determine whether it is heavy or light. Present your algor-
ithm as a decision tree.
(b) You are given 13 apparently identical coins, exactly one of which is counter-
feit and is either heavier or lighter than the others. Find a minimax algorithm
to locate the counterfeit coin.
20. Suppose all of n > 2 coins are of equal weight except for one which is known to be
heavier than the others. Find a lower bound for the number of weighings (using a
pan balance) needed by a minimax algorithm to locate the heavy coin. (You need
not specify an algorithm.)
21. Trees often provide a way of enumerating the set of solutions to problems. For
each of the following classical problems, construct a tree of minimal height which
contains a path which describes a solution. Each node of the tree should be labeled
with a system state and each branch of the tree should correspond to a single action
which changes the state of the system. As you construct the tree, do not include
any new node which has a label which already appears in the tree; thus, no two
nodes of the tree should be labeled with the same system state. The solution with
the minimum number of steps may not be unique; for each problem, count the
number of minimum step solutions.
(a) The Towers of Hanoi. Let A, B, and C denote 3 vertical pegs. Initially, 3 discs
of unequal size are arranged on peg A with the largest disc on the bottom and
the smallest disc on top. The problem is to move all 3 discs from peg A to peg
C. Each move consists of moving a single disc from one peg to another. No disc
may ever be placed on a disc smaller than itself.
(b) Missionaries and Cannihals. Three missionaries and three cannibals are initially
on the south side of a river and wish to cross to the other side. They have a
single boat which holds at most 2 people but can be handled by a single person.
However, if at any point the cannibals outnumber the missionaries on either
shore, a missionary will be devoured. Find a way to transport all the cannibals
and missionaries across the river without losing anyone in the process. You
may assume that missionaries do not eat cannibals.
(c) You are given an eight gallon container filled with water and two empty con-
tainers of capacity 5 and 3 gallons respectively. The containers are not
graduated. Find a way to divide the water into two four gallon quantities.
232 COUNTING AND ALGORITHM ANALYSIS Ch. 5
plexity function of the algorithm. We will refer to either kind of function as simply
a complexity or cost function of the algorithm, but our principal concern will be
with time complexity functions.
In general the cost of obtaining a solution increases with the problem size
n. If the value of n is sufficiently small, then even an inefficient algorithm will not
cost much to run; consequently, the choice of an algorithm for small problems is
not usually critical. For this reason, our concern is with values of n which are
large enough to make some algorithms impractical. In order to compare the
performance of algorithms for relatively large values of n, we will consider the
behavior of their cost functions as n grows large; this is called the asymptotic
behavior of the cost functions. The next definition introduces the fundamental
concept.t
If g asymptotically dominates f, and g(n) + 0, then | f(n)/g(n)| < m for all but
a finite number of values of n, none of which are greater than k. Thus if fand g are
cost functions for algorithms F and G respectively, then for problems of size k or
greater, execution of F will never be more than m times as costly as execution of G.
Examples
(a) Let f(n) =n and g(n) = —n3. Since |n| <|—n3| for all n € N, Definition
5.2.1 is satisfied by setting kK = 0 and m = 1. Hence, g asymptotically domi-
nates f. Note that f does not asymptotically dominate g, since regardless of
the choice of m, | —3| > m|n| for all n greater than both 1 and m.
(b) Let g be an arbitrary function from N to R, and let f(m) = cg(n), wherec €C R
and c > 0. Then the functions f and g asymptotically dominate each other
since | f(m)| << clg(n)| for all n € N and | g(n)|< 1/e|f(@)}| for alln EN.
(c) The functions f(7) =n, g(n) =n + 1f(n+ 1), and A(n) = bn +c, where
b,c € Rand 6 > 0, all asymptotically dominate each other. #
tOur interest is in applying these notions to functions of discrete variables, and we will treat
the present topic using functions from N to R. However, the definitions and theorems of this
section extend in a natural and straightforward way to functions from R to R.
234 COUNTING AND ALGORITHM ANALYSIS Ch.5
The binary relation of asymptotic domination will provide a basis for com-
paring complexity functions. If two functions f and g asymptotically dominate
each other, then the associated algorithms will be considered equivalent, and any
differences in cost of execution will be largely ignored. Suppose, on the other hand,
that g asymptotically dominates f but not vice versa, where f and g are the com-
plexity functions of algorithms F and G respectively. Then even if G is speeded up
by some arbitrary factor (through clever programming or a faster machine) so that
the complexity function of the fast version is cg, where c <1, cg will asymp-
totically dominate f but not vice versa. Consequently, for any m > 0, there will exist
an infinite number of arguments 1 such that cg(n) > mf(n).
Definition 5.2.2: The set of all functions which are asymptotically dominated
by a given function g is denoted by O(g) and read “order g,” or “big-Oh of g.”
If f € O(g), then fis said to be O(g).
Example
(a) Let f(x) =n and g(n) = n3. Then using an argument similar to that in the
previous example, we see that fis O(g) but g is not O(/).
(b) Let f(z) =n and h(n) = 3n. Then fis O(A) and A is OCS).
(c) Let f(n) =n. The following functions from N to R are all members of O(/).
fim) =k fork € R,
fi(a) = kn fork ER,
Aa=n+k fork € R,
fs(n) =n + If(n + 1). #
The following theorem asserts that f is O(g) if and only if every function asymp-
totically dominated by fis also asymptotically dominated by g.
Theorem 5.2.3: Let fand g be functions from N to R. Then fis O(g) if and
only if O(f) < O(g).
Proof:
(a) (O(f) < O(g) > f € O(g).) From Theorem 5.2.2 we know that
fe O(f). Since O(f) < O(g) it follows that f € O(g).
(b) (f € O(g) > O(f) < O(g).) Let h be any element of O(/); then h is
asymptotically dominated by f. Since f € O(g), f is asymptotically
dominated by g. Since the relation of asymptotic domination is transi-
tive (by Theorem 5.2.1(a)), it follows that / is asymptotically dominated
by g and therefore h is O(g). Since # was chosen to be an arbitrary mem-
ber of O(/), it follows that O(f) < O(g). ff
The following is an immediate result of the previous theorem; its proof is left as
an exercise.
1, fis O(1). For any algorithm of complexity O(1), there exists some k © N
such that execution of the algorithm will cost r<k regardless of the
value of n. Thus the cost of applying the algorithm can be bounded
independently of the problem size n. Any function which is O(c), where
c € R, is O(1). An algorithm of O(1) complexity is said to have constant
complexity.
Sec. 5.2 _ASYMPTOTIC BEHAVIOR OF FUNCTIONS 237
The following theorem establishes that the classes we have listed are given in order
of increasing complexity.
Theorem 5.2.4: Consider the class F of all functions from N to R. Then for
c € Rsuch that c > 1,
OU) < O(log n) < O(1) < O(n log n) < O(n?) < O(c") < O(n),
and all containments are proper.
Proof: The proofs that containments are proper are left as exercises. We
will prove the first, second, and fifth containments and leave the others as exercises.
By Theorem 5.2.3, in order to show O(f) < O(g) it suffices to show that fis O(g).
(a) (O(1) < O(logn).) Let f(7) = land g(n) = logan. For alln > 2, 1 <logn
and therefore f is O(g). By Theorem 5.2.3, it follows that O(1) < O(logn).
(b) (O(log n) < O(n).) For all n > 0, log n < n, and therefore log n is O(n).
It follows that Odogn) < O(n).
(c) (O(n?) < O(c") for ¢ > 1.) We will show that for sufficiently large n,
n*<c",
Unless explicit statement is made to the contrary, all logarithms in this book are to the base 2.
For ease of exposition, we have defined the concepts of this section only for functions from
N to R. However, because we are concerned with the behavior of functions for large arguments,
the definitions can be extended without difficulty to include partial functions from N to R which
are defined on all but a finite subset of N. Thus, the fact that f(z) = log m and g(n) = nlognare
not defined for the argument n = 0 causes no substantive difficulty, and we will use O(log x), for
example, to denote the set O(g), where
Since c > | it follows that log c > 0, and hence the above inequality will
hold if
2logn <n loge
or
2 <n,
log c ~ log n
The ratio n/log n becomes arbitrarily large as n increases, and therefore
for any c > 1, this inequality can be satisfied by choosing n sufficiently
large. Thus n* is O(c”) and hence O(n?) < O(c"), Jj
Complexity functions which involve various powers of n often occur in the analysis
of algorithms. The following theorem and its corollary are important for relating
these sets of functions.
Theorem 5.2.5: Let c, d € R, where 0 <c < d. Then O(n‘)< O(n‘), and
the containment is proper.
Proof: Foreachn > 1, if ¢ < d, thenn*® <n’. It follows that O(n") < O(n’).
To show the containment is proper, we will show that 7? is not O(n’). If n?
is O(n*), then for some k such that k > 0, the inequality |n“| < k|n*| holds for
sufficiently large n. If k is chosen to be 0, the inequality does not hold for n > 1.
If k is positive, then n can be chosen large enough that log n > log k/(d — c). But
Examples
(a) The function f(n) = 1/n + 63 is O(1).
(b) The function f(n) = rn + knlogn is O(n logn).
(c) The function f(n) = .6n3 + 28n2 + 31n + 468 is O(n). #
The following theorem establishes that the logarithmic base does not affect
the asymptotic behavior of functions which are O(log 7).
Proof: Using the fact that log, (a*) = x log, a, we observe that
log, n = log, (c’°*") = log, n-log, c = k log, n.
By application of Theorem 5.2.2 and Corollary 5.2.3 it follows that O(log, 2) =
O(log. n). fi
Theorem 5.2.7: Let b,c € R be constants greater than 1. Then O(7 log, n) =
O(n log, n).
The proof is left as an exercise.
In practice, any algorithm can be executed on small problems; that is, when
nis small enough, but the asymptotic behavior of a complexity function provides
important information about whether it will be feasible to execute an algorithm
for moderate or large values use of n. This point is illustrated in Tables 5.2.1 and
5.2.2.
Comparing algorithms on the basis of their asymptotic behavior is a powerful
and convenient technique, but it must be used with caution. Thus, while we would
Complexity Function
5 | 3 5 12 25 32 120
10 4 10 33 102 1024 3 x 106
102 7 102 664 104 1.3 x 103° *
103 10 103 9965 106 * *
104 14 104 1.4 x 105 108 * *
expect an O(n) algorithm to be “better” than one which is O(n?), in fact we cannot
choose between them without more information. For example, suppose that
algorithms F and G have complexity functions f(n) = cn and g(n) = dn’. If the
values of the constants are c = 50 and d = 1, then Fis a more attractive algorithm
only if n, the problem size, exceeds 50. Since this value of n may be larger than most
of the problems of interest, it may be that the O(n’) algorithm is the best choice.
Thus in order to choose between algorithms, it is generally necessary to know the
specific complexity functions and the problem size as well as the asymptotic
behaviors.
By extending the way in which order notation is used, we can characterize
algorithm performance more precisely than is possible with the notation we have
developed thus far. In the extended usage, the notation O(f) is used on the right
side of an equation to denote a member of the set O(f). For example, the assertion
that the algorithm F has asymptotic complexity f, where
f(a) = 1.6n? + O(n log n)
is interpreted as meaning f(n) = 1.6n? + g(n), where g(n) is a member of
O(n log n). This is a stronger assertion than
f(n) = O(n’);
the second is implied by the first but not vice versa. Using this extended notation,
the complexity function of different algorithms can be compared with one another
on the basis of the coefficients of dominating summand functions as well as less
important summands. Thus, for sufficiently large n, an algorithm with a com-
plexity function f(n) = 1.6n? + O(n) will probably be less costly than one who
se -
complexity function is g(n) = 2n? + O(n), which in turn will probably be less
costly than one whose complexity function is A(n) = 2n? + O(n log n).
1. Let F be the class of functions from N to R, and let Ff, g € F. Define the
binary rela-
tion = as follows:
f = gif and only if fand g asymptotically dominate each other.
Sec, 5.2 ASYMPTOTIC BEHAVIOR OF FUNCTIONS 241
(a) Show that = is an equivalence relation. (This is part (b) of Theorem 5.2.1.)
(b) Let [f/f] denote the equivalence class of f under the relation =. Show that the
binary relation
Lf] < [elif and only if fis asymptotically dominated by ¢
is a partial order on the quotient set F/=.
Give an example of a function in O(1) which is not a constant function.
Find a pair of functions f and g from N to R such that f € O(g) and ¢ ¢ O(/).
Define a function f: N — R to be bounded if there exists some r € R such that for
alln € N,|f(#)| <r. Prove that every bounded function is O(1).
For each of the following pairs of functions, f: N-> R andg: N — R, determine if
and how fand g are related in terms of asymptotic domination.
(a) f(n) = 1 for n even,
for n odd.
g(n) = for n even,
= for n odd.
(b) f@ = for n even,
=] for n odd.
g(n) = for n even,
=) for n odd.
(c) fi) =n.
g(n) = n/100 if n 4 10* for some k,
== 107 1%,2 if n = 1, 10, 100, etc.
(a) Using logical notation, write out the definition of “f does not asymptotically
dominate g.”
(b) Using the assertion of part (a), argue that if f does not asymptotically domi-
nate g, then for any m there exists an infinite number of arguments 1 such that
lg(n)| > m|f@)|.
(c) Determine whether the following assertion is true. “If f does not asymptotically
dominate g, then for all m > 0, if n is sufficiently large, then | g()| > m|f(n)|.”
Let f; and f; be functions such that f, is O(g1) and f, is O(g2).
(a) Prove that if g,(”) and g2(m) are nonnegative for all arguments n < N, then
fi +fr is O(g1 + 82).
(b) Prove that f; + f, may not be O(g1 + g2).
Let f and g be functions from N to R, and denote by f+ g the product function:
11. Prove the following assertions and show that each of the containments is proper.
(a) O(n) < Off log n).
(b) O(n logn) < O(n4), for alld > 1.
(c) O(c") < O(n), for alle > 1.
12, Show that for all integer values of k, n > 0, O(log n) = O(log (n + &)).
13. Prove Corollary 5.2.5.
14. Consider the class of functions F, where
F=({(f|f
and:N
f(N)
-Rc N}
i.e., the image of every member of F is a subset of N. Let f and g be members of F.
Prove or disprove the following.
Conjecture: If f and g are O(h), then fg is O(h) (i.e., the set O(h) is closed under
composition of functions).
15. Prove Theorem 5.2.7.
16. Suppose two algorithms F and G have time complexity functions
fi(n) = 528
Si(n) = 3n* logn + logn
AM=F+5 1
> i is O(nk*1),
a
i=Q
The expressions for permutations and combinations developed in Section 5.1 are
the most fundamental tools for counting the elements of finite sets. They often
prove to be inadequate, however, and many problems of computer science require
a different approach. An important alternate approach uses recurrence equations
(often called difference equations or recurrence relations) to define the terms of
a sequence. A formal definition of recurrence equations is difficult because of the
wide variety of forms in which such equations can be written, but the concept is
straightforward. We have already seen an example of a recurrence equation in the
definition of the Fibonacci sequence, where for n > 2, the term a, is defined by the
recurrence equation
a, = Qn~1 + Qy-2
Examples
(a) The number of permutations of n objects can be expressed using the following
recurrence system:
P(O) = 1,
P(n) = nP(n — 1), for n> 0.
The correctness of this system can be established as follows:
1. The objects of an empty set can be arranged in a sequence in exactly one
way. Thus, the boundary condition is P(0) = 1.
2. Given n objects, 2 > 0, we can choose the first object of a sequence in
any of n ways and then arrange the remaining elements in P(n — 1) ways.
Thus, the recurrence equation is P(n) = n-P(n — 1) forn> 0.
244 COUNTING AND ALGORITHM ANALYSIS Ch. 5
fO,k) = 1,
fh, =k flh—i,k) forh>0.
The system is based on the following arguments.
1. A tree of height 0 has a single node which is a leaf, so f(O, kK) = 1. This
gives the boundary condition.
2. A tree of height 4 > 0 will have the maximum number of leaves if its
root has k sons, each of which is the root of a subtree of height — 1
with f(h — 1,k) leaves. A tree of height A can therefore have up to
ke f(h — 1, &) leaves.
It can be shown by induction on A that k* is a solution to this system.
(c) Pascal derived the following recurrence system to evaluate ( k ), the number of
subsets of & objects in a set of n objects.
The number of injections and bijections from a set S to a set TJ’ can easily be
expressed in terms of permutations involving | S| and |7|; these expressions were
given in examples in Section 5.1. The number of surjections from one set to another
is difficult to characterize using only permutations and combinations, but can be
easily expressed using a recurrence system.
Sec. 5.3 RECURRENCE SYSTEMS 245
j. Thus, there are ( f; ) S(m, j) different functions from A to B which have an image
of cardinality j, where j < n. Then the total number of functions from A to B which
n
are not surjections is })7=} ( j ) S(m, j). Since there are a total of n” functions from
Ato B,
It is obvious that a recurrence system can be used to obtain any term of the
associated sequence by iteratively solving the recurrence. Alternatively, it is
sometimes possible to find an expression for the solution which can be evaluated
directly for any argument nv to find the value of the nth term.
Examples
The following are examples of solutions to recurrence systems. In each case
the expression can be shown to be a solution by direct substitution. All of these
solutions are unique, but we will not prove this.
(a) The following system describes a function which grows exponentially:
ay = k,
Qy == CAy-4 forn> 0.
Qa, = 1,
ag = bo,
a, = C,4,-1 + b,
where the value of the coefficients b, and c, may be functions of n. The value of the
general term a, can be expressed as a sum by adding both sides of the following
sequence of equations, where each equation is obtained by using the recurrence
relation to express a term in the preceding equation.
a, == CpQy-1 + b,
CpAn—1 = Cp€g-
1 Ayo + CD a1
Note that the right side of the last equation only involves the coefficients and
boundary conditions of the sequence. Forming separate sums of the left and right
sides of this set of equations and then cancelling common summands yields
n-~2 ant
a, = b, + Cyd, -1 + CnCy-1Dq—2 + cee + TT c,-ib1 + IT Cy-iDo-
Example
Consider the recurrence system
ay = b,
An = CAy.4 + 8.
Sec, 5.3 RECURRENCE SYSTEMS 247
clay = cay + c™ 1b
cay = c"b
Summing the left and right sides and cancelling gives
a, = b+cb
+ c2b + +--- +7
'1b + crb
=bSc.
i=0
= ey
a ) — ntl
ifexl. #
Example
The following procedure returns the sum of the first n entries of an array A
procedure SUM(7):
begin
total — 0;
for i — 1 ton step 1 do
total — total + A{i];
return total
end
Lemma 5.3.2a: Let a, b, and c be integers such that a> 1,b > 1, andc> 0,
and let f: N — R be any function whose values obey the recurrence system
ff) =e,
fn) = af (Z) +e for
n = b* where
k > 0.
f(n) = af (F) +e
os(f)=#1(B)+0
>
o1(20B(B)) 0
aif (a) = a'f (Fr) + ak-le,
Summing both sides of the above sequence of equations and cancelling common
summands, and noting that f(n/b*) = f(1) = c, we have
(a) Ifa = 1, then f(m) = c(k + 1). Butk = log,n,so f(n) = c(log,n + 1).
COUNTING AND ALGORITHM ANALYSIS Ch. 5
250
f() = (H | ):
But k = log, n, and a'®" = n'°®*, Therefore,
_ c(aa’®* me 1) _ e(an'® 4 —_ 1)
f (n) ~— a— { — a— 1 i
From Lemma 5.3.2a we can det erm ine the asy mpt oti c beh avi or of the fun cti on
f for those arg ume nts whi ch are pow ers of b. The fol low ing def ini tio n is a gen -
eralization of the con cep ts int rod uce d in Sec tio n 5.2. Thi s gen era liz ati on per mit s
us to dis cus s the asy mpt oti c beh avi or of fun cti ons on a sub set of the dom ain N.
Definition 5.3.1: Let f and g be fun cti ons fro m N to R, and let S be an
infinite sub set of N. The n f is O(g ) on S if the re exis ts k > 0 and m > 0 suc h tha t
| f()| < m|g(n)| for alln € S such that n > k.
Example
Let f: N > R be defined as follows:
f(x) =1 if x is even,
fw=*x if x is odd.
Then fis O(1) on the set of even integers, but fis not an O(1) function. #
It is easy to see that if gis O(A) and S < N, then gis O(h) on S. Moreover, the prop-
erties of asymptotic behavior we have considered extend in a natural way to
asymptotic behavior on S. For example, if c is a constant and fand g are O(A) on
S, then cf and f+ g are O(h) on S.
The next lemma is an immediate consequence of Lemma 5.3.2a; its proof is
left as an exercise.
Lemma 5.3.2b: Let a, b, and c be integers such that a > 1,6 > 1,andc > 0,
and let f: N—> R be a function such that
fd) =e,
f@=af(tn/b)+ec forn = b* where k > 0.
Let S = {b¥|k © N}.
(a) Ifa=1, then fis Odog x) on S.
(b) Ifa #1, then fis O(7'*?) on S.
We now use the preceding lemma to characterize the asymptotic behavior for
arguments which are powers of b for a large class of recurrence systems.
Theorem 5.3.2: Let a,b, and c be integers such that a> 1, b> 1, and
c > 0, and let f: N — R be any function such that
fM<e,
f(n)<af(n/b)+c forn = b* where k > 0.
Sec. 5.3 RECURRENCE SYSTEMS 251
gl) = ¢,
g(n) = ag(n/b) + ¢ for n = b* where k > 0.
By Lemma 5.3.2b, the function g is O(log) on S if a = 1 and O(n) on S if
a ~ 1. It is easy to show by induction that any function f which satisfies the follow-
ing inequalities
fOse,
f@ <af(n/b) + ¢ for n = b* where k > 0,
is bounded by the function g for all arguments which are powers of 6, that is,
ifn € S, then f(n) < g(n).
We conclude that the function fis O(log x) on S if a = 1 and fis O(n?) on S if
axl. |
Example
The procedure MAXMIN given in Fig. 5.3.1 applies a divide and conquer
strategy to return the maximum and minimum values of the entries Afi], ..., ALJ]
of a vector 4. MAXMIN first determines if there is a single entry, i.e., if i = j; in
this case, MAXMIN returns the ordered pair <A[i], A[i]>. If i <j, then MAXMIN
divides the entries into two disjoint subproblems of approximately the same size
and solves each of the subproblems recursively. The solutions to the subproblems
are then used to construct the solution to the original problem. To find the largest
and smallest entries of the array A[1: 7], we call MAXMIMN (1, 2). We define the
f(1) = 0,
f@) = 2f(i/2) +2 for n = 2* where k > 0.
The function f obeys the following inequalities:
fl) <2
f(n) < 2f(n/2) +2 for n = 2* where k > 0.
By Theorem 5.3.2 we can conclude that MAXMIN is an O(n) algorithm if n is a pow-
erof2. #
Examples
(a) The procedure MAXMIN, given in Fig. 5.3.1 and discussed in the previous
example, is O(n) for all nm = 2*, and the number of comparisons made by
MAXMIN increases with n. Therefore we can conclude from Theorem 5.3.3
that MAXMIN is an O(x) algorithm for all arguments n € N.
(b) A binary search of a sorted list stored in A[i:j] is given in Fig. 5.3.2. The
procedure determines whether an argument arg is contained in any of the
locations A{i], Afi + 1],..., ALj]. If so, the procedure returns the index of
the argument in A; otherwise the procedure reports that the argument was
not found. To search array A[1: 7] for arg, we call BINSEARCH (arg, 1, 7).
The procedure first compares arg with an element near the middle of the list.
If they match, the search is successful and the index of the element is returned.
Otherwise, if arg is less than the element, the search is resumed recursively on
the initial portion of A, and if arg is greater than the element, the search is
continued on the second portion of A.
procedure BINSEARCH
(arg, i,j):
begin
m—|F4 |;
i
Fig. 5.3.2. Binary search for arg in the array A[i:j] where i<j
and entries are sorted in increasing order
When the recurrence relation of a divide and conquer algorithm is of the form
JS (n) = af(n/b) + c, the constant ¢ represents the cost of splitting the problem
into subproblems plus the cost of combining their solutions to solve the original
problem. Sometimes the cost of splitting the problem, and more often the cost of
combining the solutions of the subproblems, increases with n. We next consider
recurrence relations of the form f(n) = af(n/b) + cn; these recurrence relations
can be applied when the splitting and combining costs grow linearly with n. Using
the techniques and results developed previously, we can prove the following result.
Theorem 5.3.4: Let a,b, and c be integers such that a> 1, b> 1, and
c > 0, and let f: N — R-+ be a monotone increasing function such that
fM<e,
f@ <af(n/b) + cn for n = b* where k > 0.
(a) Ifa< bb, then fis Om).
(b) Ifa =), then fis O(n log n).
(c) Ifa> 8, then fis O(n'*?).
Proof: Suppose n= b*, where k © N and k > 0. Then we can bound
F(@) as follows:
I) <af (+) + en
af (5) + gk! ae
< a'f (Fe) +.
Summing both sides of these inequalities and cancelling summands which appear
on both sides, we obtain
It follows from Theorem 5.3.3 that f is O(n log n) in the case that a = b.
(ii) If a-=4 5, then we can apply the identity
| —_ xrtt _ n j
l—x 2x
to the inequality
fa)<en>dfi\b(+)
k
to obtain
1 _ (4 k+i
FQ) <n be
bh
cn peti — qk tt _ b**} — qk
Example
Suppose S is an arbitrary sequence of n distinct elements and we wish to build
a binary search tree of minimum height which contains the elements of S as node
values. The following algorithm can be used.
1. Find the median element m of S. (The median is the element of S that would
appear in the [n/2]th position if the sequence S were sorted.) The root of the
tree is assigned the value m.
2. Form two sequences S; and S, such that S; consists of those elements of S
which are less than m and SS, consists of those elements of S which are greater
than m.
3. Apply this procedure recursively to S, to construct the left subtree of the root,
and to S, to construct the right subtree.
An O(n) algorithm# exists for finding the kth largest (and therefore the |. n/2|th
largest, or median) element of any sequence; it follows that there exists some
+A care ful desc ript ion of a line ar algo rith m to find the med ian of a sequ ence of elem ents is
beyond our scop e. The read er is refe rred to Aho, Hop cro ft and Ult man [197 4], page 97.
256 COUNTING AND ALGORITHM ANALYSIS Ch. 5
integer c such that the median of any set with n elements can be found with
no more than cn comparisons. Thus, step 1 can be performed with at most cn
comparisons. After the median m has been found, the sequences 5, and S,
can be formed by comparing m with every element a; € S — {m}; we add a;
to S, if a; < mand add it to S, if a; > m. Thus step 2 can be accomplished
with n — 1 comparisons. Consequently, we can characterize the number of
comparisons necessary to build the binary search tree from S as follows:
1. In each of the following prove that the given expression is a solution for the recur-
rence system.
(a) yo = 2, (b) yY=l,
, Vn = 3Y nat for
n > 0. __ Vn-t
y, = 2-3", Yn forn > 0.
_ il
Jn = i"
(c) Yo == 2,
Yn == Vant n > 0.
for
Y_ = 22",
Find a solution for each of the following recurrence systems and determine the
asymptotic complexity of the solution. (The symbols a and b denote arbitrary posi-
tive constants.)
(a) xo = 1, (b) xo =a,
Xn == Xq-1 ta fora > 0. Xy = Xy-1 + BP forn> 0.
(c) x, =1, (d) xo = 1,
X_ = 2x,-1 —1 fornm> 1, Xn = (H+ Dx,-1 for n> 0.
(ce) x, =1, (f) xo =9,
Xn == AXq-1 forn> 1. Xn = X_p-1 tn—1 = forn>0.
(g) xo = 3,
Xn = 3x1 +07 for n > 0.
(a) Find a recurrence system to describe the number of moves that must be made
in a Tower of Hanoi problem with n discs, where n > 0. (See problem 5 1.21(a).)
(b) Solve the recurrence system of part (a).
(a) Consider n coplanar straight lines, no two of which are parallel and no three of
which pass through a common point. Find a recurrence system to describe the
Sec. 5.3 RECURRENCE SYSTEMS 257
number of disjoint areas into which the lines divide the plane. Show that
(n2 + n + 2)/2 is a solution.
(b) Suppose that n > 3 and exactly three of the lines pass through a common point.
’ Find a recurrence system for the number of regions into which the lines divide
the plane.
A derangement of n objects is permutation which leaves none of the objects fixed.
Thus, if fis a derangement function defined on the first » natural numbers, then
Stk) & k for all k <n. Let g be the number of derangements of n objects. Argue the
correctness of the following recursive characterization of g.
g(1) = 0,
&(2) = 1,
g(r) =(n2—)De(n—-1)4+Q@—Dgum—2) forn>2.
(Hint: A derangement either interchanges the first element with another, or it does
not.)
(a) The total path length of a tree is the sum of the lengths of all simple directed
paths from the root of the tree to a node. Find a recurrence system for the
minimum total path length of a complete n-ary tree of height h.
(b) Find the solution to the recurrence system of part (a).
(c) The external path length of a tree is the sum of the lengths of all simple directed
paths from the root of the tree to a leaf. Find a recurrence system for the
minimum external path length of a complete x-ary tree of height h.
(d) Find the solution to the recurrence system of part (c).
Prove Lemma 5.3.2b.
Let f: N— R be a function which satisfies the following relations where b, c > 0:
fO)<e,
fa) <afm—1)+6b forn>0.
If a is a nonnegative real number, describe how the asymptotic behavior of f is
affected by the value of a.
Prove parts (b) and (c) of Theorem 5.3.3.
10. It has been shown (Pohl [1972]) that if a vector A has n entries, then [32 — 2] com-
parisons suffice to find the largest and smallest entries of A. Modify the procedure
MAXMIN so that it never requires more than [3 — 2] comparisons of elements
of A for all n > 1. (Hint: Handle n = 1 and n = 2 as special cases, and make sure
your algorithm does not divide an array with an even number of entries into two
arrays both of which have an odd number of entries.)
11. (a) Construct a recursive procedure MAX2 to implement a divide and conquer
strategy for finding the largest element in the entries A(z), ..., A(j) of an array
A. Your procedure should divide the array into two approximately equal
subarrays.
(b) State the recurrence system which characterizes the complexity function f for
MAX? if f(x) is defined to be the number of comparisons made between entries
of an n element array A, where n is a power of 2.
(c) Find the solution of the recurrence system of part (b).
258 COUNTING AND ALGORITHM ANALYSIS Ch. §
ciated with performing each operation must be given. For example, we may
assume that all arithmetic operations cost the same or we may assume (more
accurately, for most computers) that multiplication is more costly than addition.
Alternatively, we may choose to ignore the cost of some operations. For example,
the cost of applying some sorting algorithms is essentially proportional to the
number of comparisons made between elements of the set being sorted. In the
analysis of such sorting algorithms, it is common to ignore operations such as
assignments, arithmetic operations, and comparisons of loop indices.
In this section we will consider some algorithms and discuss their cost of
execution. In some cases we will also comment on the optimality of these algo-
rithms. Optimality can be discussed in a variety of ways, of which two will be
important to us here. First, we can investigate the absolute optimality of an algo-
rithm with respect to a specified set of operations. If an algorithm is optimal in the
absolute sense, then if the primitive operations are restricted appropriately, no
algorithm can perform the task using fewer operations than the optimal algorithm.
Second, there is the weaker concept of asymptotic optimality. Suppose f is the
complexity function of an algorithm A which solves a specified problem. Then A
is asymptotically optimal if for every other algorithm B that solves the problem,
if the complexity function of B is g, then f is O(g). Thus for sufficiently large
arguments, the value of fis bounded by a multiple of the value of g. Informally, we
say O(f) is a lower bound on the asymptotic complexity of the class of algorithms.
Note that two algorithms with distinct complexity functions can both be asymp-
totically optimal. In contrast, if fand g are complexity functions of algorithms for
some problem class, and if fis optimal in the absolute sense, then f(x) < g(”) for
every argument n € N.
Table 5.2.1 describes how the growth of the cost of an algorithm is determined
by its asymptotic behavior. As a rule of thumb, we can say that it is usually feasible
to execute algorithms of O(n) and O(n log n) complexity for fairly large values of 7.
Time or space limitations often make it difficult or impossible to execute O(n?)
and O(n’) algorithms for even moderate values of nm. Exponential algorithms (those
of O(a") where a > 1) cannot generally be executed except for small values of n.
We will now analyze several algorithms, characterize their complexity func-
tions, and consider their optimality. We will describe algorithms for finding the
maximum element of a set, algorithms for searching for a specified element in a set,
and algorithms for sorting the elements of a set. All of the algorithms we describe
are based on comparisons; that is, the result of applying the algorithm is determined
by a sequence of comparisons between elements of a set. We will treat the ques-
tion of optimality only for the class of algorithms based on comparisons where the
number of outcomes of any comparison is bounded. (Most algorithms of interest
have either two or three possible outcomes for each comparison, e.g., < and >,
or <, = and >.) Thus, our claims that certain algorithms are asymptotically
optimal depend on our considering only a restricted class of algorithms; the
claims may not hold if we consider algorithms which are not based on comparisons
or algorithms in which the number of outcomes of a comparison is not bounded.
260 COUNTING AND ALGORITHM ANALYSIS Ch.5
procedure MAX:
begin
max < A{l];
for i = 2 until 1 do
if max < Afi] then max — Afi]
end
Theorem 5.4.1: Any algorithm to find the maximum element of a set with
n members, n > 0, must make at least n — 1 comparisons.
Proof: Each comparison establishes that one element is not larger than
another. In order to find the maximum element, each of n — 1 elements must be
shown (by means of a comparison) to be no larger than some other element. Hence
n — 1 comparisons are necessary to find the maximum of elements. JJ
the winner
best of {1, 2, 3, 4}
1 2 3 4 5 6 7 8
the contestants
After the winner has been found, the resulting labeled tree provides some help in
finding the second best player, since he must have been one of the three players who
lost to the winner. Thus, only two more matches need be played to find the second
place winner.
The algorithms we have described for finding the largest element of a sequence
have the property that the cost is uniform over all problems of size n. In general,
however, the cost of applying an algorithm to a problem of size n may depend on
the particular problem solved. Consider, for example, sorting a list of n entries.
If all the entries are distinct, then there are n! different permutations of the n
entries and consequently ! different lists with the same set of entries. The cost of
applying a particular sorting algorithm to a list with these n entries will usually
depend on the order in which the entries appear; for example, if the list is nearly
sorted, then the algorithm may have to do very little work. The cost of applying
an algorithm to a problem of size n is usually based on either a worst case or an
average case analysis. A worst case analysis defines the cost of applying an algo-
rithm to a problem of size n as the maximum cost over all problems of size x. Thus,
if fis a complexity function based on a worst case analysis, then for every problem
of size n, the cost of applying the algorithm is no greater than f(n). In an average
case analysis, a probability distribution is assumed over the set of problems of size
n and the average cost is calculated based on this probability distribution. Such
an analysis often assumes all problems of size n are equally likely; in this case,
the value of f(n) is equal to the sum of the costs of applying the algorithm to all
problems of size n divided by the number of problems of that size. Of the two
kinds of analysis, worst case is usually simpler because it only requires that we
determine how bad things can be and then analyze that single case, whereas an
262 COUNTING AND ALGORITHM ANALYSIS Ch. 5
average case analysis must account for all possible cases and then weight them
appropriately.
Searching Algorithms
The following theorem relates the number of nodes of a balanced binary tree
to its height.
Theorem 5.4.2: The height of a balanced binary tree with n nodes is | log n }.
To measure the cost of searching with a binary search tree, we take the number
of records in the file to be the problem size and define the cost of a search as the
number of records examined during the search. By Theorem 5.4.2, a balanced
- binary search tree with n nodes is of height h = | log n]|. Since as many ash + 1
records may be examined in the course of a search, a worst case analysis of a search
in a balanced binary tree yields the complexity function f(m) = Llogn]| + 1. The
search is therefore an O(log) algorithm if the search tree is balanced. In fact,
a balanced tree may not be possible if too many records in the file have the same
key, but if all keys are distinct, then a balanced tree can be always constructed.
(A recursive algorithm for constructing a balanced binary search tree was given
in the last example of Section 5.3.)
Many ways of organizing files and searching for records have been developed,
and whether a particular search algorithm is optimal depends on what operations
are permitted and are consistent with the file organization. For search algorithms
which locate records by comparing a search argument with record keys, a search
which uses a balanced tree is asymptotically optimal. This result is established by
the following theorem.
Now consider the average case performance of searches using a binary search
tree. For the purpose of this analysis, we assume that all records are equally likely
to be the object of a search, and that every search is successful. Furthermore, we
assume the binary search tree is balanced. Note that approximately half the nodes
of a balanced binary search tree are leaves, approximately 3 of the nodes are either
leaves or one step removed, approximately } of them are within two steps of a leaf,
etc. Thus, unless n is small, most of the nodes of a binary search tree are nearly
as far from the root as the leaves.
Let C, be the number of comparisons required to find the ith record stored
in the binary search tree T. The average cost C of a search in T is then
C=_19$ dC,
We can calculate C, easily if we can determine the length of the path from the
root of the search tree to the ith node; C, is one greater than the length of this
path, and therefore C is equal to n plus the sum of the lengths of all such paths.
has either no sons or two sons. Such a tree has one node a distance 0 from the root,
2 nodes a distance 1 from the root, and in general 2* nodes a distance k from the
root for all k < h. The total path length of a complete binary tree is therefore no
greater than >)*_, i2'. From Theorem 2.5.3, it follows that
We now use the bound on L, found in Theorem 5.4.4 to investigate the average
case performance of a search in a balanced binary search tree for the special case
where all leaves are distance h from the root. A complete binary tree of height h
with all leaves a distance / from the root contains 2'*! — 1 nodes. Recall that the
number of comparisons made in locating any record is 1 plus the length of the
path from the root to the node where the record is stored. Hence, the number of
comparisons necessary to locate each of the n = 2'*! — 1 records exactly once is
Lp +n = (hk — 1) 2414-2421 — 1 = hd + 1,
If we assume that all searches are successful, and all records are sought with equal
probability, then the average cost of a search is
C=A Ur +n)=
i
|
e
But 2’+! > A + 2 for all h > 0. Hence,
ct mt Qahth
Moreover,
Cy A) +1—-h+))_ 2! —h =h.
Dati —_ ] ~~ atl — l
Thus, for this class of binary search trees, the average cost of a search lies between
hand h + 1. Since both hand h + 1 are O(log n), it follows that the average search
cost is O(log n). Note that worst case and average case performances of searches in
a balanced binary search tree have the same asymptotic complexity.
Sorting Algorithms
Proof: A decision tree can be used to represent any sorting algorithm based
on comparisons. Each internal node of the decision tree will be associated with
the comparison of some element x, with another element x;. Each possible outcome
of a comparison is represented by an arc from the corresponding internal node.
If the result of comparing x, with x, is either x; << x, or x, > x,, then the decision
tree is binary.t Each leaf of the decision tree must specify a rearrangement of the
sequence which places the elements in sorted order. Since it may be necessary
to apply any one of n! permutations to arrange correctly the n elements of a se-
quence S, the decision tree must have at least n! leaves.
The number of comparisons made by an algorithm to specify a particular
permutation is the length of the path from the root to the leaf representing that
permutation. A minimax algorithm to sort n elements is therefore represented by
a tree with at least n! nodes and of height as small as possible. Since a binary
tree of height A has no more than 2* leaves, the height of the decision tree must be
large enough to satisfy the inequality 2! < 2*. Thus, log(n!) < Ah. But for n > 0,
Since h > log(n)), it follows that h > 4n logn — 4n. But A is the largest number
of comparisons required to sort n elements with a decision tree of height A; hence
f(a) = h. Therefore f(m) > 1/2n log n — n/2 and hence O(f) > O(mlogn). J
The preceding theorem establishes that any O(n log n) sorting algorithm is asymp-
totically optimal. Several O(n log n) sorting algorithms are known and we will
present one later in this section, but the most straightforward sorting algorithms
are O(n”), and we begin by analyzing one of these.
tWe leave it as an exercise to show that if the decision tree is ternary with branches labelled
<, >, or =, then O(z log n) is still a lower bound on the worst case asymptotic complexity.
Sec. 5.4 ANALYSIS OF ALGORITHMS 267
procedure BUBBLE@):
for 7 — 1 step 1 until x — 1 do
-for i— n — 1 step —1 until / do
if A[{i] > Ali + 1] then interchange A[/] and 4[i+ 1] _
wrong relative order, i.e., if A[i] > A[i + 1], then the entries are interchanged. The
initial pass starts with i= — 1 and continues until i= 1. At the end of the
first pass, the smallest entry of A has been “bubbled up” into the position A[I]
and need not be considered further. In the second pass, the value of i ranges from
n— 1 to 2; this pass bubbles the smallest entry of A[2] . . . A[n] into A[2]. In general,
in the jth pass the index i ranges from n — 1 to j and the jth smallest element of
A is bubbled into A[j]. After the (n — 1)th pass, the values of A[I], A[2],...,
A{n — 1] are all in place, and consequently the largest entry of A has been moved
to A[n].
To analyze the bubble sort, we first observe that there are n — 1 passes, and
the jth pass makes n — j comparisons. The total number of comparisons is there-
fore S72} (n — fp = n(n — 1)/2 = n?/2 — n/2. It follows from Theorem 5.2.5
that the bubble sort is an O(n?) algorithm.
Alternatively, the complexity function of a bubble sort can be characterized
with a recurrence system. The boundary condition is obtained by noting that no
comparisons are necessary for a list with one entry. For the recurrence relation, we
observe that if a list has m entries, where n > 1, then (n — 1) comparisons are used
to move the smallest entry into place and this process leaves a list of n — 1 entries
to be sorted. Thus, the recurrence system is
T(l)= 0,
Tn) =Ta—Din-|l > 1,
forn
which has the solution n(n — 1)/2.
We have remarked that O(n log 7) sorting algorithms exist. Since the bubble
sort is an O(n? ) alg ori thm and O(n log n) is prop erly cont aine d in O(n’ ), it foll ows
that the bubble sort is not asy mpt oti cal ly opti mal. Neve rthe less , this sort ing
algori thm is com mon ly used whe re the valu e of 1 is not too larg e and pro gra mmi ng
effort is to be kept to a min imu m. A mod ifi ed vers ion is also usef ul if only the first
k entries of the sort ed list are to be foun d; in this case only & pass es need be mad e.
The bubble sort has the addi tion al virt ue that it requ ires almo st no spac e in addi -
tion to that used to contain the input vector.
The bubble sort oper ates by succ essi vely redu cing the pro ble m; each pass
reduces the size of the unsorted port ion of the vect or by 1. Sequ enti al sear ch of
a list of length n is simi lar; each com par iso n eith er find s the reco rd soug ht or
reduces the problem size by 1. If the reco rd soug ht is not fou nd at step i of a se-
quential search, then a pro ble m of size n — i mus t be solv ed. Com par e this with
a binary search: if the ith com par iso n of a bina ry sear ch does not loca te the reco rd,
the problem is reduced to one of app rox ima te size n/2'. Two sub pro ble ms are
268 COUNTING AND ALGORITHM ANALYSIS Ch. 5
defined at each step of a binary search, but each subproblem is only about half as
big as the original problem, and only one of them needs to be solved. Because the
subproblems are approximately equal in size, the algorithm is said to be “balanced.”
An algorithm is balanced if for some k, 0 <_k <1, the algorithm breaks a
problem of size n (where n is sufficiently large) into a collection of subproblems,
none of which is greater than size kn. In contrast, an algorithm may reduce a prob-
lem of size n to one of size n — p where p is a fixed integer; such an algorithm is
not balanced. Thus, bubble sort and sequential search are not balanced algorithms
because they reduce a problem of size n to one of size n — 1. Binary search, on the
other hand, is balanced because it changes a problem of size n into one of size n/2.
Moreover, the binary search of Fig. 5.3.2 would remain balanced even if m were
assigned the value of |(i + /)/r| for some r > 2 rather than [(i + j)/2|. Such a
“skewed” binary search would still have O(log n) complexity, but it would not be
as efficient as the usual binary search. In general, the most efficient algorithms are
those which are balanced, and among the balanced algorithms the most efficient
are those which break a problem into subproblems of approximately equal size.
We will now describe a sorting algorithm which implements a balanced divide
and conquer strategy; then we will show the algorithm is asymptotically optimal by
proving that the complexity function of the algorithm is O(n log n).
The next theorem shows that any algorithm to merge two lists of lengths m
and n requires m + n — 1 comparisons for some pairs of lists.
Theore m 5.4. 7: Let A be an alg ori thm whi ch mer ges two sor ted lists on the
basis of com par iso ns bet wee n list entr ies. The re exis t an infi nite num ber of val ues
of m,n & Nan d lists of len gth s m and n res pec tiv ely suc h that the alg ori thm A
req uir es at leas t m + n — 1 com par iso ns to mer ge the lists .
Proof: It suf fic es to tre at the cas e m = n. Let LIS T1 and LIS T2 be list s of
length m suc h tha t for all i, LIS T1[ 4] < LIS T2[ /] < LIS T1[ i + 1]. The n the me rg ed
270 COUNTING AND ALGORITHM ANALYSIS Ch. 5
output LIST must be constructed by selecting elements from the lists alternately.
If we represent the original lists by the following pair of digraphs,
LIST 1 ay ay a3 Qin ay
o—___+ e+e» .- o—__+e
LIST? by b, b Bn 1 Dey
by by b3 bm -1 bm
Each edge of the digraph of LIST represents the result of a single comparison.
If any comparison is not made, the resulting partial subdigraph is consistent with
more than one ordering. Since a merging algorithm must be able to produce any
of the orderings consistent with such a partial subdigraph, all the comparisons
must be made. Because the digraph has 2m — | edges, it follows that m-+ 1 — 1
comparisons are necessary. jj
There exist values of m and n such that fewer than m -+ n — 1 comparisons will
suffice to merge two sorted lists. For example, ifn = 1, then merging can be done
by inserting the single element of LIST2 in the sorted list LIST1; this requires only
[log(m + 1)] comparisons using binary search. But the preceding theorem shows
that for some values of m and n, m + n — 1 comparisons are necessary.
The procedure MERGESORT, which uses the procedure MERGE as a sub-
routine, is given in Fig. 5.4.4. The next theorem establishes that the worst case
behavior of MERGESORT is asymptotically optimal.
procedure MERGESORT(LIST):
if LENGTH(LIST) < | then return LIST
else
begin
k — LENGTH(LIST);
set LIST! to LIST[1]... LIST[|_£/2_]];
set LIST2 to LIST[LA/2.] + 1]... LIST[A]:
return MERGE(MERGESORT(LIST1), MERGESORT(LIST2))
end
Proof: We will apply Theorem 5.3.4, which requires that we characterize the
number of comparisons by a recurrence system in the form
fW <e,
S() <af(n/b) + cn for n = b* where k > 0.
The preceding theorem together with Theorem 5.4.5 shows that the worst
case behavior of MERGESORT is asymptotically optimal. A number of other
O(n log n) sorting algorithms are known, including heapsort, which has a worst
case behavior of O(n log n) and quicksort, which has an average case behavior of
O(n log n) but a worst case behavior of O(n”). A careful treatment of these algo-
rithms is beyond our scope; the reader is referred to Aho, Hopcroft and Ullman
[1974].
1. Construct a binary decision tree of minimum height for finding the maximum of four
elements. Prove that your tree is of minimum height.
2. Itcan be shown that using comparisons to find the largest and second largest elements
of a sequence of length n requires n + [log n| — 2 comparisons, where the outcome
of each comparison is <, =, or >. Describe an algorithm which accomplishes the
task with this number of comparisons.
4. (a) Find an expression for the minimum total path length of a balanced binary
tree of height h.
(b) Estimate the cost of an average search in a balanced binary search tree of
height A which has minimum total path length. Assume all searches are suc-
cessful.
(c) Use Theorem 5.4.3 to find the average case asymptotic complexity of a search
in a balanced binary search tree.
272 COUNTING AND ALGORITHM ANALYSIS Ch. 5
In this section it was shown that the worst case asymptotic complexity of a search
in a balanced binary search tree is O(log m). Characterize the asymptotic behavior
of the worst case performance for the set of all binary search trees with n nodes; i.e.,
what happens if we drop the restriction that the tree is balanced ?
Find the worst case asymptotic complexity of a ternary tree search as described in
Section 3.2. Assume the ternary search tree is balanced.
Theorem 5.4.5 was proved using the assumption that a comparison of two elements
resulted in one of two possible outcomes: either x; < x; or x; > x;. Show that if
three outcomes are permitted (i.¢., x; < xj, x; > x,, OF X; = x,) the result still holds.
Consider the algorithm for sorting by interchange given in Fig. 5.4.5. The input is
procedure SORT(n):
for i ~ 1 until n — 1 do
begin
comment: find minimum entry in Afi: 7].
min — Ali];
position <— i;
for ji + 1 until 2 do
if A[j] < min then
begin
min — A[j};
position — j
end;
comment: interchange minimum entry with A[/].
Al[position] — A{i];
Ali] <— min
end
Fig. 5.4.5 Sorting the array A[1: ”] by interchange
Sec. 5.4 ANALYSIS OF ALGORITHMS 273
the number of entries in the vector 4; when the algorithm terminates, the entries of
A are sorted in nondecreasing order. The algorithm makes a sequence of n — 1
passes with the ith pass finding the smallest entry in A[i:n] and interchanging it
with A[i]. Prior to the ith pass, the first i — 1 entries are in place. Let f(n) be the
number of comparisons made in sorting a vector with 1 entries. Find the asymptotic
behavior of f.
10. The final example of Section 5.3 described how a binary search tree can be con-
structed from an unsorted sequence of length n using O(n log n) comparisons.
Prove that it is not possible to accomplish this task with O(7) comparisons. (Hint:
Show that if this task could be accomplished with O(n) comparisons, then we could
devise an O(n) algorithm to sort by comparisons.)
11. Let A[0: 2] be a vector of coefficients and consider the problem of evaluating the
polynomial
P,(2) = 3} Ali} z
for an arbitrary real argument z. Define the time complexity function f of an algor-
ithm to evaluate P,(z) as the function such that f(m) is the maximum number of
multiplications required to evaluate P,(z) for any vector A[0: 7].
(a) Find the asymptotic complexity of the following algorithms for evaluating
P,{2z).
(i) (This algorithm is known as Horner’s method and is known to use a minimal
number of multiplications.)
procedure HORNER:
begin
value — A[n];
for i<-n — 1 step —1 until 0 do
value «- (value * z) + Afi]
end
(ii) procedure TWO:
begin
power <1;
value < A[0};
for i — 1 until 7 do
begin
power <~ power * Z;
value < value + (A[i] * power)
end
end
(iii) procedure THREE:
begin
value <0;
1 do
for i — 0 until
begin
summand < A[t];
for j <1 until i do summand <— summand * z;
value — value +- summand
end
end
274 COUNTING AND ALGORITHM ANALYSIS Ch. 5
(b) Suppose it is known that A[i] = 0 for all odd i. Construct an algorithm to take
advantage of this restriction and analyze its asymptotic complexity.
12. (a) Using the programming language of this text, write a recursive procedure to
perform a sequential search on an array Afi: /j], where i < j.
(b) Write a recursive procedure to implement the interchange sort of Figure 5.4.5.
(Note: It would be poor practice to implement either of these algorithms
recursively. The exercise will illustrate, however, that they can be viewed as
examples of unbalanced divide and conquer algorithms.)
INFINITE SETS
6.0 INTRODUCTION
Many interesting and important sets are not finite; two obvious examples are the
set of natural numbers and the set of all ALGOL programs. But even with these
sets, we will never have to treat more than a finite number of the individual ele-
ments. For example, it should suffice to be able to answer questions about all
ALGOL programs with less than, say, 10!°” symbols; there is no need to find a
way to answer the same questions for all ALGOL programs. It can therefore be
argued that we are only interested in a finite number of ALGOL programs. Then
why should the computer scientist be interested in infinite sets? In fact, treating
infinite sets is often easier and more useful than dealing with the finite subset in
which we are interested. Many infinite sets of interest are inductively defined;
investigations of such sets tend to produce results about the entire infinite set and
often provide insight into the structure of the set and its elements.
As with finite sets, we are often interested in the size, or cardinality, of an
infinite set. Cardinality arguments, based on principles similar to the pigeonhole
principle, can be used to establish important results. For example, we will use
cardinality arguments to show that there exist tasks which cannot be performed by
any computer. This is demonstrated by showing that there are more tasks than
there are programs; it follows immediately that some tasks cannot be performed
by any of the programs. This technique will be used to show that there exist real
numbers which cannot be computed by any computer program, even if a computer
of unlimited storage and speed is assumed to exist.
Finite sets can be distinguished from infinite sets using either of two definitions.
We will present both definitions and illustrate their use.
275
276 ~= INFINITE SETS Ch. 6
Defini tio n 6.l .la : A set A is fini te wit h car din ali ty n € N if the re is a bij ect ion
from the set {0, 1,...,2 — 1} to 4. A set is infinite if it is not finite.
To prove a set A is infinite by using definition 6.1.1a, one must establish that
no bijection exists from {0, 1,...,” — 1} to A for any n. Because it is necessary
to rule out an infinite number of possibilities, such a proof can be quite difficult.
For this reason, it is often useful to use the following alternate definitions of
finite and infinite sets.
Definition 6.1.1a states explicitly how to recognize a finite set and then says
that everything else is infinite; Definition 6.1.1b does just the reverse. It is usually
most convenient to use the first definition to show that a set is finite, and the second
to show that a set is infinite. Definitions 6.6.la and 6.6.1b can be shown to be
equivalent by using the Axiom of Choice.t In our discussions we will use whichever
definition is most convenient.
Using Definition 6.1.la, we can give a shorter proof for Theorem 6.1.1 than
the one given previously.
Axiom of Choice: If C is a collection of nonempty sets, then there exists a set T such that 7
has as elements exactly one x from each set S € C.
Conceptually this principle allows us to choose an arbitrary element from any nonempty set,
and in fact make an infinity of such choices. This seemingly reasonable assertion has some dis-
comforting implications. The interested reader is referred to Wilder [1965] for a discussion of the
Axiom of Choice and a proof of the equivalence of Definitions 6.1.1a and 6.1.1b.
Sec. 6.1 FINITE AND INFINITE SETS 277
Examples
(a) The set of real numbers, R, is infinite. We use Definition 6.1.1b and the
following map:
f:R-R,
S@=x+1 if x > 0,
f(x) =x ifx <0.
Then f is an injection and f(R) = {x|x € R A x ¢ [0, 1}.
(b) Let & = {a, 5}. Then &* is infinite. Let f: Z* > L* be defined by f(x) = ax.
Then fis an injection and the image of fis the proper subset of £* which con-
tains all strings beginning with the letter a.
(c) The closed interval, [0, 1], is infinite. The function f: [0, 1] > [0, 1] defined by
J (x) = x/2 is an injection whose image is the proper subset (0, 1/2]. #
oo
Example
Let A denote the set of ALGOL programs which never halt. We will show the
set A is infinite by constructing an infinite subset A’ < A of programs which never
halt.
278 ~=INFINITE SETS Ch. 6
begin
label: go to label
end
go to label;
The next theorem shows that the property of a set being infinite is preserved
under certain set operations.
f(x) = {x}.
Then f is an injection and it follows from Theorem 6.1.3 that @(A) is
infinite.
(c) Since B 4 $, we can choose some element 6 € B, and define the map-
f:A-AXB,
f(x) = &, b>.
Since A is infinite and fis injective, it follows from Theorem 6.1.3 that
A X Bis infinite. Jj
This section has introduced the notion of infinite set and the use of injections
to show that sets are not finite. We are accustomed to dealing with finite sets, where
Sec. 6.2 COUNTABLE AND UNCOUNTABLE SETS 279
any injection from a set to itself is also a surjection. In contrast, infinite sets do
not have this property and we have used this fact to distinguish between the classes
of finite and infinite sets. In a later section, injections will play a crucial role; we
will use them to determine when two infinite sets are the same size, as well as to
establish when one infinite set is “larger” than another.
Examples
(a) [I+| = No.
The function f: N > I+ defined by f(x) = x + 1 is a bijection.
(b) [I] = No.
The function f: N — I defined by f(x) = x/2 if x is even, f(x) = —(x + 1)/2 if x
is odd, is a bijection. #
We say a set can be enumerated if its elements can be listed. The list may be
finite or infinite, and repetitions may occur, that is, not all entries of the list need
be distinct. If a list enumerates the set A, then every entry in the list is an element of
A and every element of A appears as an entry of the list. These concepts can be
formalized as follows.
TR is the first letter of the Hebrew alphabet. This notation was introduced by Cantor.
Sec. 6.2 COUNTABLE AND UNCOUNTABLE SETS 281
Examples
(a) If A = @, there is only one enumeration of A; it is the empty function.
(b) If A = {a, b,c}, then <a, b, a, c> and <b, c, a> are both finite enumerations
of A, the first with repetitions and the second without.
(c) Let A be the set of even natural numbers. Then
<0, 2, 4,...> and
<2, 0, 6, 4, 10, 8, .. >
are both enumerations of A. (The second enumeration function is
S(n) = 2(n + 1) if nis even and f(m) = 2(n — 1) if nis odd.) #
Examples
(a) The set &* is countably infinite for any finite alphabet Z. This can be shown by
exhibiting the elements of X* in standard order (Definition 3.6.7). If 2 = {a, b}
and a precedes b in the alphabetic order of Z, then the enumeration of X* in
standard order is
<A, a, b, aa, ab, ba, bb, aaa, aab, . . .>
Note that if || > 1, then Z* cannot be enumerated in lexicographic order.
(b) The set of positive rational numbers Q-+ is countably infinite. Clearly Q+- is
not finite, since the natural numbers N can be mapped injectively to a proper
subset of Q+. We will show Q+ is countable by exhibiting an enumeration
with repetitions. The order of the enumeration is specified by the directed path
of the following array.
NUMERATOR
1 2 3 4 3
1 1/1 2/1—+3/1 4/1—>5/1
{
Ve 3/ae 3/2 4/2
N
ha 24a 34
S&B
1/5<~ 4s
UU
the
WN
wa
re
Since this enumeration will include every integer ratio m/n, it is an enumeration
of Q+, and therefore Q + is countably infinite. The enumeration is with repe-
titions, e.g., 4 and 4 denote the same element of Q+. From Theorem 6.2.1,
it follows that there is a bijection from N to Q+. #
Choose a, from A
Choose a, from A — {ay}
Choose a, from A — {a, ay}
Choose a; from A — {ap, a;, a}
Each of the sets A — {ao, a,,4,,...,4,} is infinite. If this were not so, then
A would be equal to the union of the two finite sets 4 — {ao,a,,...,a,} and
{a ,4,,...,a,}. But the union of two finite sets is a finite set and A is infinite.
Therefore each set A — {ay, a1, a),..., a,} is infinite and we can select a new ele-
ment a,,,. Thus we can construct an infinite sequence <ap, a), a, ...> without
repetitions; the elements of this sequence comprise a countably infinite subset
of A. ff
Like the finite sets, countable sets are closed under certain set operations. The
following theorems list the principal results.
Examples
The preceding theorem can be used to show that each of the following sets is
countably infinite.
(a) Ir = {{x1, X2,..- , Xn» |x; € I} (the set of n-tupl es with intege r compo nents ).
(b) Q" = {x1, Xa, eee Xn? |x; € Q}.
(c) The set of all nth degree polynomials with rational coefficients.
(d) The set of all polynomials with rational coefficients.
(e) The set of all n < m matrices with rational components.
(f) The set of all matrices of arbitrary finite dimension with rational compo-
nents. #
As our definitions have suggested, not all infinite sets are countably infinite.
The next theorem establishes that we need another infinite cardinal number.
Theorem 6.2.5: The subset of real numbers, [0, 1], is not countably infinite.
Proof: Recall that [0,1] denotes the set {x|x Ee RA O<x<1}. Each
x € [0, 1] can be represented by an infinite decimal expansion:
x= Xi QX 1 X~X3 one
where each x, is a decimal digit. Using this representation requires some care,
since the representation is not unique; for example:
5000... = .4999 .. .t
We will show that no function from N to [0, 1] is surjective. This will establish that
no enumeration exists for [0, 1].
TTo show that .4999 . . . is an alternative representation of .5, let x denote .4999 . . . Then
10x = 4.999...,
100x = 49.999 ...,
and 100x — 10x = 45. It follows that x = .5.
Sec. 6.2 COUNTABLE AND UNCOUNTABLE SETS 285
Let f: N-— > [0, 1] be an arbitrary function from the natural numbers to the
set [0, 1]. Arrange the elements f(0), f(1),..., ina vertical array, using a decimal
representation for each value f(x). The resulting array appears as follows:
FO): .X90X01X02-+
FO)! Xpo%11X12---
f(r): XnoXniXna: ++
where x,, is the ith digit in the decimal expansion of f(n). We now specify a real
number y ¢€ [0, 1] as follows: y = .yoy,y,..., where
y= lifx, ~1,
= 2if x, = 1.
The number y is determined by the digits on the diagonal of the array. Clearly,
y © [0, 1]. However, y differs from each f() in at least one digit of the expansion
(namely, the nth digit). Hence, y 4 f(n) for any n, and we conclude that the map
J: N- (0, 1] is not a surjection. Therefore, fis not an enumeration of [0, 1]. Since
the map / was arbitrary, this establishes that |[0, 1]|~N. ff
The preceding theorem and proof are due to Cantor. The proof technique is
sometimes called the “Cantor diagonal technique” or simply “diagonalization.”
Essentially, this technique begins with an infinite list such that each element on
the list has an infinite description. It then produces an object distinct from each
element of the list. This technique has many variations and is applied extensively
in the theory of computability.
Wo Wi W2
by letting the ith row represent the characteristic function of A,. Then a,, = 74,(W;);
that is, a,, = lifw, € A,, otherwise a,, = 0. Now define a language L by traversing
the diagonal elements of the array and including in L exactly those elements which
are not in their respective subsets:
0,7
= ,
L = {w,|a © N} = {w,|w, € A, i € N}.
By construction, L + A, for any i € N; that is, L does not appear in the enumera-
tion. But L © @(Z*). Therefore, (Ay, A;, A2,-.-> iS not an enumeration of @(£*).
Since the enumeration was an arbitrary enumeration of any nonempty subset of
O(X*), it follows that no enumeration of the entire set O(2*) exists. J
The sets [0, 1] and @(£*) are examples of sets which are infinite but not count-
ably infinite. In the next section we will develop tools for showing that [0, 1] and
@(z*) have the same cardinality. We choose [0, 1] to be the “standard set” for this
cardinality and make the following definition.
The choice of c is based on the fact that the set [0, 1] is often called a con-
tinuum.
Examples
(a) |[a, b]| = ¢ where [a, b] is any closed interval in R with a < b. This is estab-
lished by noting that f(x) = (6 — a)x + a is a bijection from [0, 1] to [a, 5).
(b) [(0, 1)] = |[0, 1]]. These two sets differ only in their containment of the end
points of the interval; in order to construct a bijection from [0, 1] to (0, 1) we
must find an image for 0 and 1 in the interval (0,1) while keeping the
map surjective. Define the set A to be {0, 1, 1/2, 1/3,..., 1/n,...}. Define the
map fas follows:
f:{0, 1] ©, 1),
fO=5
f() = $5 forn =,
1 1 i a
0 5 4 3 2 1
Sec, 6.2 COUNTABLE AND UNCOUNTABLE SETS 287
&(x)
__ U/2 — x)
xd — x)
Since f of the preceding example is a bijection from [0, 1] to (0, 1), and g is
a bijection from (0, 1) to R, the composite function gf is a bijection from
(0, 1]to R. Hence, |R| =c. #
Show that each of the following sets has cardinality ec by constructing a bijection
from [0, 1] to the set.
(a) (a,b), wherea < banda,be R.
(b) {x|x ERA x> 0}.
() {Kx wlxy Ee RA x* + y? = 1}.
Let |A| =c, |B] =c, |[D| = No, |Z] => 0, where A, B, D, and E are disjoint.
Prove each of the following.
(a) |AUBl=c.
(b) |AU Dil =c.
(Cc) [Dx E|=No.
Try to find a set § such that |@(S)| = No. If you do not succeed, describe the difficul-
ties encountered.
(a) In Theorem 6.2.5, suppose we use a binary expansion for f(i) and define the
digits of y in the obvious way:
288 = INFINITE SETS Ch. 6
7. Joe Cool , a stud ent at Silo Tech , has sugg este d the foll owin g proo f that no bije ctio n
exists from N to N. Assume f is a bijection from N to N, with f(k) = ix.
For each i;,, construct a number in (0, 1] by reversing the digits of i, and putting
a decimal point to the left. For example, if i, = 123, the number constructed becomes
321000...
This defines a map g from N to [0, 1] which is injective, e.g., g(123) = .321000...
Apply the Cantor diagonal technique to the array
gof(O) = .xXooX%01---
Sof) = XpoX11X12---
to construct the number y € [0,1]. Now reverse the digits of y and put the
decimal point to the right. The result is a number which does not appear in the
list f(0), f(1),..., which contradicts the assertion that f is surjective. Hence, no bi-
jection can exist from N to N.
Should we promote Joe to full professor or suggest he find a job as a COBOL
programmer (assuming the two are mutually exclusive)?
The preceding sections introduced the finite cardinal numbers, the cardinal number
NX, for a countable infinity, and the cardinal number c for some sets of an uncount-
able infinity. In each case, the cardinality of a set A was established by constructing
a bijection from a standard set to A. .This allows us to show that two sets have
the same cardinality, but so far, we have not defined an order relation which will
enable us to assert that one set is larger than another. In this section, we develop
the order relations < and < on cardinal numbers and show that they have prop-
erties similar to the usual order relations over the real numbers. The following
definition formalizes the concept of two sets having the same cardinality even when
a standard set has not been specified.
Definition 6.3.1: Let A and B be sets. Then, A and B are equipotent or have
the same cardinality, denoted by | A| = | B\, if there is a bijectionfrom Ato B.
Example
Let E be the set of positive even integers. Then, |I-+- | =| E| because the func-
tion
Sec. 6.3 COMPARISON OF CARDINAL NUMBERS 289
fi1I+-E,
f(x) = 2x
is a bijection fromI+ to E. +
It follows from the preceding theorem that to show a set S has cardinality a,
it suffices to choose any set S’ which we know has cardinality « and establish the
existence of a bijection from S to S’ or from S’ to S. In general, we choose the set
S’ to make the proof as easy as possible.
We now consider order relations on sets of cardinal numbers. Our goal is to
be able to compare the sizes of sets. For example, our intuition tells us that sets with
cardinality ¢ are “larger” than countable sets. Before we formally define the order
relation for arbitrary collections of sets, we make the following observations con-
cerning finite sets and their cardinal numbers.
Let A and B be finite sets with | A| = n, | B| = m.
(a) If there exists an injection from A to B, thenn < m.
(b) If there exists a bijection from A to B, then n = m.
(c) If there exists an injection from A to B, but no bijection exists, then
nom,
These relationships between functions and cardinalities can be extended in a
natural way to apply to arbitrary sets.
We have chosen to use the notation < and < because the order relations we
have just defined have the properties which we usually associate with these sym-
bols. However, the proofs that the properties hold are, in some cases, lengthy and
intricate. The following two theorems establish some of these properties, but their
proofs are too involved to be presented here. The first theorem, called the Law of
Trichotomy, asserts that any two sets can be compared using either the relation
<or=.
Theorem 6.3.2 (Zermelo): Let A and B be sets. Then exactly one of the
three following conditions holds:
290 ~—sINFINITE SETS Ch. 6
(a) |A|<|BI,
(b) |B] <|A|, or
(c) |A|= |B.
The second theorem asserts that the relation < is antisymmetric.
The preceding theorem often provides a powerful mechanism for showing that
two sets have the same cardinality. If we can construct an injection f: A — B,
thus establishing that | 4| << |B|, and another injection g: B—> A to establish that
|B|<|A|, then we can conclude that |A| = |B|. Note that f and g need not be
surjective. Thus Theorem 6.3.3 allows us to conclude that a bijection exists from
A to B on the basis of injections from A to B and B to A. It is often easier to con-
struct two such injections than a single bijection.
Theorem 6.3.4: Let S be a set of cardinal numbers. The order relation < on
Sis a linear order. The order relation < on S is a quasi order.
Examples
(a) We show |(0, 1)| = |[0, 1]| by exhibiting an injection from each set to the
other as follows:
@ f:(@,1)— (0, 1],
Sx) =x.
(ii) g:[0, 1] ©, 1),
s=t4+y
(b) [@(N)| =e.
(i) We show that |@(N)| < ¢ by constructing an injection as follows:
g: PN) > (0, 1]
For every subset § < N, g maps S§ to a real fraction,
BCS) = .X9X1X2..-,
where the fraction is expressed in binary representation and
X2; =0 for 7 = 0,1,2,...,
Xoje1 = 1 for j < S, and
=0Q forj € S;
e.g., () = 0,
g(N) = .01010101...,
3, 5) = .00 01 00 01 00 01...
&({l,
(Note that we cannot use (in place of g) the function g’ such that 2’(S) is
the binary fraction .xox:x2..., where x; = 1 if j ¢ S and x; = 0 if
Sec. 6.3 COMPARISON OF CARDINAL NUMBERS 291
e.g., f(0) = ¢,
fQ@) = fC1ll1l..)=N,
f(.101010000 . . .) = {0, 2, 43.
Then f is an injection. (Note that fis not a surjection. For example, if
.1000... is chosen as the representation of 1/2 rather than .0111...,
then the set {0} will be in the image of f but the set {nj|n © N A n> 0}
will not.)
It follows from the Cantor-Schréder-Bernstein theorem that |@(N)| = e¢.
The relationship between the finite cardinal numbers, &,, and c is established
by the following theorem.
there was no bijection from N to A, so |A| + |N]. It follows that |A| <|N], ie.,
|[A|< No.
We next observe that the map
FNM
SQ) =
is an injection from N to [0, 1]; hence |N| < |[0, 1]|. In Theorem 6.2.5, we showed
that |N| + |[0, 1]|. It follows that |N| <|[0, I]L ie, Xo <e. Ff
Example
Define a number x € (0, 1) to be computable if and only if there is an ALGOL
(or PL/I, or FORTRAN, etc.) program P which, when given any nonnegative
integer i as an input, will halt after producing, as its only output, the ith digit of
the decimal expansion of x. The time required for the computation can be arbitrarily
large but must be finite. Thus, the number x = .x9x;x,...is computable in the
sense that the program P can be used to determine x to an arbitrary precision, or to
produce any digit of the expansion of x. A number x & (0, 1) is noncomputable
if it is not computable, The following procedure computes the digits of the repeating
decimal .514141414...
procedure COMP(i):
if i = 1 then return 5
else
if i mod 2 = 0 then return 1
else return 4
We now show that there exist noncomputable numbers in the open interval
(0, 1). The proof uses a cardinality argument and is nonconstructive. The following
sets will be used:
x, the ALGOL character set,
A, the set of all ALGOL programs,
C, the set of ALGOL programs which compute some number in (0, 1),
S, the numbers in (0,1) which are computed by some ALGOL program.
Since & is a finite set, the set of nonempty strings over the alphabet = has cardinality
No, ie., |L*] = No. Since any ALGOL program is a finite string over £,
|A[< [2+].
Since C is a proper subset of A, |C|<|A|. Any program P can compute the digits
of at most one element of S, but different programs might compute the digits of the
same number. It follows that |.S|<|C|. Thus, we have
IS|<|C| <|A|< No.
But in Section 6.2, we showed that |(0, 1)| = ¢, and in Theorem 6.3.5 we showed
No <c. Hence [S| <|(,1)|, i.e., some of the numbers in (0, 1) are not com-
putable. #
Sec. 6.3 COMPARISON OF CARDINAL NUMBERS 293
We will show that g is not surjective and hence not bijective. The function g maps
each element of A to a subset of A; an element x may or may not be in the subset
g(x). The set S < A is defined as follows:
= [x|x € g(x).
Now S is a subset of A, but g(a) # Sforanya € A. For if g(a) = S, then
ae S<+ae {x|x € g(x)} by definition of S,
<a € g(a) by application of the predicate
which defines S,
<aéS by the assumption that g(a) =
Since this is a contradition, the assumption that g(a) = S is false. Since a was
arbitrary, it follows that g is not surjective; and hence, not bijective. Since g
was an arbitrary function, this establishes that no bijection exists and therefore
|A|#|P(A)|. I
Using the previous theorem, we can construct a countably infinite set of infi-
nite cardinal numbers, each of which is smaller than the one which follows:
IN| <|@(N)| < O(N) |< --:
Problems: Section 6.3
Prove that if there exists a surjection from A to B, then |B] < | A].
NA
PY
Find the cardinality of each of the following sets. Prove your assertion.
—_
>
Previous sections have described the cardinal numbers as well as the order rela-
tions < and <. We can now define an arithmetic for cardinal numbers. The arith-
metic is a generalization of the familiar finite arithmetic and includes the operations
of addition, multiplication, and exponentiation.
We will present some of the fundamental properties of cardinal arithmetic
but will prove only a few of our assertions. In some cases, proofs are most naturally
given using ordinal numbers, which we have not developed but which include the
cardinal numbers as a proper subset. Consequently, although we quote a set of
theorems intended to illustrate the characteristics of the arithmetic, in many cases
the proofs are beyond the scope of this text and will be omitted.
Definition 6.4.1: Let a and b be cardinal numbers and let A and B be disjoint
sets such that | A| = a and | B| = b. The sum of a and b is defined to be
a+b=|AUBI.
The following is easily proven using the preceding definition and the proper-
ties of set union.
The following theorem asserts that the order relations < and < are preserved
by the operation of addition.
The following theorem illustrates one way in which arithmetic involving infi-
nite cardinal numbers differs from the familiar arithmetic.
We will not prove the theorem; however, the special cases of a= N, anda=c
follow from our previous work.
Example
We show that c+ Ny =c. Let A = {x|x € R and x > 1}, and let B=
+ 2)|n € N}.
{1/(n Then [A] =c, |B|=No and AM B=. Furthermore,
AU Bc R;hence,|A U B| <e. But|A| = ¢,s0|/A U B[ >. Hence|A U Bl] =
c+No=c. #
Definition 6.4.2: Let a and b be cardinal numbers, and let A and B be sets
such that |A| = 5 and |B| = b. Then the product of a and b, denoted a-b or sim-
ply ab, is defined as follows:
a-b=|A X Bl.
The proof of the following theorem is left as an exercise.
We will not prove the general statement of the theorem, but the special cases
where a = c and b = XX, can be shown on the basis of our earlier work.
Sec. 6.4 CARDINAL ARITHMETIC 297
Example
We show that No-c = c. Let A =N and B = (0,1); then [A] = No and
|B| = c. We must show |A x B| = c. Define a function f from A x B to the
positive real numbers:
f:A X B- {x|x € R+},
S(n,x) =n+x.
Then fis injective, and since |R-+ |= c, it follows that |A x B|<c. Furthermore,
the map
2:(0,1)—>A x B,
&(x) = <0, x),
is injective and establishes that c<.|A x B|. Hence|A x Bl =c. #
Definition 6.4.3: Let a and b be cardinal numbers, and let A and B be sets
such that |A| = a and |B| = b. Then a to the power b, denoted a’, is defined as
a’ = | A¥|.
BP) = <f la f lo
which is also an injection. (It is easy to show that 8 = a~'.) Thus
| ABV? | < | A? 4 A” |,
Exponentiation preserves the order relations < and < in the expected fashion.
Once again, the proof of part (b) is beyond our scope. We leave the proof of part (a)
as an exercise.
1. Determine the values of the following expressions. The letter n denotes an arbitrary
member of N.
(a) n+WNo (b) n+e (Cc) No
+ No
(d) e+e (e) No (f) n-c
(g) No-No (h) cc (i) O¥
(j) Is (k) 2% @
(m) Nb (n) NF (o) ©?
(p) ¢3 (q) ¢ + (o-e + 3%)
Find the cardinality of each of the following sets.
(a2) RU R?2
(b) S xX X&* where |S| =x forne N.
(c) The set of all m x n matrices with components in R.
(d) The set of all x component vectors with integer components.
(e) The set of all functions from &* to N.
(f) The set of all functions from I x I to I.
(g) The set of m x n matrices with rational components.
Prove Theorem 6.4.1.
We have not defined an operation of subtraction for cardinal numbers. Show that
the following definition is unsatisfactory because the operation is not well defined.
“Definition”: Let A and B be sets such that |A| = a, |B| = 6, and Bc A. Then
a—b=|A—Bl.
Let a, b, and d be cardinal numbers.
(a) Prove thatifa<b,thena+d<b4+d.
(b) Show by counterexample that a <b does not imply that a+d<b+d.
(c) Prove that if a <b, then ad < bd.
(d) Show that a < } does not imply ad < dd.
Prove Theorem 6.4.4.
aH
Halmos [1960] develops the ordinal and cardinal numbers, along with their
arithmetics. More extensive treatments of these topics are given in the books by
Stoll [1963] and Suppes [1960]. Cohen [1966] discusses the role of the continuum
hypothesis in set theory. Vilenkin [1968] presents many of the concepts of this
chapter in an informal and entertaining way.
7
ALGEBRAS
7.0 INTRODUCTION
In Chapter 0, mathematical models were described as consisting of three compo-
nents: a phenomenon or process of the real world which we wish to investigate,
a mathematical structure, and a description of the way in which the mathematical
structure represents the real world process. To be useful, a mathematical model
must have a structure whose operations and relations reflect the real world in a
satisfactory way. Choosing a mathematical structure therefore requires understand-
ing how properties can be characterized mathematically and how some properties
imply others. A familiarity with the concepts of mathematical structures will
facilitate the understanding of abstract characterizations of new models and pro-
vide a basis for the construction of new models.
The mathematical structure of a model is often presented implicitly; in this
case there is no precise specification of the mathematical structure being used.
This usually causes no difficulty, because in most cases the structure is a familiar
one and an obvious choice. In this chapter, however, it will be useful to specify in
detail each mathematical structure we consider. In addition, we will develop a
few basic properties of some of these structures, emphasizing those properties which
are useful for the models which interest us.
The mathematical structures we will investigate are algebras, sometimes called
algebraic systems or algebraic structures, and their study is often referred to as’
“modern algebra.” These structures have been used in computer science for such
purposes as to describe the functions computable by classes of machines, to inves-
tigate the complexity of arithmetic computations, to characterize abstract data
structures, and as a basis for programming language semantics. Unfortunately, the
formalisms used in various applications are often quite different from one another,
although the fundamental concepts and techniques are the same. We will develop
only some of the most basic topics of this area, but at the end of the chapter we
will describe ways in which they can be augmented to treat various applications.
300
Sec. 7.1 THE STRUCTURE OF ALGEBRAS = 301
Examples
(a) The integers with the binary operation of addition and the constant 0 can be
described as an algebra in the following way.
1. The carrier is the set I = {... —3, —2, —1,0,1,2,...}.
2. There is a single operation, addition (denoted “+-”), from I? to I.
3. The element 0 is a constant.
Alternatively, this algebra can be presented as the triple <I, +, 0>.
(b) The real numbers R with addition, multiplication and unary minus can be
described as an algebra as follows:
1. The carrier is R, the set of real numbers.
2. There are two operations (“+” and “-”) from R? to R and one (“—”)
from R to R.
3. The elements 0 and 1 are constants.
This algebra can be denoted by <R, +,-,—,0,1>. #
The two examples above are of specific and familiar structures. To specify them
precisely, we would present them as n-tuples by stating, for example, “Let
+Note that the carrier of an algebra may be empty and the operations and constants may not
all be distinct. Our examples, however, will have nonempty carriers, and the operations and con-
stants will generally be distinct.
302. ALGEBRAS Ch. 7
Examples
(a) The algebras <N, -, 0> and <I, —, 0> have the same signature, since each has
a single binary operation and a single constant.
(b) The structures <R, +, -,1,0> and <@(S), U, O, S, > have the same signa-
ture.
(c) The algebras <I, +, 0> and <I, +> do not have the same signature because
the number of constants is not the same. #
Two algebras can have the same signature but not be related in any substantive
way. In order to prove useful theorems about classes of algebras, we generally
need to consider properties in addition to those implied by signature. We will
treat only properties specified by axioms, where each axiom is an equation written
in terms of the elements of the carrier and the operations of the algebra. A set of
axioms, together with a signature, specifies a class of algebras called a variety;
algebras which have the same signature and which obey the same set of axioms are
said to be of the same variety. Investigations of algebras are generally concerned
with particular varieties; the theorems that are proved are based on the axioms of
the variety, and the results hold for all algebras in the given variety.
Examples
(a) Consider the variety of algebras with the same signature as <I, +, 0) and the
following axioms:
Gi) x+y=ytx,
Gi) @+y4+z2=x4+04+9,
(iii) x +0=-x.
Then <R, +, 0>, <2*, concatenation, A>, <P(S), U, 6>, <P(S), A, S>, and
<I, -, 1 are all members of this variety, and theorems proved about this variety
will hold for these specific algebras.
(b) Consider the variety of algebras with the same signature as <R, +, - »—,9, 1D
(where “—” is a unary operation) and the following axioms:
Sec. 7.1 THE STRUCTURE OF ALGEBRAS 303
(Gi) x+y=y+x,
(ii) x+y = y+x,
Gi) @4+y)+z2=x4+04+2),
(iv) (x-y)-z = x-(y-2),
(v) x-(y +z) =(x-y) + 2),
(vi) x +(—x) =0,
(vil) x +0=x,
(viii) x-l =x.
Then <I, +,°, —,0,1> and <Q, +,-,—,0,1> are algebras of the same
variety, but <P(S), U, O, V,, S>, where ~ denotes set complementation, is
not because axiom (vi) does not hold for this algebra.
(c) Consider the variety of algebras with the signature <S, o, c>, (where is a bi-
nary operation and c¢ is a constant) and the following axioms:
aoc =a,
cca = a,
Any theorems we prove for this variety will hold for the algebras <I, +, 0>,
<R, -, 1> and <Z*, concatenation, A>. Not all these theorems will hold for the
algebra <I, —, 0> (where “—” denotes subtraction), because 0 — 1 = 1, thus
violating the second axiom. #
For the remainder of this chapter, rather than deal with algebras with arbitrary
signatures, we will usually treat an arbitrary algebra such as A = <S, 0, A,k),
where o is a binary operation, A is a unary operation, and k denotes a constant.
This will simplify the presentation by eliminating the need to treat arbitrary num-
bers of operations and constants and arbitrary arities of operations, but the defini-
tions and concepts can be extended to include algebras with other signatures as well.
Before we introduce the concept of a subalgebra, we must first define the notion
of a set of elements being closed under an operation.
Examples
(a) Consider the set of natural number N, and let 7’ = {x|O< x < 10}. The
set J’ is not closed with respect to the operation +, since 7 + 7 = 14 and
14 ¢ T’. However, T’ is closed with respect to the operation max, where the
operation is defined as max(x, y) =x if x > y, otherwise max(x, y) = y.
(b) Since each operation of an algebra with carrier S is defined as a function
from S” to S, it follows that the carrier of an algebra is closed under all its
operations. #
Definition 7.1.2: Let A =<S,0, A, k> and A’ = <8’, 0’, A’, k’> be algebras.
Then A’ is a subalgebra of A if
fi) S’c S;
(ii) ac’ b=aocbforallabe S’;
(iii) A’a= Aaforallae S’;
(iv) k’ =k,
Examples
(a) Let E denote the set of even integers. Then <E, +,0> is a subalgebra of
<I, +, 0>.
(b) Let - denote multiplication. Then <0, 1], «> is a subalgebra of <R, +>.
(c) If M denotes the set of odd integers, then <M, -, 1 is a subalgebra of <I, -, D.
But <M, +> is not a subalgebra of <I, +> because the odd integers are not
closed under addition;e.g.,1+1=-2. #
When no confusion can result, the operation may not be specified, and we will
speak of an identity, or an identity element, and a zero, or a zero element.
Examples
(a) The algebra <I, -, 1, 0>, where - denotes multiplication, has an identity 1 and
a zero 0.
(b) The algebra <I, +> has an identity 0 but no zero element.
(c) The algebra <N, max> has an identity 0 but no zero element.
(d) The algebra <N, min> has a zero element 0 but no identity element.
Sec. 7.1 THE STRUCTURE OF ALGEBRAS 305
(e) Let T be the set of integers between m and n, where m <n and both m and n
are included in 7. Then <7, max) is an algebra with an identity m and a zero n.
(f) Consider the algebra <R, +, ->. The element 0 is an identity for +, but there
are no zeroes for this operation. The element 1 is an identity and 0 is a zero for
the operation -. #
Identities and zeros are sometimes called two-sided identities and two-sided
zeroes since they have the same effect when used on either the right or left. In
contrast, the following definitions characterize one-sided identities and one-sided
zeroes.
Example
Let A = <S, o> where S = {a, b, c} and © is a binary operation defined by the
following operation table. (The entry in the row labeled x and the column labeled
y is the value of xo y,)
° a 5 c
a a b b
b a b c
c a b a
Then both a and b are right zeroes but neither is a left zero. The operation o is
neither associative nor commutative. #
The following theorems establish the most useful properties of identities and
zeroes.
Theorem 7.1.2: Let o be a binary operation on S with left zero 0, and right
zero 0,. Then 0, = 0,, and this element is a two-sided zero.
306 ALGEBRAS Ch. 7
The proof is similar to that of Theorem 7.1.1. The above theorems have the
following immediate consequence:
Examples
(a) The algebra <I, + > has an identity 0 and every element x € I has an inverse
with respect to the operation +; the inverse of x is denoted —x:
x+(—x)=0.
(b) The algebra <N, +> has an identity 0 which is the only element that has an
inverse.
(c) In the algebra <I, ->, only the identity 1 has an inverse, but in <R, -) all ele-
ments except the zero element 0 have an inverse.
(d) Let T be the set of integers between m and n, where m <n andm and n are
included in 7. Then <7, max> has an identity m, but only m has an inverse.
(e) Consider the set F of all functions on a set A under the operation of function
composition. Then 1, is an identity. By Theorem 4.2.8, every surjection has a
right inverse, every injection has a left inverse, and every bijection has a two-
sided inverse. Note that one-sided inverses may not be unique.
(f) Let N, be the first k natural numbers, where k > 0:
N; = {0,1,2,...,& — 1}.
Define +, to be an addition mod k; for every x, y & Nx,
X+eyexty ifx+y<k,
=x+y—k ifx+y>k.
Then +, is an associative binary operation with an identity 0. Every element
of N,; has an inverse; the inverse of 0 is 0 and the inverse of every nonzero
element x is k — x.
(g) Let N;, be the first A natural numbers, where k > 2, and define multiplication
mod k as follows:
X*, y = z, where z G N, and xy — z = nk for some n € N.
Then 1 is an identity for the operation. An element x © N,, has an inverse in
Sec. 7.1 THE STRUCTURE OF ALGEBRAS 307
N;, only if x and k have no nontrivial divisors in common, i.e., only if x and
K are relatively prime. #
Theorem 7.1.3: If an element has both a left and a right inverse with respect
to an associative operation, then the left and right inverse elements are equal.
Proof: Let 1 be an identity for the operation o, and let x be an element with
a left inverse w and a right inverse y. Then
wox=xoy= 1].
By associativity of the operation o, it follows that
w=wol=wo(xoy)=(wox)op=loy=y, |
unary
sum product difference abs minus absval
+ : ~ jx—y| max min — |x|
(a) I
(b) N
(c) {x[/0O<x<10}
(dq) {x|-Sxx<3}
fe) {x|-10<x*<0}
(f) {2x|xe
3. Let the universe be the real numbers R. Fill in the following table with Y (yes) or N
(no) according to whether the binary operations listed in the top row have the prop-
erties listed in the leftmost column.
(a) associative
(b) commutative
(c) identity exists
(d) zero exists
6. Consider the algebras <{a, b, c, d} ©> and <{a, b, c}, o>, where in each case o is defined
by one of the following operation tables:
(a) (b)
Find examples of algebras with a single binary operation which have the properties
listed below. In each case, choose your algebra to have a nonempty carrier as small
as possible if such an algebra exists. Your answer can be given as an operation table.
(a) An identity element exists.
(b) A zero element exists.
(c) An identity element and a zero element exist.
(d) The carrier has more than one element. Both an identity element and a zero
element exist.
(e) An identity exists but not a zero.
(f) <A zero exists but not an identity.
(g) The operation is not commutative.
(h) The operation is not associative.
(i) A left zero exists which is not a right zero.
G) Aright identity exists which is not a left identity.
(k) An identity exists and every element has an inverse.
(1) The carrier has more than one element. An identity exists, and every element
has a left inverse, but no element other than the identity has a right inverse.
Describe a variety with the signature <5, o>, where o is a binary operation, such-
that for every algebra <T, -> of this variety, if V < T, then <V, -> is a subalgebra of
cT, » .
Programming Problem
Many algebraic varieties are useful in various areas of computer science. We will
consider only four of the most important varieties: semigroups, monoids, groups,
and Boolean algebras. Semigroups and monoids find application in formal lan-
guages and automata theory, groups are used in automata and coding theory,
and Boolean algebras for many aspects of information processing as well as in
switching theory. The utility of the structures is not limited to these areas, however;
all of them are used in many other areas of investigation. In this section we will
develop some of the properties of these varieties.
Semigroups
The following deceptively simple structure has been extensively studied, and
a rich theory has emerged.
The preceding definition establishes that the variety of semigroups consists of all
algebras with a single binary operation which satisfies the axiom of associativity:
ac(boc)=(aob)oe.
From Definition 7.1.2, it follows that if (S, o> is a semigroup and T is a subset of
S such that T is closed with respect to o, then <7, o> is a subalgebra of <S, ©;
we call <7, o> a subsemigroup of <S, o>. The use of the term “subsemigroup” to
denote a subalgebra of a semigroup is justified by the following theorem.
Examples
(a) Let k >0 and S, be the set of integers greater than or equal to k; S; =
{x|x € 1 A x > k}. Then <S;, +> is a semigroup, where + denotes ordinary
addition, since the operation is associative and S, is closed with respect to +.
Note that if k < 0, the set S, is not closed under the operation of addition
and <S;, +> is not an algebra.
(b) The algebras <I, —> and <R-++, /> are not semigroups because the operations
of subtraction and division are not associative.
(c) If- denotes the operation of multiplication, the algebras <[0, 1], ->, <(0, 1), ->,
310 ALGEBRAS Ch.7
and <N, -> are all semigroups. Moreover, they are all subsemigroups of
<R, ->.
(d) Let © denote a finite nonempty alphabet. Then <Z*, concatenation> and
<X*+, concatenation> are semigroups.
(e) Let S = {a, b} and defi ne the ope rat ion © so that both a and b are righ t zero es:
aca=boa=a
ach=bob=b.,
The operation © on S is associative, since for any x,y,z € S,
xo(yoz)=xeoz=z=yozr=(xey)z,
The algebra <S, o> is a semigroup, called the right zero semigroup of two
elements.
(f) The algebras <S, max> and <S, min> are semigroups for any set S of real
numbers.
(g) Let R be a binary relation on a set S. Then <{R"|n € N}, composition> is
asemigroup. #
Monoids
Examples
(a) The algebra <R, +, 0> is a monoid because + is associative and 0 is an identity
element for +. Both <I, +, 0> and <N, +, 0> are submonoids of <R, +, 0.
(b) The algebras <I, -, 1>, <N, -, D, <I+, -, > and <R, -, 1> are all monoids.
(c) The algebras <I, -, 0> and <I, +, 15 are not monoids because in each case the
constant is not an identity for the specified operation.
(d) If % is a finite nonempty alphabet, then <2*, concatenation, A> is a monoid.
If X < &* then <X*, concatenation, A> is a submonoid of
<=*, concatenation, A).
Sec. 7.2 SOME VARIETIES OF ALGEBRAS = 311
(e) Let S be any subset of the real numbers which contains a lower bound, i.e.,
there is some m € S such that m < x for all x € S. Then <S, max, m) is a
monoid. Similarly, if S contains an upper bound n, then x € S=> x <nand
<S, min, n> is a monoid.
(f) The systems (Nz, +x, 0> and <N;, -;, 1) are monoids, where
N; = {0,1,2,...,4
— 1}
and the operations +, and -, are addition and multiplication mod k. #
If <S, o, a> is a monoid, then <S, o> is a semigroup; this is sometimes expressed
by the assertion that “every monoid is a semigroup.” On the other hand, some
semigroups, such as <N, +), have an identity, and some, such as <I+, +>, do
not. A semigroup <S, o> can always be converted into a monoid by “adjoining”
(i.e., adding) a new element whose behavior is defined to be that of an identity for
the operation o. Suppose 1 is an element not in S. (If necessary we can relabel the
elements of S so that 1 ¢ S.) We can extend the operation o to S U {1} so that
for all x € SU {1}, xo l =1lox=-x. Then <S U {1}, 0, 1d is a monoid. This
process is called “adjoining an identity” to the semigroup <S, o>. Note that even
if c was an identity of <S, o>, it will not be one for the monoid CS U {1}, °o, D,
since
col=loc=c#¥l.
Groups
Definition 7.2.3: A group is an algebra with signature <S, o, , 1> such that
o is an associative binary operation on S, the constant | is a two-sided identity for
the operation o, and ~ is a unary operation defined over the carrier such that for
all x € S, X is an inverse for x with respect to o.
Examples
(a) The alg ebr a <I, +, —, 0> is a gro up, whe re + den ote s add iti on and — den ote s
unary minus. If K den ote s the set of all mul tip les of a giv en k & N, the n
<K, +, —, 0> is a subgroup of <I, +, —, 0>.
(b) The algebra <Q+, -,~!, 1> is a group, where - denotes multiplication, and ~!
denotes the una ry ope rat ion of tak ing the rec ipr oca l of a rat ion al num ber .
(c) Let A be any set and let P den ote the set of per mut ati ons on A. The n P is the
set of biject ive fun cti ons fro m A to A. The str uct ure <P, o, ~}, 1, is a gro up,
where o denotes composition of functions, and f~! is the inverse function of f
(d) The ope rat ion s max and min can not gene rall y be used as the bina ry ope rat ion
of a group because an inverse operation cannot be defined if the carrier has
more than one element.
(e) The algebras <N,, +x, ~, 0 are groups, if we define X = k — x.
(f) The algebras <N;, +x, ~, 1) are not groups because the element 0 < Ny has
no inverse. +
Boolean Algebras
(where + and - are binary operations and ~ is a unary operation called com-
plementation) and the following axioms hold. (We write ab for a-b.)
Gi) a+b=b+a !
(ii) ab = ba co mm ut at iv e law s
(ili) Deak 1 Ot
(iv) (ab)c = a(bc) ass oci ati ve law s
(v) ab+c)=ab+ac e e
(vi) a+ @c) =(at+ bla +o) distributive laws
(vil) a+0=a 0 is an identity for +
(vii) al=a 1 is an identity for -
“o ae : properties of the complement
Less formally, we can say that a Boolean algebra has two commutative, associa-
tive binary operations + and - which distribute over each other, together with a
single unary operation ~. The constants 0 and 1 are identities for -- and . respec-
tively, and for every element a,a-+ d= Oanda-a@= 1,
If <S, +, +, ~, 0, 1> is a Boolean algebra and T is a subset of S which is closed
under the operations +, -, and ~, and 0,1 © T, then <T, +, +, 7,0, 1>is a sub-
algebra of (S, +, +, ,0, 1> called a Boolean subalgebra. A Boolean subalgebra is
a Boolean algebra.
Examples
(a) It can be shown that if the carrier of a Boolean algebra is finite and has more
than one element, then the cardinality of the carrier is an even integer. The
following operation tables describe the operations of a Boolean algebra with
carrier {0, 1}.
+10 1 _
0/0 1 o] 1
1 | 1 Oo 1] 0
Note that these operations are similar to the operations V, A and — defined
for truth values in Chapter 1.
(b) Let A be any set and let ~ denote the operation of set complementation rela-
tive to A. Then <P(A), U, 1, ~, 6, A> is a Boolean algebra. This is an example
of a Boolean set algebra. The carrier of a Boolean set algebra need not be
a power set; it can be any collection of sets which is closed under union, inter-
section and complement relative to some universal set.
(c) Let S be the set of positive divisors of 30; S = {1, 2, 3, 5, 6, 10, 15, 30}. Let
x1 + x, denote the least common multiple of x, and x2; let - denote the great-
314 ALGEBRAS Ch. 7
est common div iso r and X den ote the num ber 30/x . The n <S, +, -, ~, 1, 30>
is a Boolean algebra. #
Constr uct a sem igr oup usin g the ope rat ion max whi ch has a zero but no iden tity .
Let S, = {x|x € I A x > k} where k > 0. Show that (S,, +> is a subsemigroup
of <I, +>.
Construct a monoid using the operation max which has no zero and an infinite
carrier.
Let E den ote the even natu ral num ber s; E = {0, 2,4, ...} . Sho w that (E, +, 0 is
a submonoid of <N, +, 0>.
Show that every subalgebra of a monoid is a monoid.
Construct a group using max as the binary operation.
Let E denote the even integers; E = {0, —2, 2, —4, 4, ...}. Show that <E, +, —, 0>
is a subgroup of <I, +, —, 0>, where the symbol — denotes unary minus.
Construct tables for the operations of addition and inverse for the group
<Ni, +x, ys 0»
where k = 5.
Show that if o is a binary operation on 7 and 0 is a zero element with respect to the
binary operation o, then T cannot be made the carrier of a group unless T = {0}.
Prove that if <S, o, ~, 1> is a group, then for every a € S,
(a) ifx ~y, th en
aox ao y, Similarly, if x ~ y, thenxoa #yoa.
(b) aeoS=S=Soa.
(c) d =a (the inverse of the inverse of a is a).
11. Show that if <S, 0, 7, 1> isa group and Tis a nonempty subset of S such that
Vx Vylx,y € T> xope TI,
then <7, o, ~, 1> is a subgroup of <S, °, ~, 1D.
12. For each of the following digraphs, let R be the binary relation represented by the
digraph, and let S = {R"|n © 1+} be the carrier of an algebra in which composition
of relations is the binary operation. In each case, determine whether the algebra can
be presented as a semigroup, monoid, or group, and state the cardinality of the
carrier.
(a) C) C) (b) 68
(c) /\ (d) |
Sec. 7.3 HOMOMORPHISMS ~— 315
(e)
13. (a) State necessary and sufficient conditions on a binary relation R so that the set
{R"|n © N} can be made the carrier of a monoid with the operation of com-
position.
(b) State necessary and sufficient conditions on a binary relation R so that the set
{R"|n © 1+} can be made the carrier of a monoid using the operation of com-
position.
(c) State necessary and sufficient conditions on a binary relation R on a finite set
so that the set {R"| 2 © I++} can be mad e the carr ier of a grou p with the bina ry
operation of composition.
14. Let S be a set. Sho w that <{S, 6}, U, A, 7, @, S> is a Boo lea n sub alg ebr a of
<O(S), U, A, 7, O, SD.
15. Consider the following questions to determine when a Boolean algebra can be
constructed from the set of integers between and including m and n where m <n,
using the operations of max (for +) and min (for -).
(a) Do the operations max and min satisfy Axioms 1-4 of Definition 7.2.4?
(b) Do the ope rat ion s max and min sati sfy Axi oms 5 and 6?
(c) Wha t wou ld be the con sta nts if Axi oms 7 and 8 are to be sati sfie d?
(d) Can an inverse ope rat ion be defi ned whi ch sati sfie s Axi oms 9 and 10? (Hin t:
Your answer should be expressed as a function of the size of the carrier, i.e.,
ofn—m-+1.)
16. Consider a com put er whi ch uses wor ds of k bits to repr esen t non neg ati ve inte gers
in binary nota tion . The only ope rat ion is addi tion . Whe n ove rfl ow occu rs, the high
order bits are lost.
(a) What algebrai c vari ety wou ld be mos t app rop ria te to mod el addi tion in the
machine? How big is the carrier?
(b) Suppose ove rfl ow caus es the resu lt to be set to the larg est rep res ent abl e num ber .
What algebraic variety would best model addition in this case?
7.3 HOMOMORPHISMS
We wish to find ways of cha rac ter izi ng the str uct ura l sim ila rit ies of two alg ebr as
A and A’. Clearly one possib ili ty is for A’ to “lo ok jus t lik e” A, tha t is, for A’ to
be simply a relabeled version of A. Th en A and A’ mus t hav e the sa me sig nat ure ,
the carriers of A and A’ mu st hav e the sa me car din ali ty, and the ope rat ion s and
constants of the two algebras mu st hav e the sa me pro per tie s. If two alg ebr as are
similar in the sense we hav e des cri bed , the n the sim ila rit y can be est abl ish ed by
exhibiting a bijection from the car rie r of A to tha t of A’ suc h tha t the fun cti on
describes how A’ can be viewed sim ply as a a rel abe lli ng of 4. The con cep t is ma de
316 ALGEBRAS Ch. 7
Definition 7.3.1: The algebras A = <S, 0, A, kD and A’ = <S’, 0’, A’, k’> are
isomorphic if there exists a bijection / such that
(Gi) A: SS’;
(ii) h(a o b) = h@) 0’ hb);
(iii) A(A(@)) = A’(A@);
(iv) A(k) =k’.
The map h is called an isomorphism from A to A’, and A’ is said to be an isomorphic
image of A under the map A.
Examples
(a) Let E denote the set of even integers; E ={... —4, —2,0,2,4,...}. Then
the algebras <I, +, 0> and <E, +, 0> are isomorphic. This is established by
showing the map
fil-E,
f(x) = 2x,
is an isomorphism, that is, by showing the conditions of Definition 7.3.1
are satisfied:
1. The function fis clearly bijective.
2. For any integers x and y, f(x + y) = 2(x + y)
== 2x + 2y
= f(x) + f0).
3. f(0) =2-0 =0.
(b) Let R+ denote the set of positive real numbers. Then <R+, -, 1>is isomorphic
to <R, +, 0> and the map
h: R+ > R,
A(x) = log x,
is an isomorphism. To show this, we first establish that A is a bijection from
Sec. 7.3 HOMOMORPHISMS = 317
R-+ to R. The function A is surjective because for x > 0, the equation log x = y
always hasa solution of x = 2”. Because the log function is monotone increas-
ing, h is injective. Hence, h is bijective and condition (i) of Definition 7.3.1 is
satisfied. Furthermore,
(c) The semigroups <N, +> and <I+,-> are not isomorphic. We establish this
using a proof by contradiction. Suppose / is an isomorphism from <N, +> to
<I-++, :>. There are infinitely many prime numbers in I-+-. Since h is a surjec-
tion from N to I+, there must be some x € N where x > 2 and some prime
number p, where p > 3, such that A(x) = p. If h is an isomorphism from
<N, +> to <I+, +>, then
G) p= h(x) = A(x + 0) = A(x)-h(0), and
(ii) p = h(x) = h(« — 14+: 1D = A — 1)-A0).
But since p is a prime number, the only factors of p are p and 1. Therefore,
by (i), either A(x) = 1 or h(O) = 1, and by (ii), either AQ) = 1 or A — 1) = 1.
Since 0 << 1<x—1 <x, it follows that 1 is the image of at least two ele-
ments under the function A. We conclude that A is not a bijection and therefore
not an isomorphism. 7
Definition 7.3.2: Let A = <S,0, A, k> and A’ = ¢S', 0’, A’, k’> be algebras
with the same signature, and let A be a function such that
0) h: S—> S';
(ii) h(a o b) = h(a) 0’ A(b) ;
(iii) h(A(a)) = A’(A(a));
(iv) h({k) =k’;
then h is a homomorphism from A to A’.t
Figure 7.3.1 dep ict s how two alg ebr as can be rel ate d by a ho mo mo rp hi sm .
+There is a rich ter min olo gy ass oci ate d with hom omo rph ism s. Let # be a hom omo rph ism
from A to A’. If his injective, then A is a mon omo rph ism and if h is surj ecti ve, then A is an epi mor -
phism. If A = A’, then h is an end omo rph ism ; if A = A’ and hk is an iso mor phi sm, then / is an
automorphism. We will not use this terminology.
318 ALGEBRAS Ch. 7
=~
: \ Y h(S):
YL \y
Figure 7.3.2
Examples
(a) Fork ¢ I, the map f,: 1 > I defined by A(x) = kx isa homomorphism from
<I, +, 0> to <I, +, 0>. If k # 0, then fis injective. If k = 1 ork = —1, then
f is bijective and therefore an isomorphism.
(b) Let f:R— R where f(x) = 2*. Then f is an injective homomorphism (but
not an isomorphism) from <R, +, 0> to <R, -, 1>.
(c) Let f: N- N, where f(x) = x mod k. Then fis a surjective homomorphism
from <N, +, 0> to <Nz, +x, 0>.
(d) Let 2 be a finite nonempty alphabet, and let ||x|] denote the length of a
Sec. 7.3 HOMOMORPHISMS = 319
h(x) = |[xIL,
is a homomorphism from <Z*, concatenation, A> to <N, +, 0>. If Z is a sin-
gleton set, then A is an isomorphism. #
Examples
(a) Let A be a homomorphism from the monoid <I, +, 0> to <I, +, 0 defined by
hil,
h(x) = 3x.
The homomorphic image of <I, +, 0> under Ais the monoid <{3z |” € Dj, +,0>,
which is a submonoid of <I, +, 0>.
(b) Define the map A: R > R as h(x) = 2%. Then A is a homomorphism from the
monoid A = <R, +, 0> to the monoid A’ = <R, -, 15. The image of A under
h is the submonoid <R-+, -, 1).
A(x) = nx.
Then / is a homomorphism from the semigroup <.S;, +> to a subsemigroup
of KS, D>.
(e) Let N; = {0,1,2,...,& — 1}, where k > 1, and let p ¢ N. The map
h: N- Ng,
A(x) = y where y = px mod &k,
is a homomorphism to a submonoid of «Ny, +,, 0>.
(f) For the universe of integers I and some k & N, define x ~ y if and only if
= ymod k. Let I/~ be the quotient set; then [x] = {y |» = x mod k}. Define
the operation + on I/~ as [x] + [y] = [x + y], and unary minus as —[x] =
[—x]. Then <I/~, +, —, [OP is a group and ,
h:1l-oVe-,
h(x) = [4],
is a homomorphism from <I, +, —, 0> to <I/~, +, —,f0op>. #
Sec. 7.3 HOMOMORPHISMS 321
(a) Show that two algebras cannot be isomorphic if their carriers have different
cardinalities.
(b) Give an example to show that two algebras with the same signature may not be
isomorphic even though their carriers have the same cardinality.
Prove Theorem 7.3.1 for algebras with signature <S, o, k>, where o is a binary opera-
tion andk ¢€ S.
Suppose A is a homomorphism from <S, o> to <S’, o>, where o and °’ are binary
operations.
(a) Show that if 1 © S is an identity with respect to the operation o, then some
element 1’ € S’ is an identity with respect to o’ for the subalgebra <A(S), o>.
(b) Show that an identity for <h(S), °’> may not be an identity for ¢S’, 0’>.
(c) Show that if 0 € S is a zero with respect to o, then some element 0’ € S’
is a zero for the subalgebra <A(S), o> and h(O) = 0’.
(d) Show that a zero for <A(S), o> may not be a zero for <S’, 0’.
(a) Show that there are exactly i homomorphisms from <N;, +:, 0> to itself.
(b) Describe the set of all homomorphisms from <N, +, 0> to <N;, +7, >.
(c) Describe the set of all homomorphisms from <N2, +2, 0> to <N3, +3, 0>.
Prove parts (b), (c), and (d) of Theorem 7.3.3.
Most computers represent numbers with binary sequences of a fixed length. Only
a finite set of numbers can be represented exactly, and “arithmetic overflow” occurs
when the result of a computation is larger than any of the numbers which can be
represented. Consider the following strategies for treating arithmetic overflow. For
simplicity, we will treat only the natural numbers and the operation of addition. For
each of the following functions f, determine whether f is a homomorphism from
<N, +, 0> to the specified algebra <5, , 0>, where S is the set of binary sequences of
length k. In each case, the operation @ is based on binary addition and is described
by means of examples. In the illustrative examples given below, we use k = 3.
(a) The & bits represent the least significant digits of the k digit binary representa-
tion of each natural number. The operation @ is the usual binary addition
except that if overflow occurs, the leading digits are lost. Thus, f(3) = 011,
f(6) = 110, and f(9) = £3 + 6) = 011 © 110 = 001 = f(8n + 1)forallneN.
(b) Ifn < 2*, then f(m) is the & digit binary representation of n. If n > 2*, then f(#)
is represented by the k digit binary representation of 2" — 1. Thus f(3) = 011,
f(6) = 110, and £9) = f3 + 6) = 011 © 110 = 111 = f(®) for all x > 7.
(c) One bit is reserved for an indication that overflow has occurred. (We will use 0
for no overflow, 1 for overflow, and use the leftmost bit as the overflow indica-
tor.) For all numbers less than 2%~, the numbers are represented in their
k — 1 digit binary representation and the overflow bit is set to 0. If n > Qe-1),
then f() consists of the digit 1 followed by the k — 1 least significant digits of
the binar y repre senta tion of n; e.g., if k = 3, then f(12) = 100. Thus f(3) = 011,
fQ2) = 010, and f(3 + 2) = 011 © 010 = 101 = f(4n + 1) for alln e N.
Let A = <S, 0, k> and A’ = <8’, 0’, k> and let h be a hom omo rph ism fro m A to A’.
Show that if <7, 0’, k’> is a sub alg ebr a of A’, the n <h7 '(7 ), °, k> is a sub alg ebr a of A.
322 ALGEBRAS Ch. 7
8. Let X be a finite alphabet, and consider the monoid <Z*, concatenation, A>. This is
sometimes called the free monoid generated by X. The free monoid has the following
important property:
Let <S, o, 1> be an arbitrary monoid. For any map A: X —S, there is a unique
extension of 4 to a homomorphism h*: X* - S.
Prove this property.
—-a~—b
wh ic h is pr es er ve d un de r th es e op er at io ns ,
Because ~ is an equivalence relation
sp ec t to th e bi na ry op er at io ns of +, —,
we say ~ is a congruence relation with re
-, and the unary operation —.
re la ti on s fo r op er at io ns of ar bi tr ar y ar it ie s,
Rather than define congruence
th e al ge br a A = <S , 0, A> , wh er e o is a
we will restrict our formal definition to
op er at io n. We wil l us ua ll y wr it e ab fo r a o b.
binary operation and A is a unary
ea k of a re la ti on ~ on a se t S as a co ng ru en ce re la ti on wi th
Inform al ly , we wi ll sp
io n o if ~ is a co ng ru en ce re la ti on on th e al ge br a <S , o> . A
respect to th e op er at
re la ti on on an al ge br a A wi th ca rr ie r S if an d on ly if ~
relation ~ is a co ng ru en ce
sp ec t to ea ch of th e op er at io ns of A.
is a congruence on S with re
Examples
Equality is a congruen ce re la ti on on an y al ge br a.
(a)
er at io n of ad di ti on . Th e eq ui va le nc e
(b) Consider the integers I together with the op
so me gi ve n k € N is a co ng ru en ce
relation ~ of “equivalence mod k” for
324 ALGEBRAS Ch.7
P2\. P.,
A( q ) q?
and define p/q ~ r/s <> ps = rq as before. Clearly if a = b then A(a) = A(b);
but a ~ b does not imply A(@) ~ A(d), e.g., A(1/2) # AQ/4). Thus, ~ is
not preserved by the operation A, and consequently ~ is not a congruence
relation on<F, A>. #
The following theorem gives another characterization of a congruence relation
with respect to a binary operation.
The ore m 7.4. 2: Let A = {S, 0 A}, be an alg ebr a wit h a bin ary ope rat ion o
and a unary ope rat ion A, and let h bea hom omo rph ism fro m A to A’ = <S’, o', A.
Then the equ iva len ce rela tion over S ind uce d by / is a con gru enc e rela tion on the
algebra A.
Proof: Two elements a,b € S are equivalent under the relation induced by
h if and onl y if h(a) = A(b) . To sho w this is a con gru enc e rela tion on A we mus t
show
(i) ifa~ b, then Aa ~ Ab, and
(ii) ifa~ bandc~d,thenacc~ bod.
(i) If a~ b, then h(a ) = A(b ), and the ref ore A’A (a) = A‘h (b) . But sin ce A is a
homomorphism, (Aa) = A‘h (a) and h(A b) = A’h (b) . The ref ore h(A a) =
h(Ab), and hen ce Aa ~ Ab. Thi s est abl ish es tha t ~ is a con gru enc e rel ati on
with respect to the unary operation A.
(ii) Ifa ~ ba nd c ~ d, the n A(a ) = h(b ) and hA(c ) = h(d ). The ref ore
h(a) 0 h(c) = h(b) o' h(a).
Since h is a homomo rp hi sm , h(a © c) = h(a ) 0’ h(c ) and h(b o d) = h(b ) o' h(d );
hence h(a o c) = h(b o d). It fol low s tha t aoc ~ bod , thu s est abl ish ing tha t
~ is a congruence rel ati on wit h res pec t to the bin ary ope rat ion o.
Thus ~ is a congruence relation on the algebra A.
Example
Consider the homomorphism A fr om the al ge br a <2 *, co nc at en at io n, A>
Th e eq ui va le nc e rel ati on ~ in du ce d by h is
to <N, +, 0> defined by A(x) = ||x||.
the following:
w~ v<>h(w) = h(v)> | wl] = llell.
le nc e rel ati on w ~ v<=> llw || = |jo lf is a
Since # is a homomorphism, the equiva
co nc at en at io n. It fol low s tha t if || w|| =
congruence relation on &* with respect to
{|| and |] yl] = [Z|], then | wyl] = lezl
Problems: Section 7.4
occurrence of “—” represents the unary minus. (Note that you must show that ~ is
an equivalence relation.)
2. Foran arbitrary monoid A = <S, o, 1>, show that equality and the universal relation
S x S are both congruence relations on A.
3. Consider the algebra A = <I, +). For each of the following binary relations on I,
prove or disprove that the relation is a congruence relation on A.
(a) x~ yo X<0Ay<O0VXS>OA YS)
(b) x~ yoo |x —y| < 10
(c) x~yoe7w=y=0VX%KFOA yD)
(qd) x~yoxby.
4. Let k be a natural number. Describe the class of all congruence relations on an
algebra of the form <{0, 1, 2,..., &}, max>.
5. An ideal of a semigroup A = <5, ©» is a subset K of the carrier S such that if x ¢ K
and y € S, then xo y € K and yox € K. For an arbitrary ideal K, define the
equivalence relation ~ over the carrier S as follows:
x~yeo[xyye KV (x
= y)].
(a) Show that if A has a zero element 0, then 0 < K.
(b) Show that if A has an identity element 1 and S + K, then 1 ¢ K.
(c) Show that ~ is a congruence relation on A.
6. Find an infinite ideal of the semigroup <I, ->. (The definition of an ideal is given in
the preceding problem.)
7. Let A = <S, 0, A> bean algebra, where o is a binary operation and Aisa unary opera-
tion. Show that if ~ and = are both congruence relations on A, then the intersection
of ~ and = is also a congruence relation on A.
8. State the conditions for ~ to be a congruence relation on <S, []>, where [] is a tern-
ary operation on S. Denote the result of the operation [] on the operands a, b, c
by L1G, 6, ¢).
9. Let & be the alphabet of a programming language, where ¥ contains the two symbols
end and continue (note that the keywords of a language are often treated simply
as special symbols). Using - to denote concatenation, we define the relation ~ on
<2*, concatenation) to be the smallest congruence relation such that for anyx € Z*,
x-continue ~ continue-x ~ x
end-x ~ end
[x1-b] = [xy].
Show that with respect to this operation,
(c) [continue] is an identity on £*/~, and
(d) [end] is a left zero, but not a right zero.
Sec. 7.5 NEW ALGEBRAS FROM OLD = 327
For U, Vc &*/~, let U- V denote the set {[xy]| [x] € U and [y] © V}. We define a
matrix product on these sets in the usual way, but using union for + and set product
for multiplication; thus
U2, Ur. Vor Vr (Uo4+Vi1) U (22° Var) (Ua Via) U (U 22+ V2)
(e) Findann x n matrix which is a left identity for this matrix operation (each entry
of the matrix will be either [end] or [continue]).
(f) Is this left identity also a right identity?
There are seve ral way s of com bin ing alg ebr as to buil d new ones . We will disc uss
two methods in this section.
Quotient Algebras
We firs t trea t the top ic of quo tie nt alg ebr as. Rec all tha t if ~ is an equ iva len ce
relation over a set S, then [x] denotes the equivalence class of x € S.
Def ini tio n 7.5. 1: Let A = <S, 0, A, k> be an alg ebr a wit h a bin ary ope rat ion
o, a unary ope rat ion A, and a con sta nt k, and let ~ be a con gru enc e rel ati on on
A. The quotient algebra of A wit h res pec t to the rel ati on ~, den ote d by A/~ , is
the algebra <S/~, 0’, A’, [k]>, where
(i) S/~ is the quo tie nt set of S und er the rel ati on ~ (the ele men ts of S/~
are the equivalence classes of the relation ~),
(ii) For all [a], [6] € S/~ , [a] o’ [6] = [ao 4]; and A’[a ] = [Aa] ,
(iii) [k] is the equivalence class of k under ~.
The operations and constants of a quotient algebra retain many of the prop-
erties of the original algebra. For example, if the operation ° is commutative,
then ©’ is as well, since
Example
Let F be the set of fractions as defined in Section 7.4 and consider the algebra
A=<F,+,—,—>. If ~ is the relation p/g ~ r/s<> ps = rq, then ~ is a con-
gruence relation on the algebra A. The carrier F/~ of the quotient algebra A/~
is the set of rational numbers Q. #
SxS 5 S S A S
ath)
| © J ceeeeeRUReNAnenEE
hXh h h h
Product Algebras
Definition 7.5. 2: Let A’ = <S’, 0’, A’, ky, and A” = (S" , 0, AY, Kk" be
algebr as, whe re ’ and ” are bin ary ope rat ion s and A’ and A” are una ry ope rat ion s.
The direct product of A’ with A”, is the algebra
AX A’ —_ <S’ x“ Ss", °, A, Xk’, k'S,
where <a, c> 0 <b, d> = (a0 ' b,c 0" dy and A¢a , c> = <A’ a, A"c > for all <a, ©;
<b, d> in S’ x S’’. The alg ebr a A’ x A” is also cal led the pro duc t alg ebr a.
If the direct product of two algebras is defined, then the product algebra has
the same signature as the operands. If both operand algebras are of the same
variety, then the variety of the product algebra will be the same as that of the
operands; for example, the direct product of two semigroups is a semigroup.
Examples
(a) Let A = <N, +,0> an d A’ = <N , +, 0> . Th en A x A’ = (N ?, +, <0, 0;
the oper at io n + of the pr od uc t al ge br a is de fi ne d by the eq ua ti on
<a,c> + <b,d> = <a+b,c¢+d)>.
In this chapter we have only covered some of the most basic and well-understood
topics of the field usually referred to as universal algebra. It is possible to extend the
concepts we have described in. many interesting and important ways. For example,
relational algebras permit relations on the carrier to occur in the signature of the alge-
bra; in our treatment, relations could only be included indirectly, e,g., by choosing
a partially ordered set as the carrier. Another extension would be to relax the require-
ment that the operations of an algebra be defined for ali possible operands. For
example, in the formulation we have presented, <R, /> is not an algebra because
the operation of division is not defined if the divisor is 0, Permitting operations to
be defined only for some of the possible operands gives another kind of mathe-
matical structure called a partial algebra. We can also extend the concept of algebra
to that of a many-sorted algebra; this is a mathematical system in which elements
from various sets (rather than a single carrier) can occur as operands, and not all
operations need be defined for all operands. Thus, we could use one set to represent
the integers, another set to represent the floating point numbers, and a third set to
represent truth values. In such an algebra, arithmetic operations are defined on
the sets of numbers, and Boolean operations are defined on truth values. The
ceiling and floor functions are unary operations which map the real numbers to
the integers. A relation such as “<” can be represented as an operation which has
numbers as operands and whose result is a truth value. Extensions such as these
are currently being applied to a number of problem areas of computer science,
the most notable of which is the semantics of programming languages.
A(x) = nx.
Let ~ be the congruence relation on A induced by A. Describe the quotient algebra
A/~.
2. Let h be a homomorphism from A = <S, 0, A, k> to A’ = (S’, 0’, A’, k’>, and let ~
be the equivalence relation induced on S by h:
Let A = <{1, 2, 3}, max, 1> and A’ = <{5, 6}, min, 6>. Specify the product algebra
A x A’ by constructing an operation table and identifying the constants.
Let A’ = <8’, 0’, A’, 1 and A” = <8”, 0”, A”, 1’ where ©’ and o” are binary
operations and A’ and A” unary operations, and consider the product algebra
A’ x Av = 6S’ XK 8,0, AKU, I>.
(a) Show that if the binary operations of A and A’ are commutative, then the binary
operation of the product algebra is commutative.
(b) Show that if the binary operations of A’ and A” are associative, then the binary
operation of the product algebra is associative.
(c) Show that if the constants of A’ and A” are identity elements with respect to
their binary operations, then the constant of the product algebra is an identity
with respect to its binary operation.
(d) Show that if the constants of the algebras A’ and A” are zeroes with respect to
their binary operations, then the constant of the product algebra is a zero with
respect to its binary operation.
(ec) Show that if A and A’ are groups, then the product algebra is a group.
Let A and A’ be alg ebr as wit h non emp ty carr iers and def ine the rel ati on ~ ove r a
product algebra A x A’ as follows:
<w,x> ~ <y, Z) <> w= y.
(a) Determine when ~ is a congruence relation on AXA’.
(b) Show that if the relation ~ defined above is a congruence relation, then
(A x A’/~ is isomorphic to A.
Let A; = <N;, +;,0> where N, = {0,1,2,...,/—1} and +, denotes addition
mod j.
(a) Show that A, x A; is isomorphic to Ag.
(b) Describe the set of congruence relations on Az X Az.
(c) Describe the set of con gru enc e rel ati ons on A,,, whe re m € I+.
The programs of this text have been written in an informal programming lan-
guage based on ALGOL 60. Because our principal concern is the clear and unam-
biguous description of algorithms, we have used the ALGOL 60 framework
whenever it has been convenient, but abandoned it when doing so resulted in a
more easily understood algorithm description. This has resulted in a language with
the following properties:
1. Simple data types include integers, real numbers, and character strings.
Complex data types include whatever is convenient for treating the prob-
lem, including arrays, lists, graphs, edges and nodes. The data type of a
program variable will be evident from the context; we will not include
formal declarations in the programs. Similarly, the scopes of variables
will be clear from the context.
2. The operations used in the language include the arithmetic operations,
the floor and ceiling functions, and concatenation of character strings.
When convenient we will also use other operations, requiring only that
they be clear and unambiguous.
3. The conditions of the language (used in conditional and iteration state-
ments) include all propositions whose truth values can be established at
the appropriate time during program execution.
332
Appendix THE PROGRAMMING LANGUAGE 333
enable the reader to understand the programs of the text with a minimum of
effort.
statementk
end
4. while condition do statement the while statement
5. for variable < initial-value [step step-size] until final-
value do statement the for statement
6. procedure procedure-name [(list of parameters)]: the procedure definition
statement statement
7. return [expression] the return statement
8. procedure-name (list of arguments) the procedure cail statement
9. comment: character string the comment statement
10. other statements
if cond it io n th en st at em en tl [el se st at em en t2 ]
begin
statement] ;
statement? ;
statementk
end
if x < y then
begin
temp <~ X;
x<—Y;
y< temp
end
begin
nfact <1;
while x > 1 do
begin
nfact — nfact *n;
n<n— I
end
end
begin
variable < initial-value;
while variable < final-value do
begin
statement;
variable < variable ++ step-size
end
end
begin
variable < initial-value;
while variable > final-value do
begin
statement;
va r i a b l e <— v a r i a b l e + s t e p - s i z e
end
end
th e va lu e 0 to al l ar ra y en tr ie s A[ i]
The following statement assigns
for 0 < i < n a n d ia n ev en nu mb er :
for i<— 0 step 2 until 7 do A{i] <— 0
336 THE PROGRAMMING LANGUAGE Appendix
Section 1.1
339
340 ANSWERS TO SELECTED PROBLEMS
Suppose P is false. Then P => Q is true for any proposition Q. If we know P> Q
is true and accept the false hypothesis P as true, then we can infer the truth of Q
from the truth table of =. Since Q is arbitrary, Q may or may not be true.
The only noncommutative operator is =.
The only nonassociative operator is >.
(b) No.
<-PVQ
Section 1.2
1 Vx Vy dz S(x, y, 2)
Vx[—L(x, 0)] or m4x[L(x, 0)]
True
False
P(x, y) denotes x + y = 0
All integers greater than 10.
The universe contains only 3.
PO, 0) A PQ, 1)
[P, 0) V PO, 1] A (Pd,0) V PQ, D)
P(x) denotes x = x + 1.
Vx Vy dz P(x, y, 2)
Vx P(x, 0, x)
Section 1.3
L
(a) » VyLE(y, 1) > VxP(x, y, x)]
(d) Vx[P(, x, 6) <> E(x, 2)]
(g) Vx Vy[ G(x, ») A Gy, x)] > EC, y)]
(h) Vx Vy Vz[[GQ, x) A GQ, z)] > Vu Well P(x, z, u) A PCy, z, v)] => Gu, v)]]
(a) Every arithmetic assertion which is provable is true.
(d) If z = x V y and z is provable, then x is provable or y is provable.
ANSWERS TO SELECTED PROBLEMS 341
(a) IfP(x) denotes “x is prime” and E(x) denotes “x is even”, then J !x{[P(x) A E(x)].
(c) T(x): x is a train
C(x): =x is a car
F(x, y): xis faster than y
Vx{T(x) > AyICQ) A Flx, yl]
(e) Let R denote “it rains tomorrow” and W(x) denote “x will get wet.”
R=> jAx[W()]
VQ) P(x) <> “Ax P(x)
dxP(x) <> “Vx — P(x)
AIP(x) > Ax[P(x) A Vy[PO) > y = o]]
(a) True.
(b) False. Consider the universe consisting of 0 and 1, and let P(x) denote “x = 0”
and Q(x) denote “x = 1.”
(Refer to Tables 1.1.1 and 1.1.2.)
(a) Ax[P(x) A Q(x)]< [PO) A G)] V [PG) A O)) (expansion)
<> [[P) A Q)] V PQ)] A (PO) A QO) V Q)]
(distributivity)
<> [[PO) V PQ)] A [Q0) V PO) A (PO) V @d)]
A (20) V ew (distributivity)
=> [P(0) V P(1)] A [@0) V OC] (simplification)
Moreover, for this universe,
[P(O) V P(1)] A [Q0) V O()]<> 4xP(x%) A JxQ(e).
(b) Let P(x) denote “x = 0” and Q(x) denote “x = 1”.
Section 1.4
1. (a) F: Ym fat.
T: I’m thin.
FV T
—T
..F Disjunctive syllogism
Conclusion: I’m fat.
342 ANSWERS TO SELECTED PROBLEMS
(b) R: Trun.
B: I get out of breath.
R=>B
—B
“aR Modus tollens
Conclusion: I didn’t run.
(c) B: The butler did it.
H: His hands are dirty
B>H
Li
The only conclusions are the hypotheses.
(e) Iam not happy and my program does not run. (By modus tollens and conjunc-
tion.)
(f) All trigonometric functions are continuous functions. (Universal instantiation,
hypothetical syllogism and universal generalization.)
(i) Let A(x) denote “x is good for the auto industry.”
Let C(x) denote “x is good for the country.”
Let Y(x) denote “x is good for you.”
Let & denote the constant of “you buying an expensive car.”
The given hypotheses are:
Vx[AQ) > CQ)]
Vx{C(x) > YO)
A(b)
Then by universal instantiation
A(b) = C(b)
C(b) > Y(b)
By modus ponens
C(é).
And again by modus ponens
Y(b)
and by conjunction
C(b) A Y)
Conclusion: It is good for you and the country for you to buy an expensive car.
3. (a) J: IBM will take over the copier market.
X: Xerox will take over the copier market.
R: RCA returns to the computer market.
We wish to show
UV X)
Rol
7. 7aR
ANSWERS TO SELECTED PROBLEMS 343
—“Ax[T(x) A 7P@)]
Ax[P(x) A C@)]
“. AVX[T(x) > 7C)])
a dif fer ent int erp ret ati on con sid er a uni ver se con -
The argument is invalid. For
ble and a (ro und , rub ber ) bal l. Def ine the pre di-
sisting of a (round, glass) mar
cates as follows:
T(x) denotes “x is a marble.”
P(x) denotes “x is a round object.”
C(x) denotes “x is made of rubber.”
7. (b) The third step, which asserts
3x{P(x) \ 7O(Q)) > Ix 7 PQ) A ax 7909]
is fallacious, although
Ax{A(x) A BO X) ] > [x AG ) A dx B( x) ]
Section 1.5
Section 1.6
2. Any program which does not halt is a solution (see Definition 1.6.1). The following
program is correct, because if it halts (it won’t), the final assertion false will be true.
Al: true
while true do x <- 1
A2: false
3. (a) (i) using forward construction:
Al: true
x<-l1
Al: dyltrue A x = 1]
But A1 is equivalent to the assertion “x = 1”, hence
AT: true
x<-]
Al:x=1
is correct.
Similarly
Al:x=1
yo2
Al: 3azgfx=1Ay=2]
is correct and equivalent to
Al:x=1
yo 2
AF:x=1Ay=2
346 ANSWERS TO SELECTED PROBLEMS
(This can be done either with truth tables or by using the identities of Tables il ll
and 1.1.2.) Applying a rule of consequence, it follows that (ii) is true. Thus, the .
if-then rule can be applied and we conclude that the program segment is correct. _
6. procedure ZERO
Al:n>0O
begin
ie;
Alin>OQAi=1
Avicnt+1A Wil<j<i> Vij] =0)
ANSWERS TO SELECTED PROBLEMS 347
while {<i n do
Aisin
A Vifl <j <i> V[j] =0]
begin
Vii] — 0;
A4:isin A Vil sjsi= Vis] = 0]
ieit+l
At:i<n+1iAVill<j <i> VU] =0]
end
AF: Vifisjsin=> V{j] = 9]
end
Al 1}A1. We
— {i note that Al > A2 since n>O Ai=1l>i<n+],
and the assertion 1 <j < iis false for all j.
By a rule of consequence, this establishes that AI{i << 1}A2.
We next establish that the hypothesis of the rule of iteration holds, that is,
(A2 A ix n{V[i]—0;i<—i+4+ 1}A2.
We note that A2 \ i<.n< A3. Hence, it suffices to show
AX{V[i] — 0; ii + 1342.
We will first prove A3{V[i]< 0}44 and then A4{i<_i-+ 1}A2. The assertion
A3{V[i] <— 0}44 follows immediately from an application of the Alternative Axiom
of Assignment. Applying the same Axiom, we find
<j i
itl<n+ 1>Va
<it1 Oiji+l1342.
] =V
WijA
Since the asse rtio n on the left is equi vale nt to A4, it foll ows that A4{i <— i+ 1}A2 .
This establishes that the rule of iteration holds, and we conclude that
A2{while i <n do begin V[i] <— 0; i< i+ 1 end}[A2 A 7“G< nv).
But
[422 A MG<n]>fi=nt+la Wil<j <i> Vj] =0}} > AF.
It follows by the rule of composition that the procedure ZERO is correct with respect
to Aland AF.
The procedure SN EA KY illu stra tes one of the pro ble ms ass oci ate d wit h con str uct ing
initial and final ass ert ion s for a pro ced ure . For exa mpl e, sup pos e a pro ced ure is
intended to sort the entr ies of a list, but the fina l ass ert ion of the pro ced ure spec ifie s
merely that the entries are in non dec rea sin g orde r. The n a pro ced ure whi ch ass ign s
the same val ue to eac h ent ry of the list will be cor rec t wit h res pec t to the fina l asse r-
tion. Thus it is nec ess ary to spe cif y not onl y that the entr ies are in ord er, but that the
final list can be obt ain ed by rea rra ngi ng the entr ies of the ori gin al list.
In practice, however, there is oft en som e sacr ific e of pre cis ion in ord er to mak e
the proof of correctness more manage abl e; thus , the init ial and fina l ass ert ion s we
gave for PRODUCT would be con sid ere d acc ept abl e by som e, wit h the und ers tan din g
con str uct ed by virt ue of our und ers tan din g of the pro b-
that SNEAKY would not be
lem.
If desired, the initial and fina l ass ert ion s for PR OD UC T can be cha nge d so that
ect. Thi s is don e wit h aux ill iar y var iab les as
SNEAKY is no longer formally corr
348 ANSWERS TO SELECTED PROBLEMS
follows:
Atiat>O\a=a Ab=Bd' .
AF: y =a’+b’,
Since the val ues of a’ and b’ are not aff ect ed by pr og ra m exe cut ion , the ap pr op ri at e
value of y will be guaranteed.
9. (a) Al:n=1 A X[VEi] = arg]
begin
index — 1;
Al:nz1 A index =1 A 3i[V[i] = arg]
A2: Vill <j < index=> Vij] # arg] A 3i{V{i] = arg]
while V [index] + arg do
A3: Vill <j < index => V[j] # arg] A Ji[V[i] = arg]
index < index + 1
A2: Vill <j < index => VUj] % arg] A 3i[V[i] = arg]
AF: (Vindex] = arg) A Wil <j < index > V[j] # arg]
end
Section 2.1
1. (a) {0,1, 2, 3, 4}
(c) {George Washington}
2. (a) If the universe of discourse is I, then the set is
{x|O0 <x A x < 100}.
(b) If the universe is I, then the set is
{x|dy[x = 2y + 1]}.
4. A=G= $
= {x| x is¢ ,
even}, B
and = F =E
C = {1, 2, 3}.
ANSWERS TO SELECTED PROBLEMS 349
Section 2.2
1. If he. shaves himself, he will break his vow not to shave anyone who shaves himself.
Therefore he must find someone else to shave him. Since only a barber can shave
someone else and he is the town’s only barber, he must leave town to be shaved.
(a) If the assertion “heterological is heterological” is true, then heterological applies
to itself, and is therefore homological; thus the assertion is false and we have a
contradiction. On the other hand, if the assertion is false, then heterological is
not heterological, i.e., heterological does not apply to itself. It follows that
heterological is heterological, and therefore the assertion is true. This is another
contradiction. Therefore the assertion is neither true nor false.
Section 2.3
Section 2.4
2. AUVUBUC=(A-(BUC)U(B-Quc
3. A proof of part (b) of Theorem 2.4.1 can be obtained by replacing all occurrences
of U with A, and V with A in the proof of part (a). A proof of part (d) can be
obtained from that of part (c) in the same way.
(a) Assume Cc Aand Cc B. Then
Velx e C>xe Al A Valxe C>xe B]
is true. Since V distributes over A, this is equivalent to
Valx e C>xe AA(xE C>xe€ B))
which is equivalent to
Velxe C>[xE AA xe BI].
Hence
Vxlxe C>xeEeAn Bi,
and therefore
CcANB. §
350 ANSWERS TO SELECTED PROBLEMS
Hence, A U(B—A)=AUB. J
AU(AN B)=A. §f
(c) |xxe
A-~B={xAA € B}
=f{xljxeAAlA€ xB
¢ B}
AA exe UAx
={xjx
={x|jxe AAxe U—B
AAxxee B}
={x|
=ANB. |
11. (a)
Us=6 QS= 4.
(c) ws = fa, 5}; £2 S= 9.
12. (b) (AN BHO O(AN BNC) =ANBN(ENC)
=(AN Bnd
=,
Section 2.5
1. (a) Basis: The digits 0, 1, 2, 3, 4,5, 6,7, 8 and 9 (i.e., all decimal digits) are in
the set.
Induction: If x is in the set and d is a decimal digit then xd is in the set.
Extremal: An object is in the set if and only if it can be constructed from a
finite number of applications of clauses 1 and 2.
(c) Basis: OQisin S.
Induction:
(i) Ifx <€ S,then lx e€ S.
Gi) If(x e S A x 0), then x0 € S.
iii) Ife SAyeS A x0), then xy € S.
Extremal: as in part (a).
2. (a) procedure MULT(, 5):
if b = O then return 0 else return MULT(a, b — 1) + a4
(31) = 2"
But
(> i)’ =12=1
and
SB=B=1.
Hence, the assertion holds for n = 1.
Inductio n: As su me the ass ert ion hol ds for arb itr ary n > 1, ie. ,
(3 i) = x=f 2
352 ANSWERS TO SELECTED PROBLEMS
Then
Sitat »)’
fel
i) + (200+ ny i)+@+ 1?
= 2 + (a(n + 1) MEY) + 1)?
+(n
= Si.§
5. Let m be an arbitrary integer in N. We prove Vn{(a")" = a”"] by induction.
Basis: Suppose n = 0. Then by Definition 2.5.4, (a")° = 1 and a™° =a@° = 1,
This establishes the basis step.
Induction: The induction hypothesis is “Assume the assertion holds for arbitrary
n”,ie., (a") = a™, Then
(amy'*! == (a™)r-g™ by Definition 2.5.4
== qmn.gm Induction hypothesis
== qmntm Theorem 2.5.5
== qm(nt1) distributivity of multiplication. Jf
SQ+N=2H 14 V1 = MATH
i=0 i=0 i=0
+(n4+1)
=nm+22n+1=(7+1)% §
Note that a proof by induction is not required.
(d) Basis: For n = 0, we have 1 + 2n = 1 and 3" = 1. Therefore, 14+ 27°< #
for n = 0.
Induction: Assume 1 + 2n < 34 for arbitrary n. The inequality i < 3” holds
for all n; hence
2< 2-3",
and therefore,
37 +2 << 3 4 2-3" = 3-3" = 341,
By the induction hypothesis, 1 + 2n < 3+,
sO
14+2n4+2<3"
and
14+2m+)<3", §
7. (a) Casel: fr =1, then r! =1 for alli N, and hence tr! = (n + 1).
i=0
Case 2: Suppose r + 1. We prove the assertion by induction.
ANSWERS TO SELECTED PROBLEMS 353
romp —1l_r-l_,
r-~l or-i~*
Therefore, the assertion is true for n = 0.
Induction: Assume the assertion is true for arbitrary n. Then
n+i n
Sris Yori porn
i=0 i=0
(nt4)
=! 1 party
r—1l
_ yard — 1 prt2 — yuri
r—-l r—l
pnt2 —1
~~ yp = | i
Basis: The sum of the interior angles of a triangle is 180° = (3 — 2)180°. Hence,
the assertion is true for n = 3.
Induction: Assume the assertion holds for an arbitrary convex polygon with
n > 3 sides, and consider a convex polygon C with n + 1 sides. The polygon C can
be divided into a triangle T and a polygon P of n sides by connectingtwo non-adjacent
vertices. The sum of the interior angles of C is equal to the sum of the interior
angles of P and T. Since P has n sides, we can apply the induction hypothesis to
conclude that the sum of the interior angles of P is (n — 2)180°. By the basis step,
the sum of the interior angles of T is 180°. Therefore, the sum of the interior angles
of C is
(n — 2)180° + 180° = (mw + 1) — 2)180°.
This establishes the assertion for alln > 3. fj
10. The induction step of the proof is fallacious. In particular, if n = 1 or n = 2, it is
not true that the set S contains two nonequal subsets of n people which must overlap.
Section 2.6
1. (a) The empty set is a model of axioms (b) through (€). (This postulate plays the
same role as the basis step in an inductive definition.)
(b) The “infi nite root ed bina ry tree” exam ple of this secti on suffi ces as an exam ple
which satisfies all postulates but (b).
(c) The set {0} where 0’ = 0 satisfies all the postulates but (c).
(d) Let S = {0, 1, 2} where 0’ = 1,1’ = 2,and2’ = 1. Then S satisfies all postulates
but (d).
(e) Let S = {0, x1, %2,---5 V1, Var-+ eh
where 0’ = x, and x} = Xi44 fori eé I4,
w=. forie I+.
Then S satisfies all postulates but (e).
354 ANSWERS TO SELECTED PROBLEMS
2. (a) We show
Vp WaVrip +g) t+r=pt+@tr)l.
Let p and gq be arbitrary natural numbers. We establish
VWipt+gtr=pt@t+trl
by induction.
Basis: Let r= 0. Then by the basis step of the definition with m = p + q
we have
(p+q)+0=p+q.
Also by the basis step with m = q we have
p+q+0)=p+g¢.
Hence
(p+qgtr=p+@tr)
ifr = 0,
Induction: By the inductive step of the definition of addition,
pt+@t+tr)=pt+@try
=(pt+qt+ny
=((pt+qt+ry (Induction Hypothesis)
=(pP+g+rr’,
Thus the assertion holds for allr e N. fj
Section 2.7
1. (a) A* = {A, a, aa}.
(e) B* = {(ab)"|n > 0} = {A, ab, abab, .. .}.
2. (b+) Amdo = Am*s for all m,n > 0.
Proof: Let mbean arbitrary integer. We show Wx[A"A" = A™**] by induction
on A.
Basis: n=0.
Am™A® = A™{A}
= Am
= Amto
(b) By Theorem 2.7.3a, A* = {A} U At. Hence A* = A? if and only if {A} < A*;
ie, A € A" for somen € I+.If A € A, then A € A! and therefore A* = A’,
Conversely, if A* = At, then A € A” forn < 1+. But if A € A’ it follows that
AEA,
(c) We apply parts (a) and (b) of this problem. Since A ¢ A*,
(A*)* = (A*)* = A*, Of
10. We establish that A*B is a solution by substituting A*B for X on the right and
showing that the remaining occurrence of X is equal to A*B.
X = A(A*B) UB Substitution
= (AA*)BU B Associativity
= A*BU {A}B 2.7.3h and 2.7.1b
= (A* U {A)B 2.7.1f
= A*B
11. If X = XAU Band A ¢ A, then X = BA* is the unique solution. The proof
is essentially the same as that of Theorem 2.7.4.
13. (a) (1) Xt = AX, U BX,
Section 3.1
6. (a), {<1}, {C2}, {3d}, (CLD, <2}, (EDD, <3}; 12D, BD}, KD, 29, GB}
(b) 903) == 29
Case 2. If ab, then <a, b> = {fa}, {a,b}. Then <a, b> = <c, d> only if
{c} = {a} and {c, d} = {a, b}. But if {c} = {a}, then c = a, and therefore, since
az b, {c,d} = {a, d} = {a,b}; henced=b. Jj
(b) Under the given definition the ordered triples <1, 2, 1) and <1, 1, 2> are equal,
but they are not equal according to Definition 3.1.1.
Section 3.2
Suppose there is a directed path which is not simple from a node a to a node 6 in
the tree. Because the path is not simple, it contains a cycle of length > 1. Since there
is a directed path from the root r to a there must be at least two distinct directed
paths from r to 6, one of which contains a cycle and one of which does not. But
this contradicts Theorem 3.2.1. Hence, every directed path is a simple path. Jj
From Theorem 3.2.1, there is a directed path from the root r to a and from r to b.
It follows that there is at least one undirected path from a to b. Now suppose
£€05 C1y+++5Cm> and <do, dy,...,d,> are distinct simple undirected paths from
ato b. Then a = cy = d) and b=c,, = d,. Let i be the least integer such that
c, = d;, for all k < i, but ¢,,; ~ d;4,;. Note that since the paths are distinct, i exists
and 0 < i<_m — 2. Let j be the least index such that j > i and c; = d, for some
r> i.Sincec, = d,,j exists, 7 < m, and eitherj 4 i+ 1orr #i-+ 1, By the choice
of j, there is no c,, i <.s <j which is equal to any d,, i << ¢ <r. Hence the path
(Cis City + +9 Cj G1, Gp-2,---,4> is an undirected simple cycle of length greater
than 2, contradicting Theorem 3.2.2. Hence, if a ~ 6, then there is at most one sim-
ple path from atob. fj
Basis: If n = 1, then the only node is the root. Since there are no loops on nodes
there are O = n — 1 arcs. Hence the assertion holds for trees with 1 node.
Induction: Suppose the assertion is true for all trees with n nodes; n > 1. Let T
be a tree with n + 1 nodes. Then T has at least one node a with outdegree 0 and
indegree 1; a is a leaf. Consider the tree T’ formed by deleting the node a and its
incident arc from the tree T. Then 7’ has n nodes and by the induction hypothesis,
T’ has n — 1 arcs. But T has one more node and one move arc than T’; hence T
has n + 1 nodes and nv arcs. This establishes the induction step and completes the
proof. J
The recursive pro ced ure give n bel ow uses a pro ced ure ME DI AN whi ch retu rns
the median value of a finite set of integers. The median of a finite set of integers S
is the element x < S such that either the number of elements of S less than x is
equa l to the num ber of ele men ts of S grea ter than x (if S has an odd num ber of ele-
ments), or the num ber of ele men ts less than x is one mor e than the num ber of
elements greater than x (if S has an even number of elements).
procedure CONSTRUCT_TREE(S):
comment: Construct a bin ary sea rch tree who se nod e val ues are the ele men ts of the set S.
358 ANSWERS TO SELECTED PROBLEMS
if S = ¢ then return ¢
else
begin
m+«- MEDIAN(GS);
Si-{xlxe SAx< im;
S2,-{xlx Ee SAx>m);
construct the tree T such that
(a) the root r of T is labelled m
(b) the left subtree of r is CONSTRUCT_TREE(S1)
(c) the right subtree of r is CONSTRUCT_TREE(S2);
return T
end
10. We first show that the bounds are attainable. Consider a tree in which each interior
node has a single descendant. Then the tree has a single leaf, and for each integer
d such that 0 < d< A, there is a single node a distance d from the root. In such a
tree, 2 = h + 1, so the lower bound is attainable.
Now consider a tree in which each internal node has two descendants, and all
leaves are a distance A from the root. Then the number of nodes in the tree is
h
L244 40-42% = HW M1 1,
i=0
h h
2am -1.
At+i=Mi<n<c¥
d=0
19. (a) The height of 7, is one greater than the height of 7).
(b) No more than A, records are examined in a search in 7,, and no more than
h, + 1 records are examined in a search in 73.
21. (a) Since T is complete, every node has either no sons or two sons. If the root has
no sons then n = 1 and d, = 0. Hence
2-4 = 20 = |,
Me
i 1
if
Now assume the assertion is true for all complete binary trees with 7 leaves,
n & I+, and let T’ be a complete binary tree with m + 1 leaves. Then 7” can be
constructed from some complete binary tree T with leaves b,, bz,...,6, by
adding two sons to a leaf b, of T, 1< A <n. Let these leaves be 5; and bi..1,
and associate the remaining leaves of T with those of 7’ in the natural way:
if 1< _m <k, then b,, corresponds to b/,, andif k + 1 <m<n, then 5,, cor-
responds to 6,41. Then for 1<m<k, d), =d,, and fork +1<m<n,
ANSWERS TO SELECTED PROBLEMS 359
(b) Suppose we begin at the root of a tree T and follow a path from the root to a
leaf. If at each node in the path it is equally likely that we turn left or right,
then the probability of travelling any particular path of length m is 2-, and
consequently the probability of reaching node 5; is 2~%. The sum of these
probabilities must be 1.
If 7 is a complete k-ary tree with n leaves, b,, b2,...,6, and d, is the
length of the path from the root to leaf 5;, 1 <i<n. Then
Skea,
i=]
(c) The height A of the tree is the maximum path length of the d;; that is,
h = max {d,}. The maximum number of leaves n of a binary tree of height A
is 2*, Hence,
n < Dmax {di}
and therefore
log n < max {d}}.
Since max {d,} is an integer,
flog n]< max {d}. J
Section 3.3
Reflexive N Y Y N Y N
Irreflexive Y N N Y N WN
Symmetric Y Y Y N NWN
Antisymmetric Y N Y Y ¥ WN
Transitive Y Y Y Y Y Y
6. Relative Absolute
Union Intersection Complement Complement
Reflexive Y Y N N
Irreflexive Y Y Y N
Symmetric Y Y Y Y
Antisymmetric N Y Y N
Transitive N Y N N
Section 3.4
Section 3.5
<y, x> € Ris x,y € R, => <x, y> € R, <> <y, > € Rj.
(c) Since R,; U R, > Ri, it follows from part (ii) of Definition 3.5.1 that
1(R; U R,) > R;. Since t(R; U R;) is transitive, by part (iii) of Definition
3.5.1, t(R,; U R,) > t(R,). Similarly, 7(R; U R,) > t(R,), and hence
(Ry U Ry) > t(R;) UCR). fl
(a) By hypothesis, R is reflexive and therefore R > E. By definition, s(R) > R,
and hence s(R) > E. Therefore s(R) = s(R) U E, which establishes that s(R)
is reflexive. A similar proof establishes that r(R) is reflexive. Jj
(c) Since R is transitive, R = t(R). To show r(R) is transitive, it suffices to show
that ¢r(R) = r(R) as follows:
tr(R) = t(R U E)
=U (Ru Ey
It is easy to show by induction that
(RU Ey = U Ri;
we leave this to the reader. Thus -_
IC~
tr(R)
it ces
a
|
Qo
ar
wm,
i
if Ce
~
I
o
a
362 ANSWERS TO SELECTED PROBLEMS
8. (a) The digraph of t(R) has two components, one the complete digraph on {a, , c}
and the other the complete digraph on {d, e, f, g, h}-
10. (a) (Rt)* = 7(t(R)). But 7(R) is transitive and hence by Theorem 3.5.1, t@(R)) =
t(R).
(bt) R= {<A, B), <B, C>, KC, A), KC, D>, <D, A>}.
(c) The transitive closure is the universal relation on the set {4, B, C, D}.
(d) In most games of this sort, the relation “is more likely to win than” is transitive.
But in this game, R ~ ¢(R); it follows that the relation is not transitive. If the
relation were transitive, there would be a best die.
(e) If you wanted to make money, the proposed game would be a poor vehicle
because no matter which die you pick, your opponent can choose one which
will beat yours 2/3 of the time. Note that this would not be possible if the rela-
tion of part (b) were transitive.
Section 3.6
<N, <>
MMM S
222%
BAS
MNS
<N, =>
<I, =>
<R, =>
<@(N), proper
containment>
ZB
ZZ
Z22~
<P), <>
eZ
<P(fa}), <>
mM
<M
<P(G), <>
2. (a) Since R is a quasi order, R is transitive and irreflexive (and hence, antisym-
metric). By Theorem 3.5.8(c), r(R) is transitive and by definition of reflexive
closure, r(R) is reflexive. It remains to show that r(R) is antisymmetric. We
ANSWERS TO SELECTED PROBLEMS 363
False. The antisymmetry condition fails for any pair of integers <x, —x>, where
x 0,
(a) Suppose R is a quasi order. Then R is irreflexive and transitive. Since
<x, x> ¢ R for any x, it follows that <x, x> ¢ R* for any x. Hence R° is ir-
reflexive. To show R¢ is transitive, consider any <x, y> € R° and <y, z> € R*.
Then <y,x> € Rand <z,y> € R, and by the transitivity of R, <z,x> € R.
Hence <x, z> € R°, which establishes that R° is transitive and therefore a quasi
order. Jj
All of the assertions are true.
(a) (only if) If R is a quasi order, then R is irreflexive and transitive. Since R is
transitive, by Theorem 3.5.1, R = 7(R) = R*. Now suppose <x, y> € R.
If <x, y> € R*%, then <y,x> € R, and since R is transitive, it follows that
<x,x> € R, violating the irreflexivity of R. Thus, if <x,y> € R, then
<x, y> ¢ Re and hence RO R° = ¢.
(if) We must show that if RQ R° = @ and R = R’*, then R is irreflexive and
transitive. Clearly R must be irreflexive, since if <x, x> € R, then <x, x> € R°
and R R° + @. Moreover, if R= R*, then R = ¢(R) and it follows that
Ris transitive. Jj
11. The proced ure PR OD UC T has a sing le loop and will ter min ate if this loop is tra-
versed only a fini te num ber of time s. The loop will be trav erse d so long as i < a,
that is, so long as a — i> 0. Sinc e both a and i are inte ger vari able s, this mea ns that
the loop will be trav erse d if a — iis a mem ber of the wel l-o rde red set I+. By the init ial
assumption, a > 0, and the first ass ign men t sta tem ent init iali zes 7 to 0. If a= 0,
then a — i ¢ 1+ and the loop is not trav erse d at all. If a > 0, then the loop is tra-
versed causing i to be inc rem ent ed and a — i to be dec rea sed in valu e. Sinc e I-+-
is well-ordered, the value of a — i will not be a member of I+ after a finite number
of executions of the loop, causing the while loop to terminate. |
13. Proof: Suppose a and 6 are leas t upp er bou nds of B. The n by def ini tio n of lub,
a<band b< a. Sin ce < is ant isy mme tri c it fol low s that a = 5. A sim ila r pro of
holds if a and b are glbs. fj
14. (a) True.
(b) False. (Consider <1, —1> and <—1, 1.)
(c) False. (Because T is not a linear order.)
(d) True.
(e) False. (T is not antisymmetric.)
364 ANSWERS TO SELECTED PROBLEMS
17. (a) Let <S, <)> be a poset, and let B be a finite subset of S. If B does not have a
minimal element, then for each x; € B we can find some x;,, € B such that
X; > X41, that is, x, > x;,, and x; ~ x41. It follows that we can construct
an infinite sequence of strictly decreasing values of B:
Hy > Xing > Xing tt
But B is finite, so in any infinite sequence of values of B, some value must be
repeated. By antisymmetry, it follows that all intervening values in the sequence
must be equal, contradicting the condition that x; ~ x;,;. Thus any strictly
decreasing sequence of elements of B must terminate, and it follows that B
must have a minimal element. The proof that B has a maximal element is
similar. Jj
Section 3.7
10. A/R = [0], (11, [21, [3], [4], [S1}, where [Kk] = {y|y = 67 + & for some i < Th.
11. (a) Maybe. (Yes, if 7, = 2,; otherwise, no.)
(c) “Maybe. (Yes, if 2, A 2, = ; otherwise, no.)
12. (a) No. (Let A = {a} and R, = {<a, a>}.)
(c) Yes.
13. n.
15. By definition
{x,y ER ex-youd € L.
for somec
<x, y> € Ry x — y = dk e I.
for somd<
(a) (only if) Suppose I/R, refines I/R,;; then R, < R,. The pair <k, 0D € Ry and
hence <k, 0> € R;,; therefore
k-O=1-k=¢q forsomeceL
It follows that & is an integral multiple of j.
(if) Ifk =rj for some r é I, then
Kx, y> € Rye (x — y) = ck for somc e€ I
=> (x — y) = cr for some c,r € I
<be B’.
Hence, B = B’. Since the blocks of z and 2’ exh aus t 4, it fol low s tha t z = 7’.
(if) Suppose R induces z and 7 ind uce s a (po ssi bly dif fer ent ) eq ui va le nc e rel ati on
R’, Then for any a, b € A,
aRb <> [ale = [b]e
<> a,b & [ale
<> IBIBe nm Aae BA bE B
<> aR’b.
Hence, R= R’. fj
366 ANSWERS TO SELECTED PROBLEMS
19. Suppose z and 7’ are sum partitions of x, and z,. Then by Definition 3.7.8, 2 and
n’ refine each other. By Theorem 3.7.11, the relation “refines” is antisymmetric and
hencez = 27’. Jj
20. (a) Part (i) of Definition 3.7.7 establishes that 7,-7, is a lower bound of the set
{7,, 7%} under the relation “refines.” Part (ii) of the same definition asserts that
71*7%z is the greatest lower bound. ff
Section 4.1
Hence, the assertion is true for all m < N. Since n was arbitrary the result follows
by Universal Generalization. Jj
9. The procedure SUM1 is a recursive algorithm:
procedure SUMI1(m, 7)
if n = 0 then return m
else return S(SSUM1(m, P()))
= f(100)
= f(f(100 + 11)
= f(f(il)
== f(1i1 — 10)
= f(101)
= 101 —10 = 91
= f(99)
= 91 by part (a).
368 ANSWERS TO SELECTED PROBLEMS
(ii) Now let x < 90 and let & be the smallest integer such that
90< x + 11k< 100.
Then fx) = f(f@ + 11))
=f(F(F@ + 2-11)
== fy + 11k)
where 90< x + 11k < 100 and k >1. By part (i), it follows that
S(e + 11k) = 91; hence
Section 4.2
f: (0, 1,2,...,2—-1}-A,
I@ =a; forO<i<a;
ANSWERS TO SELECTED PROBLEMS 369
then (dp, @1,...,5 G-1> is the image of the function funder g. Hence g is surjective.
It follows that g is bijective. Jj
(a) ‘mon
Define fas follows:
f: A P(A),
f(@= {a} for alla < A.
Then / is injective, since if
f(a) = f(b), then {a} = {b}, which impliesa = 5. Jj
(a) fO) =a, f() = 6, f@) =e.
(b) f(*) = 2x, x € (0, 1).
(c) f@ = 2n forn>0
= —(2n + 1) forn <0.
(a) Suppose g: A —> B and f: B— C, and let c be an arbitrary element of C.
Since fg is surjective, there is some element a € A such that fg(a) = c. But by
Theorem 4.1.1, fe(a) = f(g(a)), where g(a) € B. Thus c is the image of an
element of B under f. Since c was arbitrary, it follows that f is surjective. Jj
10. (a) Since f and g are monotone increasing, if x< y, then f(x) < f(y), and
a(x) < g(y). Hence, if x < y then
(f + a(x) = f(x) + 8) <fO) + 8) = FS + 8),
and it follows that f+ g is monotone increasing. Jf
(c) Let f(x) = g(x) = x. Then fand g are monotone increasing, but the product
function f-g(x) = f(x)-g(x) = x* is not a monotone increasing function on
R. §
11. (a) Let y be an arbitrary element such that y € f(A) — f(C). Then f(x) = y for
some x € A, but for every z € C, y# f(z). Hence x € A — C, and since
y = f(x), this implies that y ¢ f(A — C). Since y was arbitrary, this establishes
that f(A) — f(C) < f(A —C). ff
12. (a) Suppose y € f(f71(B’). Then there is an x in f~!(B’) such that f(x) = y. Since
x € f71(B, it follows that f(x) € B’. Hence y € B’; therefore f(f~1(B)) < B’.
(b) By part (a), f(f7'(B)) < B’. Suppose y € B’. Since fis surjective, there is an
x € f71(B’) such that f(x) = y. Since x € f-1(B), it follows that f(x) is in
f(f71(B’)). Hence, y € f(f-1(B)); therefore B’ < f(f-1(B)).
14. By Theorem 4.2.4, since f is bijective, f— is bijective. Hence (f ~!)-1! is defined and
equal to the converse relation of f~!. But f~! is the converse of f, so by Theorem
3.5.3a,(f')'=f |
17. The relation R is the equality relation E on A.
19. (a) Let x,,X, € A’ and suppose x, # x2. Since f is injective, f(x1) # f(%2).
But f (x1) = f(x) and f|y(x2) = f(%2). Hence, flae1) # Fl 4(x2). It follows
that f|y is injective. ff
20. (a) Sup pos e x €¢ A — B. Then x € Aan dx ¢ B. Henc e x(x) = land 7,(x) = 0.
In this case ¥ [1 — X2(x)] = 1 = X4-2(x). Now suppose x ¢ A — B; then
xe A—-B=AUB. It follows that yx) = 0 or ¥2(x) = 1, and therefore
either yx) = 0 or 1 — ZXa(x) = 0. Hence xAx)[l — Xa@)] = 0 = Z4-2().
370 ANSWERS TO SELECTED PROBLEMS
21. (a) The function has one left and one right inverse and they are both equal to the
inverse function. The equivalence relation induced by the function is the equality
relation. The canonical map g is defined by g(x) = {x}.
(b) Since the func tion is neit her inje ctiv e nor surj ecti ve, it has no left or righ t
inverses. The equivalence relation induced by the function is the universal
relation. The canonical map g is defined by
g(x) = fa, b,c} for x € {a, b, c}.
Section 5.1
1, (a) 34 = 81.
(c) The c can occur in any of the last three positions in the string. Once the posi-
tion of c is specified, either of two letters can occur in each of the other three
positions. Thus there are 3-23 = 24 such strings.
A binary relation from A to Bis a subset of A x B. There are 2!4*3! = 2I4l-18] = mn
such subsets.
There are 16 binary sequences of length 4. Representation for the sequence of digits
0,1, 2,...,9 can be chosen in P(16, 10) ways.
(a) (4)-2" = 2-1. This can be proved by induction.
(b) 2°71,
11. Let|A| = m,|B| =n, and let f: A > B bea bijection. Note that fis an injection from
A to B. Then by the pigeonhole principle, m <n, for if m > n, no injection from
A to B exists. Since fis a bijection, the inverse function f~! exists and is an injection
from B to A; this implies, by the pigeonhole principle, that 2 <. m. Hence m = n,
ie, [A] =[Bl. |
12. (a) If Bc A, then AU B= A; hence|A U B| =|Al.
(b) It is easy to show that A = (A — B) U (A B) and that A— Band ANB
are disjoint. It follows from Theorem 5.1.3 that|A| =|4 — BJ] +/|AM Bl. §f
= 2% (,” 1) +3 (*):
a7 )=3G)+ 30)
== 2" + 2 by the induction hypothesis
= Qari. |
Therefore, in base } notation, (6 — 1)[1 +10 + 102+ .--- + 10°] = 10"*! —1.
If 6 = 6 and vn = 3, then (in base 6 positional notation),
S{i + 10 + 10? + 107] = 5555 = 10000 — 1.
18. There are 2” distinct bit patterns. Since +0 and —0 are distinct representations
of the integer 0, only 2? — 1 distinct integers can be represented.
(b) We first count the number of nonzero real numbers that can be represented.
The leading digit of the mantissa on a nonzero number must be a 1. The other
m — 1 bits of the mantissa (including its sign bit) can each be chosen in one of
two ways; hence there are 2”~! choices for the mantissa of a nonzero number.
By part (a), the exponent can represent any of 2* — 1 distinct values. Since
distinct pairs of mantissas and exponents denote distinct real numbers, the
rule of product applies. Thus there are (2* — 1)(2"7~!) distinct nonzero repre-
sentable real numbers, and therefore one can represent (2* — 1)(2"7!) +1
distinct real numbers.
(c) We first restrict ourselves to the nonnegative integers. Every integer 1, where
0< n< 223, can be represented by choosing an appropriate mantissa and
exponent. Above this range, not every integer can be represented (e.g. 273 + 1
is not representable). But all such integers greater than 273 require an exponent
with a value greater than or equal to 24, and every configuration with an expo-
nent this large represents an integer. Hence there are about 223 (27 — 24)
integers greater than 223 which are representable, making a total of
223 + 230 — 24.223
or
230 — 23.223
positive integers. Taking negative integers into account gives a total of about
2(239 — 23-223) = 1.76-10°
distinct representable integers.
(d) About 224 integers can be represented in integer notation. Using the results
of part (c), the ratio is about 1 to 100.
Section 5.2
1. The proof of part (a) of Theorem 5.2.1, given in the text, establishes that the
relation “g asymptotically dominates f” is reflexive and transitive. It follows
that the relation = is reflexive and transitive. Moreover, by the symmetry
of the roles of fand g in the definition of =, it follows that = is symmetric and
therefore an equivalence relation. Jj
372 ANSWERS TO SELECTED PROBLEMS
10. (a) We show that log” ¢ O(1). Suppose to the contrary that logz € O(1); then
there must be some 4, m => 0 such that if n> k, then logn< m-1 =m.
But if 7 > 2”, then log n > m; thus log 7 is not asymptotically dominated by
e(n) = 1 and hence log is not O(1). By Theorem 5.2.4, O71) < Odog vn) and
it follows from the above argument that the containment is proper. jf
(c) We show that if d> 1, then d* € O(n"). Suppose d” is O(n?). Then there exists
k,m => 0 such that if n> k, then d*?< mn?. Then for these values of 7,
niogd< logm + 2logz, and for n> 1,
ANSWERS TO SELECTED PROBLEMS 373
n 2 log m
logn = logd * (og n)(loga)
But the ratio on the left grows, arbitrarily large as n increases, whereas the
first summand on the right is a constant, and the second term decreases as n
increases. Thus the inequality can be violated by choosing n sufficiently large.
We conclude that d" ¢ O(n?). From this result and Theorem 5.2.4, we conclude
that the containment is proper. fj
11. Let K and n be arbitrary positive integers such that K=>[c] and
n> max (c*, K). Then
ni =n(n — 1)(n — 2)...(K + DK(K—1)...2¢1.
Since K > [cl],
(1 — 1)(n — 2)...(K + DK> cr,
Since n> c*,
nl > cKcn"K = cr,
Hence n! > c" if is sufficiently large, and therefore O(c”) < O(n!).
To show the containment is proper, it suffices to show that for any m > 0,
the value of n can be chosen large enough that 1! > mc". Without loss of gen-
erality, we can assume m > 1. We showed above that if” is chosen large enough,
n! > (me). But for n> 2, (mc) > mc"; hence n! > mc" for n sufficiently
large. It follows that n! is not O(c’). fj
13. If P is a polynomial of degree k, then P(n) = ay + ayn + ayn? +--+ + aynk,
where a, 4 0. By Theorems 5.2.5, 5,.2.2(b), and 5.2.3, a;n' € O(n*) for each i,
0<i<k. It follows from Theorem 5.2.2(c) that P is O(n*). Jj
16. Algorithm F takes less time than G to execute if and only if 10 <2” < 50.
17. hoho-hofhohohehohoh
18. The conjecture is true and can be proved by induction on k.
Basis: Ifk =0, then Si# = i= i=Q
Mi=n+le ow).
Li
Induction: We assume >) i* € O(m**') for some arbitrary k. Then there exist
i=Q
>» kx M(n**1),
i=0
It follows that
Section 5.3
(a) Let x, denote the minimum total path length of a complete n-ary tree of height
h. Each internal node of such a tree has 7 sons. The total path length for a tree
of height 0 is 0; thus,
Xo = 0.
Suppose 7” is a complete n-ary tree of height A with minimum total path length.
Then a complete n-ary tree of height h + 1 of minimal total path length is
constructed by adding 1 sons to some node a of T’ where a is distance A from
the root of 7’. Then the path length from the root to each son of a ish + 1;
thus
Xhai = X, + n(h + 1), where A > 0.
some
k € N,
r< bk <m<bk!
’ Because fis monotone increasing,
= Kb%(b*)4
< Kb4(m)é.
Therefore, f(m) < Kb4(m)‘ if m is greater than a power of b which is at least
as great as r. It follows that fis O(v7). Jj
11. (a) procedure MAX2(i, /):
if i = j then return Afi]
else
begin
comment: Divide A into two subarrays of approximately equal size.
m |S |;
qt
(b) f() =0
f@ = 24(4) +1 for k >
n = 2* where 1.
(c) Suppose n = 2*. Then by a proof similar to that for Lemma 5.3.2a it follows
that
m= || The node values of the left subtree of the root are contained in
Ali: m — 1], and those of the right subtree of the root are in A[m + 1:j]. If
i<m, then the node value of the left son of the root is stored in
a{|24e=+)]. If i = m, then the root has no left son. If m <j, then the
ae
m+j+1 .Ifm=j,
node value of the right son of the root is stored in All
no right son exists.
376 ANSWERS TO SELECTED PROBLEMS
Section 5.4
Since A is an integer and log n lies properly between h and A + 1, it follows that
h=|logn|. Jj
7. Let T be a balanced ternary search tree with n nodes and height h. Then Tis complete
and
hoi h
ye +3<an< > 3
i=Q i=0
34 — J Za rt
5 +3s2<5-— y—
34+ 5< 2n< 34t! — ]
34 < 2m < 3+
h < log; Qa) <<h +1
Hence, h = | log; (2n)|, and it follows that the worst case complexity of a search
in a ternary search tree is O(log 7).
10. Suppose an O(n) algorithm exists for constructing a binary search tree T from an
unsorted list of n elements. Then traversing the tree T in inorder (using the LIST
procedure of Fig. 3.2.3) produces the list in sorted order. Since the traversal algo-
rithm requires no comparisons between elements of the list, the entire sorting pro-
cedure would require O(n) comparisons. But by Theorem 5.4.5, if f is the worst
case complexity function of an algorithm for sorting by comparisons, then
O(a log n) < O(f). Since O(n log n) ¢ O(n), the supposition that a binary search
tree can be constructed in O(n) time leads to a contradiction of Theorem 5.4.5.
12. (a) procedure SEQSEARCH(arg, i,j):
if arg = A[i] then return i
else
if i = 7 then return “not found”
else return SEQSEARC(arg,
H i+1, /)
(b) procedure RECSORT(, /):
if i = 7 then return
else
begin
comment: find minimum entry in list.
min — Afi];
position < i;
fo
k r —i-+ 1 untilj do
if A[k] < min then
begin
min <— A[k];
position — k
end;
comment: interchange minimum with A[k].
A[ position] — Ali];
Ali] — min;
comment: sort remainder of the list.
call RECSORTG + 1,/)
end
378 ANSWERS TO SELECTED PROBLEMS
Section 6.1
ty n. Th en for so me n € N, the re is a bij ect ion
1. Assume [0, 1] is finite with cardinali
the re is a rea l nu mb er z € [0, 1], su ch tha t
f: (0,1,...,2 — 1} to [0,1]. We show
f(m) % z for any mé {0,1,...,2— 1}.
Suppose f(0) = xo,
fi (1) == X41,
f(a _— 1) == Xy-1-
Suppose f: A— Bis an inject ion and A is inf ini te. To sho w B is inf ini te we con str uct .
4.
an injection g: B-» B such that g(B) is a proper subset of B.
Since f is inject ive fro m A to B, f is bij ect ive fro m A to f(A ); thu s an inv ers e
function f~! exists whi ch is a bij ect ion fro m f(A ) to A. (No te tha t we are usi ng f~*
to denote a functi on fro m f(A ) to A rat her tha n fro m B to A.) Mo re ov er , sin ce A
is infinite, there is an inj ect ion h : A > A suc h tha t h(A ) is a pro per sub set of A.
We define the function g: B > B as follows:
a(x) = xifx e B— f(A).
g(x) =fohof-'(x) ifx € f(A).
Then fhf ~! is an inj ect ion fro m B to itse lf and fhf -'( B) = SAA ). Sin ce h(A ) is
properly con tai ned in A and fis an inj ect ion , fh( A) is pro per ly con tai ned in f(4 ). It
follows that f-'!Af(B) ~ B and hence B is infinite. Jj
. Proof of Theore m 6.1 .4( d): It suff ices to con str uct an inj ect ion fro m A to A. Def ine
4)
f:A-— AB as follows:
f)=g8 where g(b) = x for all b in B;
that is, f(x) is the con sta nt fun cti on g: B—> A suc h that g(b ) = x. Cle arl y, f is an
injectio n. Sin ce A is infi nite , it fol low s that 4? is infi nite by The ore m 6.1. 3. §
Section 6.2
(c) For eachn € N, let # denote the sequence of digits of the binary representation
of n in reverse order. Let (wo, w1, w2,...> be an enumeration without repeti-
tions of &*. Then define f: N — P({a, b}*) as follows
f(n) = {w,|the (i + 1)th digit of # is 1, where i > 0}.
For example, if the enumeration of Z* is in standard order,
<A, a, b, aa, ab, ba, bb, aaa, aab, .. .»
then
£0) = $,
f() = {A},
£2) = {a},
fG) = {A, a},
f(4) = {5},
etc.
The function fis a bijection from N to the set of finite subsets of X*.
2. (b) Define f: [0, 1] — [0, 1) by
IY) =}
fQ) =
for)
= ET ne
S(x) = x for x zt.
Then f is a bijection from [0, 1] to [0, 1). Now let g: [0, 1) — [0, co) be defined
by g(x) = x/(1 — x). Then gf: [0, 1] > [0, oo) is a bijection.
3. Let f, be a bijection from [0, 1] to A, fy be a bijection of [0, 1] to B, fp be a bijection
from N to D, and fg be a bijection from {0, 1, 2,...,7” — 1} to E.
(a) Let g;:[0, 4) — (0, 1],
gi(x) =1fa#—2) ifx = 1/nforn> 2 wheren € N,
2,(x) = 2x otherwise;
(c) Let <do, d;, dz,.. > and <eo, €1,..., €n-1> be enumerations without repetitions
of D and E respectively. Define a function fas follows:
fiNoD*xE,
Sk) = Cdkjnts Ck mod a
would produce the number .011111..., which is different from every represen-
tation in the list but denotes a number equal to the first item on the list.
7. The digits of y form an infinite string which has a left end but no right end. Reversing
the digits results in a string which has a right end but no left end, i.e., this string is
not a member of Z*, where Z is the set of decimal digits. Since only strings in L*
represent elements of N, the result of the diagonalization is not an element of N.
Section 6.3
(ii) Let @ and b be elements of S suppose a < b and b < a. It follows from
Theorem 6.3.3 that a = 6, and hence < is antisymmetric.
(iii) Let a, b and ¢ be elements of S and assume a < b and b<c. Let A, B
and C be sets with cardinalities a,b and c respectively. Since a< b,
an injection f exists from A to B. Since b < c, an injection g exists from
B to C. Let h be the composite function h = gf, where gf: A —> C. Then,
by Theorem 4.2.1(b), A is injective and therefore a < c. It follows that
< is transitive.
To show that < is a linear order, we need to show that any two
elements of S are comparable, i.e., either a< b or b< a. By Theorem
6.3.2, for any a,b € S,a<b,a=b, or b <a. By Definition 6.3.2, if
a <b, th a en
< b; if a = b, thena<b, and if b < a, then b < a. Hence
a and b are comparable and therefore < is a linear order. JJ
10. (a) |Q| = No. We show this by noting that
Q = (03 UQ+ UQ-
where Q + is the set of positive rationals and Q — is the set of negative rationals.
Clearly |Q+| = |Q—| and therefore Q is the union of three countable sets.
Hence, by Theorem 6.2.3, Q is countable, ie., |Q|< No. Since there is an injec-
tion from Q+ to Q and |Q+| = No, it follows that No <|Q|. Therefore, by
Theorem 6.3.3, |Q| = No.
(b) |[0, 1] x [0, 1]]| =e.
(i) The function f: [0,1] — [0, 1] x [0,1] defined by f(x) = <x,0> is an
injection. Therefore, ¢ = |[0, 1]| < |[0, 1] x [0, 1]].
Gi) Let x = .xoxjx2... and y = .yoyi yz... be the decimal expansions of
x,y © [0, 1], where we choose a representation which does not terminate
in an infinite sequence of 9’s. (Thus, .50000 . . . is acceptable, but .4999 .. .
is not. This ensures that each x € [0, 1] will have a unique representation.)
Define g as follows:
Section 6.4
1. (a) No
(ce) Oifn2=0;R,ifn>1.
@ 0
2. (a) c¢
(b) No
3. Let &, B, and 6 be cardinal numbers of the sets A, B and C respectively and assume
A, B and C are pairwise disjoint. Then
a+ Pp=|AUB|
=(BUA| by commutativity of set union
=B+4,
so addition of cardinal numbers is commutative. Moreover,
a+(B+0)=|Al/+|BUC|
=|AUBUC)|
=([(AUB)UC] by associativity of set union
=(AUB/[+(C|
=(4
+ B) +6;
hence addition of cardinal numbers is associative. Jf
4. Although we have not proved it, the result of the operations of addition, multiplica-
tion and exponentiation of cardinal numbers is independent of the sets chosen as
representatives for the cardinal numbers, ie., if |A|=[Bl, |C| =|D] and
ANC=BOD
= ®, then
[A] +|C| =|B] + |D}.
This is not the case with the operation of subtraction proposed in the problem. For
example, let A = B = C = N and let D be the set of even integers. Then | Aj = | B|
and|C| =|D|, but|A — C]| = OA No =[B| — | DI.
5. (a) Let A, B, and D be sets such that |A| = a, |B| = b, | D| = d, and
AND=BO D=¢.
Since a < b, there exists an injection f: A —- B. Define g as follows:
g:AUD->BUD,
g(x) = f() ifx € A,
= xX ifxe D.
Then g is an injection from AU D to BU D; hence |A U DI < [Bu
DI.
Since AM D= BO D = @, it follows thata+d<b+d. J
(b) Let a=n, b=n-+1, and d=). Then a<b buta+d=N,=b4d.
8. The set {0, 1}N has cardinality 2%. Since this is the set of characteristic functions of
subsets of N, it follows that |@(N)| = 2%». In (b) of the examples immediately preced-
ing Theorem 6.3.5, we showed |@(N)| = ¢; hence 2¥> = c. In (c) of the same examples,
ANSWERS TO SELECTED PROBLEMS 383
we showed |NN|=c. But [N‘| = X¥. Since for every n> 2, 2<n<QNo, it
follows from Theorem 6.4.8 that
Section 7.1
2. unary
+ — [x—y| max min |x|
(a) Y Y Y Y Y Y Y Y
(ob) Y Y N Y Y Y N Y
(cc) N N N Y Y Y N Y
(d) N N N N Y Y Y Y
(ec) N N N N Y Y N N
(ff) Y Y Y Y y Y Y
0, = 0,0 0, = 0;. i
6. (a) This algebra is just a presentation of the integers {0, 1, 2, 3} under addition mod 4.
The operation is commutative and associative. The element a is an identity.
All elements have inverses (because the element a appears in every row of the
operation table.) No zero element exists (because no row (column) has entries
which are all equal to the row (column) label).
7. (a) a
ae
(d) a b
a a b
b b b
(f) a b
a aa
384 ANSWERS TO SELECTED PROBLEMS
(h) a b
a a b
b aa
Section 7.2
1. Let 7, = {x|x € R and x<k}. Then & is a zero element of <7, max>, but no
identity element exists.
2. We must show that <S,, +> is a subalgebra of <I, +>. By definition of S, it follows
that S, < I. Furthermore, since k > 0, the set S, is closed under addition, i.e.
ifx>kandy>k, thenx +y>k;
therefore ¢S,, +> is an algebra. It follows that <S;,, +> is a subalgebra of <I, +>
and hence a subsemigroup of <I, +>. Jj
5. Let <T, °’, 1 be a subalgebra of a monoid <S,°,1>. Then T< S, 1’ = 1, and
ac’b=aob for all a,b € T. The operation o is associative on S; hence the
operation 0’ is associative on T since
(ao’ b)o’c = (ac b)oc = ac (boc) =a’ (bo’ ©).
Moreover, 1’ is an identity with respect to o’, since
Vox =lox=x,
Therefore <T, ’, 1 isa monoid. Jj
8. t+ke{ 0 12 3 4 x
tal)
©
©
NOW
hb
WN&
a
&
aox=acy>do(aox)=Go(acy)
> (@cea)ox=(@oa)oy
=>lox=loy
> x= y.
In the same way, we can show that ifxoa=yoa,thnx=y. J
(b) By definition, ao S = {ao x|x € S}. Since S is closed under o, ao Sc S.
Now suppose y is an arbitrary element of S. Then for some x € S, namely
X= doy,
acox=ac(G@cey)=(acad)oy=loy=y;
hence ao S> 8S. Therefore aco S =S. Similarly, one can show that
S=Soa fj
(c) Let x be the inverse of a; then
Gox=xoa@=l,
and
xX=lox=(aca@)ox=ac(@ox)=aol=a, J
12. Variety Cardinality
a group 1
b semigroup 1
c semigroup 2
d group 4
e semigroup 3
13. (b) The algebra <{R*|n ¢ I+}, composition, R*> is a monoid if and only if
R* R/ = R’ for all positive j. This holds if and only if R* R’= R. Thus a neces-
sary and sufficient condition is that there exist a k such that R“*! = R. (Note
that there need not be a k such that R“ = R°: an example is R={<a,hb>,<b,
a>,<c,a>},.)
16. (a) Since k binary digits are used to represent each representable integer, the
carrier has 2* elements. The variety is a group, because 0 is an additive identity
and if 2* — x is added to any representable integer x, the result will be 0.
(b) The carrier still has 2* elements, but the variety is a monoid with identity
element 0. For every representable x and y, the operation @ of the monoid is
defined by x@®y = min (x + y, 2* — 1).
Section 7.3
1. (a) An isomorphism is a bijective map from one carrier to another; if the carriers
of two algebras have different cardinalities, then no bijections exist from one to
the other.
(b) Let A, = <{a, b}, o> and A, = <{ce, d}, [_}>, where o and (] have the operation
tables o lab led
a c ce oe
386 ANSWERS TO SELECTED PROBLEMS
= h-*(c) o h-'(d),
and
ho (ky) = ANAK) = ATA(Ky) = ky.
Thus h-! is an isomorphism from A, to A,, which establishes that 4, ~ A,
and that ~ is symmetric.
(iii) Suppose 4, ~ A, and A, ~ A;; and let A be an isomorphism from A, to A,
and g be an isomorphism from A, to A;. We show that gh is an isomorphism
from A, to A;:
gh(ac b) = g(h(a 5))
= g(h(a)) (_} e(A(®)).
= gh(a) A gh(b).
Moreover,
gh(k,) = g(k2) = ks.
It follows that 4, ~ A; and that ~ is transitive. Jj
5. (Proof of Theorem 7.3.3b) Let A = <S,°,1> be a monoid and A’ = <S", 0’, 1D
(note that A’ need not be a monoid). The same proof given in the text for part (a)
of the Theorem establishes that the operation ’ is closed and associative over the set
ACS). To show that 1’ is an identity with respect to o’ for the set AGS), we note that
h(1) = 1’ since 4 is a homomorphism from A to A’. Then for any x € ACS), there is
some a € S such that h(a) = x, and
1’ 0 ‘x = A(1) o’ h(a) = AC o a) = h(a) = x.
Thus 1’ is an identity for the set h(S) and hence <h(S), 0’, 1 isa monoid. fj
6. (a) The function f: N - S is defined by
f(n) = n mod 2*
and is a homomorphism since
S(a + b) = (a + 6) mod 2* = (amod 2% + b mod 2*) mod 2*
== amod2* @ b mod2*
= f(a Of).
ANSWERS TO SELECTED PROBLEMS 387
Section 7.4
Then + pi
ate~b+ee rs + tlu
tu ~ [q
<> (pu + tq)qu ~ (ru + ts)/su
<> (pu + tq)su = (ru + ts)qu.
tqsu = rquu + tqsu = (ru + ts)qu. Hence ,
But since ps = rq, (pu + tq)su = psuu +
a+c~ b-+c. Moreo ver, since -+ isco mmut ativ e,a ~ b2=c t+ta n~c+ b.To show
c, the prece ding proof can be altere d by repla cing
that a ~ b implies a — c ~ 6 —
that a ~ b impli es that c — a ~ c — b, how-
each occurrence of + by —. To prove
388 ANSWERS TO SELECTED PROBLEMS
plq ~ v/s => ps = rq => —ps = —rq > (—p)lq ~ (—r)/s > —(p/q) ~ —(/S).
3. (a) This is not a congruence relation, since
1~ —2but —-1+1% —-2+1.
(b) This is not a congruence relation because it is not an equivalence relation; it is
reflexive and symmetric, but not transitive.
4. Let ~ be any equivalence relation over {0, 1, 2,..., } such that every equivalence
class of ~ is a sequence of successive integers:
an b=Vxla<x<b>a~ x].
Then ~ is a congruence relation on the algebra <{0, 1, 2,..., k}, max>.
5. (a) Since K is an ideal, KoO< K. But by the properties of a zero element,
Ko OQ = {0}. Therefore, {0} < K;ie.,0 € K.
6. The set of multiples of any integer k is an ideal of <I, ->.
The relation ~ is a congruence relation on <S, (> if ~ is an equivalence relation
and for all a, b,c,d & S,ifa~ b, then
(i) Ci, c,d) ~ (1G, ¢, d)
(ii) (ic, a, d) ~ Cie, b, d)
(iii) (C\(e, d, a) ~ [1(e, d, 6).
From these conditions we can show that the following (which can be used as an alter-
native definition):
An equivalence relation ~ is a congruence relation over <S, [> if and only if
for all elements a, a’, b, b’, c, c’ € S,
Section 7.5
2. Define the map ffrom A/~ to <A(S), 0’, A’, k’> as follows:
an analogous proof can be used to show that <1, 1 is a right identity. It follows that
the product algebra of two monoids is a monoid. J
6. (a) Always.
(b) This is easily shown by establishing that the function
h:A-(A x AY~,
h(x) = [x] = {<x, »>},
is an isomorphism.
7. (a) This can be shown by constructing the operation tables of the two algebras
and showing that they are identical except for notation. In particular, the
map f such that f(<0, 0>) = 0, f(<1, 1D) = 1, £(O, 2>) = 2, F(K1, O) = 3,
F(<O, 1>) = 4 and f(<1, 2>) = 5 is an isomorphism from A, Xx A3 to Ag.
BIBLIOGRAPHY
Axo, ALFRED V., JoHN E. Hopcrort, AND JEFFREY D. ULLMAN, The Design and Analysis
of Computer Algorithms. Reading, Mass.: Addison-Wesley, 1974.
Ano, ALFRED V., AND JEFFREY D. ULLMAN, The Theory of Parsing, Translation, and Com-
piling. Englewood Cliffs, N.J.: Prentice-Hall, Inc., 1972.
BELLMAN, RICHARD, KENNETH L. COOKE, AND Jo ANN Lockett, Algorithms, Graphs and
Computers. New York: Academic Press, 1970.
BUSACKER, RoBERT G., AND THomas L. Saaty, Finite Graphs and Networks; An Introduc-
tion with Applications. New York: McGraw-Hill, 1965.
CouEN, Paut J., Set Theory and the Continuum Hypothesis. New York: W. A. Benjamin,
1966.
Coun, P. M., Universal Algebra. New York: Harper & Row, 1965.
DeLonc, Howarp, A Profile of Mathematical Logic. Reading, Mass.: Addison-Wesley,
1970.
Deo, NaR sIN GH, Gra ph The ory with Appl icat ions to Eng ine eri ng and Com put er Scie nce.
Englewood Cliffs, N.J.: Prentice-Hall, Inc., 1974.
Exspas, B., ET AL., “An Ass ess men t of Tec hni que s for Pro vin g Pro gra m Cor rec tne ss, ”
ACM Computing Surveys, Volume 4, Number 2, June 1972.
Even, SHimMon, Algorithmic Combinatorics. New York: The Macmillan Co., 1973.
Fioyp, R. W., “Assigning Mea nin gs to Pro gra ms, ” in Mat hem ati cal Aspe cts of Com put er
Science, Proc . Sym p. Appl . Mat h., Vol ume 19, ed. J. T. Schw artz ., Pro vid enc e, R. I:
American Mathematical Society, 1967.
FRALEIGH, J. B., A Firs t Cou rse in Abs tra ct Alg ebr a. Rea din g, Mas s.: Add iso n-W esl ey,
1969.
Git, ArTHUR, Applied Alg ebr a for the Com put er Sci enc es. Eng lew ood Clif fs, N.J .:
Prentice-Hall, Inc., 1976.
GRATZER, G., Universal Algebra. New York: Van Nostrand, 1968.
391
392 BIBLIOGRAPHY
Hautmos, PauL R., Naive Set Theory. New York: Van Nostrand, 1960.
HEerSTEIN, I. N., Topics in Algebra. Waltham, Mass.: Blaisdell, 1964.
Hoare, C. A. R., “An axiomatic basis for computer programming,” Communications of
the ACM, Volume 12, Number 10, October, 1969.
Knut, D.E., The Art of Computer Programming; Vol. I/ Fundamental Algorithms
(2nd Ed.). Reading, Mass.: Addison-Wesley, 1973.
Knutu, D. E., The Art of Computer Programming; Vol. 3/ Sorting and Searching, Reading,
Mass.: Addison-Wesley, 1973.
Knutu, D. E., Surreal Numbers, Reading, Mass.: Addison-Wesley, 1974.
KRrivINE, Jean-Louis, Introduction to Axiomatic Set Theory. Dordrecht, Holland: D.
Reid! Publishing Co., 1971.
LANDAU, EpMuUND, Foundations of Analysis. New York: Chelsea Publishing Co., 1951.
Liu, C. L., Introduction to Combinatorial Mathematics. New York: McGraw-Hill, 1968.
MACLANE, SAUNDERS, AND GARRETT BirkHorFF, Algebra. New York: The Macmillan Co.,
1967.
Mak}, D.P., and M. THompson, Mathematical Models with Applications. Englewood
Cliffs, N.J.: Prentice-Hall, Inc., 1973.
Manna, ZOHAR, Mathematical Theory of Computation. New York: McGraw-Hill, 1974.
Minsky, MARVIN, Computation: Finite and Infinite Machines. Englewood Cliffs, N.J.:
Prentice-Hall, 1967. ;
A Assertions, 9
logically equivalent, 13, 30
Absolute satisfiable, 30
complement, 89 unsatisfiable, 30
optimality, 259 valid, 30
Absurdity, 14 Associative operation, 19, 86, 108
393
394 INDEX
Edges, 125
Element Factorial, 201
greatest, 167 Fallacies, 41, 47, 107
identity, 304-5 Father (of a node), 132
least, 167 Fermat’s Last Theorem, 77
maximal, 173 Fibonacci sequence, 199, 245
minimal, 173 File, 136, 206, 262
of a set, 75 Final assertion, 58
First Prin ci pl e of Ma th em at ic al In du ct io n, 10 2,
zero, 304-6
Empty
107, 170
relation, 122 Flowcharts, 127
set, 84 Formal system, 39, 47, 54-55
string, 96 Four cubes problem, 220
Enumeration, 281 Fractions, 322
Equality Free monoid, 322
functions, 196 Free variable, 25
relations, 123 Function, 193
sets, 78 argument, 194
Equipotent, 288 bijective, 204, 214-15, 217, 222
396 INDEX
K M