Basic Set Theory
Basic Set Theory
Paul L. Bailey
Preface v
Chapter I. Symbolic Logic 1
1. Propositions 1
2. Logical Operators 1
3. Tautologies and Contradictions 6
4. Generation of Operators 8
5. Exercises 11
Chapter II. Sets 13
1. Sets and Elements 13
2. Subsets and Quantifiers 14
3. Set Operations 15
4. Cartesian Product 17
5. Numbers 18
6. Exercises 20
Chapter III. Functions 21
1. Functions 21
2. Images and Preimages 23
3. Composition of Functions 25
4. Restrictions and Bijections 26
5. Exercises 27
Chapter IV. Collections 29
1. Collections of Sets 29
2. Collections of Functions 29
3. Power Sets 31
4. Partitions 32
5. Exercises 33
Chapter V. Relations 35
1. Relations 35
2. Partial Orders and Total Orders 36
3. Equivalence Relations 37
4. Equivalence Classes 38
5. Partitions induced by Equivalence Relations 39
6. Partitions induced by Functions 40
7. Functions defined on Partitions 41
8. Canonical Functions 42
9. Exercises 43
iii
iv CONTENTS
v
CHAPTER I
Symbolic Logic
1. Propositions
A proposition is a statement which is either true or false, although we may not
know which. Propositions are denoted by lowercase letters such as p, q or r. The
truth or falsity of the proposition is called its truth value, and the two possible
truth values are labeled T for TRUE and F for FALSE. The truth value of the
proposition p is denoted V(p).
For example, the statement “The sun rises in the east” is a proposition, and if
we wish to label this statement p, we write
p = “The sun rises in the east”.
Similarly, we may write
q = “The sun rises in the west”.
In this case, V(p) = T and V(q) = F.
2. Logical Operators
Propositions may be modified and combined by the use of logical operators,
which take one or more propositions and create a new one which has its own truth
value. The resultant truth value is uniquely determined by the proposition(s) op-
erated upon and the operator(s) used. Operators which accept one input are called
unary operators, and operators which accept two inputs are called binary operators.
The behavior of each logical operator is determined by a truth table. The truth
table lists all possible combinations of the truth values of the inputs, and states the
operator’s output for each combination of inputs.
The simplest useful logical operator is the negation operator NOT (¬), which
operates on a single proposition and reverses its truth value. Thus
¬(“Pigs are mammals”) = “Pigs are not mammals”.
1
2 I. SYMBOLIC LOGIC
The action that NOT has on the truth value of a proposition is defined by its
truth table, which lists the possible truth values of a proposition p side by side with
the truth value of ¬p:
p ¬p
T F
F T
Table 1. NOT Truth Table
p q p∧q
T T T
T F F
F T F
F F F
Table 2. AND Truth Table
p q p∨q
T T T
T F T
F T T
F F F
Table 3. OR Truth Table
Thus if let p and q be as above and we assume that pigs are mammals who
cannot fly, we have V(p) = T, V(q) = F, V(p ∧ q) = F and V(p ∨ q) = T.
At this point we adopt the convention that the NOT operator takes “binds
tighter” than any other operator, that is, it takes precedence in the order of oper-
ations and applies only to the object on its immediate right. Thus ¬p ∧ q means
(¬p) ∧ q as opposed to ¬(p ∧ q). We are now ready for our first theorem.
2. LOGICAL OPERATORS 3
Theorem I.2. (DeMorgan’s Laws) For any two propositions p and q we have
(1) V(¬(p ∨ q)) = V(¬p ∧ ¬q);
(2) V(¬(p ∧ q)) = V(¬p ∨ ¬q).
Proof. The proofs of these assertions are truth tables in which each step is ex-
panded, and the columns corresponding to either side of the equalities above are
compared.
p q p∨q ¬(p ∨ q) ¬p ¬q ¬p ∧ ¬q
T T T F F F F
T F T F F T F
F T T F T F F
F F F T T T T
p q p∧q ¬(p ∧ q) ¬p ¬q ¬p ∨ ¬q
T T T F F F F
T F F T F T T
F T F T T F T
F F F T T T T
If propositions are linked together to form new propositions via logical opera-
tors, the result may be called a composite proposition. Propositions which are not
presented as composites are known as atomic propositions, or atoms. It is critical
to realize that the propositional calculus we are developing cannot tell us anything
about the truth or falsity of atoms. However, if we know the truth value of atoms
prior to applying the propositional calculus to some composite of them, it will tell
us the truth value of that composite.
The proof of DeMorgan’s Laws points out that even complicated composites
have corresponding truth tables which relate the possible truth values of potentially
unknown propositions to the truth value of the composite. In particular, suppose
we do not know the truth values of p and q, and we let r = ¬(p∧q) and s = ¬p∨¬q.
Then V(r) = V(s) regardless of the meaning of p and q.
Corollary I.3. The disjunction operator OR may be defined in terms of the nega-
tion operator NOT and the conjunction operator AND as
V(a ∨ b) = V(¬(¬a ∧ ¬b)).
Proof. Apply Assertion I.1 to DeMorgan’s First Law (take the NOT of both sides).
We may think of the NOT operator as distributing into the AND operator, but
when it does so it changes AND to OR. An analogous statement applies to the OR
operator. However, we do have a actual distributivity of AND over OR and of OR
over AND.
4 I. SYMBOLIC LOGIC
Theorem I.4. (Distributive Laws) For any two propositions p and q we have
(1) V((p ∨ q) ∧ r) = V((p ∧ r) ∨ (q ∧ r));
(2) V((p ∧ q) ∨ r) = V((p ∨ r) ∧ (q ∨ r)).
Proof. The tables tell the story.
p q r p∨q (p ∨ q) ∧ r p∧r q∧r (p ∧ r) ∨ (q ∧ r)
T T T T T T T T
T T F T F F F F
T F T T T T F T
T F F T F F F F
F T T T T F T T
F T F T F F F F
F F T F F F F F
F F F F F F F F
p q r p∧q (p ∧ q) ∨ r p∨r q∨r (p ∨ r) ∧ (q ∨ r)
T T T T T T T T
T T F T T T T T
T F T F T T T T
T F F F F T F F
F T T F T T T T
F T F F F F T F
F F T F T T T T
F F F F F F F F
Intuitively we realize that AND and OR are commutative operators, which is
to say that p ∧ q means the same thing as q ∧ p and p ∨ q is just another way of
saying q ∨ p. Thus we are content when we notice that our truth tables agree. It is
also easily verified that AND and OR are associative operators, and we leave it to
the reader to verify this.
Assertion I.5. (Commutativity Laws) For any two propositions p and q we have
(1) V(p ∧ q) = V(q ∧ p);
(2) V(p ∨ q) = V(q ∨ p).
Assertion I.6. (Associativity Laws) For any propositions p, q, and r we have
(1) V((p ∧ q) ∧ r) = V(p ∧ (q ∧ r));
(2) V((p ∨ q) ∨ r) = V(p ∨ (q ∨ r)).
2. LOGICAL OPERATORS 5
Commutativity and associativity do not hold for all of the commonly used
logical operators. This brings us to the implication operator IMP (⇒), where we
read p ⇒ q as “p implies q” or as “if p, then q”. We have a name for the components
of an implication: p is called the hypothesis and q is called the conclusion. One
may be surprised by the truth table of this logical operator the first time it is
encountered:
p q p⇒q
T T T
T F F
F T T
F F T
Table 4. IMP Truth Table
A false proposition implies anything one wishes it to imply. Thus the proposi-
tion “If pigs fly, then the earth if flat” is true whether or not the earth is indeed
flat. Just to get our feet wet with the implication operator, we assert the following,
which may be verified directly from the truth tables.
Assertion I.7. If p and q are propositions, then
(1) p ⇒ (p ∨ q);
(2) (p ∧ q) ⇒ p.
Theorem I.8. The implication operator IMP may be built from the negation op-
erator NOT and the conjunction operator AND operators since
V(p ⇒ q) = V(¬(p ∧ ¬q)).
At this point you may be asking why we chose for p ⇒ q to be true even when p
and q are both false. The others choices in the truth table for implication are easily
justified by common sense, but why this one? The answer lies in the truth table
for the equivalence operator and the theorem which follows it, a theorem which we
very much want to be true and which depends on this choice.
The equivalence operator IFF (⇔) signifies logical equivalence, so that p ⇔ q
is read “p is logically equivalent to q” or “p if and only if q”. This is the operator
that answers the question “do p and q have the same truth value?”
p q p⇔q
T T T
T F F
F T F
F F T
Table 5. IFF Truth Table
Any two tautologies may be combined via the AND operator to form another
tautology. Indeed, the tautology
(p ∨ ¬p) ∧ ¬(p ∧ ¬p),
which states that either p is true or ¬p is true, but not both, is often considered the
basis of Western logic. Notice that the “but not both” part may be derived from
the p ∨ ¬p part by an application of DeMorgan’s Law.
Examples of contradictions:
(1) p ∧ ¬p
(2) p ⇒ ¬p
(3) (p ⇔ (p ∧ q)) ∧ (p ⇒ q)
Similarly, any two contradictions may be combined via the OR operator to
form another contradiction (they may also be combined via the AND operator to
form another contradiction, but this is a weaker statement).
Examples of indeterminate propositions:
(1) (p ⇒ q) ⇔ (p ∧ q)
(2) (p ∨ ¬q) ⇒ (p ∨ ¬p)
(3) p ⇒ q
In a certain sense, mathematics is the process of discovering tautologies. How-
ever, the superstructure of most theorems is of the indeterminate form p ⇒ q. Why,
then, is it difficult to prove theorems? It may seem that one simply needs to deter-
mine the truth values of p and q and verify the truth or falsity of the theorem with
a glance at the truth table for implication. This is far from the case; an implication
is a description of the relationship between p and q, and not of their individual
truth values. In fact, proving an implication involves verifying that all four rows of
the truth table for implication are satisfied (although such proofs rarely take this
explicit form).
Now we turn to a pair of constructions which are critically important for aspir-
ing mathematicians to grasp. Suppose that p and q are propositions, and consider
the implication p ⇒ q. The converse of this implication is the proposition q ⇒ p,
whereas its contrapositive is the proposition ¬q ⇒ ¬p.
Assertion I.11. The contrapositive of an implication is logically equivalent to it.
The converse of an implication is logically independent of it.
Proof. To explore the logical relations between any two propositions a and b, we
construct the truth table of a ⇔ b. If this truth table contains nothing but T’s in
the last column, then a and b are logically equivalent. If this truth table contains
nothing but F’s in the last column, then a and b are logically incompatible. If this
truth table contains some T’s and some F’s in its last column, then a and b are
logically independent. We leave it as an exercise to determine what a and b should
be in these cases and to complete the proof.
8 I. SYMBOLIC LOGIC
An example is in order. Let p be the proposition “The egg falls fifty feet onto
cement” and q be the proposition “The egg breaks”. Additionally, we assume that
when an egg falls fifty feet onto cement, then it breaks, so that we are assuming
that p ⇒ q is true. Now it is clear that if the egg is not broken, it could not have
fallen fifty feet onto cement. This is nothing more than the claim ¬q ⇒ ¬p. On
the other hand, it is possible to break an egg without dropping it fifty feet onto
cement; just because it is broken, we may not accurately conclude that it did drop
fifty feet onto cement. So the converse q ⇒ p is not necessarily true.
It is intuitively clear that the converse of an implication is not logically equiv-
alent to the implication, and yet when immersed in the abstract world of mathe-
matics, surrounded by definitions and related ideas which have not previously been
contemplated, the distinction between an implication and its converse may seem to
blur. Thus it is a good idea to keep in mind “the converse is not necessarily true”
(even when the implication is).
On the other hand, many proofs depend on the contrapositive. It is often easier
to prove that ¬q ⇒ ¬p than p ⇒ q; but if we can prove that ¬q ⇒ ¬p, we get
p ⇒ q for free.
A related idea is that of proof by contradiction. Here we wish to prove some
proposition a, where a may or may not be in the form of an implication. The
roundabout method of proof by contradiction assumes that ¬a is true, and arrives
at a conclusion which is a proposition known to always be false, in other words,
a contradiction. Thus the assumption that led to the contradiction (¬a) must be
false, proving that a is true. This technique is invaluable in group theory and
topology.
Often one finds proofs that masquerade as proofs by contradiction but are
actually proofs by contrapositive. That is, one wishes to prove that p ⇒ q, and so
assumes that p ∧ ¬q is true, and arrives at a contradiction, without ever using the
assumption p. This is not the preferred method.
4. Generation of Operators
In this section we introduce primitive logical operators which do not arise in
ordinary language but which, nonetheless, arise from definitional truth tables which
differ from those we have already encountered. These are XOR, NOR, and NAND.
The exclusion operator XOR (l) stands for exclusive OR and means a or b, but
not both.
a b alb
T T F
T F T
F T T
F F F
Table 6. XOR Truth Table
The alternate denial operator NOR (↑) means “neither a nor b”.
a b a↑b
T T F
T F F
F T F
F F T
Table 7. NOR Truth Table
a b a↓b
T T F
T F T
F T T
F F T
Table 8. NAND Truth Table
Theorem I.16. The operators NOT, AND, OR, IMP, IFF, XOR, and NOR may
be derived from NAND.
Proof. It suffices to show that NOT and AND may be written in terms of NAND.
The definition of NAND and a glance at the truth tables gives us that
(1) ¬a ⇔ (a ↓ a)
(2) (a ∧ b) ⇔ ¬(a ↓ b)
There are four possible logical operators of a single proposition, and we have
only discussed the identity operator (V) and NOT. There are also the constant
operators whose value is always T or F. Notice that a constant operator cannot be
generated from NOT because NOT NOT is the identity, NOT NOT NOT is NOT,
etc. We use this fact in our final theorem.
Theorem I.17. The operators NOR and NAND are the only binary operators
which are sufficient by themselves to generate NOT, AND, OR, IMP, IFF, XOR,
NOR, and NAND.
Proof. In order for a generic binary operator GEN (t) to generate NOT, a t b must
be false when both a and b are true, for otherwise we can never achieve anything
but true in the first row of a truth table of a composite proposition whose only
operator is GEN. Similarly, a t b must be true whenever both a and b are false.
Thus we have a partial truth table for GEN.
p q ptq
T T F
T F V1
F T V2
F F T
Now suppose that GEN is not a commutative operator. If V1 = T and V2 = F,
then (p t q) ⇔ ¬(q) is a tautology, and if V1 = F and V2 = T, then (p t q) ⇔ ¬(p)
is a tautology. In either case, GEN may be constructed from NOT. However, NOT
cannot generate a constant operator of a single atom such as p∧¬p, which is always
false, and thus NOT cannot generate AND.
Thus for GEN to generate the other logical operators, it must be commutative
so that V1 = V2 = V. If V = T, then GEN is NAND, and if V = F, then GEN
is NOR.
There are sixteen possible truth tables resulting from combinations of two
propositions, and we have only mentioned seven of them. The reader is welcomed
to explore the possibilities inherent in the others.
5. EXERCISES 11
5. Exercises
Exercise I.1. Determine the truth table of the following composite propositions
and state whether they are tautologies, contradictions, or indeterminate.
(a) (p ∨ q) ⇒ (p ∧ q)
(b) (p ∧ q) ∨ (p ⇒ q)
(c) (p ⇒ q) ⇒ p
(d) p ⇒ (q ⇒ p)
(e) (p ⇒ q) ⇒ q
(f ) p ⇒ (q ⇒ p)
(g) (p ⇒ q) ⇒ r
(h) p ⇒ (q ⇒ r)
(i) ((p ⇒ q) ∧ (q ⇒ r)) ⇒ (p ⇒ r)
(j) (p ∧ q) ⇔ (p l q)
(k) (p ↓ q) ⇒ (p ∨ q)
Exercise I.2. Complete the proof of Assertion I.11.
Exercise I.3. Write a logically equivalent statement using NOT, AND, and OR.
(a) ¬(p ⇒ q)
(b) (p ⇒ q) ⇒ r
Exercise I.4. Use truth tables to prove the following assertions.
(a) (a l b) ⇔ ¬(a ⇔ b)
(b) (a ↑ b) ⇔ ¬(a ∨ b)
(c) (a ↓ b) ⇔ ¬(a ∧ b)
Exercise I.5. Show that the logical operators NOT and OR are sufficient to gen-
erate AND, IMP, IFF, XOR, NOR, and NAND.
Exercise I.6. Develop the truth tables for logical operators of one proposition
other than NOT. You should get three of these, and you will see that they may
reasonably be called identity, constant truth, and constant falsehood.
Exercise I.7. Develop the truth tables for logical operators of two propositions
other than AND, OR, IMP, IFF, XOR, NOR, and NAND. You should get nine
of these. Give these new operators names. Relate them to the operators of one
proposition identity, constant truth, constant falsehood, and negation. Relate them
to the operators of two propositions AND, OR, IMP, IFF, XOR, NOR, and NAND.
CHAPTER II
Sets
13
14 II. SETS
3. Set Operations
Let A and B be subsets of some “universal set” U and define the following set
operations:
Intersection: A ∩ B = {x ∈ U | x ∈ A ∧ x ∈ B}
Union: A ∪ B = {x ∈ U | x ∈ A ∨ x ∈ B}
Complement: A r B = {x ∈ U | x ∈ A ∧ x ∈
/ B}
The pictures which correspond to these operations are called Venn diagrams.
Example II.1. Let A = {1, 3, 5, 7, 9}, B = {1, 2, 3, 4, 5}. Then A ∩ B = {1, 3, 5},
A ∪ B = {1, 2, 3, 4, 5, 7, 9}, A r B = {7, 9}, and B r A = {2, 4}.
Example II.2. Let A and B be two distinct nonparallel lines in a plane. We may
consider A and B as a set of points. Their intersection is a single point, their union
is crossing lines, and the complement of A with respect to B is A minus the point
of intersection.
If A ∩ B = ∅, we say that A and B are disjoint.
Example II.3. A sphere is the set of points in space equidistant from a given
point, called its center; the common distance to the center is called that radius of
the sphere. Thus a sphere is the surface of a solid ball.
Take two points in space such that the distance between them is 10, and imagine
two spheres centered at these points. Let one of the spheres have radius 5. If the
radius of the other sphere is less than 5 or greater than 15, then the spheres are
disjoint. If the radius of the other sphere is exactly 5 or 15, the intersection is a
single point. If the radius of the other sphere is between 5 and 15, the spheres
intersect in a circle.
The following properties are sometimes useful in proofs:
• A=A∪A=A∩A
• ∅∩A=∅
• ∅∪A=A
• A⊂B ⇔A∩B =A
• A⊂B ⇔A∪B =B
As an example, we prove one of these properties.
16 II. SETS
4. Cartesian Product
Let a and b be elements. The ordered pair of a and b is denoted (a, b) and is
defined as
(a, b) = {{a}, {a, b}}.
This is the technical definition; think about how it relates to the intuitive approach
below.
Intuitively, if a and b are elements, the ordered pair with first coordinate a and
second coordinate b is something like a set containing a and b, but in such a way
that the order matters. We denote this ordered pair by (a, b) and declare that it
has the following “defining property”:
(a, b) = (c, d) ⇔ (a = c) ∧ (b = d).
The cartesian product of the sets A and B is denoted A × B and is defined
to be the set of all ordered pairs whose first coordinate is in A and whose second
coordinate is in B:
A × B = {(a, b) | a ∈ A, b ∈ B}.
Example II.6. Let A = {1, 3, 5} and let B = {1, 4}. Then
A × B = {(1, 1), (1, 4), (3, 1), (3, 4), (5, 1), (5, 4)}.
In particular, this set contains 6 elements.
In general, if A contains m elements and B contains n elements, where m and
n are natural numbers, then A × B contains mn elements.
Similarly, we have ordered triples (a, b, c), with a “defining property”
(a, b, c) = (d, e, f ) ⇔ (a = d) ∧ (b = e) ∧ (c = f ).
The we declare the cartesian product of three sets to be
A × B × C = {(a, b, c) | a ∈ A, b ∈ B, c ∈ C}.
By slight of hand which we will not discuss at this point, one may show that
it is possible to “identify” the order pair ((a, b), c) with the ordered pair (a, (b, c)),
so that (A × B) × C is identified with A × (B × C), and that both of these are
“identified” with A × B × C. This forces a kind of associativity on the operation
of cartesian product.
We continue with ordered n-tuples and the cartesian product of n sets, for any
natural number n. If A is a set, the cartesian product of A with itself n times is
denoted An . For example, A2 = A × A and A3 = A × A × A. The entries of an
ordered n-tuple in such a cartesian product are called coordinates.
We have the following properties:
• (A ∪ B) × C = (A × C) ∪ (B × C);
• (A ∩ B) × C = (A × C) ∩ (B × C);
• A × (B ∪ C) = (A × B) ∪ (A × C);
• A × (B ∩ C) = (A × B) ∩ (A × C);
• (A ∩ B) × (C ∩ D) = (A × C) ∩ (B × D).
18 II. SETS
5. Numbers
Later, we will formally develop some of the standard number systems. For the
time being, we use these familiar sets only in examples. Since they are useful for
intuition into general set constructions, at this time we specify the standard names
for the common sets of numbers.
The following sets of numbers are standard:
Natural Numbers: N = {0, 1, 2, 3, . . . }
Integers: Z = {. . . , −2, −1, 0, 1, 2, . . . }
p
Rational Numbers: Q = { | p, q ∈ Z, q 6= 0}
q
Real Numbers: R = {Gaps in Q}
Complex Numbers: C = {a + ib | a, b ∈ R and i2 = −1}
We view N ⊂ Z ⊂ Q ⊂ R ⊂ C.
The following standard notation gives subsets of the real numbers, called in-
tervals:
• [a, b] = {x ∈ R | a ≤ x ≤ b} (closed)
• (a, b) = {x ∈ R | a < x < b} (open)
• [a, b) = {x ∈ R | a ≤ x < b}
• (a, b] = {x ∈ R | a < x ≤ b}
• (−∞, b] = {x ∈ R | x ≤ b} (closed)
• (−∞, b) = {x ∈ R | x < b} (open)
• [a, ∞) = {x ∈ R | a ≤ x} (closed)
• (a, ∞) = {x ∈ R | a < x} (open)
5. NUMBERS 19
Example II.8. Let A = [1, 5] be the closed interval of real numbers between 1
and 5 and let B = (10, 16) be the open interval of real numbers between 10 and 16.
Let C = A ∪ B. Let N be the set of natural numbers. How many elements are in
C ∩ N?
Solution. The set C ∩ N is the set of natural numbers between 1 and 5 inclusive
and between 10 and 16 exclusive. Thus C ∩ N = {1, 2, 3, 4, 5, 11, 12, 13, 14, 15}.
Therefore C ∩ N has 10 elements.
The first three of our standard sets of numbers, N, Z, and Q, have an algebraic
nature; they are the minimum sets of numbers which allow us to add and multiply
(N), subtract (Z), and divide (Q).
The real numbers are the geometric completion of the rational numbers, con-
structed from the rational numbers by filling in the gaps. For example, the sequence
{1, 1.4, 1.41, 1.414, 1.4142, 1.41421, 1414213, . . . }
√
consists of rational numbers but converges to 2, which is not a rational number.
The rational number line has “holes” where the irrational numbers belong, and
for this reason it does not model the synthetic notion of a line as well as the real
numbers.
We think of a point as zero-dimensional space. A set which represents zero-
dimensional space is {0}. A line is one-dimensional space, and is represented by R.
A plane is two-dimensional space, and is represented by R2 , the set of all ordered
pairs of real numbers. Three-dimensional space is represented by R3 , the set of all
ordered triples of real numbers.
The complex numbers are the algebraic closure of the real numbers, and were
developed from the real numbers so that all polynomials may be factored.
Example II.9. Let A = [1, 3], B = [3, 8], and C = (0, 3) be intervals of real
numbers. The set A × B × C forms a cube in R3 , which is closed on its sides (it
contains its boundary there) but open on the top and bottom (it does not contain
its boundary there). How many elements are in (A × B × C) ∩ (Z × Z × Z)?
Solution. By generalizing a previous proposition, we have
(A × B × C) ∩ (Z × Z × Z) = (A ∩ Z) × (B ∩ Z) × (C ∩ Z).
Now A × Z = {1, 2, 3}, B × Z = {3, 4, 5, 6, 7, 8}, and C × Z = {1, 2}. Thus
(A × B × C) ∩ (Z × Z × Z) has 3 · 6 · 2 = 36 elements.
Warning II.1. The notation for ordered pair (a, b) is the same as the standard
notation for open interval of real numbers, but its meaning is entirely different.
This is standard, and you must decide from the context which meaning is intended.
20 II. SETS
6. Exercises
Exercise II.1. Let A, B, and C be the following subsets of N:
• A = {n ∈ N | n < 25};
• E = {n ∈ A | n is even};
• O = {n ∈ A | n is odd};
• P = {n ∈ A | n is prime};
• S = {n ∈ A | n is a square};
Compute the following sets:
(a) (E ∩ P ) ∪ S;
(b) (E ∩ S) ∪ (P r O);
(c) P × S;
(d) (O ∩ S) × (E ∩ S).
Exercise II.2. In each case, draw a Venn diagram representing the situation:
(a) A r (B ∪ C) = (A r B) ∩ (A r C);
(b) (A ∪ B) ∩ C = (A ∩ C) ∪ (B ∩ C);
(c) (A r B) r C = A r (B ∪ C).
Exercise II.3. Let A and B be subsets of a set U . The symmetric difference of A
and B, denoted A4B, is the set of points in U which are in either A or B but not
in both.
(a) Draw a Venn diagram describing A4B.
(b) Find two set expressions which could be used to define A4B, and justify your
answer.
(c) Choose one of your expressions above as a formal definition, and use it to prove
that symmetric difference is commutative and associative. Your proof here may
use the fact that intersection and union are commutative and associative without
proving these facts.
In the next two exercises, you should read “show that” to mean “give a formal
proof that”.
Exercise II.4. Let A, B, and C sets. Show that
(A ∩ B) ∪ C = (A ∪ C) ∩ (B ∪ C).
Exercise II.5. Let A, B, and C be sets. Show that
(A ∪ B) × C = (A × C) ∪ (B × C).
CHAPTER III
Functions
1. Functions
Let A and B be sets. A function from A to B is a subset f ⊂ A × B of the
cartesian product of A with B such that for every a ∈ A there exists a unique b ∈ B
such (a, b) ∈ f . This is the technical definition; think about how it relates to the
intuitive approach below.
Intuitively, a function from a set A to a set B is an assignment of every element
in A to some element in B. Another way of describing this is that we think of a
function as a kind of vehicle, something which sends each element of A to an element
of B. If we think of elements are the nouns of set theory and sets as the adjectives
(an element has a property if it is in the set of things with that property), then we
may think of functions as the verbs.
There are many familiar examples of functions from the set of real numbers into
itself, for example, sin, cos, log, and so forth. It is essential in mathematics, and
extremely useful as a way of thinking in general, to expand our view of functions
so that they can send elements from any set to any other set.
Let f be a function from A to B. If a ∈ A, the element of B to which a is
assigned by f is denoted f (a); in other words, the place in B to which a is sent by
f is denoted f (a). We declare that a function must satisfy the following “defining
property”:
∀a ∈ A∃!b ∈ B ` f (a) = b.
In words, for every element a in A there exists a unique element b in B such that
a is sent to b by f .
If f is a function from A to B, this fact is denoted
f : A → B.
We say that f maps A into B, and that f is a function on A. For this reason,
functions are sometimes called maps or mappings. If f (a) = b, we say that a is
mapped to b by f . We may indicate this by writing a 7→ b.
Two functions f : A → B and g : A → B are considered equal if they act the
same way on every element of A:
f = g ⇔ (a ∈ A ⇒ f (a) = g(a)).
Thus to show that two functions f and g are equal, select an arbitrary element
a ∈ A and show that f (a) = g(a).
If A is sufficiently small, we may explicitly describe the function by listing the
elements of A and where they go; for example, if A = {1, 2, 3} √ and B = R, a
perfectly good function is described by {1 7→ 23.432, 2 7→ π, 3 7→ 593}.
21
22 III. FUNCTIONS
However, if A is large, the functions which are easiest to understand are those
which are specified by some rule or algorithm. The common functions of single
variable calculus are of this nature.
Example III.1. Let R be the set of real numbers. The following are all functions
from R into R:
• f (x) = 0;
• f (x) = x;
• f (x) = x3 + 3x + 17;
• f (x) = sin(x);
• f (x) = exp(x).
The following are functions from the set of positive real numbers into R:
√
• f (x) = x;
• f (x) = log(x).
Note that tan(x) is not a function from R into R, because it is not defined at (for
example) the point π2 ∈ R.
Some functions are constructed from existing functions by specifying cases.
Example III.2. Let R be the set of real numbers. Define f : R → R by
(
0 if x < 0;
f (x) =
x3 if x ≥ 0.
The reader familiar with calculus may ask himself whether or not the first, second,
and third derivatives exist and are continuous for this function.
Example III.3. Let X be a set and let A ⊂ X. The characteristic function of A
in X is a function χA : X → {0, 1} defined by
(
0 if x ∈
/ A;
χA (x) =
1 if x ∈ A.
In particular, let X = [0, 1] ⊂ R be the closed unit interval and let A = Q ∩ X be
the set of rational numbers in this interval. Think about the graph of the function
χA .
Example III.4. Suppose we designed a computer system that records information
on patients in a hospital. Each patient is assigned a number upon admission, which
is just the next available number, starting with zero. We create a program which
allows the user to type a working diagnosis of 60 characters or less for this patient,
and file this information under the patient number. We only allow the user to type
capital letters, spaces, commas, and periods in this diagnosis. The file may be
viewed as a function
DIAG(patient number) = “patient diagnosis”;
here, DIAG : N → B, where B is the set of all possible strings of allowed characters
with length less than or equal to 60 which can be typed on a computer keyboard.
The size of B is 2960 (why?).
2. IMAGES AND PREIMAGES 23
3. Composition of Functions
Let A, B, and C be sets and let f : A → B and g : B → C. The composition
of f and g is the function
g◦f :A→C
given by
g ◦ f (a) = g(f (a)).
The domain of g ◦ f is A and the codomain is C. The range of g ◦ f is the
image under g of the image under f of the domain of f .
Example III.11. Let A be the set of living things on earth, B the set of species,
and C be the set of positive real numbers. Let f : A → B assign each living thing
to its species, and let g : B → C assign each species to its average mass. Then g ◦ f
guesses the mass of a living thing.
If f and g are injective, then g ◦ f is injective. If f and g are surjective, then
g ◦ f is surjective. For example, we prove the first of these statements.
Proposition III.12. Let A, B and C be sets and let f : A → B and g : B → C
be injective functions. Then g ◦ f is injective.
Proof. To show that a function is injective, we select two elements in the domain
and assume that they are sent to the same place; it then suffices to show that they
were originally the same element.
Let h = g ◦ f . Let a1 , a2 ∈ A and suppose that h(a1 ) = h(a2 ) = c. Let
b1 = f (a1 ) and let b2 = f (a2 ). Since h(a) = g(f (a)) for each a ∈ A, we have
g(f (a1 )) = g(b1 ) and g(f (a2 )) = g(b2 ). Thus g(b1 ) = g(b2 ) = c. Since g is injective,
it follows that b1 = b2 by the definition of injectivity. Since f is injective, it follows
that a1 = a2 , again by definition.
Example III.13. Let f : R → R be given by f (x) = x2 and let g : R → R be given
by g(x) = sin x. Then g ◦ f : R → R is given by g ◦ f (x) = sin x2 and f ◦ g : R → R
is given by f ◦ g(x) = sin2 x.
This example demonstrates that composition of functions is not a commutative
operation. However, the next proposition tells us that composition of functions is
associative.
Proposition III.14. Let A, B, C, and D be sets and let f : A → B, g : B → C,
and h : C → D be functions. Then h ◦ (g ◦ f ) = (h ◦ g) ◦ f .
Proof. To show that two functions are equal, it suffices to show that they act the
same way on an arbitrary element of the domain.
Let a ∈ A. Then
h ◦ (g ◦ f )(a) = h(g ◦ f (a)) = h(g(f (a)) = h ◦ g(f (a) = (h ◦ g) ◦ f (a).
26 III. FUNCTIONS
5. Exercises
Exercise III.1. Let P be the set of people who ever lived. Which of the following
are functions from P to P ?
(a) {(a, b) ∈ P × P | b is a father of a}
(b) {(a, b) ∈ P × P | a is a father of b}
(c) {(a, b) ∈ P × P | b is a grandmother of a}
(d) {(a, b) ∈ P × P | b is a youngest son of the paternal grandmother of a}
(e) {(a, b) ∈ P × P | b is a youngest son of the maternal grandmother of a}
Exercise III.2. Let N be the set of natural numbers and let Z be the integers.
Find examples of functions f : Z → N such that:
(a) f is bijective;
(b) f is injective but not surjective;
(c) f is surjective but not injective;
(d) f is neither injective nor surjective.
Exercise III.3. Let N be the set of natural numbers. Let A be a subset of N given
by [50, 70] ∩ N, where [50, 70] is the closed unit interval of real numbers between 50
and 70.
Define a function f : N → N by f (n) = 3n. Note that A is in both the domain
and the codomain of f .
(a) Find the image f [A].
(b) Find the preimage f −1 [A].
(c) Show that f is injective.
(d) Show that f is not surjective.
Exercise III.4. Let f : R → R be given by f (x) = x3 − 6x2 + 11x − 3. Find
f −1 [{3}].
Exercise III.5. We would like to define a function f : Z × Z → Q by (p, q) 7→ pq .
Unfortunately, this does not make sense. Fix the problem, and show that the
resulting function is surjective but not injective.
Exercise III.6. We would like to define a function f : Q → Z by pq 7→ pq.
Unfortunately, this is not “well-defined”. Figure out what this means and fix the
problem. Is the resulting function injective?
Exercise III.7. Let f : X → Y be a function and let A, B ⊂ X.
(a) Show that f [A ∪ B] = f [A] ∪ f [B].
(b) Show that f [A ∩ B] ⊂ f [A] ∩ f [B].
(c) Give an example where f [A ∩ B] 6= f [A] ∩ f [B].
Exercise III.8. Let f : X → Y be a function and let C, D ⊂ Y .
(a) Show that f −1 [C ∪ D] = f −1 [C] ∪ f [D].
(b) Show that f −1 [C ∩ D] = f −1 [C] ∩ f [D].
Exercise III.9. Let f : X → Y and g : Y → Z be surjective functions. Show that
g ◦ f is surjective.
Exercise III.10. Let f : X → Y and g : Y → Z be functions.
(a) Show that if f is surjective and g ◦ f is injective, then g is injective.
(b) Give an example where g ◦ f is injective but g is not.
28 III. FUNCTIONS
Collections
1. Collections of Sets
We do not disallow the possibility that a set may be an element of another set.
In fact, this idea is very useful. For example, we may talk about the set of lines in
a plane, even though each line is a set of points in the plane. The set of lines is a
set of subsets of the points in the plane. It is common to call sets whose elements
are subsets of a given set a collection of subsets.
Let X be a set and let C be a collection of subsets of X. Then the intersection
and union of the sets in the collection are defined by
• ∩C = {x ∈ X | x ∈ C for all C ∈ C};
• ∪C = {x ∈ X | x ∈ C for some C ∈ C}.
Thus ∩C is the intersection of all the sets in C and ∪C is their union.
Example IV.1. Let A = {n ∈ N | n < 25}, O = {n ∈ A | n is odd},
P = {n ∈ A | n is prime}, and S = {n ∈ A | n is a square}. Let C = {O, P, S}.
Then
• ∩C = ∅, because no square is a prime;
• ∪C = {2, 3, 4, 5, 7, 9, 11, 13, 15, 16, 17, 19, 21, 23}.
Example IV.2. Let A = {n ∈ N | n < 1000}. For each d ≤ N, define
Dd = {n ∈ A | n = dm for some m ∈ N}.
Let D = {Dp | p is prime and p ≤ 7}. Find ∩D.
Solution. The set Dd is the set of positive multiples of d which are less then 1000.
The set D is the collection of all Dp such that p is a prime which is less than 7.
Thus D = {D2 , D3 , D5 , D7 }. Then ∩D, being the intersection of these sets, is the
set of natural numbers less than 1000 which are multiples of 2, 3, 5, and 7. Such
a number must be a multiple of 210. Also, any multiple of 210 which is less than
1000 is in all four sets. Thus ∩D = {210, 420, 630, 840}.
2. Collections of Functions
We may also consider sets whose members are functions.
Example IV.3. Let X be a set and let Sym(X) be the set of all bijective functions
on X. Then Sym(X) is a collection of functions.
If A and B are sets, we may speak of the set of all functions from A to B. We
shall denote this set by F(A, B):
F(A, B) = {f : A → B}.
29
30 IV. COLLECTIONS
Example IV.4. Let A = {1, 2} and B = {5, 6, 7}. Then F(A, B) contains the
following functions:
• 1 7→ 5 and 2 7→ 5;
• 1 7→ 5 and 2 7→ 6;
• 1 7→ 5 and 2 7→ 7;
• 1 7→ 6 and 2 7→ 5;
• 1 7→ 6 and 2 7→ 6;
• 1 7→ 6 and 2 7→ 7;
• 1 7→ 7 and 2 7→ 5;
• 1 7→ 7 and 2 7→ 6;
• 1 7→ 7 and 2 7→ 7.
Also F(B, A) contains the following functions:
• 5 7→ 1, 6 7→ 1, 7 7→ 1;
• 5 7→ 1, 6 7→ 1, 7 7→ 2;
• 5 7→ 1, 6 7→ 2, 7 7→ 1;
• 5 7→ 1, 6 7→ 2, 7 7→ 2;
• 5 7→ 2, 6 7→ 1, 7 7→ 1;
• 5 7→ 2, 6 7→ 1, 7 7→ 2;
• 5 7→ 2, 6 7→ 2, 7 7→ 1;
• 5 7→ 2, 6 7→ 2, 7 7→ 2.
Example IV.5. Let F = F(R, R) denote the set of all real valued functions of a
real variable:
F = {f : R → R}.
Let D denote the set of all differentiable functions in F:
D = {f : R → R | f is differentiable}.
Note that D ⊂ F.
The differentiation operator is a function
d
: D → F.
dx
d
Not every function is the derivative of a function, so dx is not surjective. Since two
d
functions which differ by a constant have the same derivative, dx is not injective.
3. POWER SETS 31
3. Power Sets
Let X be a set. The power set of X is denoted P(X) and is defined to be the
set of all subsets of X:
P(X) = {A | A ⊂ X}.
Here are a few examples:
• X = ∅ ⇒ P(X) = {∅};
• X = {0} ⇒ P(X) = {∅, {0}};
• X = {0, 1} ⇒ P(X) = {∅, {0}, {1}, X};
• X = {0, 1, 2} ⇒ P(X) = {∅, {0}, {1}, {2}, {0, 1}, {0, 2}, {1, 2}, X}.
and so forth. Here are some properties:
• Y ⊂ X ⇒ P(Y ) ⊂ P(X);
• ∩P(X) = ∅;
• ∪P(X) = X.
Let X be any set and let T = {0, 1}. A given function f : X → T may be
viewed as a subset of X by thinking of f as saying, for a given element, whether
or not it is in the subset. The element 1 is thought of as “ON” or “TRUE” and
the element 0 is thought of as “OFF” or “FALSE”. Specifically, given f : X → T ,
define A to the preimage of 1:
A = {a ∈ A | f (a) = 1};
−1
that is, A = f [{1}].
On the other hand, given a subset of X, we can construct a function
χA : X → T
by defining (
0 if x ∈
/ A;
χA (x) =
1 if x ∈ a.
This is just the characteristic function of the subset A.
Thus the power set of X corresponds to the set of functions from X into T in
a natural way. Another way of stating this is that there exists a bijective function
between P(X) and F(X, T ).
32 IV. COLLECTIONS
4. Partitions
Let X be a set and let C ⊂ P(X). We say that C covers X if ∪C = X. We
say that the sets in C are mutually disjoint if ∩C = ∅. If for every two distinct sets
C, D ∈ C, we have C ∩ D = ∅, we say that the members of C are pairwise disjoint.
If the sets of a collection are pairwise disjoint, then they are mutually disjoint, but
the converse of this is not necessarily true.
Example IV.6. Let X = {1, 2, 3} and let C = {{1, 2}, {1, 3}, {2, 3}}. Then
∪C = ({1, 2} ∪ {2, 3}) ∪ {2, 3} = {1, 2, 3} ∪ {2, 3} = {1, 2, 3} = X,
so the sets in C cover X. Also
∩C = ({1, 2} ∩ {1, 3}) ∩ {2, 3} = {1} ∩ {2, 3} = ∅,
so the sets in C are mutually disjoint. They are not, however, pairwise disjoint.
Let D = {{1, 2}, {3}}. Then D covers X with pairwise disjoint sets.
A partition of X is a collection of pairwise disjoint nonempty subsets of X
which covers X. The members of a partition are called blocks.
Suppose that C is a partition of X. If x ∈ X, then there is a unique A ∈ C such
that x ∈ A; x is certainly in one of them, because X is covered by the members of
C; x is in no more than one, for otherwise the ones containing x would overlap and
not be disjoint. Put another way, every x ∈ X is in exactly one of the members of
C.
Example IV.7. Let x be a point in a space and let S(x, r) be a sphere of radius
r with center x. Then the collection
S = {S(x, r) | r ∈ R and r ≥ 0}
is a partition of space; the blocks of this partition are spheres centered at x. This
is true since each point in space has a unique distance from the point x.
Example IV.8. Let C be the set of cards in a deck and let S be the set of suits.
That is, C contains 52 elements and S = {♠, ♥, ♦, ♣}. There is a natural function
f : C → S which sends a given card to its suit. The preimage of a suit under f is
the set of cards in that suit, for example:
f −1 [♠] = {2♠, 3♠, 4♠, 5♠, 6♠, 7♠, 8♠, 9♠, 10♠, J♠, Q♠, K♠, A♠}.
Let S = {f −1 [s] | s ∈ S}. Then S is a collection of subsets of C, each subset
consisting of all the cards in a given suit. It is clear that S covers C and that the
sets within S are pairwise disjoint. Thus S is a partition of C. This is a general
phenomenon: functions induce partitions on their domains. We will explore this in
depth later.
One more thing to notice here. There are as many elements in S as there are
in S. Indeed, in some philosophical way, S is essentially the same as the set S.
5. EXERCISES 33
5. Exercises
Exercise IV.1. Design a collection C of subsets of N which has all of the following
properties:
(1) C covers N (∪C = N);
(2) distinct sets in C are disjoint (C, D ∈ C and C 6= D ⇒ C ∩ D = ∅);
(3) each set C ∈ C contains infinitely many elements;
(4) C contains exactly 7 subsets of N.
Recall that we have given the name “partition” to collections of sets satisfying the
first two properties.
Exercise IV.2. Let R be the set of real numbers.
(a) Find a collection of subsets of R which covers R but whose members are not
mutually disjoint.
(b) Find a collection of subsets of R which covers R and whose members are
mutually disjoint but not pairwise disjoint.
(c) Find three different partitions of R, each containing a different number of blocks.
Exercise IV.3. Let X = {1, 2, 3, 4, 5} and let Y = {1, 2, 3}. Find a five different
partitions of the set F(X, Y ), each of which contains three blocks.
Exercise IV.4. Let X be a set and let A, B ⊂ X.
(a) Show that P(A ∩ B) = P(A) ∩ P(B).
(b) Show that P(A) ∪ P(B) ⊂ P(A ∪ B).
(c) Find an example such that P(A) ∪ P(B) 6= P(A ∪ B).
Exercise IV.5. Let X be a set. Find an injective function φ : X → P(X).
Exercise IV.6. Let X be as set. Show that there does not exist a surjective
function φ : X → P(X).
(Hint: select an arbitrary function φ : X → P(X), and construct a set in P(X)
which is not in the image of φ.)
Exercise IV.7. Let X be a set. Define a function φ : P(X) → P(X) by A 7→ X rA.
Show that φ is bijective.
Exercise IV.8. Let X be a set and let T = {0, 1}. Show that there is a corre-
spondence between the sets P(X) and F(X, T ).
Exercise IV.9. Let X be a set containing n elements. Try to count the size of the
set P(X).
Exercise IV.10. Let A and B be sets containing m and n elements respectively.
Try to count the size of the set F(A, B).
Exercise IV.11. Let X be a set containing n elements and let P be the set of all
partitions of X. Try to count the size of the set P.
CHAPTER V
Relations
1. Relations
Let A be a set. A relation R on A is a subset of the cartesian product of A
with itself: R ⊂ A × A. If (a, b) ∈ R, we say that a is related to b, and may write
aRb.
For example, suppose that A is the set of all inhabitants of some island. Let U
be the subset of A × A given by
(a, b) ∈ U ⇔ a is the uncle of b.
Let N be the subset of A × A given by
(a, b) ∈ N ⇔ a is the niece of b.
Note that aN b does not imply bU a, nor does aU b imply aN b. However, if we had
S ⊂ A × A given by
(a, b) ∈ T ⇔ a is the sibling of b,
then aSb ⇔ bSa.
Let R ⊂ A × A be a relation. We say that R is:
• reflexive if aRa for all a ∈ A;
• symmetric if aRb ⇔ bRa;
• antisymmetric if aRb ∧ bRa ⇒ a = b;
• transitive if aRb ∧ bRc ⇒ aRc;
• definite if aRb ∨ bRa for all a, b ∈ A.
The relation “is the same person as” is reflexive, symmetric, and transitive; so
is the relation “is the same height as”. The relation “is the parent of” has none
of these properties (except antisymmetry; think about why). The relation “is the
ancestor of” is transitive, and if we allow that one is one’s own ancestor, it is also
reflexive and antisymmetric.
35
36 V. RELATIONS
3. Equivalence Relations
Let A be a set and consider the relation
E = {(a, b) ∈ A × A | a = b}.
Then E is simply the relation of equality. The set E is sometimes called the diagonal
of A × A. This is because if we graph E (say that A = R), we obtain the diagonal
line which is the graph of the equation y = x. Notice that the relation of equality
is reflexive, symmetric, and transitive.
Let A be a set and let ≡ be a relation on A. We say that ≡ is an equivalence
relation if it is reflexive, symmetric, and transitive:
• a ≡ a (reflexivity);
• a ≡ b if and only if b ≡ a (symmetry);
• if a ≡ b and b ≡ c, then a ≡ c (transitivity).
Example V.4. Let A be the set of all animals in the world. Define a relation R
by
R = {(a, b) ∈ A × A | a and b are of the same species }.
Note that we could have written this
aRb ⇔ a and b are of the same species.
Then R is an equivalence relation on the set A. For certainly if an animal a is a
pig, then it is a pig (reflexivity); if a and b are both pigs, then b and a are both
pigs (symmetry); and if a and b are both pigs, and b and c are both pigs, then a
and c are both pigs (transitivity).
Example V.5. Let X = N × N. Define a relation on X by
(a, b) ≡ (c, d) ⇔ a + d = b + c.
This is an equivalence relation.
Example V.6. Let Z∗ = Z r {0} be the set of nonzero integers. Let X = Z × Z∗ .
Define a relation on X by
(a, b) ≡ (c, d) ⇔ ad = bc.
Show that this is an equivalence relation.
Solution. We wish to show that ≡ is reflexive, symmetric, and transitive.
(Reflexivity) Let (a, b) ∈ X. Then ab = ba by commutativity of multiplication.
This says that (a, b) ≡ (a, b), so ≡ is reflexive.
(Symmetry) Let (a, b), (c, d) ∈ X. Then
(a, b) ≡ (c, d) ⇔ ad = bc ⇔ cb = da ⇔ (c, d) ≡ (a, b),
so ≡ is symmetric.
(Transitivity) Let (a, b), (c, d), (e, f ) ∈ X. Suppose that (a, b) ≡ (c, d) and
(c, d) ≡ (e, f ). Then ad = bc and ce = df . Multiply the first equation by e and
the second by b and apply commutativity of multiplication in the integers to obtain
ade = bce and bce = bdf . Then by transitivity of equality, we have ade = bdf . By
cancelation, we have ae = bf . Thus (a, b) ≡ (e, f ), and ≡ is transitive.
38 V. RELATIONS
4. Equivalence Classes
Relations of this type are particularly important, because they group the el-
ements of a set into blocks such that the members of one of the blocks, although
not exactly equal, are similar in some sense in which one may be interested. More
precisely, equivalence relations induce partitions on sets.
Let ≡ be an equivalence relation on a set A. We say that two element a, b ∈ A
are equivalent if a ≡ b. Since ≡ is symmetric, this is the case if and only if b ≡ a.
The equivalence class of a, denoted [a], is the set of all elements of A which are
equivalent to a:
[a] = {b ∈ A | a ≡ b}.
Example V.7. Suppose A is the set of all animals in the world, and ≡ is the
relation of being in the same species. Let p be a pig. Then [p] is the set of all pigs
in the world. One can see that if q is also a pig, then [p] = [q]. Also it is clear that
if a is an anteater, then [p] ∩ [a] = ∅. Note there is exactly one equivalence class
[x] for each species of animal on earth such that x is an animal of that species. We
now proceed to formalize these assertions.
Proposition V.8. Let A be a set and let ≡ be an equivalence relation on A. Then
the following conditions are equivalent:
(1) a ≡ b;
(2) [a] = [b];
(3) b ∈ [a].
Proof. To prove a statement of this kind, we need to show that (1) is logically
equivalent to (2), that (2) is logically equivalent to (3), and that (3) is logically
equivalent to (1). It suffices to show that (1) implies (2), that (2) implies (3), and
that (3) implies (1).
(1) ⇒ (2) Suppose that a ≡ b. By symmetry of ≡, we know that b ≡ a. We
wish to show that [a] = [b]. We show containment both ways.
Let c ∈ [a]. Then a ≡ c by definition of [a]. Thus b ≡ c by transitivity of ≡,
because b ≡ a and a ≡ c. Thus c ∈ [b] by definition of [b]. This shows that [a] ⊂ [b].
Simply by reversing the roles of a and b is the above argument, we see that
[b] ⊂ [a]. Therefore [a] = [b].
(2) ⇒ (3) Suppose that [a] = [b]. We wish to show that b ∈ [a]. Now by
reflexivity, b ≡ b. Thus b ∈ [b]. Since [a] is the same set as [b], we must have b ∈ [a].
(3) ⇒ (1) Suppose that b ∈ [a]. We wish to show that a ≡ b. But this follows
by the definition of [a].
5. PARTITIONS INDUCED BY EQUIVALENCE RELATIONS 39
8. Canonical Functions
Let A be a partition of a set A, and for a ∈ A let a denote the block containing
a. Then there is a canonical function
β:A→A
given by f (a) = a. Each element simply is sent to the block containing it. That is,
each element is sent to its equivalence class in the equivalence relation corresponding
to the partition. The function β is surjective, since every block contains an element
(we made it part of our definition of partition that its members are nonempty).
Theorem V.16. Let φ : A → B be a function. Let A be the set of equivalence
classes of A induced by f . Let β : A → A be the canonical function given by a 7→ a.
Then there exists a unique injective function
φ:A→B
such that φ = φ ◦ β. If φ is surjective, then φ is bijective.
Proof. Define φ by φ(a) = φ(a). We must show that this is well defined and
injective, that φ = φ ◦ β, and that any other function ψ : A → B such that
φ = ψ ◦ β is equal to φ.
Note that φ is defined via a choice of representative for a given block in A. To
show that φ is well-defined, we must show that the definition of φ is independent of
the choice of representative. Thus let a1 , a2 ∈ A such that a1 = a2 . Thus a1 and a2
are inverse images of the same point in B under the map φ. That is, φ(a1 ) = φ(a2 ).
Therefore φ(a1 ) = φ(a1 ) = φ(a2 ) = φ(a2 ), and φ is well-defined.
To see that φ is injective, let a1 , a2 ∈ A such that φ(a1 ) = φ(a2 ). Then
φ(a1 ) = φ(a2 ). By definition of kernel equivalence, a1 = a2 , so φ is injective.
To see that φ = φ ◦ β, note that for a ∈ A, φ(a) = φ(a) = φ(β(a)). Thus this
holds essentially by definition of φ and of β.
Suppose that ψ : A → B is another function such that φ = ψ ◦ β. Then
ψ(a) = φ(a) = φ(a), so φ = ψ since it acts the same way on every element of its
domain. Thus a is the unique function with this property.
Example V.17. Let A be the set of animals on earth and let S be the set of
species. Let φ : A → S be given by sending an animal to its species. Let A be the
partition of A into subsets of A which contain all of the animals of a given species.
Then A is the partition of A induced by φ. Let β : A → A be the canonical function
which sends an animal to the block which contains it. One can easily see that such
blocks naturally correspond to the set of species. The bijective function φ, whose
existence is guaranteed by the above theorem, sends each block to the species to
which the animals in the block belong.
9. EXERCISES 43
9. Exercises
Exercise V.1. Let A and B be sets and let ≤ be a total order on B. Let f : A → B
be a function and define a relation 4 on A by
a1 4 a2 ⇔ f (a1 ) ≤ f (a2 ).
(a) Show that if f is injective, 4 is a total order on A.
(b) Give an example where f is not injective and 4 is not a partial order on A.
Exercise V.2. Let X be a set and let C ⊂ P(X). Define a relation 4 on C by
A 4 B ⇔ ∃ injective f : A → B.
Is 4 a partial order on C?
Exercise V.3. Let X be a set and let C ⊂ P(X). Define a relation ≡ on C by
A ≡ B ⇔ ∃ bijective f : A → B.
Show that ≡ is an equivalence relation.
Definition V.18. A circle in the cartesian plane is a subset of R2 which is the set
of all points equidistant from a given point, called its center; the common distance
is called the radius of the circle. If C ⊂ R2 is a circle and A ⊂ R2 , we say that A
is inside C if for each a ∈ A, the distance from a to the center of C is less than or
equal to the radius of the circle.
Exercise V.4. Let C ⊂ P(R2 ) be the collection of all circles in the cartesian plane.
Define a relation 4 on C by
C1 4 C2 ⇔ C1 is inside C2 .
Is 4 a partial order on C?
Exercise V.5. Let C ⊂ P(R2 ) be the collection of all circles in the cartesian plane.
Define a relation 4 on C by
C1 4 C2 ⇔ the center of C1 is inside C2 .
Is 4 a partial order on C?
Exercise V.6. Let C ⊂ P(R2 ) be the collection of all circles in the cartesian plane.
Define a relation ≡ on C by
C1 ≡ C2 ⇔ C1 and C2 have the same center .
Is ≡ an equivalence relation?
Exercise V.7. Define a function | · | : R2 → R by
p
|(x, y)| = x2 + y 2 .
Let C be the partition of R2 induced by this function.
Describe the members of C.
Exercise V.8. Let X = {1, 2, 3}. Define a function f : P(X) r {∅} → X by
f (A) = the smallest member of A.
Compute the partition of P(X) induced by the function f .
44 V. RELATIONS
Binary Operators
1. Binary Operators
Let A be a set. A binary operator on A is a function
∗ : A × A → A.
A binary operator is simply something that takes two elements of a set and gives
back a third element of the same set.
Example VI.1. Let R be the set of real numbers. Then + : R × R → R, given by
+(x, y) = x + y, is a binary operator. Also · : R × R → R, given by ·(x, y) = xy, is
a binary operator.
In general, in the sets N, Z, Q, R, and C, addition and multiplication are binary
operators.
Example VI.2. Let X be a set and let P(X) be the power set of X. Then union
and intersection are binary operators on P(X); for example
∩ : P(X) × P(X) → P(X)
is defined by ∩(A, B) = A ∩ B, where A, B ⊂ X.
Example VI.3. Let X be a set and let Sym(X) be the set of all permutations of
X. Then ◦ is a binary operator on Sym(X):
◦ : Sym(X) × Sym(X) → Sym(X)
is defined by ◦(φ, ψ) = φ ◦ ψ.
Let A be a set and let ∗ : A × A → A be a binary operator. As in the above
examples, it is customary to write a ∗ b instead of ∗(a, b), where a, b ∈ A. However,
we keep in mind that ∗ is a function and that a ∗ b ∈ A.
45
46 VI. BINARY OPERATORS
2. Closure
Let ∗ : A × A → A be a binary operator on a set A and let B ⊂ A. We say that
B is closed under the operation of ∗ if for every b1 , b2 ∈ B, we have b1 ∗ b2 ∈ B.
Example VI.4. Let E be the set of even integers. Then E is closed under the
operations of addition and multiplication of integers. Indeed, the sum of even
integers is even, and the product of even integers is even.
Let O be the set of odd integers. Then O is closed under multiplication. How-
ever, O is not closed under addition, because the sum of two odd integers is even.
√
Example VI.5. Let B = {a+b 2 ∈ R | a, b ∈ Q}. Then B is√closed under addition
√
and multiplication of real numbers. For example, if a1 + b1 2 and a2 + b2 2 are
two element of B, then
√ √ √
(a1 + b1 2) + (a2 + b2 2) = (a1 + a2 ) + (b1 + b2 ) 2 ∈ B
and √ √ √
(a1 + b1 2)(a2 + b2 2) = (a1 a2 + 2b1 b2 ) + (a1 b2 + a2 b1 ) 2 ∈ B.
Note that these results are in B because Q itself is closed under addition and
multiplication. Therefore a1 a2 + 2b1 b2 ∈ Q, and so forth.
Example VI.6. Let X be a set and let Y ⊂ X. Then P(Y ) ⊂ P(X), and the
subset P(Y ) is closed under the operations of intersection and union of subset of
X.
3. Standard Notation
It is very common that binary operations be named addition or multiplication,
even if the elements of the set are not numbers in the common sense.
If the operation on A is named addition and denoted +, then it is standard
that the identity element be named zero and denoted 0 and that the inverse of a
is denoted −a. By convention, one may assume that an operation denoted by + is
commutative and associative. If n is a natural number and a ∈ A, then na means
a added to itself n times.
If the operation on A is denoted ·, it is usually but not always called multipli-
cation and the · is dropped, so that ab means a · b. The identity element in this
notation is usually called one and written 1. The inverse of a, if it exists, is denoted
a−1 . If n is a natural number and a ∈ A, the an means a multiplied by itself n
times.
When people refer to general binary operations, usually multiplicative notation
is used, since it is the simplest. We also use ∗ to mean a “generic” binary operation.
4. PROPERTIES OF BINARY OPERATORS 47
thus the ik entry of C is the dot product of the ith row of A with the k th column
th
of B.
Let Mn (R) be the set all n × n matrices over R. Then addition of matrices
is a binary operation on Mn (R) which is commutative, associative, and invertible.
Also, multiplication of matrices is a binary operation on Mn (R) which is associative
and has an identity. The identity is simply the matrix given by aij = 1 if i = j
and aij = 0 otherwise. However, this operation is not commutative, and there are
many elements which do not have inverses.
5. EXERCISES 49
5. Exercises
Exercise VI.1. In each case, we define a binary operation ∗ on R. Determine
if ∗ is commutative and/or associative, find an identity if it exists, and find any
invertible elements.
(a) x ∗ y = xy + 1;
(b) x ∗ y = 12 xy;
(c) x ∗ y = |x|y .
Exercise VI.2. Consider the plane R2 . Define a binary operation ∗ on R2 by
x1 + x2 y1 + y2
(x1 , y1 ) ∗ (x2 , y2 ) = ( , ).
2 2
Thus the “product” of two points under this operation is the point which is midway
between them. Determine if ∗ is commutative and/or associative, find an identity
if it exists, and find any invertible elements.
Exercise VI.3. Let I be the collection of all open intervals of real numbers. We
consider the empty set to be an open interval.
(a) Show that I is closed under the operation of ∩ on P(R).
(b) Show that I is not closed under the operation of ∪ on P(R).
Exercise VI.4. Let X and Y be sets and let ∗ : Y × Y → Y be a binary operation
on Y which is commutative, associative, and invertible. Let f : X → Y be a
bijective function. Define an operation on X by
x1 x2 = f −1 (f (x1 ) ∗ f (x2 )).
Show that is commutative, associative, and invertible.
Exercise VI.5. Let X and Y be sets and let ∗ : Y × Y → Y be a binary operation
on Y . Let F(X, Y ) be the set of all functions from X to Y . Show that ∗ induces a
binary operation, which may also be called ∗, on F(X, Y ).
Exercise VI.6. Let X be a set and let ∗ : X × X → X be a binary operation
on X which is associative and invertible. Show that ∗ induces a binary operation,
which may also be called ∗, on P(X). Is it associative? Does it have an identity?
Is it invertible?
CHAPTER VII
Natural Numbers
1. Natural Numbers
We wish to create a set which is allows us to count in a more or less formal
way. The numbers we use to count are be labeled 0, 1, 2, et cetera, defined in a
manner which reflects what we memorized as infants.
Having built the language of sets, we start with the simplest set, which is the
empty set, and call it 0. Now 1 is naturally thought of as a set containing one
element, and the most obvious choice for an this element is 0. Proceeding in this
way, we would obtain
• 0 = ∅;
• 1 = {∅};
• 2 = {∅, {∅}};
• 3 = {∅, {∅}, {∅, {∅}}};
and so forth. We could have written this as
• 0 = ∅;
• 1 = {0};
• 2 = {0, 1};
• 3 = {0, 1, 2};
and so forth. Under this interpretation, a given natural number should be the set
containing all of the previous natural numbers. Having made a plan for defining
natural numbers, we proceed to attempt to formalize it.
We define 0 to be the empty set. If x is a set, the successor of x is denoted x+
and is defined as
x+ = x ∪ {x}.
The natural numbers are the set N defined by following properties:
(1) 0 ∈ N;
(2) if n ∈ N, then n+ ∈ N;
(3) if S ⊂ N, 0 ∈ S, and n ∈ S ⇒ n+ ∈ S, then S = N.
2. Induction
Note that the third property of natural numbers asserts that only eventual
successors of 0 are in N; that is, this property asserts that N is a minimal set
containing eventual successors of 0, and that N is the unique set satisfying (1)
through (3). This property is known as the Principal of Mathematical Induction.
Suppose that for every natural number n, we have a proposition p(n) which is
either true or false. Let
S = {n ∈ N | p(n) is true}.
51
52 VII. NATURAL NUMBERS
Now if p(0) is true, and if the truth of p(n) implies the truth of p(n+ ), then the
set S contains 0 and it contains the successor of every element in it. Thus, in this
case, S = N, which means that p(n) is true for all n ∈ N. We state this as
Theorem VII.1. Induction Theorem
Let p(n) be a proposition for each n ∈ N. If
(1) p(0) is true;
(2) If p(n) is true, then p(n+ ) is true;
then p(n) is true for all n ∈ N.
For m, n ∈ N, we say the m is less than or equal to n if m ⊂ n:
m ≤ n ⇔ m ⊂ n.
Now the induction theorem can be made stronger by weakening the hypothesis.
The resulting theorem gives a proof technique which is known as strong induction.
Theorem VII.2. Strong Induction Theorem
Let p(n) be a proposition for each n ∈ N. If
(1) p(0) is true;
(2) If p(m) is true for all m ≤ n, then p(n + 1) is true;
then p(n) is true for all n ∈ N.
Proof. Let t(n) be the statement that “p(m) is true for all m ≤ n”.
Our first assumption is that p(0) is true, and since the only natural number
less than or equal to 0 is zero (because the only subset of the empty set is itself),
this means that t(0) is true.
Our second assumption is that if t(n) is true, then p(n+1) is true. Thus assume
that t(n) is true so that p(n + 1) is also true. Then p(i) is true for all i ≤ n + 1.
Thus t(n + 1) is true.
By our original Induction Theorem, we conclude that t(n) is true for all n ∈ N.
This implies that p(n) is true for all n ∈ N.
3. RECURSION 53
3. Recursion
We now state the Recursion Theorem, which will allows us to define addition
and multiplication of natural numbers. It is possible to prove this theorem using
strong induction.
Theorem VII.3. Recursion Theorem
Let X be a set, f : X → X, and a ∈ X. Then there exists a unique function
φ : N → X such that φ(0) = a and φ(n+ ) = f (φ(n)) for all n ∈ N.
Let f : N → N be given by f (n) = n+ . Let σm : N → N be the unique function,
whose existence is guaranteed by the Recursion Theorem, defined by σm (0) = m
and σm (n+ ) = f (σm (n)) = (σm (n))+ . Then σm (n) is defined to be the sum of m
and n:
m + n = σm (n).
Let f : N → N be given by f = σm . Let µm : N → N be the unique function,
whose existence is guaranteed by the Recursion Theorem, defined by µm (0) = 0
and µm (n+ ) = f (µm (n)) = σm (µm (n)) = m + µm (n). Then µm (n) is defined to be
the product of m and n:
mn = µm (n).
The following properties of natural numbers can be proved using the above
definitions:
• m + n = n + m (commutativity of addition);
• (m + n) + o = m + (n + o) (associativity of addition);
• mn = nm (commutativity of multiplication);
• (mn)o = m(no) (associativity of multiplication);
• m(n + o) = mn + mo (distributivity of multiplication over addition);
• m + 0 = m (0 is an additive identity);
• 1m = m (1 is a multiplicative identity);
• 0m = 0.
We state two additional properties, which we will use to show that multiplica-
tion of integers is well-defined.
Proposition VII.4. Cancelation Law of Addition
Let a, b, c ∈ N and suppose that a + c = b + c. Then a = b.
Proposition VII.5. Cancelation Law of Multiplication
Let a, b, c ∈ N and suppose that ac = bc. Then a = b.
CHAPTER VIII
Integers
1. Motivation
The goal is to create the integers from the natural numbers. This will give us
a formal number system in which subtraction is possible. We know where we want
to go with this; we just wish to formalize it in a manner that makes proving things
about the integers possible. Thus it is allowable and desirable to use our intuitive
understanding of the number system we wish to devise as a beacon.
The plan is two take ordered pairs of natural numbers, and think of them as
integers. The pair (m, n) is to be thought of as the integer m − n. Thus (5, 0)
should represent 5, and (0, 5) should represent −5. Unfortunately, (3, 8) should
also represent −5. Thus there are too many pairs.
This situation is alleviated via the use of equivalence relations. We take the
set of ordered pairs of natural numbers and partition it into blocks of pairs which
represent the same integer. Here, two integers represent the same integer if they
differ by the same amount. Since we do not yet have the operation of subtraction,
instead of defining “differing by the same amount” as a − b = c − d, instead we say
that (a, b) and (c, d) differ by the same amount if a + d = b + c.
Then we define an integer to be a block in the partition of N × N induced by
this equivalence relation.
55
56 VIII. INTEGERS
2. Definition
Proposition VIII.1. Let X = N × N. Define a relation on X by
(a, b) ≡ (c, d) ⇔ a + d = b + c.
Then ≡ is an equivalence relation.
Proof. We wish to show that ≡ is reflexive, symmetric, and transitive.
(Reflexivity) Let (a, b) ∈ X. Then a + b = b + a because addition of natural
numbers is commutative. Thus (a, b) ≡ (a, b), and ≡ is reflexive.
(Symmetry) Let (a, b), (c, d) ∈ X. Then by symmetry of equality and commu-
tativity of addition of natural numbers,
(a, b) ≡ (c, d) ⇔ a + d = b + c ⇔ c + b = d + a ⇔ (c, d) ≡ (a, b).
Thus ≡ is symmetric.
(Transitivity) Let (a, b), (c, d), (e, f ) ∈ X. Suppose that (a, b) ≡ (c, d) and
(c, d) ≡ (e, f ). Then a + d = b + c and c + f = d + e. Add f to both sides of the
first equation and add b to both sides of the second to obtain a + d + f = b + c + f
and b + c + f = b + d + e. Thus a + d + f = b + d + e. By the commutativity of
addition and cancelation, we obtain a + f = b + e. Thus (a, b) ≡ (e, f ), and ≡ is
transitive.
The set of equivalence classes in this equivalence relation is called the set of
integers, and is denoted Z. The equivalence class of (a, b) is denoted [a, b].
3. ADDITION 57
3. Addition
We define addition in Z by
[a, b] + [c, d] = [a + c, b + d].
To define addition, we select members from two different equivalence classes
and define their sum in terms of the selected members. What if we had selected
different members? For example, is [3, 5] + [2, 1] = [6, 8] + [9, 8]? We need to
reassure ourselves that the defined operation makes sense in this regard. If it does,
it is called well-defined.
Proposition VIII.2. Addition in Z is well defined.
Proof. To show that addition is well-defined, we select two arbitrary representatives
from each equivalence class and show that they produce the same equivalence class
upon being added.
Let a1 , a2 , b1 , b2 , c1 , c2 , d1 , d2 ∈ N such that
[a1 , b1 ] = [a2 , b2 ] and [c1 , d1 ] = [c2 , d2 ].
This means that (a1 , b1 ) ≡ (a2 , b2 ) and (c1 , d1 ) ≡ (c2 , d2 ), so
(1) a1 + b2 = b1 + a2 ;
(2) c1 + d2 = d1 + c2
by our definition of equivalence.
Our definition of addition of equivalence classes gives that
[a1 , b1 ] + [c1 , d1 ] = [a1 + c1 , b1 + d1 ]
and
[a2 , b2 ] + [c2 , d2 ] = [a2 + c2 , b2 + d2 ].
We wish to show that [a2 + c1 , b1 + d1 ] = [a2 + c2 , b2 + d2 ].
Adding equations (1) and (2) yields:
(a1 + b2 ) + (c1 + d2 ) = (b1 + a2 ) + (d1 + c2 ).
Since addition of natural numbers is commutative and associative,
(a1 + c1 ) + (b2 + d2 ) = (b1 + d1 ) + (a2 + c2 ).
Thus (a1 +c1 , b1 +d1 ) ≡ (a2 +c2 , b2 +d2 ). Therefore [a1 +c1 , b1 +d1 ] = [a2 +c2 , b2 +d2 ],
and addition is well-defined.
58 VIII. INTEGERS
4. Multiplication
We define multiplication in Z by
[a, b] · [c, d] = [ac + bd, ad + bc].
Proposition VIII.3. Multiplication in Z is well defined.
Proof. Let a1 , a2 , b1 , b2 , c1 , c2 , d1 , d2 ∈ N such that
[a1 , b1 ] = [a2 , b2 ] and [c1 , d1 ] = [c2 , d2 ].
This means that (a1 , b1 ) ≡ (a2 , b2 ) and (c1 , d1 ) ≡ (c2 , d2 ), so
a1 + b2 = b1 + a2 and c1 + d2 = d1 + c2
by our definition of equivalence.
Our definition of multiplication of equivalence classes gives that
[a1 , b1 ][c1 , d1 ] = [a1 c1 + b1 d1 , a1 d1 + b1 c1 ]
and
[a2 , b2 ][c2 , d2 ] = [a2 c2 + b2 d2 , a2 d2 + b2 c2 ].
We wish to show that [a1 c1 + b1 d1 , a1 d1 + b1 c1 ] = [a2 c2 + b2 d2 , a2 d2 + b2 c2 ]. This is
a little tricky, so we introduce some additional notation to shorten things. Define
x = a1 c1 + b1 d1 + a2 d2 + b2 c2 ;
y = a1 d1 + b1 c1 + a2 c2 + b2 d2 .
Now if we show that x = y, we will be done by definition of equivalence. Let
z = a1 d2 + b2 d1 + b1 c2 + a2 c1 .
By the cancelation law of addition of natural numbers, it suffices to show that x +
z = y + z. This is accomplished by showing that each side is equal to 2(a1 b2 )(c1 d2 ).
First add z to both sides of the definition of x, expand z on the right side, and
use commutativity of addition to insert shuffle the terms of z into the expression,
achieving
a1 c1 + a1 d2 + b2 c2 + b2 d1 + b1 d1 + b1 c2 + a2 d2 + a2 c1 = x + z.
Distributivity converts this into
a1 (c1 + d2 ) + b2 (c2 + d1 ) + b1 (d1 + c2 ) + a2 (d2 + c1 ) = x + z.
Now use the fact that c1 + d2 = c2 + d1 to obtain
(a1 + b2 + b1 + a2 )(c1 + d2 ) = x + z.
Since a1 + b2 = a2 + b1 , we have
2(a1 + b2 )(c1 + d2 ) = x + z.
Perform the same manner of computation on the equation defining y, and you
will find that
2(a1 + b2 )(c1 + d2 ) = y + z.
5. ALGEBRAIC PROPERTIES 59
5. Algebraic Properties
Theorem VIII.4. Let a, b, c ∈ Z. Then
(1) a + b = b + a (commutativity of addition);
(2) a + (b + c) = (a + b) + c (associativity of addition);
(3) ∃!z ∈ Z such that a + z = a (additive identity);
(4) ∃! − a ∈ Z such that a + (−a) = z (additive inverses);
(5) ab = ba (commutativity of multiplication);
(6) a(bc) = (ab)c (associativity of multiplication);
(7) ∃!e ∈ Z such that ae = a (multiplicative identity);
(8) a(b + c) = ab + ac (distributivity of multiplication over addition).
These eight properties state that Z is a commutative ring. We prove or comment
on each.
Proposition VIII.5. Let a, b ∈ Z. Then a + b = b + a.
Proof. Since a and b are integers, they are represented by pairs of natural numbers,
say a = [m, n] and b = [u, v]. Then
a + b = [m, n] + [u, v] = [m + u, n + v] = [u + m, v + n] = [u, v] + [m, n] = b + a.
Proposition VIII.6. Let a, b, c ∈ Z. Then (a + b) + c = a + (b + c).
Proof. This follows easily from the definitions and the fact that addition is associa-
tive in the natural numbers in a manner entirely analogous to the proof above.
Proposition VIII.7. There exists a unique element z ∈ Z such that for every
a ∈ Z we have a + z = a.
Proof. Let z = [0, 0]. The fact that a + z = a is immediate from the definition and
the analogous fact in N. Later, we will justify calling this element z by the name
zero.
For uniqueness, suppose that y also satisfies a + y = a for all a ∈ Z. Then
z = z + y = y + z = y.
Proposition VIII.8. For every a ∈ Z there exists a unique element −a ∈ Z such
that a + (−a) = z.
Proof. Let a = [m, n], where m, n ∈ N. Define −a = [n, m]. Then a + (−a) =
[m + n, m + n] = [0, 0]. Call this element negative a.
For uniqueness, suppose a + b = z. Then a + b = a + (−a). By commutativity,
b + a = (−a) + a. Adding (−a) to both sides gives b = b + z = b + a + (−a) =
(−a) + a + (−a) = (−a) + z = (−a).
Now we may define subtraction on Z by
a − b = a + (−b).
Clearly subtraction in not commutative or associative.
60 VIII. INTEGERS
6. Embedding
We wish to show that, in a very meaningful sense, the natural numbers can
be regarded as integers. To do this, we create an injective function N ,→ Z which
preserves all of the properties of the natural numbers with which we are concerned.
That is, what matters to us about the natural numbers is not how they were defined,
but how they behave. Specifically, they can be added and multiplied. Thus we want
our injective function to preserve these properties.
Let φ : N → Z. We say that φ is an embedding if
• φ(1) = e, where e is the multiplicative identity of Z;
• φ(m + n) = φ(m) + φ(n);
• φ(mn) = φ(m)φ(n).
There is a unique function φ : N → Z which satisfies all of these properties, and it
is given by φ(n) = [n, 0].
This also gives us additional properties which motivated us in the first place:
• ∀n ∈ N∃b ∈ Z such that φ(n) + b = φ(0);
• ∀a ∈ Z∃n ∈ N such that either a = φ(n) or a = −φ(n).
The first of these says that Z contains the additive inverses of the natural numbers,
and the second says that Z is, in some sense, the smallest set that does so.
Thus from now on, whenever it is convenient, we view N as a subset of Z. Then
to say that a ∈ N ∩ Z we mean that a ∈ φ(N) ⊂ Z. The meaning should be clear
from the context.
In particular, φ(1) = e by definition and φ(0) = z because the additive identity
of Z is unique. Thus we identity 1 with e and 0 with z, and may drop these
temporary names.
62 VIII. INTEGERS
7. Order
Let φ : N ,→ Z be the embedding given by n 7→ [n, 0].
We define a relation ≤ on Z by
a ≤ b ⇔ b − a ∈ φ(N).
This leads to other relations:
• a < b ⇔ (a ≤ b) ∧ (a 6= b);
• a > b ⇔ ¬(a ≤ b);
• a ≥ b ⇔ ¬(a < b).
Proposition VIII.13. The relation ≤ on Z is a total order.
Proposition VIII.14. Let m, n ∈ N. Then m ≤ n if and only if φ(m) ≤ φ(n).
Proposition VIII.15. The relation ≤ on Z has the following properties:
(1) a ≤ b ⇒ a + c ≤ b + c;
(2) (c ≥ 0) ∧ (a ≤ b) ⇒ ac ≤ bc;
(3) (c ≤ 0) ∧ (a ≤ b) ⇒ ac ≥ bc.
We define a function | · | : Z → N by
(
a if a ≥ 0;
|a| =
−a otherwise .
We call |a| the absolute value of a.
8. EXERCISES 63
8. Exercises
Construct the rational numbers as follows.
Exercise VIII.1. Find an appropriate set on which to work. Define an relation
on this set, and show that it is an equivalence relation. Define the set Q of rational
numbers to be the equivalence classes of this equivalence relation.
Exercise VIII.2. Define addition and multiplication on Q and show that it is well
defined.
Exercise VIII.3. Let a, b, c ∈ Q. Show that
(1) a + b = b + a;
(2) a + (b + c) = (a + b) + c;
(3) ∃!0 ∈ Q such that a + 0 = a;
(4) ∃! − a ∈ Q such that a + (−a) = 0;
(5) ab = ba;
(6) a(bc) = (ab)c;
(7) ∃!1 ∈ Q such that a1 = a;
(8) a 6= 0 ⇒ ∃a−1 ∈ Q such that aa−1 = 1;
(9) a(b + c) = ab + ac.
The nine properties above assert that Q is a field.
Exercise VIII.4. Define a relation on Q which coincides with the common notion
of their ordering, and show that this is a total order relation.
CHAPTER IX
Modular Integers
1. Well-Ordering Principle
First we establish a few properties of the integers which we need in order to
understand the ring of integers modulo n. One tool which can be used to establish
these properties is the Well-Ordering Principle.
Proposition IX.1. Well-Ordering Principle
Let X ⊂ N be a nonempty set of natural numbers. Then X contains a smallest,
element; that is, there exists x0 ∈ X such that for every x ∈ X, x ≤ x0 .
Proof. Since X is nonempty, it contains an element, say x1 . If x1 is the smallest
member of X, we are done, so assume that the set
Y = {x ∈ X | y < x1 }
is nonempty. Since there are only finitely many natural numbers less than a given
natural number, Y is finite.
Proceed by induction on (mod Y ). If (mod Y ) = 1, then Y contains exactly
one element, which is vacuously the smallest member of Y .
Now assume that (mod Y ) = n. By induction, we assume that any nonempty
set with less than n elements contains a smallest member. Since Y is nonempty,
let x2 ∈ Y . If x2 is the smallest member of Y , we are done, so assume that the set
Z = {x ∈ Y | x < x2 }
is nonempty. Since x2 ∈ / Z, (mod Z) < n, so Z contains a smallest member (by
our inductive hypothesis), say x0 . Then x0 is also smaller than any element in Y .
This completes the proof by induction.
Thus every finite set of natural numbers has a smallest element, and since Y is
finite, is has a smallest element. This element is the smallest member of X.
65
66 IX. MODULAR INTEGERS
2. Division Algorithm
Definition IX.2. Let m, n ∈ Z. We say that m divides n, and write m | n, if
there exists an integer k such that n = km.
Exercise IX.1. Show that the relation | is a partial order on the set of positive
integers.
Proposition IX.3. Division Algorithm for Integers
Let m, n ∈ Z. There exist unique integers q, r ∈ Z such that
n = qm + r and 0 ≤ r < (mod m).
Proof. Let X = {z ∈ Z | z = n−km for some k ∈ Z}. The subset of X consisting of
nonnegative integers is a subset of N, and by the Well-Ordering Principle, contains
a smallest member, say r. That is, r = n − qm for some q ∈ Z, so n = qm + r. We
know 0 ≤ r. Also, r < (mod m), for otherwise, r − (mod m) is positive, less than
r, and in X.
For uniqueness, assume n = q1 m+r1 and n = q2 m+r2 , where q1 , r1 , q2 , r2 ∈ Z,
0 ≤ r1 < m, and 0 ≤ r2 < m. Then m(q1 − q2 ) = r1 − r2 ; also −m < r1 − r2 < m.
Since m | (r1 −r2 ), we must have r1 −r2 = 0. Thus r1 = r2 , which forces q1 = q2 .
Definition IX.4. Let m, n ∈ Z. A greatest common divisor of m and n, denoted
gcd(m, n), is a positive integer d such that
(1) d | m and d | n;
(2) If e | m and e | n, then e | d.
Proposition IX.5. Let m, n ∈ Z. Then there exists a unique d ∈ Z such that
d = gcd(m, n), and there exist integers x, y ∈ Z such that
d = xm + yn.
Proof. Let X = {z ∈ Z | z = xm + yn for some x, y ∈ Z}. Then the subset of X
consisting of positive integers contains a smallest member, say d, where d = xm+yn
for some x, y ∈ Z.
Now m = qd + r for some q, r ∈ Z with 0 ≤ r < d. Then m = q(xm + yn) + r,
so r = (1 − qxm)m + (qy)n ∈ X. Since r < d and d is the smallest positive integer
in X, we have r = 0. Thus d | m. Similarly, d | n.
If e | m and e | n, then m = ke and n = le for some k, l ∈ Z. Then d =
xke + yle = (xk + yl)e. Therefore e | d. This shows that d = gcd(m, n).
For uniqueness of a greatest common divisor, suppose that e also satisfies the
conditions of a gcd. Then d | e and e | d. Thus d = ie and e = jd for some i, j ∈ Z.
Then d = ijd, so ij = 1. Since i and j are integers, then i = ±1. Since d and e are
both positive, we must have i = 1. Thus d = e.
Exercise IX.2. Let m, n ∈ Z and suppose that there exist integers x, y ∈ Z such
that xm + yn = 1. Show that gcd(m, n) = 1.
Exercise IX.3. Let m, n ∈ N and suppose that m | n. Show that gcd(m, n) = m.
3. EUCLIDEAN ALGORITHM 67
3. Euclidean Algorithm
There is an effective procedure for finding the greatest common divisor of two
integers. It is based on the following proposition.
Proposition IX.6. Let m, n ∈ Z, and let q, r ∈ Z be the unique integers such that
n = qm + r and 0 ≤ r < m. Then gcd(n, m) = gcd(m, r).
Proof. Let d1 = gcd(n, m) and d2 = gcd(m, r). Since “divides” is a partial order
on the positive integers, it suffices to show that d1 | d2 and d2 | d1 .
By definition of common divisor, we have integers w, x, y, z ∈ Z such that
d1 w = n, d1 x = m, d2 y = m, and d2 z = r.
Then d1 w = qd1 x + r, so r = d1 (w − qx), and d1 | r. Also d1 | m, so d1 | d2 by
definition of gcd.
On the other hand, n = qd2 y + d2 z = d2 (qy + z), so d2 | n. Also d2 | m, so
d2 | d1 by definition of gcd.
Now let m, n ∈ Z be arbitrary integers, and write n = mq +r, where 0 ≤ r < m.
Let r0 = n, r1 = m, r2 = r, and q1 = q. Then the equation becomes r0 = r1 q1 + r2 .
Repeat the process by writing m = rq2 +r3 , which is the same as r1 = r2 q2 +r3 , with
0 ≤ r3 < r2 . Continue in this manner, so in the ith stage, we have ri−1 = ri qi +ri+1 ,
with 0 ≤ ri+1 < ri . Since ri keeps getting smaller, it must eventually reach zero.
Let k be the smallest integer such that rk+1 = 0. By the above proposition
and induction,
gcd(n, m) = gcd(m, r) = · · · = gcd(rk−1 , rk ).
But rk−1 = rk qk + rk+1 = rk qk . Thus rk | rk−1 , so gcd(rk−1 , rk ) = rk . There-
fore gcd(n, m) = rk . This process for finding the gcd is known as the Euclidean
Algorithm.
In order to find the unique integers x and y such that xm+yn = gcd(m, n), use
the equations derived above and work backward. Start with rk = rk−2 − rk−1 qk−1 .
Substitute the previous equation rk−1 = rk−3 − rk−2 qk−2 into this one to obtain
rk = rk−2 − (rk−3 − rk−2 qk−2 )qk−1 ) = rk−2 (qk−2 qk−1 + 1) − rk−3 qk−1 .
Continuing in this way until you arrive back at the beginning.
For example, let n = 210 and m = 165. Work forward to find the gcd:
• 210 = 165 · 1 + 45;
• 165 = 45 · 3 + 30;
• 45 = 30 · 1 + 15;
• 30 = 15 · 2 + 0.
Therefore, gcd(210, 165) = 15. Now work backwards to find the coefficients:
• 15 = 45 − 30 · 1;
• 15 = 45 − (165 − 45 · 3) = 45 · 4 − 165;
• 15 = (210 − 165) · 4 − 165 = 210 · 4 − 165 · 5.
Therefore, 15 = 210 · 4 + 165 · (−5).
68 IX. MODULAR INTEGERS
4. Prime Integers
Definition IX.7. An integer p ∈ Z is called prime if
(1) p ≥ 2;
(2) p | ab ⇒ p | a or p | b, where a, b ∈ N.
Definition IX.8. An integer p ∈ Z is called irreducible if
(1) p ≥ 2;
(2) p = ab ⇒ a = 1 or b = 1, where a, b ∈ N.
Exercise IX.4. Let p ∈ Z. Show that p is prime if and only if p is irreducible.
Exercise IX.5. Let a, p ∈ Z such that p is prime.
Show that gcd(a, p) = 1 or gcd(a, p) = p.
Here is an interesting exercise. The standard proof is by contradiction.
Exercise IX.6. Show that there are infinitely many prime integers.
(Hint: assume there are only finitely many, multiply them, and add 1.)
The following series of exercises constitutes a proof that every integer greater
than one has a unique factorization into prime integers.
Exercise IX.7. Let p ∈ Z be prime and let m, n ∈ Z.
Show that if p | mn, then p | m or p | n.
Exercise IX.8. Let p ∈ Z be prime and let n1 , . . . , nr ∈ Z.
Show that if p | n1 . . . nr , then p | ni for some i = 1, . . . , r.
(Hint: proceed by induction on r.)
Exercise IX.9. Let a ∈ Z such that a ≥ 2.
Show that a = p1 . . . p2 , where pi is prime for i = 1, . . . , r.
(Hint: proceed by strong induction on n.)
Exercise IX.10. Let p1 , . . . , pr , q1 , . . . , qs be prime integers.
Show that if p1 . . . pr = q1 . . . qs , then r = s and that the qj ’s can be relabeled so
that pi = qi for i = 1, . . . , r.
(Hint: assume not, and let m be the smallest integer that has two different prime
factorizations.)
5. CONGRUENCE MODULO n 69
5. Congruence Modulo n
Definition IX.9. Let n ∈ N, and define a relation ≡n on Z by
a ≡n b ⇔ n | (a − b).
This relation is called congruence modulo n; that is, if a ≡n b, we say that a is
congruent to b modulo n. Sometimes this is written a ≡ b (mod n). If the n is
understood, we may drop the “ (mod n)” from the notation.
Proposition IX.10. Let n ∈ N. Then ≡n is an equivalence relation on Z.
Proof. We wish to show that ≡n is reflexive, symmetric, and transitive.
(Reflexivity) Let a ∈ Z. Now 0 · n = 0 = a − a; thus n | (a − a), so a ≡ a.
Therefore ≡ is reflexive.
(Symmetry) Let a, b ∈ Z. Suppose that a ≡ b; then n | (a − b). Then there
exists k ∈ Z such that nk = a − b. Then n(−k) = b − a, so n | (b − a). Thus b ≡ a.
Similarly, b ≡ a ⇒ a ≡ b. Therefore ≡ is symmetric.
(Transitivity) Let a, b, c ∈ Z, and suppose that a ≡ b and b ≡ c. Then nk = a−b
and nl = b − c for some k, l ∈ Z. Then a − c = nk − nl = n(k − l), so n | (a − c).
Thus a ≡ c. Therefore ≡ is transitive.
Proposition IX.11. Let n ∈ N and let a1 , a2 ∈ Z. By the Division Algorithm,
there exist unique integers q1 , r1 , q2 , r2 ∈ Z such that
• a1 = nq1 + r1 , where 0 ≤ r1 < n;
• a2 = nq2 + r2 , where 0 ≤ r2 < n.
Then a1 ≡ a2 (mod n) if and only if r1 = r2 .
Proof.
(⇒) Suppose that a1 ≡ a2 . Then n | (a1 − a2 ). This means that nk = a1 − a2
for some k ∈ Z. But a1 − a2 = n(q1 − q2 ) + (r1 − r2 ). Then n(k + q1 − q2 ) = r1 − r2 ,
so n | r1 − r2 .
Multiplying the inequality 0 ≤ r2 < n by −1 gives −n < −r2 ≤ 0. Adding this
inequality to the inequality 0 ≤ r1 < n gives −n < r1 − r2 < n. But r1 − r2 is an
integer multiple of n; the only possibility, then, is that r1 − r2 = 0. Thus r1 = r2 .
(⇐) Suppose that r1 = r2 . Then a1 − a2 = nq1 − nq2 = n(q1 − q2 ). Thus
n | (a1 − a2 ), so a1 ≡ a2 .
70 IX. MODULAR INTEGERS
6. Integers Modulo n
Definition IX.12. The partition of Z induced by the equivalence relation ≡n is
called the set of integers modulo n, and is denoted Zn . For an integer a ∈ Z, denote
its equivalence class under the equivalence relation by [a]n . If the n is understood,
we may write this equivalence class as [a] or a.
An element r ∈ Z is called a preferred representative for [a]n if r ∈ [a]n and
0 ≤ r < n.
The division algorithm for the integers assures us that there is a unique pre-
ferred representative for each equivalence class. Also, as r ranges over the integers
from 0 to n − 1, the equivalence classes [r]n are distinct. Thus there are exactly
n equivalence classes in the set of integers modulo n; that is, (mod Zn ) = n. For
example,
Z7 = {0, 1, 2, 3, 4, 5, 6}.
Proposition IX.13. Let n ∈ Z. Define the binary operations of addition and
multiplication in Zn by
a + b = a + b and a · b = ab.
These operations are well-defined.
Proof. Select a1 , a2 , b1 , b2 ∈ Z such that a1 ≡ a2 and b1 ≡ b2 ; say a1 − a2 = kn and
b1 − b2 = ln for some k, l ∈ Z.
(Addition) We wish to show that a1 + b1 = a2 + b2 , i.e., that a1 + b1 ≡ a2 + b2 .
We simply add the equations above to obtain
a1 − a2 + b1 − b2 = kn + ln;
thus
(a1 + b1 ) − (a2 + b2 ) = (k + l)n;
from this, n | ((a1 + b1 ) − (a2 + b2 )), so a1 + b1 ≡ a2 + b2 .
(Multiplication) We wish to show that a1 · b1 = a2 · b2 , i.e., that a1 b1 ≡ a2 b2 .
To do this, adjust the original equations to obtain
a1 = a2 + kn and b1 = b2 + ln
and multiply them to obtain
a1 b1 = a2 b2 + a2 ln + b2 kn + kln2 ,
whence
a1 b1 − a2 b2 = (a2 l + b2 k + kln)n;
thus n | (a1 b1 − a2 b2 ), so a1 b1 ≡ a2 b2 .
7. THE GROUP OF INTEGERS MODULO n 71
8. Order of an Element in Zn
For any k ∈ N and any a ∈ Zn , define ka to be a added to itself k times:
k
X
ka = a.
i=1
Precedence of Operators
(1) NOT
(2) AND, OR
(3) XOR, NOR, NAND
(4) IMP
(5) IFF
75
APPENDIX B
77
APPENDIX C
ZFC Axioms
The Zermelo-Fraenkel axioms are intended to place set theory on a solid logical
foundation. Together with the Axiom of Choice, these form the ZFC axioms of set
theory, upon which the bulk of modern mathematics is based.
Axiom C.1 (Axiom of Extension). Two sets are equal if and only if they have the
same elements.
∀A, ∀B : A = B ⇐⇒ (∀C : C ∈ A ⇔ C ∈ B)
Axiom C.2 (Axiom of the Empty Set). There is a set with no elements.
∃∅, ∀x : ¬(x ∈ ∅)
Axiom C.3 (Axiom of Pairing). If A and B are sets, then there is a set containing
A and B as its only elements.
∀A, ∀B, ∃C, ∀D : D ∈ C ⇐⇒ (D = A ∨ D = B)
Axiom C.4 (Axiom of Union). If A is a set, there is a set whose elements are
precisely the elements of the elements of A.
∀A, ∃B, ∀C : C ∈ B ⇐⇒ (∃D : C ∈ D ∧ D ∈ A)
Axiom C.5 (Axiom of Infinity). There is a set N such that ∅ is in N and whenever
A is in N , so is A ∪ {A}.
∃N : ∅ ∈ N ∧ (∀A : A ∈ N ⇒ A ∪ {A} ∈ N )
Axiom C.6 (Axiom of Powers). If A is a set, there is a set whose elements are
precisely the subsets of A.
∀A, ∃P(A), ∀B : B ∈ P(A) ⇐⇒ (∀C : C ∈ B ⇒ C ∈ A)
Axiom C.7 (Axiom of Regularity). If A is a set, there is an element of A which
is disjoint from A.
∀A : ¬(A = ∅) ⇒ (∃B : B ∈ A ∧ ¬(∃C : C ∈ A ∧ C ∈ B))
Axiom C.8 (Axiom of Separation). Given any set A and any proposition p(x),
there is a subset of A containing are precisely those x for which p(x) is true.
∀A, ∃B, ∀C : C ∈ B ⇐⇒ C ∈ A ∧ p(C).
Axiom C.9 (Axiom of Replacement). Given any set A and any proposition p(x, y)
where p(x, y1 ) and p(x, y2 ) implies y1 = y2 , there is a set containing precisely those
y for which p(x, y) is true for some x in A.
Axiom C.10 (Axiom of Choice). Given any set of nonempty sets, there is a set
the contains exactly one element in each of the nonempty sets.
79
Bibliography
[Ha60] Halmos, Paul R., Naive Set Theory, Undergraduate Texts in Mathematics, Springer-
Verlag (1960,1974)
[St87] Stewart,James Calculus, 2nd edition, Brooks/Cole Publishing Company (1987,1991)
81