0% found this document useful (0 votes)
22 views47 pages

Set Theory Oxford

Set theory notes

Uploaded by

criss hd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views47 pages

Set Theory Oxford

Set theory notes

Uploaded by

criss hd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 47

B1.

2 Set Theory
Martin Bays

HT23 Oxford

Contents
0.1 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1 Introduction 3
1.1 Russell’s paradox . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Zermelo-Fraenkel Set Theory . . . . . . . . . . . . . . . . . . . . 3
1.3 Foundations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Consistency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.5 Why study set theory? . . . . . . . . . . . . . . . . . . . . . . . . 4
1.6 Cardinality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.7 Structure of the course . . . . . . . . . . . . . . . . . . . . . . . . 5

2 The first axioms 5


2.1 Extensionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Empty set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Pairing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.4 Union . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.5 Powerset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3 Formulas and comprehension 7


3.1 The formal language L . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2 Comprehension . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

4 Products and relations 9


4.1 Cartesian products . . . . . . . . . . . . . . . . . . . . . . . . . . 10
4.2 Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4.2.1 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4.2.2 Order relations . . . . . . . . . . . . . . . . . . . . . . . . 12
4.2.3 Equivalence relations . . . . . . . . . . . . . . . . . . . . . 12

5 The axiom of infinity and the natural numbers 13


5.1 Natural numbers – discussion . . . . . . . . . . . . . . . . . . . . 13
5.2 Inductive sets and the axiom of infinity . . . . . . . . . . . . . . 13
5.3 The order on N . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
5.4 Recursion on N . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

1
CONTENTS 2

5.5 Arithmetic on N . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5.6 Defining Z, Q, R (not on syllabus) . . . . . . . . . . . . . . . . . 19

6 Cardinality 20
6.1 Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
6.2 Cardinalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
6.3 Comparing cardinalities . . . . . . . . . . . . . . . . . . . . . . . 21
6.4 Finite sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
6.5 Countable sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
6.6 Cardinal arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . 24

7 Replacement and Foundation 27


7.1 Replacement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
7.2 Foundation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

8 Well-ordered sets and ordinals 28


8.1 Well-ordered sets . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
8.2 Ordinals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
8.3 Transfinite recursion . . . . . . . . . . . . . . . . . . . . . . . . . 33
8.4 Ordinal arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . 34

9 The Axiom of Choice 37


9.1 The well-ordering principle . . . . . . . . . . . . . . . . . . . . . 37
9.2 Cardinal comparability . . . . . . . . . . . . . . . . . . . . . . . . 38
9.3 Zorn’s lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
9.4 ZFC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

10 Cardinal numbers 40
10.1 Cardinal arithmetic with Choice . . . . . . . . . . . . . . . . . . 42
10.2 Cardinal exponentation and CH (off-syllabus) . . . . . . . . . . . 43

11 Example: Infinite dimensional vector spaces 44

A References 45
A.1 Exam errata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

0.1 Acknowledgements
These notes follow the outline of Jonathan Pila’s notes for this course, which
were themselves originally based on notes of Robin Knight. Other sources in-
clude the books of Hils-Loeser, Goldrei, Jech, and Kunen, (see references below),
and lecture notes of Itay Kaplan.
1 INTRODUCTION 3

1 Introduction
What is a set? One standard informal answer might be: “A set is an unordered
collection of objects, called its elements”. We can formalise this intuitive idea
of the data given by a set as a test for equality:

Principle of Extensionality: Two sets A and B are equal if and only if they
have the same elements,

A = B ⇔ ∀x (x ∈ A ↔ x ∈ B).

But this leaves a trickier question: What sets are there? If we want to say
something holds “for all sets A”, which A must we consider? It is tempting to
give a broad definition, such as:

By a “set” we understand any collection to a whole M of spe-


cific well-separated objects m (called the “elements” of M ) of our
intuition or thought.1 – (Cantor 1895)

But too broad conceptions of set lead to paradoxes.

1.1 Russell’s paradox


We might expect:

Unrestricted Comprehension: For any well-defined property P (x), there is


a set whose elements are precisely those x which satisfy P (x).

But take P (x) to be the property: x is a set which is not an element of itself.
Suppose R is the set whose elements are precisely the x satisfying P (x). Then
for any set x,
x∈R⇔x∈ / x.
In particular,
R∈R⇔R∈
/ R.
This contradiction shows that Unrestricted Comprehension is inconsistent.

1.2 Zermelo-Fraenkel Set Theory


Around the start of the 20th century, Zermelo and (later) Fraenkel developed a
version of set theory which avoids Russell’s paradox and similar paradoxes.
This Zermelo-Fraenkel set theory is the subject of this course. One key
feature of this theory is its adoption of the axiomatic method : it consists of
axioms, statements about the universe of sets, intended to suffice to derive all
the theorems of ordinary mathematics.
This axiom system is denoted by “ZFC”, where C denotes one axiom known
as the Axiom of Choice; we also sometimes consider the system “ZF” which
omits this axiom.
1 “Unter
einer ,Menge‘ verstehen wir jede Zusammenfassung M von bestimmten wohlunter-
scheidenen Objekten m uns[e]rer Anschauung oder unseres Denkens (welche die ,Elementen‘
von M genannt werden) zu einem Ganzen.”
1 INTRODUCTION 4

1.3 Foundations
It has become common (though not universal) practice to consider ZFC as
forming the foundations of mathematics.
Such a position gives in particular a clear answer to the question: what is a
proof of a mathematical statement? This proceeds as follows.
Given a mathematical statement S, we first encode it into a formal (first-
order) statement σ about sets – remarkably, such encoding appears to always
be possible. We then identify the notion of a “proof of S” with the notion of
a formal proof of σ from ZFC; the latter has a clear unambiguous definition
(presented in B1.1 Logic).

1.4 Consistency
Is ZFC consistent? Could there be some paradox which it fails to avoid?
Sadly, Gödel’s 2nd Incompleteness Theorem shows that if ZFC is consistent,
then we can not prove (in the above sense, i.e. from ZFC) that it is consistent.
Mathematicians and set theorists vary in the extent to which they “believe”
that it is consistent.
What we can say for sure is that the mathematics humanity has developed
so far has not revealed any inconsistency in ZFC.

1.5 Why study set theory?


(1) Sets are very natural primitive mathematical structures, so are of intrinsic
mathematical interest.
(2) As natural mathematical objects, sets arise in many areas of mathematics,
so we need to be ready to deal with them.
(3) Since set theory can form a foundation for mathematics, studying founda-
tional issues (e.g. relative consistency of axioms) in set theory suffices for
addressing such questions in mathematics as a whole.

This course concentrates on (1) and (2), but also introduces the necessary
preliminaries for the Part C course Axiomatic Set Theory, which concentrates
on (3).

1.6 Cardinality
One key concern in the study of sets is the size (“cardinality”) of a set. Sets
can be finite, countable, or uncountable – but there is much more to say than
that.
We will define sets X and Y to have the same cardinality if there is a bijection
X → Y , and we will see that the ZFC axioms suffice to make this a rich and
useful concept, answering in particular questions like:

• Is there a complex number which is not the zero of any integer polynomial?
• How many lines does it take to cover the real plane?
• Can the subsets of R be exhaustively indexed by real numbers?
2 THE FIRST AXIOMS 5

1.7 Structure of the course


We will study:

• The axioms of ZFC, introducing them gradually throughout the course.


• Formalisation of mathematics in set theory (concentrating on N).
• Cardinalities.
• Ordinals: These measures of the “length of an infinite process” are impor-
tant in particular for “transfinite” inductive arguments.
• Axiom of Choice: We study this important axiom in detail, giving a num-
ber of equivalent formulations.

2 The first axioms


2.1 Extensionality
ZF1 (Extensionality): For all x and y, x is equal to y if and only if x and y
have the same elements:

∀x ∀y (x = y ↔ ∀z (z ∈ x ↔ z ∈ y)).

To make sense of such axioms, we adopt the following way of thinking.


We work in a mathematical universe V consisting of objects, which we
call sets. When we say “for all x” (written ∀x) we mean “for all sets x in
the universe V”; similarly ∃x refers to existence in V. Given two sets x and
y of V, it may or may not be that x ∈ y holds; we say that x is an ele-
ment of y when it does. Our axioms are statements using these concepts, and
we assume that V is such that the axioms are true in V. This gives us informa-
tion about the universe V.2
This is how we will discuss the axioms. At first, we know nothing about the
universe. As we introduce the axioms, we find out more and more about it.
We are not making a philosophical claim that this universe V we work in is
the “real universe” of sets.
The Extensionality axiom ZF1 says that an object a of V is determined by
the information of which objects b of V satisfy b ∈ a. So we may identify a with
the set of such b – this justifies us calling the objects of V “sets” and reading ∈
as “is an element of”.
Now note that the elements of a set in V are also sets, as are the elements of
its elements, and so on. Such sets are called hereditary sets. So the objects
of V are hereditary sets.
There are no cows in our universe V, there are only sets. Nor are there sets
of cows in V, there are only sets of sets – which are actually sets of sets of sets,
and so on.
From now on, we reserve the word set to mean: an object in V.
2 Those with B1.1 Logic will recognise this as the notion we formalised there of V being a

model of the axioms.


2 THE FIRST AXIOMS 6

2.2 Empty set


We now begin to “populate” our universe V by giving axioms which guarantee
the existence of sets. First off:
ZF2 (Empty Set): There is a set with no elements:
∃x ∀y y ∈
/ x.

We can already deduce that this empty set is unique:


Theorem 2.1. There is a unique set with no elements.

Proof. Existence is by the axiom Empty Set. Suppose x and y each have no
elements, i.e. ∀z z ∈
/ x and ∀z z ∈
/ y. Then x and y have the same elements, i.e.
∀z (z ∈ x ↔ z ∈ y), since both sides of “↔” are false for all z. So x = y by
Extensionality.

We denote this unique empty set by ∅, as usual.

2.3 Pairing
ZF3 (Pairing): For any x and y (not necessarily distinct), there is a set whose
elements are precisely x and y:
∀x ∀y ∃z ∀w (w ∈ z ↔ (w = x ∨ w = y)).

(Here, ∨ means “(inclusive) or”; P ∨ Q is true iff at least one of P and Q is.)
Again, we have uniqueness by Extensionality:
Theorem 2.2. For any x and y there is a unique set whose elements are pre-
cisely x and y.

Proof. Exercise (Sheet 1).

We denote this unique set by {x, y}, and by {x} in the case that y = x.
This is already enough to build up a rich collection of sets: we have the
existence of ∅, {∅}, {∅, {∅}}, {∅, {∅, {∅}}}, and so on. But we don’t yet know
that any sets with more than two elements exist!
End of lecture 1

2.4 Union
ZF4 (Union): For any set x, there is a set whose elements are precisely the
elements of the elements of x:
∀x ∃y ∀z (z ∈ y ↔ ∃w (z ∈ w ∧ w ∈ x)).

(Here, ∧ means “and”.)


S
S By Extensionality again, this set is unique, and we denote it by x. Note
∅ = ∅.
We can now define some familiar notation.
S
Given sets x and y, define x ∪ y := {x, y}. This notation is justified by the
following easy exercise.
3 FORMULAS AND COMPREHENSION 7

Exercise 2.3. For any sets x, y, z, we have z ∈ x ∪ y iff z ∈ x or z ∈ y.


Given sets x, y, z, define {x, y, z} := {x, y} ∪ {z}. More generally, recursively
define for n ≥ 2:
{x1 , . . . , xn+1 } := {x1 , . . . , xn } ∪ {xn+1 }.
Then this behaves as the notation indicates, namely:
n
!
_
∀y y ∈ {x1 , . . . , xn } ↔ y = xi .
i=1

2.5 Powerset
A set x is a subset of a set y, written x ⊆ y, if every element of x is an element
of y.
x ⊆ y ⇔ ∀z (z ∈ x → z ∈ y).

ZF5 (Powerset): For any set x, there exists a set whose elements are precisely
the subsets of x:
∀x ∃y ∀z (z ∈ y ↔ z ⊆ x).

By Extensionality, this set of all subsets of x is unique. We denote it by


P(x), and call it the power set of x.
Example 2.4.
• P(∅) = {∅}.
• P(P(∅)) = P({∅}) = {∅, {∅}}.
• P(P(P(∅))) = P({∅, {∅}}) = {∅, {∅}, {{∅}}, {∅, {∅}}}.

3 Formulas and comprehension


Russell’s paradox showed that the unrestricted comprehension principle, the
existence of {x : P (x)} for any property P (x), is inconsistent.
In ZF, we weaken this principle to comprehension restricted to a set: given
a set y, we want the existence of {x ∈ y : P (x)}.
However, before we can state this as an axiom, we must formalise a notion
of a “property” P (x). To see the problem, consider trying to make sense of:
{n ∈ N : n has no English definition less than a thousand letters long }.
This would be a non-empty set, since there are only finitely many ways to
arrange a thousand letters, so it would have a least element, which would be the
least natural number which has no English definition less than a thousand letters
long – but that is an English definition less than a thousand letters long of this
natural number, which is meant to have no such definition; contradiction3 .
To avoid such paradoxes, and to facilitate reasoning about the axioms, we
will require P to be expressible in the following formal language of set theory,
which B1.1 students will recognise as first-order logic with a binary relation ∈.
3 This is a version of the “Berry paradox”.
3 FORMULAS AND COMPREHENSION 8

3.1 The formal language L


The formulas of L are built up recursively as follows:
• An expression of the form x = y or x ∈ y is a formula (with x, y any
variables).
• If ϕ and ψ are formulas then so are ¬ϕ, (ϕ ∧ ψ), (ϕ ∨ ψ), (ϕ → ψ), and
(ϕ ↔ ψ).
• If ϕ is a formula then so are ∀x ϕ and ∃x ϕ.
Here, ¬ϕ is the logical negation of ϕ, true iff ϕ is false, and can be read
as “not ϕ”. Also, (ϕ → ψ) means “ψ holds if ϕ does”, so it is equivalent to
¬(ϕ ∧ ¬ψ).
We define some useful abbreviations:
• x∈
/ y abbreviates ¬x ∈ y.
• x ⊆ y abbreviates ∀z (z ∈ x → z ∈ y).
• ∀x ∈ y ϕ abbreviates ∀x (x ∈ y → ϕ).
• ∃x ∈ y ϕ abbreviates ∃x (x ∈ y ∧ ϕ).
An occurrence of a variable in a formula is free if it is not in the scope of
a quantifier, like the first x in ∃y (y ∈ x ∧ ∀x x ∈/ y). The free variables of a
formula ϕ are the variables which occur free in ϕ. We often write e.g. ϕ(x, y) to
denote a formula whose free variables are x and y.
A sentence is a formula with no free variables. So a sentence is either true
or false in our universe V of sets.
A formula ϕ(x) with one free variable x can be viewed as a property: for a
given value of x, it is either true or false.
We have already seen that each of ZF1-4 can be expressed by a single sentence
of L. The whole ZFC axiom system will be expressible by (infinitely many)
sentences of L. This isn’t actually important in this course, and we won’t
always spell out how it’s done, but it will be crucial in the Part C Axiomatic
Set Theory course.
A formula with parameters is the result of replacing some of the variables
in a formula with sets. For example, if a is a set, ϕ(x) := a ∈ x is a formula
with parameter a and free variable x, expressing the property of having the set
a as an element.
If ϕ(x) is a formula with parameters and b is a set, we write ϕ(b) for the
result of replacing each free occurrence of x in ϕ(x) with b, so ϕ(b) asserts that b
satisfies the property ϕ. Similarly, if y is a variable not occurring in ϕ(x), then
ϕ(y) is the result of replacing each free occurrence of x with y (which in B1.1
was denoted ϕ[y/x].)

3.2 Comprehension
ZF6 (Comprehension): For any formula ϕ(x) with parameters and any set
y, there is a set z whose elements are precisely those elements x of y which
satisfy ϕ(x):
∀y ∃z ∀x (x ∈ z ↔ (x ∈ y ∧ ϕ(x))).
4 PRODUCTS AND RELATIONS 9

By Extensionality, this set is unique, and we denote it by {x ∈ y : ϕ(x)}.


Remark 3.1. ZF6 can be expressed by L-sentences, as follows.
For each formula ϕ(x, w1 , . . . , wn ),

∀w1 . . . ∀wn ∀y ∃z ∀x (x ∈ z ↔ (x ∈ y ∧ ϕ(x)))

is an L-sentence expressing the instances of ZF6 for those formulas with param-
eters which result from substituting sets for the variables w1 , . . . , wn . So we can
express ZF6 by the axiom scheme consisting of one such sentence for each such
formula.
With this restricted version of comprehension, the argument of Russell’s
paradox does not lead to inconsistency; instead, it proves something interesting.
Theorem 3.2. There is no set of all sets: there is no set Ω such that ∀x x ∈ Ω.

Proof. Suppose Ω is such. By Comprehension, consider R := {x ∈ Ω : x ∈


/ x}.
Then R ∈ Ω, so R ∈ R iff R ∈
/ R, contradiction.

End of lecture 2
Comprehension allows us to implement some more of the usual constructions
of set theory.
T
Lemma 3.3. Let a be a non-empty set. Then there is a unique set a such
that for all x, \
x∈ a ⇔ ∀y ∈ a x ∈ y.

Proof. Uniqueness is by Extensionality. Let


\ [
a := {x ∈ a : ∀y ∈ a x ∈ y},

which exists by Comprehension (and Union).


Then this has the desired property: ifSx is in every element
T of a then, since
a ̸= ∅, x is in some element of a, so x ∈ a; so then x ∈ a. The converse is
immediate.
T
We leave ∅ undefined, since it has no sensible definition.
Definition 3.4. For any sets a and b, define:
\
a ∩ b := {a, b}
a \ b := {x ∈ a : x ∈
/ b}.

4 Products and relations


In this section, we start to see how some of the usual notions of mathematics
can be handled in the set theory we have established so far.
4 PRODUCTS AND RELATIONS 10

4.1 Cartesian products


If X and Y are sets, we want to be able to consider their Cartesian product
X × Y . Its elements should be ordered pairs of elements of X and Y , so the first
problem is how to encode this notion when all we have are (unordered) sets.
For this, we use the following standard coding trick.

Definition 4.1. Given sets x and y, the ordered pair with first co-ordinate
x and second co-ordinate y is the set

⟨x, y⟩ := {{x}, {x, y}}.

The following theorem justifies the terminology.


Theorem 4.2. For any x, y, x′ , y ′ , we have ⟨x, y⟩ = ⟨x′ , y ′ ⟩ if and only if x = x′
and y = y ′ .

Proof. ⇐: Immediate.
⇒: Suppose
{{x}, {x, y}} = {{x′ }, {x′ , y ′ }}.

First, suppose y = x. Then {{x}, {x, y}} = {{x}}, so {x′ , y ′ } = {x},


hence x′ ∈ {x} and y ′ ∈ {x}, so x′ = x = y ′ , and in particular x = x′ and
y = y ′ , as required.
So we may assume y ̸= x, and symmetrically y ′ ̸= x′ .
In particular, {x} ̸= {x′ , y ′ }, since otherwise x′ = x = y ′ . Similarly,
̸ {x′ }.
{x, y} =
Hence {x} = {x′ }, so x = x′ , and {x, y} = {x′ , y ′ }, so y ′ ∈ {x, y}. Now
y ′ ̸= x, since y ′ ̸= x′ = x, so y ′ = y, as required.

We can now define the Cartesian product using Powerset and Comprehen-
sion.
Proposition 4.3. Let X and Y be sets. There is a unique set X × Y , called the
Cartesian product of X and Y , with the property that the elements of X × Y
are precisely the ordered pairs ⟨x, y⟩ where x ∈ X and y ∈ Y .

Proof. Uniqueness is by Extensionality. If x ∈ X and y ∈ Y , then ⟨x, y⟩ =


{{x}, {x, y}} ⊆ P(X ∪ Y ), so ⟨x, y⟩ ∈ P(P(X ∪ Y )). So

X × Y := {z ∈ P(P(X ∪ Y )) : ∃x ∈ X ∃y ∈ Y z = ⟨x, y⟩}

has the desired property – this set exists by Comprehension, since z = ⟨x, y⟩
can be expressed in L, namely by

∀w (w ∈ z ↔ (∀u (u ∈ w ↔ u = x) ∨ ∀u (u ∈ w ↔ (u = x ∨ u = y)))).
4 PRODUCTS AND RELATIONS 11

Notation 4.4. From now on we will allow ourselves to use the ⟨x, y⟩ notation in
S formulas used in Comprehension, as well as our other defined terms {x, y},
the
x, P(x), ∅, and so on. As in the previous proof, it is always possible to
eliminate these expressions to write an equivalent formula directly in L. The
following example illustrates the general technique (and how much paper we
will save with it!):
n [ o
∀x x, y ∈ z
n [ o
⇔ ∀x ∃w (w = x, y ∧ w ∈ z)
[
⇔ ∀x ∃w (∀u (u ∈ w ↔ (u = x ∨ u = y)) ∧ w ∈ z)
⇔ ∀x ∃w (∀u (u ∈ w ↔ (u = x ∨ ∀v (v ∈ u ↔ ∃t (v ∈ t ∧ t ∈ y)))) ∧ w ∈ z)

4.2 Relations
Definition 4.5. A binary relation is a set R of ordered pairs; we then write
xRy to mean ⟨x, y⟩ ∈ R (and we use this as an abbreviation in formulas).
SS
Remark 4.6. If ⟨x, y⟩ ∈ R then x, y ∈S R, since x, y ∈ {x,
SS y} ∈ ⟨x, y⟩ ∈ R
(from which we obtain x, y ∈ {x, y} ∈ R and hence x, y ∈ R).
Definition 4.7. The domain of a binary relation R is the set

dom(R) := {x : ∃y xRy},

and the range of R is the set

ran(R) := {y : ∃x xRy};

these sets exist by Comprehension and Remark 4.6. So R ⊆ dom(R) × ran(R).


A binary relation on a set X is a binary relation R with R ⊆ X × X, i.e.
with dom(R) ⊆ X and ran(R) ⊆ X.

Note that with this definition, = and ∈ and ⊆ are not relations, since the
domain would be a set of all sets, though their restrictions to any given set are
relations.
End of lecture 3

4.2.1 Functions
Definition 4.8. A function is a binary relation f with the property that for
all x, there is at most one y such that ⟨x, y⟩ ∈ f .
We write f (x) = y to mean ⟨x, y⟩ ∈ f (and we use this as an abbreviation
in formulas).
The restriction of f to a set a ⊆ dom(f ) is the function

f |a := f ∩ (a × ran(f )) = {⟨x, y⟩ ∈ f : x ∈ a}.

The image of a set a ⊆ dom(f ) under f is the set

f [a] := ran(f |a ) = {y : ∃x ∈ a f (x) = y}.


4 PRODUCTS AND RELATIONS 12

Definition 4.9. Given sets X and Y , a function from X to Y is a function


f with dom(f ) = X and ran(f ) ⊆ Y . We write f : X → Y for such a function.

So any function f is a function from dom(f ) to ran(f ).


Proposition 4.10. Given sets X and Y , there is a set Y X whose elements are
precisely the functions from X to Y .

Proof. Any f : X → Y is an element of P(X × Y ), so by Comprehension it


suffices to see that the property of a set f ⊆ X × Y being a function X → Y is
expressible in L. We can express it as follows:

ϕ(f ) := ∀x ∈ X (∃y ∈ Y f (x) = y ∧ ∀y ′ (f (x) = y ′ → y ′ = y)).

Remark 4.11. To partially explain the notation Y X : for finite sets X and Y we
have |Y X | = |Y ||X| .
Remark 4.12. The empty set is a function, called the empty function, ∅ : ∅ → ∅.
So Y ∅ = {∅} for any set Y .

4.2.2 Order relations


Definition 4.13.

• A strict partial order (or just strict order) on a set X is a relation


< ⊆ X × X satisfying for all x, y, z ∈ X:

Irreflexivity: ¬x < x;
Transitivity: if x < y and y < z then x < z.
It is a strict total order if also we have for all x, y ∈ X that x < y or
y < x or x = y.

• A (totally) ordered set is a set X equipped with a (total) order.


• If an order is denoted by <, we write x ≤ y as an abbreviation for
(x < y ∨ x = y), and x > y for y < x.
• A least element of a subset Y ⊆ X of an ordered set is a y ∈ Y such that
y ≤ y ′ for all y ′ ∈ Y (so y is unique if it exists).

• A minimal element of a subset Y ⊆ X of an ordered set is a y ∈ Y such


that y ′ < y for no y ′ ∈ Y .

Total orders are also known as linear orders.

4.2.3 Equivalence relations


Definition 4.14. An equivalence relation on a set X is a binary relation
∼ ⊆ X × X satisfying for all x, y, z ∈ X:

Reflexivity: x ∼ x;
5 THE AXIOM OF INFINITY AND THE NATURAL NUMBERS 13

Symmetry: if x ∼ y then y ∼ x;
Transitivity: if x ∼ y and y ∼ z then x ∼ z.

The set of equivalence classes of ∼ is then

X/ ∼ := {S ∈ P(X) : ∃x ∈ X S = {y ∈ X : y ∼ x}}
= {{y ∈ X : y ∼ x} : x ∈ X}.

5 The axiom of infinity and the natural numbers


One of the main purposes of set theory is to clarify the nature of infinite objects,
but the axioms we have introduced so far do not imply the existence of an
infinite set. We also claimed that all of mathematics can be formalised in terms
of sets, but so far we don’t even know how to treat the natural numbers and its
structure. We deal with both of these problems at once.

5.1 Natural numbers – discussion


Informally, we can encode each natural number as a set by defining n := {0, . . . , n − 1}:

0 := ∅; 1 := {0}; 2 := {0, 1}; 3 := {0, 1, 2} . . .

0 = ∅; 1 = {∅}; 2 = {∅, {∅}}; 3 = {∅, {∅}, {∅, {∅}}} . . .


(recall that these sets exist by Pairing and Union). So

n + 1 = {0, . . . , n} = {0, . . . , n − 1} ∪ {n} = n ∪ {n}.

It would be tempting to then define N := {0, 1, 2, . . .}. However, we can not


directly write an axiom which asserts the existence of such a set (∃x ∀y (y ∈
x ↔ (y = 0 ∨ y = 1 ∨ y = 2 . . .)) is an infinite expression, so not an L-sentence);
a deeper problem, if we’re trying to establish foundations and define the natural
numbers starting only from set theory, is that we would already need the natural
numbers to make sense of this “. . .” notation.
Instead, we proceed as follows.

5.2 Inductive sets and the axiom of infinity


Definition 5.1. The successor of a set x is x+ := x ∪ {x}.
Definition 5.2. A set x is inductive if ∅ ∈ x and x is closed under the successor
operation, i.e. ∀y (y ∈ x → y + ∈ x).

ZF7 (Infinity): An inductive set exists:

∃x (∅ ∈ x ∧ ∀y (y ∈ x → y ∪ {y} ∈ x)).

Proposition 5.3. There exists a unique least inductive set; we denote it by N.


That is, there is a unique set N which is inductive and which is a subset of
every inductive set.
5 THE AXIOM OF INFINITY AND THE NATURAL NUMBERS 14

End of lecture 4

Proof. Uniqueness is immediate: if N and N′ are least inductive sets, then N ⊆


N′ and N′ ⊆ N, so N = N′ (by Extensionality).
By Infinity (ZF7), there is an inductive set I. Consider
\
N := {I ′ ⊆ I : I ′ is inductive};

this is a set by Comprehension (inductivity can be expressed in L as in the


statement of ZF7).
Note that the intersection of inductive sets is inductive. So N is inductive,
and if J is inductive then J ∩ I ⊆ I is inductive, so N ⊆ J ∩ I ⊆ J. So N is least
inductive.

Later, we will also use ω to denote this set N.


We want to build up mathematics within our universe of sets, so from now
on we define4 :
Definition 5.4. A natural number is an element of N. We use numerals to
denote elements of N: 0 = ∅, 1 = 0+ , and so on.
Theorem 5.5 (Induction on N). Suppose ϕ(x) is an L-formula with parameters
such that ϕ(0) holds, and if n ∈ N and ϕ(n) holds then ϕ(n+ ) holds. Then ϕ(n)
holds for all n ∈ N.
In other words,
 
ϕ(∅) ∧ ∀n ∈ N (ϕ(n) → ϕ(n+ )) → ∀n ∈ N ϕ(n) .


Proof. The assumption on ϕ precisely means that the set X := {n ∈ N : ϕ(n)} ⊆


N (which exists by Comprehension) is inductive, hence X = N by definition of
N.

5.3 The order on N


We will use this induction principle to show that we can define in set-theoretic
terms the structure we expect to find on N. First we consider the ordering and
successor, then we move on to defining the arithmetic operations.
Definition 5.6. Define a binary relation < on N by: x < y ⇔ x ∈ y.

Note that this is indeed a relation, i.e. {⟨x, y⟩ ∈ N × N : x ∈ y} is a set.


Lemma 5.7. < is a strict partial order on N.

Proof.

Transitivity: We prove by induction on N that

∀n ∈ N ∀x ∀y ((x ∈ y ∧ y ∈ n) → x ∈ n).

So let ϕ(n) := ∀x ∀y ((x ∈ y ∧ y ∈ n) → x ∈ n). Then ϕ(0) holds trivially


(since y ∈ 0 holds for no y).
4 To avoid potential confusion: we do not redefine the notion of formula at this point.
5 THE AXIOM OF INFINITY AND THE NATURAL NUMBERS 15

Suppose ϕ(n) holds; we show that ϕ(n+ ) holds. So suppose x ∈ y ∈ n+ =


n ∪ {n}. If y = n, then x ∈ n ⊆ n+ so x ∈ n+ . Otherwise, y ∈ n, so
x ∈ y ∈ n, and by ϕ(n) we obtain x ∈ n, so again x ∈ n+ . Hence ϕ(n+ )
holds.
By induction on N (Theorem 5.5), we deduce ∀n ∈ N ϕ(n), as required.
Irreflexivity: We prove ∀n ∈ N n ∈
/ n by induction on N. Clearly 0 ∈
/ 0.
Suppose n ∈ / n, but n+ ∈ n+ = n ∪ {n}. Since n ∈ n+ but n ∈ / n, we
have n ̸= n, so n+ ∈ n. But then n ∈ n+ ∈ n, so n ∈ n by transitivity,
+

contradicting n ∈
/ n.

Lemma 5.8. For all n, m ∈ N:

(i) n+ ̸= 0.
(ii) If n ∈ m then n+ ∈ m+ .
(iii) If n ̸= 0, then n = k + for a unique k ∈ N.

Proof. (i) n ∈ n ∪ {n} = n+ , so n+ ̸= 0 = ∅.


(ii) We prove by induction on m that ∀m ∈ N ∀n ∈ m n+ ∈ m+ . This is
trivial for m = 0. Suppose for m, and let n ∈ m+ = m ∪ {m}; we conclude
by showing n+ ∈ m++ .
If n = m, then n+ = m+ ∈ m++ , as required. Otherwise, n ∈ m, so
n+ ∈ m+ by the induction hypothesis, so again n+ ∈ m++ as required.
(iii) Existence: ∀n ∈ N (n = 0 ∨ ∃k ∈ N n = k + ) holds by a trivial induction
(at the successor step, just use n+ = n+ ).
Uniqueness: Suppose k, l ∈ N and k + = l+ but k ̸= l. Then k ∈ k + =
l+ = l ∪ {l} but k ̸= l, so k ∈ l. Then k + ∈ l+ = k + by (ii), contradicting
irreflexivity.

Theorem 5.9. < is a strict total order on N.

Proof. Given Lemma 5.7, it remains to show totality, i.e. ∀n ∈ N ∀m ∈ N ϕ(n, m)


where
ϕ(n, m) := (m ∈ n ∨ m = n ∨ n ∈ m).
We first prove by induction on n that ∀n ∈ N (0 ∈ n ∨ 0 = n). This is
immediate for n = 0, and if it holds for n, then 0 ∈ n+ = n ∪ {n} since either
0 = n ∈ n+ or 0 ∈ n ⊆ n+ .
Now let n ∈ N. We conclude by proving ∀m ∈ N ϕ(n, m) by induction on
m. We have ϕ(n, 0) by what we proved above.
Now suppose ϕ(n, m); we conclude by proving ϕ(n, m+ ). If m = n then
n ∈ m+ , and if n ∈ m then n ∈ m ∈ m+ , and then n ∈ m+ by transitivity.
Otherwise, m ∈ n. In particular, n ̸= 0. By Lemma 5.8(iii), n = k + = k∪{k}
for some k ∈ N. So either m = k, in which case m+ = k + = n, or m ∈ k, in
which case m+ ∈ k + = n by Lemma 5.8(ii).
5 THE AXIOM OF INFINITY AND THE NATURAL NUMBERS 16

We consider N as a totally ordered set with this ordering.


Lemma 5.10. For n, m ∈ N, we have n ≤ m ⇔ n ⊆ m.

Proof. ⇒: If n = m then certainly n ⊆ m. If n < m and k ∈ n, then k ∈ m


by transitivity, so again n ⊆ m.
⇐: Suppose n ̸≤ m. By totality, m < n, so then m ⊆ n by the previous
step. Then n ̸⊆ m, since otherwise n = m by Extensionality, contrary to
assumption.

End of lecture 5

Theorem 5.11. Any non-empty subset X of N has a unique least element,


denoted min X.

Proof. Uniqueness given existence is immediate: if n and n′ are each least, then
n ≤ n′ ≤ n, so n = n′ .
For existence, suppose X ⊆ N has no least element. We show X = ∅ by
proving ∀n ∈ N ∀m ∈ n m ∈ / X by induction. For n = 0 this is trivial, and if it
holds for n then it holds for n+ , since otherwise n would be a least element of
X.
Remark. By an easy induction, any n ∈ N is a subset of N (i.e. N is transitive)
(Exercise Sheet 2). So n = {m ∈ N : m < n}, and we can see this as justifying
our informal notation n = {0, . . . , n − 1}.

5.4 Recursion on N
Theorem 5.12 (Definition by recursion on N). Let X be a set and g : X → X
a function, and let x0 ∈ X. Then there exists a unique function f : N → X
such that

• f (0) = x0 ;
• f (n+ ) = g(f (n)) for all n ∈ N.

Proof. For n ∈ N, say h : n+ → X is an n-approximation if h(0) = x0 and


h(m+ ) = g(h(m)) for all m ∈ n.
Claim. For each n ∈ N, there exists a unique n-approximation.

Proof. Existence: By induction on n. {⟨0, x0 ⟩} is a 0-approximation. If h


is an n-approximation, set h′ := h ∪ {⟨n+ , g(h(n))⟩}. Then h′ is an
n+ -approximation.
Uniqueness: By induction on n. For n = 0, {⟨0, x0 ⟩} is the unique 0-
approximation. Suppose the uniqueness for n, and let h1 and h2 be n+ -
approximations. Then h1 |n+ = h2 |n+ , since these are n-approximations,
and then h1 (n+ ) = g(h1 (n)) = g(h2 (n)) = h2 (n+ ), so h1 = h2 .
Claim

We conclude by proving that there exists a unique f as in the statement.


5 THE AXIOM OF INFINITY AND THE NATURAL NUMBERS 17

Uniqueness: Suppose f, f ′ : N → X are as in the statement. Let n ∈


N. Then f |n+ and f ′ |n+ are n-approximations, so f |n+ = f ′ |n+ by the
uniqueness in the Claim, and in particular f (n) = f ′ (n). Hence f = f ′ .

Existence: Each n-approximation h : n+ → X is a subset of N × X. The


property of being an n-approximation is expressible in L, by translating
the definition. So by Comprehension, there is a set

H := {h ∈ P(N × X) : ∃n ∈ N [h is an n-approximation]}.
S
Let f := H.
We show that f is as required.
• f : N → X: Let n ∈ N. There exists an n-approximation h, so
⟨n, h(n)⟩ ∈ f . If ⟨n, x⟩ ∈ f , then ⟨n, x⟩ ∈ h′ for some h′ ∈ H. Then
h′ is an m-approximation for some m with n ≤ m, so h′ |n+ is an
n-approximation, so h′ |n+ = h|n+ , and so x = h(n).
• f (0) = x0 : If h ∈ H is the 0-approximation, we have ⟨0, x0 ⟩ ∈ h ⊆ f .
• f (n+ ) = g(f (n)) for n ∈ N: Let h ∈ H be the n+ -approximation.
Then f (n+ ) = h(n+ ) = g(h(n)) = g(f (n)).

We deduce a version which allows us to treat parameters uniformly:


Corollary 5.13. Suppose A and X are sets, and g0 : A → X and g+ : A×X →
X are functions. Then there exists a unique f : A × N → X such that:

• f (a, 0) = g0 (a);
• f (a, n+ ) = g+ (a, f (a, n)) for all n ∈ N.

Proof. For each a, by Theorem 5.12 there is a unique fa : N → X such that


fa (0) = g0 (a) and fa (n+ ) = g+ (a, fa (n)) for all n ∈ N. So if we can define
f (a, n) := fa (n), this will be unique with the desired property.
To see that there is such a function f , observe that we can express in a
formula that fa is defined by recursion using g0 (a) and g+ (a, x), so there is a
formula ϕ(x, y) such that for all a ∈ A, fa is unique such that ϕ(a, fa ) holds.
Then ϕ defines by Comprehension the function F : A → X N ; a 7→ fa , i.e.
F = {⟨x, y⟩ ∈ A × X N : ϕ(x, y)}.
Then the function defined by f (a, n) := F (a)(n), i.e.

f = {⟨⟨a, n⟩, x⟩ ∈ (A × N) × X : ⟨a, ⟨n, x⟩⟩ ∈ F },

is as required.
More concisely:

f = {⟨⟨a, n⟩, x⟩ ∈ (A × N) × X : ∃h ∈ X N ((h(0) = g0 (a) ∧ ∀m ∈ N h(m+ ) = g+ (a, h(m)))


∧ h(n) = x)}.
5 THE AXIOM OF INFINITY AND THE NATURAL NUMBERS 18

5.5 Arithmetic on N
Definition 5.14. Define functions +, ·, b : N × N → N using Corollary 5.13
such that for all n, m ∈ N

• – n+0=n
– n + m+ = (n + m)+
• – n·0=0
– n · m+ = n · m + n
• – nb0 = 1
– nbm+ = (nbm) · n

We normally write nm for nbm. We use the usual operator precedence rules,
so n + m · k means n + (m · k) rather than (n + m) · k.

Explicitly, to obtain + we apply Corollary 5.13 with g0 (n) = n and g+ (n, k) =


k + ; then for ∗ we apply it with g0 (n) = 0 and g+ (n, k) = k + n; then for b we
apply it with g0 (n) = 1 and g+ (n, k) = k · n.
Now we use induction to confirm the usual arithmetic properties of these
operations.
Theorem 5.15. For all n, m, k ∈ N:

(i) n + 1 = n+ .
(ii) (n + m) + k = n + (m + k) (Associativity of +).
(iii) 0 + k = k
(iv) n + k + = n+ + k
(v) n + k = k + n (Commutativity of +).
(vi) n · 1 = n.
(vii) n · (m + k) = n · m + n · k (Distributivity of · over +).
(viii) (n · m) · k = n · (m · k) (Associativity of ·).
(ix) n · k = k · n (Commutativity of ·).
(x) mn+k = mn · mk .
(xi) mn·k = (mn )k .

Proof. We will be brief. Where we use induction below, it is on k, with m, n ∈ N


arbitrary.

(i) n + 1 = n + 0+ = (n + 0)+ = n+ .
(ii) • Base case: (n + m) + 0 = n + m = n + (m + 0).
• Inductive step: (n + m) + k + = ((n + m) + k)+ = (n + (m + k))+ =
n + (m + k)+ = n + (m + k + ).
5 THE AXIOM OF INFINITY AND THE NATURAL NUMBERS 19

(iii) • Base case: 0 + 0 = 0.


• Inductive step: 0 + k + = (0 + k)+ = k + .
(iv) • Base case: n + 0+ = (n + 0)+ = n+ = n+ + 0.
• Inductive step: n + k ++ = (n + k + )+ = (n+ + k)+ = n+ + k + .
(v) • Base case: n + 0 = n = 0 + n, by (iii).
• Inductive step: n + k + = (n + k)+ = (k + n)+ = k + n+ = k + + n,
by (iv).
(vi) n · 1 = n · 0+ = (n · 0) + n = 0 + n = n, by (iii).
(vii) • Base case: n · (m + 0) = n · m = n · m + 0 = n · m + n · 0.
• Inductive step: n · (m + k + ) = n · (m + k)+ = n · (m + k) + n =
(n · m + n · k) + n = n · m + (n · k + n) = n · m + n · k + (using
associativity of +).
(viii) • Base case: (n · m) · 0 = 0 = n · 0 = n · (m · 0).
• Inductive step: (n · m) · k + = (n · m) · k + n · m = n · (m · k) + n · m =
n · (m · k + m) = n · (m · k + ) (using distributivity).
(ix) Exercise.
(x) • Base case: mn+0 = mn = mn · 1 = mn · m0 .
+ +
• Inductive step: mn+k = m(n+k) = mn+k · m = (mn · mk ) · m =
+
mn · (mk · m) = mn · mk .
(xi) • Base case: mn·0 = m0 = 1 = (mn )0
+ +
• Inductive step: mn·k = mn·k+n = mn·k ·mn = (mn )k ·mn = (mn )k .

5.6 Defining Z, Q, R (not on syllabus)


Although it is not on the course syllabus, we briefly indicate how we can use N
and its arithmetic structure to construct within our universe of sets more of the
familiar structures of mathematics.
First, we define Z as (N × N)/ ∼ where (n, m) ∼ (n′ , m′ ) ⇔ n + m′ =
m + n′ ; we identify (n, m)/ ∼ with n − m and define addition and multiplication
correspondingly.
Then we can define Q as (Z × (N \ {0}))/ ∼′ where (n, m) ∼′ (n′ , m′ ) ⇔
n · m′ = m · n′ ; then identify (n, m)/ ∼′ with m
n
∈ Q and define addition and
multiplication accordingly.
Now R can be defined as the set of Dedekind cuts in Q: that is, we identify
r ∈ R with {q ∈ Q : q < r} ⊆ Q – the point being that we can define the set
of such subsets of Q as the downwards-closed proper non-empty subsets with
no greatest element (so this is a subset of P(Q) by Comprehension). We define
addition and multiplication accordingly.
We can then proceed to develop real and complex analysis based on this
definition of R, defining in particular C = R × R, identifying (a, b) with a + ib
6 CARDINALITY 20

and defining addition and multiplication accordingly. You may like to think how
we could continue in this vein to define your favourite objects of mathematics
(including those of logic!).
End of lecture 6

6 Cardinality
One key contribution of set theory is to give a rigorous mathematical develop-
ment of the notion of the “size” of an infinite object, which we call its cardinality.
We first explore what we can understand of this with the axioms we have so far.
Then we add another axiom, Replacement, which will let us reach larger cardi-
nalities. Later, we will add the Axiom of Choice, and see that this significantly
clarifies the structure of cardinalities (while still leaving some natural questions
undecided).

6.1 Classes
If ϕ(x) is a formula, it may or may not be that there is a set {x : ϕ(x)}. We
have {x : x ̸= x} = ∅, but there is no set {x : x = x}.
Nonetheless, it is convenient to reuse some of the notation and terminology
we use for sets to talk about {x : ϕ(x)}.
Definition 6.1.

• If ϕ(x) is a formula with parameters, we call {x : ϕ(x)} a class. We denote


classes with boldface characters.
• If X = {x : ϕ(x)} and Y = {x : ψ(x)} are classes:
– a ∈ X means ϕ(a);
– X and Y are equal if ∀x (ψ(x) ↔ ϕ(x)).
– X is a subclass of Y, denoted X ⊆ Y, if ∀x (ϕ(x) → ψ(x)).
• V := {x : x = x}, the class of all sets.
• Sets are classes: a set a is identified with the class {x : x ∈ a}.
• A proper class is a class which is not a set.

Remark 6.2.

• By Theorem 3.2, V is a proper class.

• The elements of a class are always sets, not proper classes.


• Comprehension says that a subclass of a set is a set.
6 CARDINALITY 21

6.2 Cardinalities
Definition 6.3. Sets X and Y have the same cardinality (or are equinu-
merous), written X ∼ Y , if there exists a bijection X → Y .

We think of ∼ as a class relation on V, defined by

ϕ(X, Y ) := ∃f [f is a bijection X → Y ].

Then actually ∼ is a class equivalence relation:


Lemma 6.4. For any sets X, Y, Z:

• X ∼ X.
• If X ∼ Y then Y ∼ X.

• If X ∼ Y and Y ∼ Z then X ∼ Z.

Proof. Straightforward by considering identity functions, inverses, and compo-


sitions respectively.

Provisional Definition 6.5. The cardinality |X| of a set X is the equivalence


class of X under ∼:
|X| := {Y : Y ∼ X}.
(This is a proper class, unless X = ∅.)
So |X| = |Y | ⇔ X ∼ Y .

Later, using the axiom of choice, we will redefine |X| to be a particular


canonical element of this class, which we will call a cardinal number.

6.3 Comparing cardinalities


Definition 6.6.

• |X| ≤ |Y | if there exists an injection X → Y .


• |X| < |Y | if |X| ≤ |Y | and |X| =
̸ |Y |.

Lemma 6.7. These are well-defined: if |X| = |X ′ | and |Y | = |Y ′ | and there


exists an injection X → Y , then there exists an injection X ′ → Y ′ .

Proof. Immediate by composing with bijections.

Lemma 6.8 (Tarski’s Fixed Point Theorem). Let X be a set. Then any mono-
tone function H : P(X) → P(X) has a fixed point, where:

• H : P(X) → P(X) is monotone if A ⊆ B implies H(A) ⊆ H(B) (for


A, B ⊆ X).

• A fixed point of H is a C ⊆ X with H(C) = C.


S
Proof. Let D := {A ⊆ X : A ⊆ H(A)}, and let C := D.
6 CARDINALITY 22

• C ⊆ H(C): Let c ∈ C. Then c ∈ A ∈ D say. Then A ⊆ H(A), so


c ∈ H(A). But A ⊆ C, so H(A) ⊆ H(C) by monotonicity, so c ∈ H(C).

• H(C) ⊆ C: Since C ⊆ H(C), by monotonicity H(C) ⊆ H(H(C)), so


H(C) ∈ D. Hence H(C) ⊆ C.

So H(C) = C, as required.
Theorem 6.9 (Schröder-Bernstein Theorem5 ). If |X| ≤ |Y | ≤ |X| then |X| =
|Y |.

Proof. Say f : X → Y and g : Y → X are injections.


Define H : P(X) → P(X) by

H(A) := g[f [A]c ]c := X \ g[Y \ f [A]]

where we define for this proof the complements Ac := X \ A and Dc := Y \ D.
Then H is monotone, since f [·] and g[·] are inclusion-preserving while com-
plement is inclusion-reversing; explicitly:

A ⊆ B ⊆ X ⇒ f [A] ⊆ f [B]
′ ′
⇒ f [A]c ⊇ f [B]c
′ ′
⇒ g[f [A]c ] ⊇ g[f [B]c ]
′ ′
⇒ H(A) = g[f [A]c ]c ⊆ g[f [B]c ]c = H(B).

By Lemma 6.8, there is A ⊆ X with H(A) = A. Then Ac = H(A)c =



g[f [A]c ].

So we have bijections f |A : A → f [A] and g|f [A]c′ : f [A]c → Ac , and putting
them together yields a bijection f |A ∪ (g|f [A]c′ )−1 : X → Y .

Corollary 6.10. < is a strict partial (class) order on V/ ∼, i.e. for all X, Y, Z:

(i) |X| <


̸ |X|.
(ii) If |X| < |Y | and |Y | < |Z| then |X| < |Z|.

Proof. (i) Immediate from the definition.

(ii) We have |X| ≤ |Z| by composing injections witnessing |X| ≤ |Y | ≤ |Z|.


If |X| = |Z|, then |X| ≤ |Y | ≤ |X| so |X| = |Y | by Schröder-Bernstein,
contrary to assumption.

It is perhaps natural to expect this order to be total, so that we can really


think of cardinality as a linear scale of largeness. However, this does not follow
from ZF, and we will see later that, modulo ZF, this order is total if and only
if the Axiom of Choice holds.
End of lecture 7
5 This
is also known variously as the Cantor/Schröder-Bernstein Theorem, or the Cantor-
Bernstein Theorem.
6 CARDINALITY 23

6.4 Finite sets


Definition 6.11. A set X is finite if |X| = |n| for some n ∈ N. Otherwise, X
is infinite.
Lemma 6.12. Let X be finite.

(i) Any subset of X is finite.


(ii) (“Pigeonhole principle”) Any injective function f : X → X is surjective.
(iii) N is infinite.

Proof. Exercise (sheet 2). Hint: For (i) and (ii), prove the result by induction
when X = n ∈ N, and then compose with bijections when X is arbitrary. For
the inductive step in (ii), if f : n+ → n+ is an injection but k ∈ n+ \ ran(f ),
then f ′ := σ ◦ f : n+ → n is injective where σ = (n k) : n+ → n+ is the
transposition, hence f ′ |n : n → n is also injective, but f ′ (n) ∈ n \ ran(f ′ |n ),
contradicting the induction hypothesis.
Lemma 6.13. Let n, m ∈ N. Then n < m ⇔ |n| < |m|.

Proof. First note that if n ≤ m, then n ⊆ m by Lemma 5.10, so |n| ≤ |m| since
the inclusion n → m is injective.
Suppose n < m, so in particular n ≤ m and so |n| ≤ |m|. If |n| = |m|, then
there is a bijection f : m → n, but then f is also a function f : m → m which
is injective but not surjective, contradicting Lemma 6.12(ii). So |n| < |m|.
Conversely, if |n| < |m| then |n| ̸≥ |m| so n ̸≥ m, so n < m.

So the order on the natural numbers agrees with the order on their cardi-
nalities, which partially justifies:
Notation 6.14. If n ∈ N, we usually write the cardinality |n| as n.6 e.g. |∅| = 0,
|{3}| = 1.

6.5 Countable sets


Notation 6.15. We write ℵ0 (“aleph null”) for the cardinality |N|.7
Definition 6.16. A set X is

• countable if |X| ≤ ℵ0 .
• countably infinite if it is countable and infinite.
• uncountable if it is not countable.
Theorem 6.17. A set X is countably infinite if and only if |X| = ℵ0 .
In other words, there is no infinite cardinality below ℵ0 .
6 For now this is an abuse of notation, since the set n is not actually equal to the proper

class |n|, but this will be fixed when we eventually redefine | · |.


7 Later we will redefine ℵ along with |N|, such that ℵ = |N| will remain true.
0 0
6 CARDINALITY 24

Proof. If |X| = ℵ0 then X is infinite since N is (by Lemma 6.12(iii)), so X is


countably infinite.
Conversely, suppose |X| ≤ |N| and X is infinite; we show that |X| = |N|. We
may assume X ⊆ N, since X is in bijection with its image under an injection
X → N.
Recall that by Theorem 5.11, any non-empty subset ∅ ̸= Y ⊆ N has a
unique least element min Y . Since X is infinite and subsets of finite sets are
finite, X \ n ̸= ∅ for any n ∈ N.
So define by recursion f : N → X by f (0) := min X and f (n+ ) := min(X \
f (n)+ ) (the “first element of X after f (n)”).
Then f is injective because n > m ⇒ f (n) > f (m) by induction on n. (This
is trivial for n = 0. Suppose for n and suppose m < n+ ; then f (n) ≥ f (m) by
the IH, and f (n+ ) = min(X \ f (n)+ ) > f (n), so f (n+ ) > f (m) as required.)
This shows that |N| ≤ |X|, and we conclude |X| = |N| (by Schröder-
Bernstein) as required.

Corollary 6.18. A non-empty set X is countable if and only if there exists a


surjection N → X.

Proof. ⇐: If f : N → X is surjective, then g(x) := min{n ∈ N : f (n) = x}


defines an injection g : X → N (which exists by Comprehension within
X × N).
⇒: By Theorem 6.17, by composing with a bijection we may suppose that
either X = N, in which case the result is immediate, or X = n for some
n ∈ N. Then n > 0 since X ̸= ∅, and we can define a surjection f : N → n
by (
i if i < n
f (i) =
0 else.

Remark 6.19. A natural generalisation would be: |X| ≤ |Y | whenever a surjec-


tion Y → X exists. We can not yet prove this for uncountable Y , but it will
follow from the Axiom of Choice.

6.6 Cardinal arithmetic


Definition 6.20. Define addition, multiplication, and exponentiation of cardi-
nalities by:

• |X| + |Y | := |X ∪ Y | if X ∩ Y = ∅.
• |X| · |Y | := |X × Y |.

• |X||Y | := |X Y |.
Exercise 6.21. These do define well-defined operations on cardinalities (con-
sider bijections). (To see that |X| + |Y | is always defined, note that |X| =
|{0} × X| and |Y | = |{1} × Y |, and ({0} × X) ∩ ({1} × Y ) = ∅.)
6 CARDINALITY 25

End of lecture 8

Proposition 6.22.

(a) For all cardinalities κ, λ, µ:


(i) κ + λ = λ + κ
(ii) κ + (λ + µ) = (κ + λ) + µ
(iii) κ + 0 = κ
(iv) κ · λ = λ · κ
(v) κ · (λ · µ) = (κ · λ) · µ
(vi) κ · 1 = κ
(vii) κ · (λ + µ) = κ · λ + κ · µ
(viii) κλ+µ = κλ · κµ
(ix) κλ·µ = (κλ )µ .
(b) These operations agree on finite cardinalities with the operations on N de-
fined by recursion in Definition 5.14.

(c) If κ ≤ κ′ and λ ≤ λ′ then:


• κ + λ ≤ κ′ + λ′
• κ · λ ≤ κ′ · λ′

• κλ ≤ κ′λ if κ ̸= 0.

Proof.

(a) (i) Say κ = |X| and λ = |Y | and X ∩ Y = ∅. Then X ∪ Y = Y ∪ X, so


|X| + |Y | = |Y | + |X|.
(ii),(iii) Similar equalities of sets show these.
(iv) ⟨x, y⟩ 7→ ⟨y, x⟩ defines a bijection X × Y → Y × X, so |X| · |Y | =
|X × Y | = |Y × X| = |Y | · |X|.
(v)-(ix) Similar bijections show these (see Sheet 3).
(b) Exercise (Sheet 3).
(c) Exercise (Sheet 3).

Proposition 6.23. |P(X)| = 2|X| for any set X.

Proof. The function F : P(X) → 2X defined by


(
0 if x ∈
/Y
F (Y )(x) =
1 if x ∈ Y

(i.e. F (Y ) is the indicator function of Y in X) is a bijection.


6 CARDINALITY 26

Theorem 6.24 (Cantor). Let X be a set. Then there is no surjection X →


P(X).

Proof. Let f : X → P(X). Let

D := {x ∈ X : x ∈
/ f (x)} ⊆ X.

Suppose a ∈ X and D = f (a). Then a ∈ D iff a ∈


/ f (a) = D, contradiction. So
D ∈ P(X) \ ran(f ), so f is not surjective.
Corollary 6.25. κ < 2κ for any cardinality κ.

Proof. Say κ = |X|, so 2κ = |P(X)| (by Proposition 6.23). Then x 7→ {x} is an


injection X → P(X), so κ ≤ 2κ . If κ = 2κ , then there is a bijection X → P(X),
contradicting Theorem 6.24.
Lemma 6.26. ℵ0 · ℵ0 = ℵ0 .

Proof.

• ℵ0 ≤ ℵ0 · ℵ0 : n 7→ ⟨n, 0⟩ is an injection N → N × N.

• ℵ0 · ℵ0 ≤ ℵ0 : ⟨n, m⟩ 7→ 2n · 3m is an injection N × N → N. This follows


from the Fundamental Theorem of Arithmetic (unique prime factorisa-
tion), whose proof we omit.

Alternatively, ⟨n, m⟩ 7→ 12 (n + m)(n + m + 1) + m = n+m+1



2 + m is a bijection
N × N → N (again, we omit details).
Theorem 6.27.

(i) |Q| = ℵ0 .
(ii) |R| = 2ℵ0 .

Proof.

(i) |Q| ≥ ℵ0 since n 7→ n1 is an injection, and ⟨⟨n, m⟩ , k⟩ 7→ n−m


k+1 is a surjection
(N × N) × N → Q, and |(N × N) × N| = ℵ0 by Lemma 6.26 (twice), so
|Q| ≤ ℵ0 by Corollary 6.18.

(ii) The map which associates a real with its Dedekind cut, x 7→ {q ∈ Q : q <
x}, is an injection R → P(Q) since Q is dense in R. So |R| ≤ |P(Q)| =
2|Q| = 2ℵ0 .
For the converse, we can use ternary expansions. Define Φ : 2N → R by

X f (n)
Φ(f ) := .
n=0
3n
P∞
Then Φ is injective (using n=1 3−n = 1
2 < 1; binary expansions would
not work), so 2ℵ0 ≤ |R|.
7 REPLACEMENT AND FOUNDATION 27

7 Replacement and Foundation


The axiom system we have established so far, ZF1-7, is more or less the system
originally proposed by Zermelo. To obtain ZF, we add two further axioms
which were proposed a little later. Replacement radically increases the power
of the system, while Foundation trims the universe by denying the existence of
“pathological” sets like u = {u}.

7.1 Replacement
Definition 7.1. If X and Y are classes, a formula with parameters ϕ(x, y)
defines a class function F : X → Y if:

• ϕ(x, y) implies x ∈ X and y ∈ Y, and


• for all x ∈ X there is a unique y such that ϕ(x, y) holds.

We then write F(x) = y to mean ϕ(x, y).

End of lecture 9
Example 7.2. P : V → V is the class function defined by

ψ(x, y) := ∀w (w ∈ y ↔ w ⊆ x).

ZF8 (Replacement): If a is a set and F : a → V is a class function, then its


range F[a] := {F(x) : x ∈ a} is a set.

Remark 7.3. Then F : a → F[a] is actually a function, by Comprehension.


So we could equivalently state Replacement as: a class function on a set is a
function.
Remark 7.4. As with Comprehension, we can formalise Replacement by an
axiom scheme consisting of, for each L-formula ϕ(x, y, z1 , . . . , zn ), the sentence

∀z1 . . . ∀zn ∀w (∀x ∈ w ∃y (ϕ(x, y, z1 , . . . , zn ) ∧ ∀y ′ (ϕ(x, y ′ , z1 , . . . , zn ) → y ′ = y))


→ ∃v ∀u (u ∈ v ↔ ∃x ∈ w ϕ(x, u, z1 , . . . , zn ))).

One immediate application of Replacement is to strengthen our recursion


principle on N.
Theorem 7.5 (Recursion on N, class form). If x0 is a set and G : V → V is a
class function, then there exists a unique function f with dom(f ) = N such that

• f (0) = x0 ;
• f (n+ ) = G(f (n)) for all n ∈ N.

Proof. Exactly as in the proof of Theorem 5.12, for each n there is a unique n-
approximation. Then F(n) := [the unique n-approximation] is a class function
F : N → V, so by Replacement,
S H := F[N] = {h : ∃n ∈ N [h is an n-approximation]}
is a set. Set f := H, and conclude exactly as in Theorem 5.12.
8 WELL-ORDERED SETS AND ORDINALS 28

ℵ0
Example 7.6. There is a cardinality greater than any of ℵ0 , 2ℵ0 , 22 , . . ., in the
following sense.
Applying recursion with x0 := N and G := P, we obtain a function f with
dom(f ) = N and f (0) = N, f (1) = P(N), f (2) = P(P(N)), . . ..
Then ran(f ) = f [N] is a set which we could write as {N, P(N), P(P(N)), . . .}.
S
Let X := f [N]. Then for any n ∈ N, we have f (n) ⊆ X and hence
|X| ≥ f (n).
One can show that ZF1-7 do not suffice to prove the existence of such a
cardinality.

7.2 Foundation
ZF9 (Foundation): Every non-empty set x has an ∈-minimal element, i.e. an
element y ∈ x such that no element of x is an element of y:

∀x (x ̸= ∅ → ∃y ∈ x y ∩ x = ∅).

This axiom forbids certain “pathological” behaviour of sets:

Theorem 7.7. (i) There is no x with x ∈ x.


(ii) There are no x and y with x ∈ y ∈ x.
(iii) More generally, there is no infinite descending ∈-chain, i.e. no function
f with dom(f ) = N and f (n+ ) ∈ f (n) for all n ∈ N (so f (0) ∋ f (1) ∋
f (2) ∋ . . ..

Proof. (i) If x ∈ x, then {x} violates Foundation: the only element of {x} is
x, but x ∩ {x} = {x} =
̸ ∅.

(ii) If x ∈ y ∈ x, then {x, y} violates Foundation since y ∈ x ∩ {x, y} and


x ∈ y ∩ {x, y}.
(iii) Exercise (Sheet 3).

One can show that if ZF1-8 are consistent, then so are ZF1-9: adding Foun-
dation can not introduce a contradiction. In particular, any set we prove to
exist using ZF1-8 (such as N and R) does not violate Foundation. So adding
Foundation is “harmless”, and substantially simplifies the set theoretic universe.
However, Foundation will not actually be used in the remainder of these notes.
ZF1-9 form the axiom system ZF. Later we will add the Axiom of Choice
(AC) to form ZFC, but we delay this until we need it.
End of lecture 10

8 Well-ordered sets and ordinals


One way the natural numbers arise is as a measure of size, and we have now
generalised this to infinite cardinalities. Another way natural numbers arise is in
enumerating elements of an ordered set in which “the nth element” makes sense.
8 WELL-ORDERED SETS AND ORDINALS 29

We now generalise this ordinal sense of a natural number to infinite (transfinite)


ordinal numbers. First we define the orders which can be enumerated in this
sense – those in which the “next” element always exists, even if not every element
is of this form – then we define and study a notion of ordinal number with which
we can enumerate such orders, so that each element is “the αth element” for
some ordinal number α.
We won’t use Foundation in this section; it would slightly simplify some of
the proofs, but it isn’t necessary.

8.1 Well-ordered sets


Definition 8.1. A well-ordered set (or well-order) is a totally ordered set
(X, <) which is well-founded, meaning:

• Every non-empty subset ∅ =


̸ Y ⊆ X has a least element.

We denote this least element min Y (or minX Y ).

Example 8.2.

• N is well-ordered by < = ∈, by Theorem 5.11.

• Z is not well-ordered by its usual order <, since Z has no least element.
Same for R.
• [0, 1] ⊆ R is not well-ordered by <, since (0, 1] lacks a least element.
• {− n1 : n ∈ N \ {0}} ∪ N ⊆ R is well-ordered by <.

• Any subset Y of a well-ordered set (X, <) is well-ordered by the restriction


of <, and we write this well-order as (Y, <).
Definition 8.3. Let (X, <) be a well-ordered set.

• An initial segment of X is a subset S ⊆ X which is downwards closed


in X, i.e. ∀y ∈ S ∀x ∈ X (x < y → x ∈ S). It is a proper initial segment
if S ̸= X.
• We consider initial segments as well-ordered sets, with the restriction of
<.

• For a ∈ X, define X<a := {x ∈ X : x < a}.


Remark 8.4. The proper initial segments of X are precisely the sets X<a for
a ∈ X: indeed, if S ⊊ X is a proper initial segment then S = X<min(X\S) .
Definition 8.5. An embedding of a totally ordered set (X, <) in a totally
ordered set (Y, <′ ) is a function θ : X → Y which is strictly monotone, i.e.
x < x′ ⇒ θ(x) <′ θ(x′ ) for all x, x′ ∈ X.
An isomorphism is a surjective embedding, and we write (X, <) ∼ = (Y, <′ )
and say the ordered sets are isomorphic if an isomorphism exists.

Well-orders are highly rigid:


8 WELL-ORDERED SETS AND ORDINALS 30

Lemma 8.6. If (X, <) is a well-order and θ : (X, <) → (X, <) is an embedding,
then θ(x) ≥ x for all x ∈ X.

Proof. Suppose not. Then a := min{x ∈ X : θ(x) < x} exists. But then
θ(a) < a, so θ(θ(a)) < θ(a) since θ is an embedding, contradicting minimality
of a.

Lemma 8.7. A well-order is not isomorphic to any of its proper initial seg-
ments.

Proof. If σ : X → X<x is an isomorphism, then σ(x) < x, contradicting


Lemma 8.6.

Lemma 8.8. Let (X, <) be a well-order.

(i) The only isomorphism X → X is the identity.

(ii) If (X, <) ∼


= (Y, <′ ), then there is a unique isomorphism X → Y .

Proof. (i) If σ : X → X is an isomorphism, then so is σ −1 , so by Lemma 8.6,


for all x ∈ X we have σ(x) ≤ x and σ −1 (x) ≤ x, hence x ≤ σ(x) ≤ x,
hence σ(x) = x.
(ii) If σ, τ : X → Y are isomorphisms then τ −1 (σ(x)) = x for all x ∈ X by (i),
so σ = τ .

Any two well-orders are comparable:


Theorem 8.9. Let (X, <) and (Y, <′ ) be well-orders. Then either (X, <) is
isomorphic to an initial segment of (Y, <′ ), or (Y, <′ ) is isomorphic to an initial
segment of (X, <).

Proof. Define

σ := {⟨x, y⟩ ∈ X × Y : (X<x , <) ∼


= (Y<′ y , <′ )}.

Then σ is a function, since if ⟨x, y⟩ , ⟨x, y ′ ⟩ ∈ σ, then (Y<y , <) ∼


= (Y<y′ , <),
so y = y ′ by Lemma 8.7. Symmetrically, σ is injective.
Let ⟨x, y⟩ ∈ σ, so say τ : X<x → Y<′ y is an isomorphism. Then if x′ < x,
then τ |X<x′ : X<x′ → Y<′ τ (x′ ) is an isomorphism, so ⟨x′ , τ (x′ )⟩ ∈ σ.
Hence X ′ := dom(σ) ⊆ X is an initial segment of X, and symmetrically
Y := ran(σ) ⊆ Y is an initial segment of Y , and σ : X ′ → Y ′ is an isomorphism,

since σ(x′ ) = τ (x′ ) <′ y = σ(x).


So X ′ ∼
= Y ′ . If X ′ and Y ′ are proper initial segments, say X ′ = X<x and
Y = Y<′ y , then ⟨x, y⟩ ∈ σ, contradicting X ′ = dom(σ). So either X ′ = X or

Y ′ = Y , as required.

End of lecture 11
8 WELL-ORDERED SETS AND ORDINALS 31

8.2 Ordinals
Definition 8.10. A set a is transitive if every element of a is a subset of a,
i.e. x ∈ y ∈ a ⇒ x ∈ a.
Definition 8.11. An ordinal is a transitive set which is well-ordered by ∈.
That is, an ordinal is a transitive set α such that (α, <) is a well-ordered8
set, where < := {⟨β, γ⟩ ∈ α × α : β ∈ γ}.
We use < and ∈ interchangeably to denote the order on an ordinal.
We denote the class of ordinals by ON.

By Theorem 5.11 and the Remark following it, N is an ordinal. We use ω to


denote N when we consider it as an ordinal.
Lemma 8.12. Any element of an ordinal is an ordinal.

Proof. Let β ∈ α ∈ ON. Then β ⊆ α by transitivity of α, so the restriction


(β, ∈) is a well-order. But β is transitive, since if x ∈ y ∈ β then y ∈ α and
x ∈ α by transitivity of α, so then x ∈ β by the transitivity property of the
order ∈ on α. So β is an ordinal.
Lemma 8.13. Let β ∈ ON.

(i) If α ∈ β then α = β<α . In particular, the elements of β are precisely the


proper initial segments of β.
(ii) If α is transitive (in particular, if α ∈ ON), then α ⊊ β iff α ∈ β.

Proof. (i) β<α = {γ ∈ β : γ ∈ α} = α, since γ ∈ α ⇒ γ ∈ β by transitivity of


β.
(ii) If α ⊆ β then α is an initial segment of β by transitivity of α.
So α ⊊ β iff α is a proper initial segment of β, so we conclude by (i).

Theorem 8.14. The class ON is well-ordered by ∈, i.e. for all α, β, γ ∈ ON:

(i) α ∈
/ α (Irreflexivity)
(ii) α ∈ β ∈ γ ⇒ α ∈ γ (Transitivity)
(iii) α ∈ β or α = β or β ∈ α (Totality)
(iv) Any non-empty class of ordinals has an ∈-least element. (Well-foundedness)

Proof. (i) By Lemma 8.13(ii).


(ii) By transitivity of γ.
(iii) By Lemma 8.13(ii), it suffices to show that α ⊆ β or β ⊆ α. Suppose not.
Then γ := α ∩ β is a proper subset of α and of β. But γ is transitive since
α and β are, so γ ∈ α ∩ β by Lemma 8.13(ii), so γ ∈ γ, which contradicts
(i) since γ is an ordinal by Lemma 8.12.
8 Using Foundation, we could equivalently say “totally ordered”
8 WELL-ORDERED SETS AND ORDINALS 32

(iv) This is immediate from Foundation, but we can also argue directly as
follows. Given a non-empty class Γ of ordinals and γ ∈ Γ, if Γ ∩ γ = ∅
then min Γ = γ, and otherwise min Γ = min(Γ ∩ γ), which exists since γ
is an ordinal.

Corollary 8.15. Any transitive set of ordinals is an ordinal.

Proof. Theorem 8.14 shows that ∈ defines a well-order on any set of ordinals.
Theorem 8.16. ON is a proper class.9

Proof. Suppose ON is a set. Then ON is transitive by Lemma 8.12, so ON is an


ordinal by Corollary 8.15. But then ON ∈ ON, contradicting Theorem 8.14(i).

Lemma 8.17. Isomorphic ordinals are equal.

Proof. By Theorem 8.14(iii) and Lemma 8.13(i), if α, β ∈ ON are not equal


then one is a proper initial segment of the other, so by Lemma 8.7 they are not
isomorphic.

Theorem 8.18 (Hartogs’ Theorem). If X is a set, then there exists α ∈ ON


with |α| ̸≤ |X|.

Proof. Suppose for a contradiction that |α| ≤ |X| for all α ∈ ON. Then for
every α ∈ ON, there is an injection f : α → X, and then f (x) < f (y) ⇔ x ∈ y
defines a well-order on f [X] ⊆ X which is isomorphic to α.
Considering a well-order (Y, <) as an ordered pair ⟨Y, <⟩, and let W ⊆
P(X) × P(X × X) be the set of all well-orders (Y, <) with Y ⊆ X such that
(Y, <) is isomorphic to some ordinal, and let F : W → ON be the class function
such that F((Y, <)) is the ordinal isomorphic to (Y, <), which is unique by
Lemma 8.17. Then F[W ] = ON by the previous paragraph, so ON is a set by
Replacement, contradicting Theorem 8.16.
Theorem 8.19. Every well-ordered set (X, <) is isomorphic to a unique ordinal
by a unique isomorphism.

Proof. Uniqueness of the ordinal is by Lemma 8.17, and uniqueness of the iso-
morphism is by Lemma 8.8.
For existence, by Theorem 8.18 say α ∈ ON with |α| ̸≤ |X|. Then α is not
isomorphic to an initial segment of X, so by Theorem 8.9, X is isomorphic to
an initial segment of α, which is an ordinal by Lemma 8.13.
Lemma 8.20.

(a) (i) 0 = ∅ is an ordinal.


(ii) If α is an ordinal, then so is its successor α+ = α ∪ {α}.
S
(iii) If Γ is a set of ordinals, then Γ is an ordinal.

(b) Every β ∈ ON is of precisely one of the following three types:


9 This is known as the Burali-Forti paradox.
8 WELL-ORDERED SETS AND ORDINALS 33

(i) Zero ordinal: β = 0.


(ii) Successor ordinal: β = α+ for some α ∈ ON.
S
(iii) Limit ordinal: β = β and β ̸= 0.

End of lecture 12

Proof. (a) By Lemma 8.12, 0, α+ , and Γ are sets of ordinals, so by Corol-


S
lary 8.15 it remains only to show that they are transitive. We leave this
verification as an exercise (Sheet 1).
+
(b) 0 is not a successor ordinal nor
S a+limit ordinal. A successor ordinal α is
not a limit ordinal, since α ∈
/ α .
+
Suppose β ∈ ON is not a successor ordinal, andS let α ∈ β. Then
S β ̸= α ,
+
and so α ∈ β by Theorem S 8.14(iii), so α ∈ β. Conversely, β ⊆ β by
transitivity of β. So β = β is either 0 or a limit ordinal.

• ω = N is the first limit ordinal, since ω = ω and every


S
Example 8.21.
n ∈ ω is either zero or a successor.
• We have ordinals ω + , ω ++ , . . . and their limit {ω, ω + , ω ++ , . . .} (defining
S
this set by recursion on ω). We will soon define ordinal addition and write
these as ω + 1, ω + 2, . . . and ω + ω.

8.3 Transfinite recursion


Theorem 8.22 (Transfinite Induction). Let ϕ(x) be a formula with parameters.
Suppose that ϕ(β) holds for every β ∈ ON for which ϕ(γ) holds for all γ ∈ β.
Then ϕ(α) holds for all α ∈ ON.

Proof. Otherwise, let β := min{β ∈ ON : ¬ϕ(β)} (using Theorem 8.14(iv)).


Then ϕ(γ) holds for all γ ∈ β by the minimality of β, so ϕ(β) holds, contradic-
tion.

Theorem 8.23 (Transfinite Recursion). Let G : V → V be a class function.


Then there exists a unique class function F : ON → V such that for all α ∈ ON

F(α) = G(F|α ).

(This make sense because, by Replacement, F|α : α → F[α] is a set for any
α ∈ ON.)

Sketch proof (not examinable). Analogous to the proof of Theorem 5.12.


For α ∈ ON, define an α-approximation to be a function fα : α+ → V such
that fα (β) = G(fα |β ) for all β ∈ α+ .
We show by transfinite induction that a unique α-approximation fα exists
for each α ∈ ON. Indeed, let α ∈ ON and suppose fβ is the unique β-
approximation for each β ∈ α. Note then thatS fβ |γ + = fγ whenever γ ∈ β ∈ α,
since it is a γ-approximation. So g α := {fβ : β ∈ α} is a function (using
Replacement), with domain {β + : β ∈ α} = α. If fα is an α-approximation
S
8 WELL-ORDERED SETS AND ORDINALS 34

and β ∈ α, then again fα |β + = fβ , so fα := gα ∪ {(α, G(gα ))} is the unique


α-approximation.
Now F(α) := fα (α) is the unique class function in the statement – unique
because again the restriction of F to any α+ is an α-approximation.

We typically apply this in the following form.


Corollary 8.24. Let x0 be a set and let S : V → V be a class function. Then
there is a unique class function F : ON → V such that:

• F(0) = x0 .
• F(α+ ) = S(F(α)) for all α ∈ ON.
• If η ∈ ON is a limit ordinal, then F(η) =
S S
{F(β) : β ∈ η} = F[η].

Proof. Define G(f ) as follows. If f is a function with domain an ordinal β:


if β = 0 then set G(f ) = x0 , else if β = α+ is a successor
S (i.e. has a largest
element α) then set G(f ) := S(f (α)), else set G(f ) := ran(f ). Otherwise,
set G(f ) := ∅ (say).
Now apply Theorem 8.23 to obtain F with F(β) = G(F|β ), and note that it
is as required.
Remark 8.25. In fact the proof of Theorem 8.23 yields a uniform version (analo-
gous to Corollary 5.13): if G = Gb has a parameter b, then F = Fb also has this
parameter, and Fb (α) = Gb (Fb |α ) holds for all b. Hence also in Corollary 8.24,
S and x0 may depend on a parameter.
Example 8.26 (Cumulative Hierarchy (not on syllabus)). Apply Corollary 8.24
with S = P to obtain a class function F : ON → V such that, writing Vα for
F(α), we have

• V0 = ∅
• Vα+ = P(Vα )
• Vη = {Vβ : β ∈ η} if η is a limit ordinal.
S

This is called the von Neumann cumulative hierarchy. One proves by transfinite
induction that Vα ⊆ Vβ for α ⊆ β. The rank of a set x is then defined as the
least α ∈ ON such that x ⊆ Vα (i.e. x ∈ Vα+ ), if such exists. The axiom of
Foundation is equivalent, modulo the other axioms of ZF, to the statement that
every set is an element of some Vα , i.e. that every set has a rank.

8.4 Ordinal arithmetic


We now extend our recursive definitions of the arithmetic operations from ω to
ON:
Definition 8.27. Define by Corollary 8.24 (and Remark 8.25) the unique class
functions +, ·, b : ON × ON → ON such that for all α, β ∈ ON:

• – α+0=α
– α + β + = (α + β)+
8 WELL-ORDERED SETS AND ORDINALS 35

S
– α+η = {α + β : β ∈ η} for η a limit ordinal.
• – α·0=0
– α · β+ = α · β + α
S
– α · η = {α · β : β ∈ η} for η a limit ordinal.
• – α0 = 1
+
– αβ = (αβ ) · α
– αη = {αβ : β ∈ η} for η a limit ordinal.
S

Example 8.28.

• 1 + ω = n∈ω 1 + n = ω ̸= ω + = ω + 1.
S

• α · 1 = α · 0 + α = 0 + α = α, where the last equality holds by transfinite


induction on α.
• 2 · ω = n∈ω 2 · n = ω ̸= ω + ω = ω · 2.
S

• 2ω = n∈ω 2n = ω ̸= ω · ω = ω 2 .
S

• 2ω = ω is countable, so it is not in bijection with the set of functions


ω → 2 – beware this conflict in notation!
Fact 8.29. The set of countable ordinals is closed under these arithmetic oper-
ations. Uncountable ordinals do nonetheless exist, by Hartogs’ theorem.
Definition 8.30. Let (A, <A ) and (B, <B ) be linear orders.

• The reverse lexicographic product order (or just product order) is


the linear order (A, <A ) × (B, <B ) := (A × B, <× ) where

(a, b) <× (a′ , b′ ) ⇔ (b <B b′ ∨ (b = b′ ∧ a <A a′ )).

• The sum order is the linear order (A, <A ) + (B, <B ) := ((A × {0}) ∪
(B × {1}), <+ ) where for all a, a′ ∈ A and b, b′ ∈ B:

(a, 0) <+ (a′ , 0) ⇔ a <A a′


(b, 1) <+ (b′ , 1) ⇔ b <B b′
(a, 0) <+ (b, 1).

Theorem 8.31. Let α, β ∈ ON.

(a) (α + β, ∈) ∼
= (α, ∈) + (β, ∈).
(b) (α · β, ∈) ∼
= (α, ∈) × (β, ∈).
End of lecture 13

Proof. (a) By transfinite induction on β for a fixed α:


• β = 0: Immediate.
• β = γ + : α + β = (α + γ)+ , which inductively is isomorphic to the ex-
tension of (α, ∈)+(γ, ∈) by a new greatest element, which is isomorphic
to (α, ∈) + (γ + , ∈).
8 WELL-ORDERED SETS AND ORDINALS 36

• β limit: α+β = γ∈β (α+γ), and inductively (α + γ, ∈) ∼


S
= (α, ∈) + (γ, ∈)
for each γ ∈ β.
Let σγ : (α, ∈) + (γ, ∈) → (α + γ, ∈) be the unique (by Lemma 8.8)
isomorphisms. Then they form a chain: if δ ∈ γ then σγ restricts to
an isomorphism of (α, ∈) + (δ, ∈) with an initial segment of α + γ,
which is also an ordinal and so must be α + δ (by the uniqueness in
Theorem 8.19); hence σγ extends σδ .
S
So their union σS := γ∈β σγ (which is a set by Replacement)S is an
isomorphism of γ∈β ((α, ∈) + (γ, ∈)) = (α, ∈) + (β, ∈) with γ∈β (α +
γ) = α + β.
(b) By the same argument, except that for the successor stage we argue as
follows:

(α · γ + , ∈) = (α · γ + α, ∈) ∼
= (α, ∈) × (γ, ∈) + (α, ∈) ∼
= (α, ∈) × (β, ∈),

where the penultimate isomorphism uses the IH and (a), and the final iso-
morphism is by the definitions of the sum and product orders.

Lemma 8.32. If B ⊆ α ∈ ON is a subset of an ordinal α, then the induced


order (B, ∈) is isomorphic to some β ≤ α.

Proof. Let β be the ordinal isomorphic to (B, ∈). If β ̸≤ α, then α < β, so α is


isomorphic to a proper initial segment of B, say B< b. But then α embeds into
the proper initial segment α< b of α, contradicting Lemma 8.6.
Theorem 8.33. For all α, β, γ ∈ ON:

(a) (i) (α + β) + γ = α + (β + γ).


(ii) β < γ ⇒ α + β < α + γ.
(iii) α ≤ γ ⇒ α + β ≤ γ + β.
(iv) α + β = α + γ ⇒ β = γ.
(v) α ≤ β ⇒ ∃δ ≤ β α + δ = β.
(b) (i) (α · β) · γ = α · (β · γ).
(ii) α · (β + γ) = α · β + α · γ.
(iii) For α ̸= 0, β < γ ⇒ α · β < α · γ.
(iv) α ≤ γ ⇒ α · β ≤ γ · β.

Proof. (a) (i) By the corresponding associativity of the sum of orders.


(ii) If β < γ then (β, ∈) is a proper initial segment of (γ, ∈), so (α, ∈)+(β, ∈
) is a proper initial segment of (α, ∈) + (γ, ∈), so α + β < α + γ.
(iii) (α + β, ∈) is isomorphic to a suborder of (γ + β, ∈) by considering
ordered sums, so α + β ≤ γ + β by Lemma 8.32.
(iv) By (ii) and totality.
9 THE AXIOM OF CHOICE 37

(v) By Lemma 8.32, (β \α, ∈) is isomorphic to (δ, ∈) for some δ ≤ β. Then

(β, ∈) ∼
= (α, ∈) + (β \ α, ∈) ∼
= (α + δ, ∈),

so β = α + δ.
(b) Exercise. Consider product orders, and apply Lemma 8.32 for (iv). (iii) can
also be proven by induction.

9 The Axiom of Choice


Definition 9.1. The Axiom of Choice, AC, is the following statement:
If X is a set of disjoint non-empty sets, then there exists a set C such that
|C ∩ a| = 1 for all a ∈ X.

So Y “chooses” an element of each a ∈ X.


We first give some immediate reformulations of AC.
Lemma 9.2. The following are equivalent:

(a) AC
(b) Every set X has a choice function, a function h : P(X) \ {∅} → X such
that h(A) ∈ A for all ∅ =
̸ A ⊆ X.

End of lecture 14

Proof. • (a) ⇒ (b): The set

Y := {{A} × A : ∅ =
̸ A ⊆ X}

is a set of disjoint non-empty sets, so say C is such that |C ∩({A}×A)| = 1


for all ∅ ≠ A ⊆ X. So for each ∅ = ̸ A ⊆ X there is precisely one a such
that a ∈ A and ⟨A, a⟩ ∈ C, so setting h(A) := a defines a choice function.
Explicitly,
h = {⟨A, a⟩ ∈ C : a ∈ A ∧ A ∈ P(X) \ {∅}}.

• (b) ⇒ (a): SIf X is a set of disjoint non-empty sets and h is a choice


function for X, then C := h[X] is as required.

9.1 The well-ordering principle


Definition 9.3. The well-ordering principle, WO, is the statement that
every set can be well-ordered, i.e. that for every X there exists < such that
(X, <) is a well-order.
Lemma 9.4. WO holds if and only if every set is equinumerous with an ordinal.

Proof. Let X be a set. If X can be well-ordered, then it is in bijection with


an ordinal by Theorem 8.19. Conversely, if f : X → α is a bijection with an
ordinal α, then x < y ⇔ f (x) ∈ f (y) defines a well-order on X.
9 THE AXIOM OF CHOICE 38

Theorem 9.5. AC ⇔ WO.

Proof.⇐: Let X be a set. By WO, X can be well-ordered, and then

min : P(X) \ {∅} → X

is a choice function.

⇒ (Zermelo’s theorem): Let h : P(X) \ {∅} → X be a choice function. Define


by Corollary 8.24 a chain of injections (fα )α∈ON from ordinals to X such
that
• f0 = ∅
(
fα ∪ {⟨α, h(X \ ran(fα ))⟩} if X \ ran(fα ) ̸= ∅
• fα+ =
fα else
• fη =
S
β∈η fβ for η a limit ordinal.

Then by transfinite induction, for all α ∈ ON:


• Either dom(fα ) = α, or ran(fβ ) = X for some β < α.
By Hartogs’ theorem, the second case must occur for some α ∈ ON, so let
β ∈ ON be least such that ran(fβ ) = X (which exists by Theorem 8.14(iv)).
Then dom(fβ ) = β, so fβ : β → X is a bijection, and we conclude by
Lemma 9.4.

9.2 Cardinal comparability


Definition 9.6. Cardinal comparability, CC, is the statement that the or-
dering < on cardinalities is total, i.e. for any two sets X and Y , either |X| ≤ |Y |
or |Y | ≤ |X|.
Theorem 9.7. WO ⇔ CC.

Proof. ⇒: By comparability of well-orders (Theorem 8.9), if sets X and Y


can be well-ordered then one admits an injection to the other.
⇐: Let X be a set. By Hartogs’ Theorem, say |α| ̸≤ |X|. By CC, |X| ≤ |α|,
so there exists an injection f : X → α, and then x < y ⇔ f (x) ∈ f (y)
defines a well-order on X.

9.3 Zorn’s lemma


Definition 9.8. A chain in a partially ordered set (X, <) is a subset C ⊆ X
which is totally ordered by <. An upper bound for a subset A ⊆ X is an
element u ∈ X such that u ≥ a for all a ∈ A. An element m ∈ X is maximal if
m ̸< x for all x ∈ X.
Zorn’s Lemma, ZL, is the statement:
9 THE AXIOM OF CHOICE 39

• If (X, <) is a partially ordered set in which every chain has an upper
bound, then (X, <) has a maximal element.

Theorem 9.9. AC ⇔ ZL.

Proof. ⇒: Let (X, <) be a partial order in which every chain has an upper
bound.
By WO (and Lemma 9.4), there exists a bijection θ : α → X for some
ordinal α.
Define an increasing sequence of chains by transfinite recursion (Corol-
lary 8.24):
– C0 := ∅;
(
Cβ ∪ {θ(β)} if β ∈ α and θ(β) > x for all x ∈ Cβ
– Cβ + :=
Cβ else;
S
– Cη := β∈η Cβ if η is a limit ordinal.

Then, by transfinite induction, Cβ ⊆ Cγ if β ≤ γ, and each Cβ is a chain.


In particular, Cα is a chain, so say u ∈ X is an upper bound for Cα .
Suppose u is not maximal, say x > u. Let β = θ−1 (x) ∈ α. Then
x = θ(β) ∈ Cβ + ⊆ Cα by definition of Cβ + , contradicting u being an
upper bound for Cα . So u is a maximal element.

⇐: Let X be a set. Let P ′ := P(X) \ {∅}. Say h ⊆ P ′ × X is a partial choice


function if it is a function with dom(h) ⊆ P ′ and such that h(A) ∈ A for
all A ∈ dom(h). Then the partial choice functions form a partial order
with respect to inclusion, and any chain has an upper bound, namely the
union of the chain. So by Zorn’s Lemma, a maximal partial choice function
h exists.
We conclude by showing that h is a choice function, i.e. that dom(h) = P ′ .
Suppose not, say A ∈ P ′ \ dom(h). Then A ̸= ∅ by definition of P ′ , so
say a ∈ A. But then h ∪ {⟨A, a⟩} is a partial choice function properly
extending h, contradicting maximality of h.

Remark 9.10. Zorn’s Lemma is often applied in the following special form (which
the above proof shows is actually equivalent to our statement): if a is a set and
X ⊆ P(a) is a non-empty set of subsets of a which is closed under unions
S of non-
empty chains, i.e. if ∅ =
̸ C ⊆ X is totally ordered by inclusion then C ∈ X,
then X has a maximal element with respect to inclusion. This follows from our
statement of Zorn’s Lemma by considering the partial order (X, ⊆); indeed, the
empty chain has an upper bound since X is non-empty, and any non-empty
chain is upper-bounded by its union.

9.4 ZFC
From now on, we assume AC. We could take any of the above equivalent forms
as the axiom; we use our first formulation.
10 CARDINAL NUMBERS 40

AC (Choice): If X is a set of disjoint non-empty sets, then there exists a set


C such that |C ∩ a| = 1 for all a ∈ X:

∀x (∀y ∈ x (y ̸= ∅∧∀y ′ ∈ x y∩y ′ = ∅) → ∃z ∀y ∈ x (∃u ∈ z∩y ∀v ∈ z∩y u = v)

This completes our axiom system ZFC = ZF + AC.


Fact 9.11. Assume ZF is consistent. Gödel proved (using the constructible
universe) that ZFC is then also consistent, i.e. that ZF does not prove ¬AC;
this is covered in the part C course Axiomatic Set Theory. Paul Cohen later
proved (using forcing) that ZF doesn’t prove AC either.
Even the weak form of Choice in which every element of X is of cardinality
2 is not a consequence of ZF (if ZF is consistent). As Russell put it: “To choose
one sock from each of infinitely many pairs of socks requires the Axiom of Choice,
but for shoes the Axiom is not needed” (the idea being that we can consider the
set of left shoes, but the elements of a pair of socks are indistinguishable).

End of lecture 15

10 Cardinal numbers
By WO, every set is equinumerous with an ordinal (this was Lemma 9.4). Using
this, we now redefine our notation |X|:
Definition 10.1. The cardinality |X| of a set X is the smallest ordinal equinu-
merous with X:
|X| := min{α ∈ ON : α ∼ X}.

This accords with our previous notation:


Lemma 10.2. Let X and Y be sets.

(i) |X| = |Y | ⇔ X ∼ Y .
(ii) |X| ≤ |Y | if and only if and only if an injection X → Y exists.

Proof. (i) Immediate.


(ii) By Lemma 8.32, an injection |X| → |Y | exists iff |X| ≤ |Y |, and the result
follows.

Lemma 10.3. For an ordinal α, the following are equivalent.

(i) α = |X| for some set X.


(ii) α = |α|.
(iii) For all β ∈ α, β ̸∼ α.
Definition 10.4. An ordinal satisfying these properties is called a cardinal or
a cardinal number. The infinite cardinals are sometimes also known as initial
ordinals.
The class of cardinals is denoted CN.
10 CARDINAL NUMBERS 41

Proof. • (i) ⇒ (ii): If α = |X| then α ∼ X so |α| = |X| = α.


• (ii) ⇒ (i): Immediate.

• (ii) ⇔ (iii): Immediate from the definition, since α ∼ α.

Lemma 10.5. (i) If κ is a cardinal, then there exists a cardinal greater than
κ.
S
(ii) If K is a set of cardinals, then K is a cardinal.

Proof. (i) By Corollary 6.25, |P(κ)| > |κ| = κ. Alternatively: By Hartogs’


Theorem and cardinal comparability, there is an ordinal α such that |α| >
|κ| = κ.
S S S
(ii) SK is an ordinal by Lemma 8.20(a)(iii).
S Suppose | K| ∈ K. Then
| K| ∈ κ for
S some κ S∈ K, soS| K| < |κ| since κ is a cardinal, contra-
dicting κ ⊆ K. So | K| = K.

This lemma justifies the following definition:

Definition 10.6. Define by transfinite recursion (Corollary 8.24) a class func-


tion ON → CN; α 7→ ℵα such that:

• ℵ0 = ω;
• ℵα+ is the smallest cardinal greater than ℵα ;
• ℵη = β∈η ℵβ if η is a limit ordinal.
S

In particular, we redefine ℵ0 := ω = |N|.


We also write ℵα as ωα when we think of it as an ordinal rather than a
cardinal (see below).

Theorem 10.7.

(i) ℵα ≥ α for all α ∈ ON.


(ii) If α < β ∈ ON then ℵα < ℵβ .

(iii) Every infinite cardinal is of the form ℵα for some α ∈ ON.


(iv) CN is a proper class10 .

Proof. (i) By transfinite induction on α.


(ii) By transfinite induction on β.
10 This is known as Cantor’s paradox.
10 CARDINAL NUMBERS 42

(iii) Let κ be an infinite cardinal. Consider the set

α := {β : ℵβ < κ} = {β ∈ κ : ℵβ < κ},

where the equality is by (i). Then α is an initial segment of κ by (ii), so α


is an ordinal. So α ∈
/ α, hence ℵα ≥ κ. We conclude by showing ℵα ≤ κ.
If α = 0, this follows from κ being infinite.
S
If α is a limit ordinal, then ℵα = β∈α ℵβ ≤ κ since each ℵβ < κ.
If α is a successor ordinal, say α = γ + , then ℵγ < κ, so ℵα = ℵγ + ≤ κ by
definition of ℵγ + .
(iv) By (ii) and (iii), ℵα 7→ α is a well-defined surjective class function CN \
ω → ON. So if CN were a set, then by Replacement so would be ON,
contradicting Theorem 8.16.

10.1 Cardinal arithmetic with Choice


Definition 10.8. We now consider the cardinal arithmetic operations (addition,
multiplication, and exponentiation) defined in Definition 6.20 as operations on
cardinals:

κ + λ := |(κ × {0}) ∪ (λ × {1})| κ · λ := |κ × λ| κλ := |{f : λ → κ}|.

Warning: This leads to an unfortunate ambiguity, since these cardinal arith-


metic operations rarely agree with the ordinal arithmetic operations. In prac-
tice, we get around this by notational conventions: we reserve κ, λ, µ, ν and ℵα
for cardinals, and arithmetic expressions involving these and |X| refer to car-
dinal arithmetic, while expressions involving α, β, γ, δ and ωα refer to ordinal
arithmetic.11

In ZFC, cardinal addition and multiplication are very simple:


Theorem 10.9. Let κ be an infinite cardinal.

(i) κ · κ = κ
(ii) If λ is a cardinal 1 ≤ λ ≤ κ, then κ + λ = κ = κ · λ.
(iii) If λ is an infinite cardinal, then κ + λ = max(κ, λ) = κ · λ.

Proof. (i) By transfinite induction. Assume |α| · |α| = |α| for all infinite
ordinals α < κ. Then

|α| · |α| < κ for all α < κ; (*)

for finite α, this is because |α| · |α| is finite by Proposition 6.22(b).


Define an ordering ◁ on κ × κ by

(α, β) ◁ (α′ , β ′ ) ⇔ (α, β, max(α, β)) <r (α′ , β ′ , max(α′ , β ′ ))


11 To add to the confusion, κ+ is usually defined as the smallest cardinal greater than κ (so

ℵα+ = (ℵα )+ ), which is not the ordinal successor unless κ is finite.


10 CARDINAL NUMBERS 43

where <r is reverse lexicographic order.


This is a well-order, so (κ × κ, ◁) is isomorphic to an ordinal γ. Let S be
a proper initial segment, say S = (κ × κ)◁(α,β) . Set δ := max(α, β). Then
S ⊆ δ + × δ + , and δ + ∈ κ (indeed, this holds if δ is finite since κ is infinite,
and if δ is infinite then |δ + | = |δ| < κ), so by (*), |S| ≤ |δ + | · |δ + | < κ.
Hence γ ≤ κ, since otherwise γ would have a proper initial segment γ<κ
of cardinality κ.
So κ · κ ≤ κ. Conversely, κ = |κ · {0}| ≤ κ · κ. So κ · κ = κ.

(ii) By the monotonicity properties of Proposition 6.22(c) and (i),

κ≤κ+λ≤κ+κ=κ·2≤κ·κ=κ
κ≤κ·λ≤κ·κ=κ

(iii) By (ii) and commutativity of cardinal addition and multiplication (Propo-


sition 6.22(a)).

Lemma 10.10. If f : X → Y is a surjection, then |X| ≥ |Y |.

Proof. Let h be a choice function for X. Then g(y) := h({x ∈ X : f (x) = y})
defines an injection Y → X. So |Y | ≤ |X|.
Theorem 10.11. A countable union of countable sets is countable.
S cardinal, and X is a set such that |X| ≤ κ
More generally, if κ is an infinite
and |a| ≤ κ for all a ∈ X, then | X| ≤ κ.

Proof. For every a ∈ X, there exists an injection a → κ. By Choice, we can


uniformly choose such injections:
S let Ia be the set of injections f : a → κ, let h
be a choice function on {Ia : a ∈ X}, and let fa := h(Ia ).
Let g : X → κ be an injection.
Then Z := {⟨g(a), fa (x)⟩ : x ∈ a ∈ X} is a subset of κ × κ, so |Z| ≤ κ · κ = κ.
S
S Finally, ⟨g(a), fa (x)⟩ 7→ x is a surjection Z → X, so by Lemma 10.10,
| X| ≤ |Z| ≤ κ.
Remark. Choice was essential here: ZF does not prove that a countable union
of countable sets is countable.

10.2 Cardinal exponentation and CH (off-syllabus)


In contrast, very little is determined by ZFC about cardinal exponentiation.

Definition 10.12.

• The Continuum Hypothesis (CH) is the assertion: 2ℵ0 = ℵ1 . In other


words: every uncountable subset of R is in bijection with R.

• The Generalised Continuum Hypothesis (GCH) is the assertion:


2ℵα = ℵα+ for all ordinals α.
11 EXAMPLE: INFINITE DIMENSIONAL VECTOR SPACES 44

Fact 10.13. • CH is independent of ZFC. That is, assuming ZFC is con-


sistent, it proves neither CH nor ¬CH, so both ZFC+CH and ZFC+¬CH
are consistent. The same goes for GCH. As with AC (Fact 9.11), consis-
tency of ZFC+GCH is due to Kurt Gödel and is covered in Part C, and
that of ZFC+¬CH (hence also ZFC+¬GCH) is due to Paul Cohen using
forcing.
• Any counterexample X ⊆ R to CH has to be “complicated”: it can not be
Borel, nor the projection of a Borel set (an analytic set).
• 2ℵ0 ̸= ℵω . More generally, 2ℵ0 is not of the form i∈ω αi for any ordinals
S

αi < 2ℵ0 (i.e. 2ℵ0 does not have countable cofinality). This is all ZFC
tells us about 2ℵ0 , in the sense that for any ℵα which is not of this form,
it is consistent with ZFC that 2ℵ0 = ℵα .
End of lecture 16

11 Example: Infinite dimensional vector spaces


In this section, we illustrate the use of set theory in mathematics by developing
some of the basic theory of linear algebra without assuming finite dimensionality.
(This material is not on the syllabus, but the set theory techniques we use are.)
We use without proof the finite dimensional results (covered in Prelims).
Let V be a vector space over a field K. This implies that V and K are sets,
and the associated algebraic operations (+, · : K ×K → K, and + : V ×V → V ,
and scalar multiplication · : K × V → V ) are functions.
Definition 11.1. A subset B ⊆ V is
• linearly independent if no non-trivial finite linear combination of ele-
ments of B is 0, i.e. if for any n ∈ N, a1 , . . . , an ∈ K, and b1 , . . . , bn ∈ V ,
a1 · b1 + . . . + an · bn = 0 ⇒ a1 = . . . = an = 0;

• spanning if V = ⟨B⟩ where


⟨B⟩ = {a1 · b1 + . . . + an · bn : n ∈ N, a1 , . . . , an ∈ K, b1 , . . . , bn ∈ V }.

• a basis if B is both linearly independent and spanning.


Theorem 11.2. (i) A basis exists.
(ii) Any two bases have the same cardinality. This cardinality is called the
dimension of V .

Proof. (i) We apply Zorn’s Lemma. Consider the set I of linearly independent
subsets of V as a partial order, ordered by inclusion, ⊆. If C ⊆ I is a
chain, then Sits union is also linearly independent, since any finitely many
elements of C are already elements of some I ∈ C. So by Zorn’s Lemma,
there exists a maximal element B ∈ I. We conclude by showing that B is
spanning. Suppose not, say v ∈ V \ ⟨B⟩. Then one verifies directly that
B ∪ {v} is linearly independent, and v ∈ / B, contradicting maximality of
B.
A REFERENCES 45

(ii) Let B and B ′ be bases. The case where B or B ′ is finite was done in
Prelims. So suppose B and B ′ are infinite.
Let P <ω (B) := {B0 ⊆ B : |B0 | < ℵ0 } be S the set of finite subsets of B.
Then |P <ω (B)| = |B|, since P <ω (B) = n∈N B (n) where B (n) := {B0 ⊆
B : |B0 | = n}, and B n → B (n) ; (b1 , . . . , bn ) 7→ {b1 , . . . , bn } is a surjection,
so |B (n) | ≤ |B n | = |B| so |P <ω (B)| ≤ |B| by Theorem 10.11.
If B0 ∈ P <ω (B),Sthen ⟨B0 ⟩ ∩ B ′ is finite by the finite-dimensional case.
But V = ⟨B⟩ = {⟨B0 ⟩ : B0 ∈ P <ω (B)}, so B ′ = {⟨B0 ⟩ ∩ B ′ : B0 ∈
S
P <ω (B)} is a union of |P <ω (B)| = |B| finite sets, so by Theorem 10.11
again, |B ′ | ≤ |B|.
By symmetry, |B ′ | = |B|.

A References
[Copied directly from Jonathan Pila’s notes, with a few amendments]
Text to expand and supplement these notes:

[1] D. Goldrei, Classic set theory, Chapman and Hall/CRC, Boca Raton, 1998.

Highly recommended general audience sources:

[2] S. Aaronson, The Busy Beaver Frontier, https://scottaaronson.blog/?p=4916


[3] S. Lavine, Understanding the infinite, Harvard, 1994.

[4] R. Rucker, Infinity and the mind, Princeton University Press, 1995.
[5] J. Stillwell, Roads to Infinity, CRC Press, AK Peters, 2010.

Other textbooks (more advanced) and notes:

[6] P. Aczel, Non-well founded sets, CSLI Lecture Notes, 14, Stanford, CA. On
web.
[7] M. Hils and F. Loeser, A First Journey through Logic, AMS Student Math-
ematical Library volume 89, 2019.

[8] T. Jech, Set theory, Academic Press, 1978.


[9] R. Knight, b1 Set Theory Lecture Notes.
[10] K. Kunen, Set theory, North-Holland, Amsterdam, 1980.
[11] A. Levy, Basic Set Theory, Springer, Berlin, 1979; reprinted by Dover.

Original texts and interpretations:

[12] G. Cantor, Contributions to the founding of the theory of transfinite numbers


(translated by P. Jourdain), Dover, 1955. p85.
A REFERENCES 46

[13] P. Cohen, Set theory and the continuum hypothesis, W. A. Benjamin, 1966.
[14] P. Cohen and R. Hersch, Non-Cantorian set theory, Scientific American 217
(1967), 104–116.
[15] R. Dedekind, Essays on the theory of numbers, Dover, 1963.
[16] K. Gödel, The consistency of the axiom of choice and the generalized con-
tinuum hypothesis with the axioms of set theory, Annals of Mathematics
Studies 3, Princeton University Press, 1940.
[17] K. Gödel, What is Cantor’s continuum problem? American Mathematical
Monthly, 54 (1964), 515–525. Revised version in Benacerraf and Putnam,
Philosophy of mathematics: selected readings, Prentice-Hall, 1964.

Modern perspectives, histories, and commentaries:

[18] S. Feferman, In the Light of Logic, OUP, Oxford, 1998.


[19] A. Fraenkel, Y. Bar Hillel, A. Levy, Foundations of set theory, North-
Holland, Amsterdam, 1973.
[20] J. Ferreiros, On the relations between Georg Cantor and Richard Dedekind,
Historia Math. 20 (1993), 343–363.
[21] M. Hallett, Cantorian set theory and limitation of size, Oxford Logic Guides
10, OUP, 1984.
[22] J. Hamkins, Is the dream solution of the continuum hypothesis attainable?
Notre Dame J. Symbolic Logic 56 (2015), 135–145.
[23] P. Maddy, Defending the Axioms, OUP, 2011.
[24] Y. Manin, George Cantor and his heritage, http://arxiv.org/abs/math/0209244
(just ignore the stuff about P ̸= N P )
[25] C. McLarty, What does it take to prove Fermat’s Last Theorem? Grothendieck
and the logic of number theory, Bull. Symbolic Logic 16 (2010), 359–377.
[26] G. H. Moore, Zermelo’s axiom of choice, Springer, 1982.
[27] U. Rehmann (Editor-in-Chief), Encylopedia of Mathematics, entry on “Anti-
nomy”, https://www.encyclopediaofmath.org/index.php/Antinomy
[28] S. Shelah, Logical dreams, Bull. Amer. Math. Soc. 40 (2003), 203–228.
[29] J. Stillwell, The continuum problem, Amer. Math. Monthly 109 (2002),
286–297.
[30] W. H. Woodin, Strong axioms of infinity and the search for V , Proc. ICM
Hyderabad (2010). Available online at http://www.mathunion.org/ICM/ICM2010.1/
Also the corresponding lecture can be viewed online.
A REFERENCES 47

A.1 Exam errata


Past exams obtained from some sources are “as given” while those obtained from
the Mathematical Institute sometimes have errors (or obscurities) corrected.
The latter are therefore recommended.
Here are a few errata from recentish exams:
2017.2.b.ii. The set X should be assumed non-empty.
2015.1.b.ii. Remove ‘strictly’, or its wrong!
2015.2.b.ii. Beware that ‘contained’ here means as a subset (not as an
element).
2015.3.c.ii. Has a too easy (but correct) solution.
2014.2.c.iii. This is too hard and should not be attempted.

You might also like