0% found this document useful (0 votes)
8 views

Set Theory Notes 4

Uploaded by

Chief Clasher
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Set Theory Notes 4

Uploaded by

Chief Clasher
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 112

Set Theory

Andrew Marks
October 8, 2021

These notes cover introductory set theory. Starred sections below are op-
tional. They discuss interesting mathematics connected to concepts covered in
the course. Thanks to Cecelia Higgins, Jacob Manaker, Forte Shinko, Marlon
Trifunovic, Spencer Unger, and Eric Wang, for corrections and helpful conver-
sations about the material in the notes.

Contents
1 Introduction 3
1.1 Independence in modern set theory* . . . . . . . . . . . . . . . . 6

2 The axioms of ZFC 10


2.1 Classes and von Neumann-Bernays-Gödel set theory* . . . . . . . 13

3 Wellorderings 15

4 Ordinals 18

5 Transfinite induction and recursion 22


5.1 Goodstein’s theorem* . . . . . . . . . . . . . . . . . . . . . . . . 27

6 The cumulative hierarchy 30

7 The Mostowski collapse 33

8 The axiom of choice 35


8.1 Fragments of the axiom of choice* . . . . . . . . . . . . . . . . . 37

9 Cardinality in ZF 38
9.1 Cardinality in models of the axiom of determinacy* . . . . . . . 41
9.2 Resurrecting Tarski’s theory of cardinal algebras* . . . . . . . . . 41

10 Cofinality 43

1
11 Cardinal arithmetic in ZFC 46
11.1 Some equivalents of CH . . . . . . . . . . . . . . . . . . . . . . . 49
11.2 The singular cardinals hypothesis* . . . . . . . . . . . . . . . . . 51

12 Filters and ultrafilters 53


12.1 Measurable cardinals* . . . . . . . . . . . . . . . . . . . . . . . . 56

13 Ultraproducts 58
13.1 Ultraproducts of metric spaces and asymptotic cones* . . . . . . 60
13.2 The Ax-Grothendieck theorem* . . . . . . . . . . . . . . . . . . . 61

14 Clubs and stationary sets 63

15 Applications of Fodor: ∆-systems and Silver’s theorem 67

16 Trees 70
16.1 Compactness and incompactness in set theory* . . . . . . . . . . 72

17 Suslin trees and ♦ 74

18 Models of set theory and absoluteness 77

19 The reflection theorem 85

20 Gödel’s constructible universe L 89


20.1 Gödel operations and fine structure* . . . . . . . . . . . . . . . . 92

21 Condensation in L and GCH 94

22 V = L implies ♦ 96

23 L and large cardinals 98


23.1 Finding right universe of set theory* . . . . . . . . . . . . . . . . 100

24 The basics of forcing 102

25 Forcing = Truth 106

26 The consistency of ¬CH 110

2
1 Introduction
Set theory began with Cantor’s proof in 1874 that the natural numbers do not
have the same cardinality as the real numbers. Cantor’s original motivation
was to give a new proof of Liouville’s theorem that there are non-algebraic real
numbers1 . However, Cantor soon began researching set theory for its own sake.
Already by 1878 he had articulated the continuum problem: whether there is
any cardinality between that of the natural numbers and the real numbers.
Cantor’s ideas had a profound influence on mathematics, and by 1900, Hilbert
included the continuum problem as the first in his famous list of 23 problems
for mathematics in the 20th century.
Lets recall Cantor’s definition of cardinality. If X and Y are sets, say that
X has cardinality less than or equal to Y and write |X| ≤ |Y | if there is an
injective function from X to Y . Say that X and Y have the same cardinality
and write |X| = |Y | if there is a bijection from X to Y . These definitions agree
with our usual ways of counting the number of elements of finite sets. Cantor’s
insight was to also use these definitions to compare the size of infinite sets.
Lets recall a few basic facts about cardinality2 :
Exercise 1.1. If X is a nonempty set, then |X| ≤ |Y | if and only if there is a
surjection from Y to X.
Say that a set X is finite if it has the same cardinality as a set of the form
{0, . . . , n − 1} for some natural number n. If X is not finite, say that X is
infinite. The smallest size of infinite set is that of the natural numbers N (see
Exercise 1.2). Finally, say a set X is countable if |X| ≤ |N|.
Exercise 1.2. If X is a set, either X has the same cardinality as a finite set,
or |N| ≤ |X|.
Exercise 1.3 (A countable union S of countable sets is countable.). If Xi is a
countable set for every i ∈ N, then i Xi is countable.
Exercise 1.4. If X is an infinite set, and Y is a countable set, then |X| =
|X ∪ Y |.
Exercise 1.5 (Cantor-Shröder-Bernstein). |X| = |Y | if and only if |X| ≤ |Y |
and |Y | ≤ |X|.
We write X ⊆ Y if X is a subset of Y . That is, ∀z(z ∈ X → z ∈ Y ). P(X)
denotes the collection of all subsets of X:

P(X) = {Y : Y ⊆ X}.
1 Recall that a real number is called algebraic if is a root of a nonzero polynomial with

rational coefficients. For example, 2 is algebraic since is a root of the equation x2 − 2.
Cantor showed that the cardinality of the real numbers is greater than that of the algebraic
numbers. Thus, there must be non-algebraic numbers.
2 In this section, we freely use the axiom of choice

3
Exercise 1.6. Show that there is a bijection from P(N) to the real numbers R.
[Hint: x 7→ π2 tan−1 (x) is a bijection from R to (0, 1). Then show there is a
bijection from (0, 1) to P(N) using binary expansions and Exercise 1.4.]
Recall Cantor’s diagonal argument that N has strictly smaller cardinality
than P(N) (and hence R). Suppose f : N → P(N) is any function. Then f is not
onto P(N) and so | P(N)|  |N| by Exercise 1.1. We prove this by constructing
a subset of N that is not in ran(f ). Let D = {n ∈ N : n ∈
/ f (n)}. Then this set
D diagonalizes against f . Since n ∈ D ↔ n ∈ / f (n), D cannot equal f (n) for
any n. Hence, D ∈/ ran(f ) and f is not onto.

Figure 1: Cantor’s diagonal argument. In this figure we’re identifying subsets


of N with infinite binary sequences via their characteristic functions. That is,
letting the nth bit of the infinite binary sequence be 1 if n is an element of the
set, and 0 otherwise.

This exact same argument generalizes to show that given any set X, its
powerset P(X) has larger cardinality.
Exercise 1.7. Show that for every set X, there is no surjection f : X → P(X),
and hence | P(X)|  |X|. [Hint: define D = {x ∈ X : x ∈ / f (x)}. Then show
D∈/ ran(f ).]
Cantor had realized that as a consequence of this theorem, there can be
no universal set: a set containing all other sets. Every set would inject (via
the identity function) into a universal set. But Exercise 1.7 shows that the
powerset of the universal set could not inject into the universal set. Bertrand
Russell traced through this argument (letting f be the identity function and X
be a supposed universal set), and isolated the resulting contradiction into what
is now known as Russell’s paradox. If
D = {x : x ∈
/ x},

4
then is D ∈ D? If D ∈ / D, then D ∈ D by definition, contradiction. But if
D ∈ D, then D ∈ / D by definition, contradiction.
Russell’s writings about this paradox caused a brief crisis in the foundations
of set theory. Allowing ourselves to construct a set containing all mathemati-
cal objects satisfying some given property leads to contradictions. What sets,
then, should we be allowed to construct? Is the whole enterprise of set theory
inconsistent?
The resolution to Russell’s paradox that set theorists have adopted is the so
called iterative conception of set theory3 . All sets are arranged into a cumulative
hierarchy. We begin with a simple collection of sets, and then apply some basic
operations to iteratively create more sets. This produces the hierarchy V of all
sets. The precise set existence axioms we will use will be discussed in the next
section. They are known as Zermelo-Frankel set theory or ZF. We use ZFC to
denote ZF+ the axiom of choice. The first part of this class will be discussing
these axioms of ZFC and axiomatic set theory.

Figure 2: A picture of the set theoretic universe, known as V . At step α, we


construct all sets of “rank” α. Vα denotes all sets of rank less than α.

Note that we will never define what a set is in these notes. We’re taking an
axiomatic viewpoint. ZFC includes some true principles about sets, but not all
of them. We caution that it is false to say “a set is an element of a model of set
theory”. First, this would be circular; a model is defined in model theory using
sets. Second, there are strange models of set theory which we do not want to
use to define what sets are. It would be similarly wrong to say that a natural
number is an element of a model of PA; there are nonstandard models of PA
with infinite elements greater than any natural number4 .
3 Other alternatives to ZFC have been also explored such as Russell’s type theory, or Quine’s

new foundations. They are rarely considered in modern set theory.


4 Indeed, Gödel’s incompleteness theorem says that it’s hopeless to try and axiomatize all

5
The point in examining models of set theory for us will not be to build the
“correct” model. Rather, our goal in examining models of set theory will be to
understand what the axioms of set theory can prove.

1.1 Independence in modern set theory*


In the second part of our class, we’ll begin to discuss some topics around inde-
pendence in set theory.
In reaction to Russell’s paradox, many mathematicians hoped to find a foun-
dation for set theory that could be proved to be free of paradoxes. Gödels work
in 1931 shattered this hope; we can never prove that the ZFC axioms of set
theory are consistent using simple means. Gödel showed that any computable
set of axioms which can interpret and prove basic theorems about the natural
numbers cannot prove its own consistency. From a modern viewpoint, mathe-
matical theories are arranged along a hierarchy of consistency strength, where
T1 ≤CON T2 if Con(T2 ) → Con(T1 ). That is, the consistency of T2 implies the
consistency of T1 .

Figure 3: A picture of some common theories arranged by their consistency


strengths.

An important class of set theoretic assumptions with strong consistency


strength are large cardinal assumptions. These are assumptions that there exist
“very large” cardinal numbers. For example, an inaccessible cardinal is an
uncountable cardinal number κ so that κ is regular (cf(κ) = κ) and κ is a strong
limit (i.e. λ < κ implies 2λ < κ). Informally, this means κ cannot be reached
from below by adding smaller cardinals or applying the powerset operation to
smaller cardinals. If κ is an inaccessible cardinal, then if we stop building the set-
theoretic universe at stage κ (i.e. if we take Vκ ), then we obtain a model of ZFC.
Since ZFC + “there exists an inaccessible cardinal” proves there is a model of
true sentences about the natural numbers. It is similarly hopeless to try and axiomatize all
true principles about sets.

6
ZFC, by Gödel’s completeness theorem, ZFC+“there exists an inaccessible cardinal”
implies Con(ZFC), and therefore ZFC cannot prove there is an inaccessible car-
dinal. This is a typical phenomenon. If κ is a large cardinal, then the uni-
verse restricted to height κ satisfies ZFC, and more generally will contain many
“smaller” large cardinals.
Many other interesting set theoretical statements end up being equivalent
in consistency strength to large cardinal assumptions. For example, ZF +
“all sets of real numbers are Lebesgue measurable”, is equiconsistent with an
inaccessible cardinal, and ZFC + “there is a saturated ideal on ω1 ” is equicon-
sistent with a Woodin cardinal. One of the most important open problems in
modern set theory is proving that the proper forcing axiom PFA is equiconsistent
with a supercompact cardinal.
Because of Gödel’s incompleteness theorem, none of these large cardinals
can be proved to exist from ZFC (and we cannot prove they are consistent
without assuming the consistency of even “larger” cardinals). However, they
are a vital part of the study of modern set theory, and they are viewed as the
“natural” way to increase the consistency strength of the theory of ZFC. The
consistency strength of all “natural” theories has been empirically found to be
linearly ordered and indeed wellordered. This is important evidence that these
theories are mathematically important5 . Large cardinals also create beautiful
and intricate structure in the set theoretic universe which has important and
concrete mathematical consequences (for example, in our understanding of the
real numbers). They are freely used and investigated in modern set theory. We
cannot prove they are consistent, but we deeply believe they are because of the
beautiful and important mathematical structures they create.
This, then, is one source of independence in set theory. Any statement that
implies Con(ZFC) must be either false or independent of ZFC.
However, there is a completely different method for proving independence
from the axioms of ZFC: forcing and inner models. In 1938, Gödel proved that
inside any model of ZF set theory, there is an inner model L consisting of what
are called the constructible sets. This is in a sense the smallest possible
universe of set theory. It contains only the sets one must have by virtue of
these sets being explicitly definable. Gödel showed that this inner model known
as L always satisfies both the axiom of choice and the continuum hypothesis.
This was reassuring to mathematicians who were worried about the validity and
acceptability of the axiom of choice. By Gödel’s theorem if ZF is consistent and
has a model, then ZF + the axiom of choice is also consistent. So using the
axiom of choice cannot add new inconsistencies to set theory.
A huge importance of inner models such Gödel’s L is that they have an
extremely detailed and canonical structure. Indeed, there is a whole study of
“fine structure theory”, which analyzes these canonical models in great detail.
Unfortunately, Gödel’s L is deficient in that sufficiently large cardinals (e.g.
measurable cardinals) cannot exist inside L. One aim of modern inner model
5 The naturalness assumption is very important here; neither linearity or wellfoundedness

are true if we consider all theories extending ZFC.

7
theory is to construct inner models that are compatible with having all large
cardinals, and understanding their structure.
Complementing Gödel’s constructible universe was Cohen’s 1963 invention
of the method of forcing. Given a countable model of ZFC, Cohen showed how
one can add sets to the model to create a larger “outer model” of ZFC. Cohen
used this technique to show that given any countable model of ZFC, one can add
many real numbers to it in order to find an outer model where the continuum
hypothesis is false.
These two results combine to show that if there is a model of ZFC, then
there is a model of both ZFC + CH and another model of ZFC + ¬CH. Thus,
ZFC cannot prove that CH is either true or false, and CH is independent from
ZFC. Philosophers of set theory still fiercely debate questions such as whether
there could be new intuitively justified axioms for set theory that resolve the
continuum hypothesis, or whether CH even has a definite truth value6 .
This, then, is the second source of independence in set theory: we can prove
a statement ϕ is independent from ZFC by constructing outer or inner models
that satisfy both ϕ and its negation. An imperfect analogy is that starting with
any field, we can study its subfields and field extensions. If we find two different
fields, one of which has property ψ and the other does not, then we know the
field axioms do not imply ψ.
The invention of forcing led to a renaissance of independence results in set
theory, many of which had stood open for many decades. For example, Suslin’s
problem7 , which had been open since 1920 was shown to be independent of ZFC
by Solovay and Tennenbaum in 1971. Forcing is also intimately tied to inner
model theory. The canonical structure given by inner models is often a necessary
starting point for a good understanding of the outer models we force to create.
Forcing and inner models also found applications in many different fields of
mathematics. For example, Kaplansky’s conjecture in functional analysis, and
Whitehead’s problem in group theory are independent of ZFC.
There is a deep contrast between the type of independence that comes from
having consistency strength, and the type that comes from forcing/inner mod-
els. While very simple statements (e.g. Π01 statements in arithmetic) can be
independent of ZFC by virtue of having consistency strength, statements which
are shown to be independent of ZFC by forcing and inner models must be very
complicated by so-called absoluteness results. For example, we cannot use forc-
ing to show any Σ12 sentence is independent of ZFC by Shoenfield absoluteness.
Indeed, assuming certain large cardinals exist, CH is in some sense the “sim-
plest” statement that can be proved independent from ZFC by forcing.8
Set theory remains a vibrant and active field of research, and many open
6 Classical large cardinals axioms cannot help resolve this question by a theorem of Levy
and Solovay.
7 Suslin’s problem asks the following: suppose (L, < ) is a dense complete linear order
L
without endpoints in which every collection of pairwise disjoint open intervals is countable.
Then must L be order-isomorphic to the real numbers?
8 It is a theorem of Woodin that assuming the existence of large cardinals, all Σ2 statements
1
are absolute for set forcing, assuming CH (which is itself a Σ21 statement).

8
problems remain. Indeed, even Cantor’s original goal of understanding basic
cardinal arithmetic is still an unfinished puzzle; very simple-seeming questions
about the possible behavior of cardinality in ZFC remain open. For example,
there are deep open questions about the possible behaviors of the exponential
function κ 7→ 2κ in models of set theory.

Open Problem 1.8. If ℵω is a strong limit, is 2ℵω < ℵω1 ?


A weaker theorem along these lines follows from work of Shelah in pcf theory
(see [J] for an introduction).
Theorem 1.9 (Shelah). If ℵω is a strong limit, then 2ℵω < ℵω4 .

9
2 The axioms of ZFC
In this section, we will introduce the axioms of ZFC. The axioms of ZFC are in
the language of set theory L∈ which consists of a single binary relation ∈ of
set membership. Throughout this section, we will introduce notation for certain
sets, functions and relations which are defined in terms of the ∈ relation. For
example, x ⊆ y will abbreviate ∀z(z ∈ x → z ∈ y). We will also use bounded
quantifiers freely: (∃y ∈ x)φ is defined to mean ∃y(y ∈ x ∧ φ), and (∀y ∈ x)φ
is defined to mean ∀y(y ∈ x → φ). The exists unique quantifier: ∃!yϕ(y)
abbreviates ∃y(ϕ(y) ∧ (∀y 0 ϕ(y 0 ) → y 0 = y)).
The axiom of Extensionality: Every set is determined by its members.

∀x∀y[x = y ↔ ∀z(z ∈ x ↔ z ∈ y)]

This axiom essentially defines what it means to be a set. A set x is deter-


mined precisely by what elements its contains. (A set has no order or other
data).
The axiom of Foundation: Every nonempty set contains a ∈-minimal
element.
∀x[x 6= ∅ → ∃y ∈ x∀z ∈ x(z ∈/ y)]
Here x 6= ∅ abbreviates ∃y(y ∈ x). The axiom of foundation says that the
relation ∈ on every set has a minimal element: some y ∈ x with no predecessors
under ∈ in x.
The axiom of foundation also defines what it means to be a set, but in a
more technical sense. We will prove shortly that the axiom of foundation is
equivalent to the statement that every set is an element of the von Neumann
universe V of sets; those that can be obtained from ∅ by iteratively applying
the set existence axioms.9
These first two axioms define what it means to be a set. All the other axioms
of ZFC are set existence axioms which state that certain sets exist.
The axiom of Pairing: Given two sets x and y, there is a set containing
exactly these two sets.

∀x∀y∃w[x ∈ w ∧ y ∈ w ∧ ∀z(z ∈ w → z = x ∨ z = y)]

We let {x, y} denote this set w whose only two elements are x and y. Similarly,
we’ll use {x} to denote the set whose only element is x. The existence of the
set {x} is by the pairing axiom when x = y, so {x} = {x, x}.
Proposition 2.1. The pairing axiom and the axiom of foundation imply that
there is no set x such that x ∈ x.
Proof. Assume for a contradiction there is such a set x, and consider {x}. Then
the only element of {x} is x. However, since x ∈ x, x is not -minimal.
9 Precisely, we’ll show that ZFC − Foundation proves that Foundation ↔ ∀x(x ∈ V ). Here

S as the class of sets that are in Vα for some ordinal α, where V0 = ∅, Vα+1 = P(Vα )
V is defined
and Vλ = α<λ Vα .

10
Since {x, y} = {y, x} we will also define an ordered pair, where the order of
the two elements matters.
Definition 2.2 (Ordered pairs). We define (a, b) = {{a}, {a, b}}
They key property of an ordered pair is the following:
Exercise 2.3. Show that for all sets a, b, c, d, we have (a, b) = (c, d) if and only
if a = c and b = d.
The axiom of Union: Given any set of sets S x, there is a set containing
S
exactly all the element of these sets, denoted x. Precisely, letting y = x
denote ∀z[z ∈ y ↔ ∃w ∈ x(z ∈ w)], the axiom of union states
[
∀x∃y[y = x]

Writing z = x ∪ y for ∀w(w ∈ z ↔ w ∈ x ∨ w ∈ y), S pairing and union prove


that for all sets x and y, x ∪ y exists, since x ∪ y = {x, y}.
The axiom of Nullset: There is a set with no elements. We let x = ∅
abbreviate ¬∃y(y ∈ x). Nullset states:

∃x[x = ∅]

The axiom of Infinity: There exists an inductive set.

∃x[∅ ∈ x ∧ ∀y(y ∈ x → y ∪ {y} ∈ x)]

A set x is inductive if ∅ ∈ x and y ∈ x implies y ∪ {y} ∈ x, so the


above axiom says that an inductive set exists. There are many different ways
to axiomatize the existence of an infinite set, but the version of the axiom of
infinity that we have given will work nicely with how we define the von Neumann
ordinals. In Section 4 we’ll define that if y is an ordinal, then y∪{y} is the ordinal
successor of y. Note that the infinite set x whose existence is guaranteed by the
axiom of infinity must have the following set as a subset: {∅, {∅}, {∅, {∅}}, . . .}.
We will eventually call this set ω: the set containing the ordinals {0, 1, 2, . . .}.
The axiom of Powerset: For every set x, there is a set containing all the
subsets of this set. We let y = P(x) abbreviate ∀z(z ∈ y ↔ z ⊆ x)

∀x∃y[y = P(x)]

The axiom schema of Separation: If x is a set, then every subset of x


that’s definable (from parameters) exists. Formally, for every formula ϕ in the
language of set theory, the following is an axiom

∀x, w1 , . . . , wn ∃y∀z[z ∈ y ↔ z ∈ x ∧ ϕ(x, z, w1 , . . . , wn )]

We will use {z ∈ x : ϕ(z, w1 , . . . , wn )} to abbreviate the set whose existence


is given by this axiom. More generally, we will use {z : ϕ(z, w1 , . . . , wn )} to
denote the collection of all sets z satisfying the formula ϕ(z, w1 , . . . , wn ). In

11
general, this will not be a set (e.g. {z : z ∈ / z}). We will instead call such a
collection a class, and by a class in these notes, we mean all sets z satisfying
some formula ϕ(z, w1 , . . . , wn ) where w1 , . . . , wn are fixed set parameters. In the
case where such a collection is a set then y = {z : ϕ(z, w1 , . . . wn )} abbreviates
∀z(z ∈ y ↔ ϕ(z, w1 , . . . , wn )).
If we let z = x ∩ y abbreviate ∀w(w ∈ z ↔ w ∈ x ∧ w ∈ y), then Separation
implies that for all x and y, there is some set z so that zT= x ∩ y. We can
similarly use separation to show that the sets x \ y and x exist. Letting
X × Y = {(x, y) : x ∈ X ∧ y ∈ Y }, and we can similarly use separation on
P(P(X ∪ Y )) to prove that X × Y is a set.
A binary relation R on X × Y is a subset of X × Y . We say R is a binary
relation on X if it is a binary relation on X × X. We sometimes write x R y
instead of (x, y) ∈ R. A function from X to Y is a subset f ⊆ X × Y so that
∀x ∈ X∃!y ∈ Y (x, y) ∈ f . We write f (x) = y for (x, y) ∈ f . We will use X Y to
denote the set of all function functions from Y to X
X Y = {f : f is a function from Y to X}
which is also a set by the separation axiom. We define injections, surjections,
bijections, and inverses of functions as usual. If R is a binary relation, we let
R−1 = {(y, x) : (x, y) ∈ R}.
Note that separation is actually an axiom schema: there is a separation
axiom for every formula ϕ of the language of set theory. We’ll later prove that
ZFC is not finitely axiomatizable, so this is a necessity. Recall that in PA,
induction is an axiom schema, and similarly, PA is not finitely axiomatizable.
The axiom schema of Replacement: The axiom of replacement says that
if F is a class function and X is a set, then {F (x) : x ∈ X} is a set. A class
function F is a class of ordered pairs so that there does not exist (x, y) ∈ F
and (x, y 0 ) ∈ F so that y 6= y 0 . Formally, for each formula ϕ, the following is an
axiom:

∀v1 , . . . , vn ∀X[(∀x ∈ X∃!yϕ(x, y, v1 , . . . , vn ) →


(∃Y ∀y(y ∈ Y ↔ ∃x ∈ Xϕ(x, y, v1 , . . . , vn )))]
Instead of the replacement axiom, sometimes ZFC is axiomatized using that
axiom schema of collection. The collection axiom says that if x is a set, and ϕ
defines a class from each element of x, then there is a set which meets all these
classes.
∀x, v1 , . . . , vn ∃y∀z ∈ x[∃wϕ(w, z, v1 , . . . , vn ) → ∃w ∈ yϕ(w, z, v1 , . . . , vn )]
We will show that separation and collection are equivalent to replacement
in Section 6 over the other axioms of ZF.
The axiom of Choice: Every set of pairwise disjoint nonempty sets has a
“choice set”.

∀x[(∀y ∈ x(y 6= ∅) ∧ (∀y ∈ x)(∀y 0 ∈ x)(y 6= y → y ∩ y 0 = ∅))


→ (∃z)(∀y ∈ x)(∃w ∈ y)(∀w0 ∈ y)(w0 ∈ z ↔ w0 = w)]

12
There are many different equivalent ways of formulating the axiom of choice.
For example, assuming ZF, the axiom of choice is equivalent to Zorn’s lemma and
the wellordering principle. We’ll prove some of these equivalences in subsequent
sections.
We will let ZF denote all the above axioms except the axiom of choice. We
let AC denote the axiom of choice. We let ZFC denote all the above axioms, so
ZFC = ZF + AC.
There are many equivalent ways of axiomatizing ZFC, and the above is just
one possibility. The axiomatization we have given above is also not minimal
in the sense that some of our axioms imply others. (For example, replacement
and nullset imply separation). The reason we have stated all these axioms even
when there are redundancies is that we will often be interested in models of
fragments of ZFC. For example, if α > ω is a limit ordinal, then Vα is a model
of ZFC − Replacement (and in particular it is important that Vα is a model of
the separation axiom).
Exercise 2.4. Show that the replacement axiom and the nullset axiom imply
the separation axiom. (That is, if a model of set theory satisfies the replacement
schema and nullset, then it satisfies the separation schema).
Exercise 2.5. Let n be a natural number. Show there do not exist sets x1 , . . . , xn
such that x1 ∈ x2 ∈ . . . ∈ xn ∈ x1 . (That is, show that for each n, ZFC proves
the sentence

¬∃x1 . . . xn (x1 ∈ x2 ∧ x2 ∈ x3 ∧ . . . ∧ xn ∈ x1 ).

State which axioms of ZFC you use to prove this.


Exercise 2.6. Carefully prove that the following sets exist. State what axioms
of ZFC you use:
1. For all sets x and y, there is a set z so that ∀w(w ∈ z ↔ w ∈ x ∧ w ∈
/ y).
2. For all nonempty sets x there is a set y so that ∀w(w ∈ y ↔ ∀z ∈ x(w ∈
z)).
3. For all sets a, b there is a set z so that z = {f : f is a function from a to b}.
Exercise 2.7. Show that in ZF, the following are equivalent.
1. AC.
S
2. For every x such that ∀y ∈ x(y 6= ∅), there is a function f : x → x such
that f (y) ∈ y for all y ∈ x.

2.1 Classes and von Neumann-Bernays-Gödel set theory*


Classes play an important role in set theory. We’ve already mentioned some
important classes such as the von Neumann universe V , and Gödels constructible
universe L.

13
There are alternate ways of axiomatizing set theory where we explicitly give
classes formal existence instead of just associating to each formula the class
it defines. Having classes as formally defined objects is often convenient. For
example, many large cardinal axioms say there is a proper class inner model M
of ZFC and an class function j : V → M which is an elementary embedding.
One such way to axiomatize set theory and directly talk about classes is von
Neumann-Bernays-Gödel set theory. In this axiomatization, all the objects of
study are classes. We define a set to be a class which is an element of some
other class. A proper class is a class which is not a set. By convention,
uppercase letters in NBG denote classes, while lowercase letters denote only
sets. So for example, ∃xϕ abbreviates ∃x∃Y (x ∈ Y ∧ ϕ). The axioms of von
Neumann-Bernays-Gödel set theory, abbreviated NBG include the axiom of ex-
tensionality, and all the remaining axioms of ZF, where all quantifiers in these
other axioms range just over sets10 . There is one final axiom schema, the class
comprehension axiom scheme: for every formula ϕ, the axiom:

∀X1 , . . . , Xn ∃Y [∀x(x ∈ Y ↔ ϕ(x, X1 , . . . , Xn )]

saying that ϕ defines a class.


NBG is conservative over ZF; it proves exactly the same formulas about
sets. This is easily proved by showing every model of ZF can be extended to a
model of NBG by adding all definable proper classes to our universe. Similarly,
if we remove all the proper classes from a model of NBG, we obtain a model of
ZF. Hence, if we want to discuss classes in this formal way, we can work with
NBG without changing any of the facts we’ll prove about sets. NBG also has
other advantages, for example it is finitely axiomatizable, while ZF is not.
Choice in NBG is generally taken to be the axiom that there is a global choice
class; a class function F so that for every nonempty set F (x) ∈ x.
10 For example, the axiom of infinity becomes ∃x∃Z(x ∈ Z ∧ x 6= ∅ ∧ ∀y ∈ x(y ∪ {y} ∈ x).

14
3 Wellorderings
Wellfounded relations and wellorderings are central to the study of set theory.
They are important in defining what sets are: the axiom of foundation says
that the ∈ relation on every set is wellfounded. They are also naturally lead to
the definition of the ordinals, which are an essential part of set theory used to
index steps in transfinite constructions and to create notions of rank. Cantor
was originally lead to develop the theory of the ordinals to prove that given any
two sets X and Y either |X| ≤ |Y | or |Y | ≤ |X|.
A strict partial order is a pair (P, <P ) where P is a set and <P is a
binary relation on P that is irreflexive and transitive, so that a ≮P a and
a <P b ∧ b <P c → a <P c for all a, b, c ∈ P . We say that (P, <P ) is linear if
for all a, b ∈ P such that a 6= b, either a <P b or b <P a. We write a ≤P b to
mean a <P b or a = b.
A strict partial order (P, <P ) is wellfounded if every nonempty subset
X ⊆ P contains an element that is <P -minimal inside X. That is, for every
X ⊆ P there is some a ∈ X so that b ≮P a for all b ∈ X. A linear wellfounded
strict partial order is called a wellordering.
Suppose (P, <P ) and (Q, <Q ) are strict partial orders. Then we say a
function f : P → Q is order-preserving if for all a, b ∈ P a <P b implies
f (a) <Q f (b).
Lemma 3.1. If (P, <P ) is a wellfounded strict partial order and f : P → P is
an order preserving function from (<P , P ) to (<P , P ), then f (a) ≮P a for all
a ∈ P.
Proof. Let X = {a ∈ P : f (a) <P a} be the set of points which are moved
“downward”. Assume for a contradiction that X is nonempty. Then by defini-
tion of wellfoundedness, X must have a <P -minimal element a. By definition of
X, f (a) <P a. Since a is minimal in X, and f (a) <P a, we must have f (a) ∈
/ X.
Now since f is order preserving, and f (a) <P a, we must have f (f (a)) <P f (a).
But then f (a) ∈ X by definition of X. Contradiction!
We say that a bijection f from P to Q is an isomorphism from (P, <P ) to
(Q, <Q ) if for all a, b ∈ P , a <P b ↔ f (a) <Q f (b). Hence, both f and f −1 are
order preserving.
Lemma 3.2. If (P, <P ) is a wellordering and f : P → P is an isomorphism
from (P, <P ) to (P, <P ), then f is the identity.
Proof. By the previous lemma, f (a) ≥P a for all a ∈ P . Since f −1 is also an
isomorphism from (P, <P ) to (P, <P ), for all b ∈ P , f −1 (b) ≥P b, so letting
b = f (a), we see a ≥P f (a) for all a ∈ P . Hence, f is the identity.
Corollary 3.3. If (P, <P ) and (Q, <Q ) are isomorphic wellorderings, then there
is a unique isomorphism between them.
Proof. If f, g where two isomorphisms that were not equal, then f −1 ◦ g would
be an isomorphism from (P, <P ) to (P, <P ) that is not the identity.

15
If (P, <P ) is a wellordering and x ∈ P , then the initial segment of (P, <P )
below x, noted (P, <P )  x is the wellordering (Q, <P ∩ Q × Q), where Q =
{a ∈ P : a <P x}. An initial segment of (P, <P ) is an ordering (P, <P )  x
for some x ∈ P . Note for example that an initial segment of a wellordering is
always a wellordering.

Lemma 3.4. No wellordering (P, <P ) is isomorphic to an initial segment of


itself.
Proof. Suppose x ∈ P , and (P, <P ) is isomorphic to (P, <P )  x for some x ∈ P
via the function f , which is therefore an order preserving function from P to
itself. Then f (x) <P x contradicting Lemma 3.1.

Lemma 3.5. Suppose (P, <P ) and (Q, <Q ) are wellorderings. Then exactly
one of the following holds:
• (P, <P ) is isomorphic to (Q, <Q ).
• (P, <P ) is isomorphic to an initial segment of (Q, <Q ).

• An initial segment of (P, <P ) is isomorphic to (Q, <Q ).


Furthermore, this isomorphism is unique.
Proof. Consider the set of pairs

F = {(x, y) : there exists an isomorphism from (P, <P )  x to (Q, <Q )  y}.

We claim that F is the function witnessing that this lemma is true.


Suppose x ∈ P , and y, y 0 ∈ Q are such that y <Q y 0 . Then we cannot have
(x, y) ∈ F and (x, y 0 ) ∈ F since composing one isomorphism and the inverse of
the other would contradict Lemma 3.4. So F is a function.
Similarly, if (x, y), (x0 , y 0 ) ∈ F and x <P x0 , then we claim y <P y 0 . Other-
wise, suppose f is the isomorphism from (P, <P )  x to (Q, <Q )  y, and f 0 is
the isomorphism from (P, <P )  x0 to (Q, <Q )  y 0 . If y 0 ≤Q y, then f −1 ◦f 0 is an
order-preserving function on P , and f −1 (f 0 (x)) <P x, which is a contradiction.
So F is an order preserving partial function from P to Q.
We claim that both dom(F ) and ran(F ) are closed downwards. That is, if
y <Q y 0 and y 0 ∈ ran(F ), then y ∈ ran(F ). This is since if f is an isomorphism
from (P, <P )  x to (Q, <Q )  y 0 , then restricting f to the initial segment
given by f −1 (y) gives an isomorphism from (P, <P )  f −1 (y) to (Q, <Q )  y.
Similarly, if x <P x0 and x0 ∈ dom(F ), then x ∈ dom(F ).
We finally claim P \ dom(F ) and Q \ ran(F ) cannot both be nonempty. If
so, let x be the <P -minimum element of P \ dom(F ) and y be the <Q -minimum
element of Q \ ran(f ). But then F  {x0 ∈ P : x0 < x} is an isomorphism from
(P, <P )  x to (Q, <Q )  y. Contradiction.
Uniqueness follows from Corollary 3.3

16
The significance of Lemma 3.5 is that it shows the relation “(P, <P ) is iso-
morphic to an initial segment of (Q, <Q )” is a linear ordering of the isomorphism
classes of wellorderings. Lemma 3.4 already proves this is an irreflexive ordering.
It is a good exercise to check this is a wellfounded ordering.
To simplify this global order of wellorderings and proofs about it, we intro-
duce the ordinals. Isomorphism classes of wellorders are proper classes which
are awkward to deal with. Instead, ordinals will give us a unique representative
of each isomorphism class of wellorderings.
Remark 3.6. All the results of this section apply more generally to setlike class
wellorderings. That is, classes X with a class linear order <X so that for every
a ∈ X, {b ∈ X : b < a} is a set.

17
4 Ordinals
In this section, we’ll introduce ordinals and show that they are canonical rep-
resentatives of each isomorphism class of wellorderings. This will make it much
easier to deal with the global structure we found in Section 3 on all wellorderings
under the relation “(P, <P ) is isomorphic to an initial segment of (Q, <Q )”.
Our definition of ordinal will use the notion of a transitive set. We call a set
x transitive if for every a ∈ x, if b ∈ a, then b ∈ x. Careful: the ∈ relation
on a transitive set need not be transitive. For example, the set {∅, {∅}, {{∅}} is
transitive, but the ∈ relation on this set {∅, {∅}, {{∅}} is not transitive (∅ ∈ {∅}
and {∅} ∈ {{∅}}, but ∅ ∈ / {{∅}}).
Exercise 4.1. x is transitive iff for every y ∈ x, y ⊆ x.

Exercise 4.2 (Unions and intersections of transitive


S sets are transitive). If X
T
is a set and every x ∈ X is transitive, then X and X are transitive.
Definition 4.3. An ordinal is a transitive set x so that the ∈ relation on x is a
wellordering. We let ORD denote the class of all ordinals. If α, β are ordinals,
we define α < β iff α ∈ β. We will use lowercase Greek letters α, β, γ, λ, . . . for
ordinals.
For example, it is easy to check that the sets ∅, {∅}, and {∅, {∅}} are ordinals.
We will see eventually that these are the first three ordinals which we’ll call 0,
1, and 2.
A technical detail in this section is that we will not use the axiom of foun-
dation. This is because we’ll want to use the ordinals later to prove that the
axiom of foundation is equivalent to ∀x(x ∈ V ). Note for example that if α is
an ordinal, α ∈/ α just by the definition of ordinal: if α ∈ α, this would imply
that ∈ is not an irreflexive relation on α, and hence not a strict partial order.
So we don’t need to use the axiom of foundation and Proposition 2.1 to show
that α ∈/ α.
Our first goal is to prove that the order < on the ordinals is a wellordering.
Lemma 4.4. If α 6= β are ordinals, and α ( β, then α ∈ β.
Proof. Let γ be the ∈-least element of the set β \ α. Since α is transitive, it
follows that α is the initial segment of β given by γ. Thus, α = {ξ ∈ β : ξ <
γ} = {ξ ∈ β : ξ ∈ γ} = γ, so α ∈ β.

Lemma 4.5. If α is an ordinal and β ∈ α, then β is an ordinal, and β is an


initial segment of α under <.
Proof. First we show that β is transitive. Suppose b ∈ a ∈ β. Then a ∈ α and
b ∈ α since α is transitive. Since α is linearly ordered by ∈ we must have that
either b ∈ β, b = β, or β ∈ b. If b = β or β ∈ b, then the set {b, a, β} (which
exists by Pairing) would have no -minimal element contradicting that ∈ is a
wellordering of α. So we must have β ∈ b.

18
Next, β = {γ : γ ∈ β} = {γ ∈ α : γ ∈ β} = {γ ∈ α : γ < β} since every
element of β is an element of α by transitivity.
Finally, ∈ is a wellordering of β, since β is an initial segment of α.
It follows from this lemma that each ordinal is equal to the set of ordinals
that are less than it.

α = {β : β ∈ α} = {β ∈ ORD : β ∈ α} = {β ∈ ORD : β < α}.

Now we’re ready to prove the trichotomy property for the ordering < on
ORD.
Lemma 4.6. If α, β are ordinals and α 6= β, then either α ⊆ β or β ⊆ α.

Proof. Let γ = α ∩ β. Now γ is an ordinal since it is transitive (the intersec-


tion of two transitive sets is transitive), and any subset of a wellordering is a
wellordering.
Suppose γ is not equal to α or β. Then γ ∈ α and γ ∈ β by Lemma 4.4. So
γ ∈ γ, which contradicts the definition of an ordinal.

Applying Lemma 4.4 gives the following corollary:


Corollary 4.7. If α 6= β are ordinals, then α < β or β < α. So < is a linear
ordering of the class of ordinals.
Next, we show that < is a wellordering of ORD, which we’ve already shown
is linear.
Lemma 4.8. < is a wellfounded ordering of the class of ordinals.
Proof. Suppose A ⊆ ORD is a nonempty class of ordinals, and α ∈ A. If α is
not the least element of A, then A ∩ α is a nonempty subset of α. Hence, it has
a least element.
Definition 4.9. If A is a nonempty class of ordinals, we let inf(A) denote its
least element.
We can similarly defined sup(A) for any set of ordinals. First, we show that
any set of ordinals has an upper bound:

Lemma 4.10. If X is any set of ordinals, then there is an ordinal β such that
β ≥ α for every α ∈ X.
S
Proof. Consider the set β = X. β is transitive since it is a union of transitive
sets. It is a set of ordinals, and so it is linearly ordered by ∈ by Corollary 4.7
and every subset of β has a minimal element by Lemma 4.8. Finally, if α ∈ X,
then α ⊆ β, and so α ≤ β.
Definition 4.11. If X is a set of ordinals, we let sup X = inf({β : (∀α ∈
X)[β ≥ α]} denote the least upper bound of X.

19
Next, we show every wellorder is isomorphic to a unique ordinal. First we
give an exercise:
Exercise 4.12. If α is a set of ordinals, then α is an ordinal iff α is a transitive
set.
Note that a set α of ordinals being transitive is equivalent to being closed
downwards under <; i.e. α is transitive if β ∈ α, and γ < β, implies γ ∈ α.
Lemma 4.13. If (P, <P ) is a wellordering, then (P, <P ) is isomorphic to a
unique ordinal.
Proof. Every wellordering is isomorphic to at most one ordinal by Lemma 3.4.
Consider F = {(x, α) : α is isomorphic to the initial segment (P, <P )  x}. It is
clear that F is an order preserving map, that dom(F ) is closed downwards, and
ran(F ) is a transitive set of ordinals. Finally, if dom(F ) 6= P , then letting x be
the least element of P not in dom(F ), we see that F is an isomorphism from
(P, <P )  x to a set of ordinals which is closed downwards, and thus must be an
ordinal. Contradiction. Hence, F is an isomorphism from P to an ordinal.
Definition 4.14. If P is a wellordering, we let ot((P, <P )) denote the unique
ordinal isomorphic to P ; the ordertype of P .
There are two types of ordinals: successor ordinals and limit ordinals:
Definition 4.15. If α is an ordinal, we define α + 1 to be α ∪ {α}. We say α
is a successor ordinal if there is an ordinal β so that α = β + 1. If α is not
a successor ordinal, we call α a limit ordinal.
We have the following easy lemma:
Lemma 4.16. If α is an ordinal, α + 1 is an ordinal.
Proof. First, α + 1 is transitive. Suppose a ∈ α + 1 and b ∈ a. We want to show
b ∈ α + 1. Case 1: if a ∈ α, then b ∈ α since α is transitive, hence b ∈ α + 1.
Case 2: if a = α, then since b ∈ α, b ∈ α ∪ {α} = α + 1.
The verification that α+1 is wellfounded and linearly ordered by  is similarly
simple; since every element of α + 1 is either α, or an element of α.
We have the following simple facts about α + 1.
Lemma 4.17. If α is an ordinal, then β < α + 1 if and only if β ≤ α.
Proof. If β < α + 1, then either β ∈ α and so β < α, or β ∈ {α} and so
β = α.
Next, we want to show that there are nonzero limit ordinals. The least such
ordinal is ω.
Definition 4.18. We let
\
ω= {x : x is inductive}

We call the elements of ω natural numbers. We let 0 denote the ordinal ∅.

20
To see that ω is a set, note that if we let x0 be any inductive set (which
exists by the Infinity axiom), then ω is a subset of x0 which can be defined by
the separation axiom.
Lemma 4.19 (Mathematical Induction). Suppose A ⊆ ω is such that 0 ∈ A
and for all y, y ∈ A → y + 1 ∈ A. Then A = ω.
Proof. A ⊆ ω by assumption. A is inductive, so ω ⊆ A.
Lemma 4.20. Every natural number is an ordinal.
Proof. Let A be the elements of ω that are ordinals. Now ∅ is an ordinal, and
if α is an ordinal, then α + 1 is an ordinal. Hence by Lemma 4.19, A = ω, and
every natural number is an ordinal.
Lemma 4.21. ω is an ordinal.
Proof. It is easy to see that ω is transitive by induction. Let A = {α ∈ ω : (∀β ∈
α)β ∈ ω}. Clearly ∅ ∈ ω. Next, suppose α ∈ A. Then α + 1 ∈ A, since every
β ∈ α + 1 has either β = α, or β ∈ α.
By Lemma 4.6, ω is linear.
Exercise 4.22. ω is the least nonzero ordinal which is not a successor ordinal.
An sequence is a function f with domain ω. We often write hfn : n ∈
ωi to represent sequences. Similarly, a transfinite sequence is a function f
whose domain is an infinite ordinal α, and we use the notation hfβ : β < αi for
transfinite sequences. We will also consider ORD length sequences which will
be class functions on ORD.
Exercise 4.23. Show that a strict partial order (P, <P ) is wellfounded iff there
is no sequence han : n ∈ ωi such that (∀n ∈ ω)an+1 <P an .
Exercise 4.24.
1. For each α, α = {β ∈ ORD : β < α}.
T T
2. If C Tis a nonempty class of ordinals, then C is an ordinal, C ∈ C,
and C = inf C.
S
3. If X is a nonempty set Sof ordinals, then X is an ordinal. If X has no
maximal element, then X = sup X.
4. If α is an ordinal, α + 1 = inf{β : β > α}.
Exercise 4.25.
1. Show that the class ORD of all ordinals is not a set.
2. Say that an ordinal is countable if it is isomorphic to a wellordering on
a subset of ω. Prove that the class of countable ordinals is a set. Carefully
state the axioms of ZFC that you use.

21
5 Transfinite induction and recursion
We can now formulate the principles of transfinite induction and recursion.
Transfinite induction is a proof technique we use to prove statements about
all ordinals, analogously to how we use ordinary induction to prove statements
about all natural numbers. Transfinite recursion lets us recursively define func-
tions on the ordinals, similarly to how ordinary recursion lets us recursively
define functions on ω.
Theorem 5.1 (Transfinite induction). Suppose C is a class of ordinals such
that
• 0 ∈ C,
• For all α, α ∈ C → α + 1 ∈ C,
• If λ is a nonzero limit ordinal (∀α < λ)(α ∈ C) → λ ∈ C.
Then C = ORD.
Proof. Suppose λ is the least ordinal such that λ ∈
/ C. Then apply one of the
three conditions above.
If X and Y are classes (perhaps proper classes), then a class function F
from X to Y is a class that is a subclass of X × Y = {(x, y) : x ∈ X ∧ y ∈ Y },
such that for every x ∈ X, there is a unique y ∈ Y such that (x, y) ∈ F .
Theorem 5.2 (Transfinite recursion). Let G be a class function (on V ). Then
there is a unique class function F such that for all α ∈ ORD,

F (α) = G(F  α) (*)

Note that F  α = F  {β : β < α}.


Proof. First we prove uniqueness. Suppose f, f 0 are two class functions on
ordinals, or ORD satisfying (*) for all α ∈ dom(f ) ∩ dom(f 0 ). Then we claim
f = f 0 on dom(f ) ∩ dom(f 0 ). Suppose not, and let α be the least ordinal
such that f (α) 6= f 0 (α). Then have a contradiction, since by choice of α,
f 0  α = f  α. So since (*) is true on dom(f ) and dom(f 0 ),

f (α) = G(f  α) = G(f 0  α) = f 0 (α).

Now we define F . Let F be the set of pairs (β, y) such that there exists
an f such that dom(f ) ∈ ORD, and f (α) = G(f  α) for all α ∈ dom(f ),
β ∈ dom(f ), and f (β) = y.
F is a function by the uniqueness we’ve proved above. We claim dom(F ) =
ORD, if not, let β be the least element not in dom(F ). Then by definition,
there is some function f with dom(f ) = β such that (*) is true on dom(f ).
Now let f 0 = f ∪ {(β, G(f ))}. Then f 0 also satisfies (*), and has domain β + 1.
Contradiction!

22
We give some examples of transfinite induction and recursion. We will start
with some operations on ordinals.
Definition 5.3. Define ordinal addition by recursion as follows:
• α + 0 = α.

• α + (β + 1) = (α + β) + 1
• α + λ = sup{α + β : β < λ}.
Technically for each fixed α, we’re defining the function β 7→ α + β by
recursion.
There is a different way to conceive of ordinal addition rather than this
recursive definition of “iterated successor”. The ordertype of α + β is the same
as the ordertype of the order of α followed by β.
Lemma 5.4. For all ordinals α, β, the ordertype of α + β is the same as the
ordertype of the set α t β = {0} × α ∪ {1} × β equipped with the ordering ≺
where (m, γ) ≺ (n, δ) if n < m, or n = m and γ ≺ δ.
Proof. We prove this for each α by transfinite induction on β. Our base case is
that α is isomorphic to α t ∅, and if α + β is isomorphic to α t β, then if we add
one more point to each order, α + (β + 1) is isomorphic to α t β + 1. Finally,
given any two isomorphic wellorders, there is a unique isomorphism between
them. SoSif α + β is isomorphic to α t β for each β < λ via the isomorphism
fβ , then β<λ fβ is an isomorphism between α + λ and α t λ.

Similarly, we can definition ordinal multiplication and exponentiation:


Definition 5.5. Define ordinal multiplication by recursion:

• α · 0 = 0.
• α · (β + 1) = (α · β) + α
• α · λ = sup{α · β : β < λ} for limit λ.
Exercise 5.6. Show that α · β is isomorphic to “β many copies of α”. That is,
α · β is isomorphic to the lexicographic order on α × β, where (γ, δ) <lex (λ, ξ)
iff γ < λ or γ = λ and δ < ξ.
Caution: neither + nor times are commutative. For example, 1 + ω = ω 6=
ω + 1 and 2 · ω 6= ω · 2.

23
Figure 4: The ordinals up through ω ω . Source: https://commons.wikimedia.
org/wiki/File:Omega-exp-omega-labeled.svg

24
Figure 5: Ordinal multiplication is not commutative. 2 · ω 6= ω · 2.

Definition 5.7. Define ordinal exponentiation by recursion:


• α0 = 1
• αβ+1 = αβ · α

• αλ = sup{αβ : β < λ} for limit λ.


Cantor normal form gives a unique way of expressing every ordinal in
terms of the above operations and smaller ordinals:
Exercise 5.8. For every ordinal α > 0, there are natural numbers k0 , . . . , kn
and ordinals α ≥ β0 > β1 > . . . > βn such that

α = ω β0 · k0 + . . . + ω βn · kn .

Furthermore, this representation is unique.


Cantor normal form is not so useful for understanding arbitrary ordinals.
For example, there are ordinals α such that ω α = α, and whose Cantor normal
forms are “trivial”. This follows from the following exercise:
Exercise 5.9. Suppose hγα iα∈ORD is a sequence of ordinals such that if λ is a
limit ordinal, then γλ = sup{γα : α < λ}. Then this sequence contains arbitrar-
ily large fixed points. That is, arbitrarily large α such that γα = α.

Next, we give a characterization of when a relation is wellfounded in terms of


ranks. Say that a relation R on a set X is wellfounded if for every set Y ⊆ X,
there is an element a ∈ Y that is R-minimal inside Y so for every b ∈ Y , b R a.
(We previously defined wellfoundedness for partial orders and not for arbitrary
relations). So for example, every wellfounded relation is irreflexive: if a R a,
then the set {a} would have no R-minimal element.

25
Definition 5.10. Suppose R is a relation on a set X. Define sets Xα where
α ∈ ORD by transfinite recursion as follows.
• X0 = {x ∈ X : x is R-minimal in X}.
• Xα = β<α Xβ ∪ {x ∈ X : x is R-minimal in X \ β<α Xβ }.
S S

define rankR : X → ORD ∪ {∞} by rankR (x) = inf{α : x ∈ Xα } if there is some


α ∈ ORD such that x ∈ Xα . Otherwise, let rankR (x) = ∞.
Here ∞ should be understood as just a formal symbol. We interpret ∞ as
being larger than any ordinal, and ∞ < ∞. If C ⊆ ORD is empty, we define
inf(C) = ∞. The rank function has two key properties:
Lemma 5.11.
1. For all a ∈ X, rankR (a) = sup{rankR (b) + 1 : b R a}.
2. If a R b, then rankR (a) < rankR (b).
Proof. It is clear by induction
S that if α < β, then Xα ⊆ Xβ . So rankR (a) = α
iff a is R-minimal in X \ β<α Xβ . S
Now we prove (1). Assume a is R-minimal in X \ β<α Xβ . Then for all
b R a, we must have rankR (b) < α (otherwise a would not be R-minimal in this
set), so rankR (b) + 1 ≤ α, so rankR (a) ≥ sup{rankR (b) + 1 : b R a}. We must
also have that rankR (a) ≤ sup{rankR (b) + 1 :Sb R a} since if α > rankR (b) for
all b R a, then clearly a is R-minimal in X \ β<α Xβ if it is contained in this
set.
(2) follows from (1).
Lemma 5.12. A relation R on a set X is wellfounded if and only if rankR (a) ∈
ORD for every a ∈ X.
Proof. Suppose R is wellfounded. Let Y ⊆ X be Y = {a ∈ X : rank(a) = ∞}.
Let a be a minimal element of X. Then rank(a) = sup{rank(b) + 1 : b <P a}
by Proposition 5.11.(1). This is the sup of a set of ordinals, and is hence an
ordinal. Contradiction.
Suppose rankR (a) ∈ ORD for every a ∈ X, and Y ⊆ X. Since ORD is
wellfounded, let α be the minimal element of {rankR (a) : a ∈ Y }, and let x ∈ X
be such that rank(x) = α. Then by Lemma 5.11.(2), x is R-minimal in Y .
Exercise 5.13. Consider that the set of polynomials with natural number co-
efficients under the eventual domination ordering: p(x) <∗ q(x) if ∃x∀x0 >
x(p(x) < q(x)). Show that this is a wellordering isomorphic to ω ω .
Exercise 5.14. Say an ordinal α is indecomposable iff there do not exist β, γ <
α such that α = β + γ. That is, α cannot be decomposed into the sum of two
smaller ordinals. So ω and ω 2 are examples of ordinals that are indecomposable.
Show the following are equivalent:
1. α is indecomposable.

26
2. α is a power of ω. That is, α = ω ξ for some ξ.
3. ∀X ⊆ α, either (X, ∈) is isomorphic to α or (α \ X, ∈) is isomorphic to
α.
Exercise 5.15. Show that there is an ordinal ξ such that ω ξ = ξ. Describe the
least such ordinal (it is called 0 ).
Exercise 5.16. Show that in ZFC − Infinity (i.e. ZFC without the axiom of
infinity), the following are equivalent:
1. The axiom of infinity.

2. ∃x(∅ ∈ x ∧ (∀y ∈ x)({y} ∈ x)).


Exercise 5.17. Recall if A is a subset of a topological space X and x ∈ A, then
x is isolated in A if there is an open set U such that U ∩ A = {x}. A is
perfect if there are no isolated points in A Prove the continuum hypothesis is
true for closed subset of R sets:

1. Show that if A ⊆ R is closed, then there is a countable set C ⊆ A so that


A \ C is perfect and closed. [Hint: by transfinite recursion, let A0 = A,
Tα+1 = Aα \ {x : x is isolated in Aα }, and if λ is a limit, then Aλ =
A
α<λ Aα . Show that for every α, A\Aα is closed (it is equal to A minus an
open set). Then show there is a countable ordinal α such that Aα+1 = Aα
and hence Aα is perfect (to see this, use the fact that R has a countable
basis: all intervals (a, b) with rational endpoints).]
2. Show that if A0 ⊆ R is a nonempty closed perfect set, then there is an
injection from P(N) to A.

5.1 Goodstein’s theorem*


A hereditary representation of a number in base b is that number expressed
as a sum of powers of b, where the exponents are also recursively represented
as sums of power of b, and the exponents of the exponents and so on. So for
example:
21 +20
+20 2 1
+20
537 = 29 + 24 + 23 + 20 = 22 + 22 + 22 + 20

where the last representation is hereditary in base 2.


Given a number n, the Goodstein sequence G2 (n), G3 (n), . . . is defined by
setting G2 (n) = n, and Gb+1 (n) is obtained by writing Gb (n) in hereditary base
b, replacing all occurrences of b with b + 1, and then subtracting one. So for
example, if we start at the number 3

• G2 (3) = 21 + 20 = 3
• G3 (3) = 31 + 30 − 1 = 31 = 3

27
• G4 (3) = 41 − 1 = 3
• G5 (3) = 3 − 1 = 2
• G6 (3) = 2 − 1 = 1
• G7 (3) = 1 − 1 = 0

If we start at the number 4,


1
• G2 (4) = 22 = 4
• G3 (4) = 33 − 1 = 2 · 32 + 2 · 3 + 2 = 26

• G4 (4) = 2 · 42 + 2 · 3 + 2 − 1 = 2 · 42 + 2 · 4 + 1 = 41
• G5 (4) = 2 · 52 + 2 · 5 + 1 − 1 = 2 · 52 + 2 · 5 = 60
• G5 (4) = 2 · 62 + 2 · 6 − 1 = 2 · 62 + 6 + 5 = 83
.
• ..
• G3·2402653211 −1 (4) = 0
Now the Goodstein sequence G2 (n), G3 (n), . . . starting at any number will
eventually reach 0. To see this, replace the expressions of each number Gb (n)
in hereditary base b, with the ordinal G∗b (n) obtained by replacing b with ω. So
for example,
1
• G∗2 (4) = ω ω

• G∗3 (4) = ω 2 · 2 + ω · 2 + 2
• G∗4 (4) = ω 2 · 2 + ω · 2 + 1
• G∗5 (4) = ω 2 · 2 + ω · 2

• G∗6 (4) = ω 2 · 2 + ω + 4
.
• ..

Then G∗2 (n), G∗3 (n), G∗4 (n), . . . is a decreasing sequence of ordinals, which
must therefore eventually reach 0. We therefore have Goodstein’s theorem:
Theorem 5.18 (Goodstein’s theorem). For every natural number n, the se-
quence G2 (n), G3 (n), . . . eventually reaches 0.
This is a famous example of a theorem about the natural numbers which
cannot be proved in Peano Arithmetic PA.
Theorem 5.19 (Kirby-Paris, 1982). PA 6` Goodstein’s theorem

28
The crux of their proof is that while PA can formalize and prove some basic
facts about transfinite induction, it only has sufficient power to handle trans-
finite induction through ordinals less than some finite height tower. Proving
Goodstein’s theorem truly requires being able to perform transfinite induc-
ω
tion through 0 = sup{ω, ω ω , ω ω , . . .}. Indeed, PA + Goodstein’s theorem `
Con(PA). Another famous theorem which is true but not provable in PA is the
Paris-Harrington theorem in Ramsey theory.

29
6 The cumulative hierarchy
We define the cumulative hierarchy V of sets by transfinite iterating the powerset
operation.
Definition 6.1. For each ordinal α, we define a set Vα as follows:
• V0 = ∅
• Vα+1 = P(Vα ) for all α
• Vλ = β<λ Vβ , for λ a limit ordinal.
S
S
Let V be the class α∈ORD Vα = {x : ∃αx ∈ Vα }.
We have the following proposition:
Proposition 6.2.
1. Vα is transitive.
2. α ≤ β → Vα ⊆ Vβ .
3. α ∈ Vα+1 \ Vα .
4. Vα ∈ Vα+1 \ Vα .
Proof. We prove (1) by transfinite induction. For our base case, ∅ is transitive.
At successor steps suppose a ∈ Vα+1 and b ∈ a. Then a ⊆ Vα by definition of
Vα+1 , so since b ∈ a, we have b ∈ Vα . Since Vα is transitive by our induction
hypothesis, if c ∈ b, then c ∈ Vα . Hence, b = {c : c ∈ b} ⊆ Vα , so b ∈ Vα+1 . At
limit steps a union of transitive sets is transitive.
We prove (2) for each α by transfinite induction on β. Clearly Vα ⊆ Vα .
Assume Vα ⊆ Vβ . Then if x ∈ Vα and hence x ∈ Vβ , then since Vβ is transitive
by (1), if y ∈ x, then y ∈ Vβ . Hence, {y : y ∈ x} ⊆ Vβ so it is an element of
Vβ+1 . (2) is clear for limit ordinals β.
To prove (3), note that {β ∈ Vα : β ∈ ORD} = α by our induction hypothesis
and (2). So α ∈ Vα+1 , and α ∈ / Vα .
We prove (4) without the axiom of foundation. Clearly Vα ∈ Vα+1 . Suppose
Vα ∈ Vα . Then Vα ⊆ Vβ for some β < α. Since α ⊆ Vα by (3), we would have
α ⊆ Vβ , and hence α ∈ Vβ+1 . But then α ∈ Vα , contradicting (3).
We can now use the cumulative hierarchy to define rank for elements of V :
Definition 6.3. If x ∈ V , then rank(x) is the least ordinal α such that x ⊆ Vα .
So for instance, rank(Vα ) = α, and rank(α) = α.
Exercise
S 6.4. Rank can also be defined by transfinite induction: rank(x) =
{rank(y) + 1 : y ∈ x}.
Our next goal is to prove that the Foundation axiom is equivalent to the
axiom that every set is in V . To do this, we first define the transitive closure of
a set.

30
S
Definition 6.5. For any set x, let x0 = x, and xn+1S= xn ∪ xn . Then define
the transitive closure of x to be the set TC(x) = n<ω xn
Exercise 6.6. TC(x) is the smallest transitive set containing x.
Proposition 6.7. Assume ZF − Foundation. Then the following are equivalent:
• The axiom of foundation
• ∀x(x ∈ V ).
Proof. Note that we didn’t use the axiom of foundation in our development of
the ordinals or V . For example, we defined an ordinal to be a transitive set
α so ∈ is a strict wellorder of α. So the fact that α ∈ / α follows by definition
(otherwise the ∈ ordering would not be strict), and not by foundation.
First, let’s prove that ∀x(x ∈ V ), assuming the axiom of foundation. Let x
be an arbitrary set. We claim for all y ∈ TC(x), there is an α such that y ∈ Vα .
Then setting ξ = sup{rank(y) + 1 : y ∈ TC(x)} (which is a set by the axiom
of replacement since it is the range of the function y 7→ rank(y) + 1 on the set
TC(x)), x ⊆ TC(x) ⊆ Vξ , so x ∈ Vξ+1 . To prove the claim suppose y is an
∈-minimal element of TC(x) such that y ∈ / V . But then every element of y is
in Vα for some α, hence if β = sup{rank(z) + 1 : z ∈ y}, then y ⊆ Vβ , hence
y ∈ Vβ+1 , contradiction.
Conversely, assume ∀x(x ∈ V ) and suppose x ∈ V . We want to show x has
an ∈-minimal element. Let α be the least element of {rank(y) : y ∈ x} and let
y ∈ X be such that rank(y) = α. Since rank(y) = sup({rank(z) + 1 : z ∈ y}),
it must be that no element of y is in x, otherwise α would not be the minimal
rank of an element of x.
One way of using ranks is that it allows us to select set-sized subsets of
proper classes in a canonical way. Suppose C is a nonempty proper class. Then
let Ĉ = {x ∈ C : ∀y ∈ C rank(x) ≤ rank(y)} Then Ĉ is always a nonempty
subset of C. This idea is known sometimes as Scott’s trick and allows us to do
things like formalize ultrapowers of proper class inner models in a coherent way.
We can use the above trick to show that the axiom of Replacement is equiv-
alent to Separation and Collection. Recall that the only difference between
replacement and collection is that replacement says that there is a unique y
such that ϕ(x, y, w1 , . . . , wn )) whereas collection allows there to be a proper
class of such y. However, we can use replacement to prove collection by taking
the elements of minimal rank satisfying the formula.
Exercise 6.8. Assume ZF − Separation − Replacement. Show the following are
equivalent:
1. Separation + Collection
2. The axiom schema of replacement: for each formula ϕ:

∀X∀w1 , . . . , wn [((∀x ∈ X)(∃!y)ϕ(x, y, w1 , . . . , wn )) →


∃Z∀y(y ∈ Z ↔ (∃x ∈ X)ϕ(x, y, w1 , . . . , wn ))]

31
Exercise 6.9. TC(x) is the smallest transitive set containing x. That is, for
all sets x, if y is any transitive set containing x, then TC(x) ⊆ y.
Exercise 6.10. Suppose κ is an strongly inaccessible cardinal (i.e. κ is regular,
and λ < κ implies 2λ < κ).

1. Show that for all α < κ, |Vα | < κ.


2. Show that |Vκ | = κ.

32
7 The Mostowski collapse
Definition 7.1. Suppose R is a relation on a class X. Recall R is wellfounded
if for every nonempty set Y ⊆ X, there is an R-minimal element (i.e. some
y ∈ Y so that for all z ∈ Y , (z, y) ∈
/ R). For each x ∈ X, we use the notation
predR (x) to denote the class of predecessors of x, namely predR (x) = {y ∈
X : (y, x) ∈ R}. We say R is setlike if for every x ∈ X, predR (x) is a set. We
say R is extensional if pred(x) = pred(y) → x = y.
So for example, the axiom of extensionality says that the relation ∈ is ex-
tensional.
Theorem 7.2. Suppose R is an extensional wellfounded relation on a class
X. Then there is a transitive class Y and an isomorphism f between (X, R)
and (Y, ∈). That is, f : X → Y is a bijection such that for all x, y ∈ X,
(x, y) ∈ R ↔ f (x) ∈ f (y). Furthermore, f and Y are unique.
One way to prove this is by transfinite recursion on the relation R; you can
perform transfinite recursion along any wellfounded relation (not just on the
ordinals). You’ll do this as part of your homework. Then you can recursively
define the function f by f (x) = {f (y) : (y, x) ∈ R}.

Figure 6: An example of the Mostowski collapse

We give a different proof.


Proof. Note that we are not assuming R is transitive! The first step in our proof
will be handling this issue. Let R∗ be the relation on X where (x, y) ∈ R∗ iff
there is a finite sequence x0 , . . . , xn such that x = x0 , y = xn , and xi Rxi+1 for
all i ≤ n. Now R∗ is clearly transitive and since R is wellfounded, it follows that
R∗ is also wellfounded, and is thus a wellfounded partial order. Let Xα = {x ∈
X : rankR∗ (x) ≤ α}, where rank is as defined in Definition 5.10 for wellfounded
partial orders. Note that (x, y) ∈ R implies rankR∗ (x) < rankR∗ (y).
We define functions fα : Xα → V by transfinite induction, where α < β
implies fα ⊆ fβ . Suppose we have defined fα for all α < β, such that:

33
• fα is an injection,
• ran(fα ) is a transitive
• fα (x) ∈ fα (y) iff (x, y) ∈ R for all x, y ∈ dom(fα ), and
• rankR∗ (x) = rank(fα (x)) for all x ∈ dom(fα ).

Then for x such that rankR∗ (x) = β, define fβ (x) = {{frankR∗ (y) (y) : (y, x) ∈
R}. Now by definition, each element of fβ (x) is an element of ran(fα ) for
some α < β. So ran(fβ ) is transitive. We also have rank(fβ (x)) = β, since
rank(fβ (x)) = sup{rank(y) + 1 : y ∈ fβ (x)}. Since rank(fβ (x)) = β, to check
that fβ is an injection we just need to ensure that if x, x0 are distinct and
rankR∗ (x) = rankR∗ (x0 ) = β, then fβ (x) 6= fβ (x0 ). This follows since fα is
injective for β < α, and
S since R is extensional.
To finish, let f = α∈ORD fα , and let Y = ran(f ).
Suppose X is a set and E is a relation on X so that the model M = (X; E) is
a model of ZFC, where we interpret E as the ∈ relation, and X as the universe of
the model. Then E must be an extensional relation. If E is also wellfounded,11
then (X, E) is isomorphic to a transitive set Y equipped with the ∈ relation.
We say a model of the form (Y, ∈ Y ) is a transitive model.
It is simple to see that ZFC does not prove there is a transitive model of
ZFC. (Of course this is also a consequence of Gödel’s theorem). If that were
true, then there would exist an infinite descending sequence of transitive models
under the  relation. Hence, if there is a transitive model of ZFC, then there is
a transitive model which does not contain any transitive model of ZFC12 More
strongly, we’ll eventually see ZFC+Con(ZFC) does not prove there is a transitive
model of ZFC13 .
11 Note that assuming there is a model of ZFC, we can find a model M = (X; E) so that

E is an illfounded relation on X. This is by compactness. Add countably many constants


c0 , c1 , . . . to our language of ZFC and let ϕn be the sentence cn ∈ cn−1 ∈ . . . ∈ c0 . Then
ZFC + {ϕn : n ∈ ω} is a consistent theory by the compactness theorem. So it has a model.
Note that this illfounded model will still be a model of the axiom of regularity; M “thinks”
that E is wellfounded. It simply does not contain a set with no E-minimal element.
12 In fact, it is theorem of Shepherdson (which was rediscovered by Cohen) that if there is

a transitive model of ZFC, then there is a minimal transitive model M in the sense that for
all transitive models N of ZFC, M ⊆ N . This minimal model M has a simple description; it
is Lα , where α is the inf of the heights of transitive models of ZFC.
13 To see this, suppose ZFC + Con(ZFC) and let M be a transitive model of ZFC that does

not contain any other model of ZFC (which must exist since ∈ is wellfounded). Then it is easy
to see that ω in M must be equal to the real ω, and since Con(ZFC) is true in the universe
(i.e. there is no natural number coding a proof of a contradiction in ZFC), then Con(ZFC) is
still true in M , and so M  ZFC + Con(ZFC), but M does not contain any transitive model
of ZFC.

34
8 The axiom of choice
Recall that the axiom of choice says that if X is a set whose elements are
nonempty and pairwise disjoint, then there is a choice set C so C contains
exactly one element from each set in X. There are a few special cases of the
axiom of choice which are true in ZF. For example, ZF proves the axiom of
choice is true

1. When X is finite (by induction).


S
2. Sets such that there is a linear ordering of X and every x ∈ X is finite
(pick the least element in each x ∈ X).
However, ZF without choice cannot prove the following:

1. There is a choice set for {{x + r : r ∈ Q} : x ∈ R}


2. There is a choice set for {{y ∈ P(N) : x4y is finite} : x ∈ P(N)}.
3. There is a wellordering of R.
In this section, we’ll prove some basic consequences of the axiom of choice.

Definition 8.1. Suppose (P, <P ) is a partial ordering. We say that a subset
X ⊆ P is a chain if for all x, y ∈ X, either x ≤P y or y ≤P x. We say that
X ⊆ P is an antichain if for all x, y ∈ X, x ≮P y and y ≮P x. If X ⊆ P ,
then we say that a ∈ P is an upper bound for X if for all b ∈ X, b ≤P a. We
say an element a ∈ P is maximal in P if there is no b ∈ X such that a <P b.

Lemma 8.2 (Zorn). Suppose (P, <P ) is a nonempty partial ordering so that
for all nonempty chains X, there is an upper bound for X in P . Then P has a
maximal element.
Proof. By assumption, given any chain X ⊆ P , there exists an upper bound a
for X. Note that a is maximal in the whole partial order P if and only if there
is no upper bound a0 for X so that a0 ∈
/ X. Hence, for each chain X, the set

{(X, a) : a is an upper bound for X and a ∈


/ X or a is maximal in P }

is nonempty. So consider the collection of such sets:

{{(X, a) : a is an upper bound for X and a ∈


/ X or a is maximal in P } : X is a chain}

These sets are nonempty, and any two elements of this set are disjoint; they are
sets of ordered pairs with different first coordinates. Hence, by the axiom of
choice, there is a set G so dom(G) = {X ⊆ P : X is a chain} and so that G(X)
is an upper bound for X so that G(X) ∈ / X, or G(X) is maximal in P .
Now by transfinite recursion, consider the unique function F such that
F (α) = G(ran(F  α)). Hence, F is a class function from ORD to P . We
claim that for each α, ran(F  α) = {F (β) : β < α} is a chain. This is true by

35
transfinite induction. It is true for α = 0 since ∅ is a chain. It is true at limit α
since a union of chains is a chain. Finally, if it is true at α, it is true at α + 1,
since F (α) is a upper bound for the chain {F (β) : β < α}, and the union of a
chain and an upper bound for it is also a chain.
Now F cannot be an injection, since ORD is a proper class, and there cannot
be an injective class function from a proper class to a set (apply the axiom of
replacement to its inverse to get a contradiction). So there exists ξ < α such
that F (ξ) = F (α). Then the chain X = {F (β) : β < α} includes F (ξ) = F (α).
Hence, by the definition of G, since F (α) is an upper bound for X that is
contained in X, we must have that F (α) is maximal in P .
The following is a special case of Zorn’s lemma for the ⊆ relation on subsets
of P(X):
Corollary 8.3 (The Hausdorff Maximality principle). Suppose X 6= ∅ and
A ⊆ P(X) is nonempty. Suppose that for all B ⊆ A, if ∀x, y ∈ B, x ⊆ y or
y ⊆ a, then there is some z ∈ A so that x ⊆ z for all x ∈ B. Then there exists
some a ∈ A so that there is no b ∈ A so that a ( b.
Lets prove that every set can be wellordered, assuming AC.
Theorem 8.4 (Zermelo’s wellordering theorem). There is a wellordering of
every set X.
Proof. It suffices to find an injection from an ordinal to X.
Let y be such that y ∈ / X. Let G : P(X) → X ∪ {y} be such that G(A) ∈ A
for all nonempty A ⊆ X, and G(∅) = y. Such a function G exists by the axiom
of choice applied to the set {{(A, a) : a ∈ A ∨ (A = ∅ ∧ a = y)} : A ∈ P(X)}.
Then by transfinite recursion, we can find a function F : ORD → X ∪ {y} so
that F (α) = G(X \ ran(F  α)).
We claim that if α < β and F (α) ∈ X, then F (β) 6= F (α). This is true by
definition of F (α), and since G(A) ∈ A unless A = ∅ in case which G(∅) = y ∈ /
X.
Since ORD is a proper class, there cannot be an injection from ORD to the set
X ∪ {y}. Thus, there must be α < β so that F (α) = F (β). By the above claim,
we must have that F (α) = y. Let γ be the least ordinal such that F (γ) = y,
and hence ran(F  α) = X. Then by our claim Then F  α is an injection from
α to X.
One could also prove the wellordering theorem from the Hausdorff maximal-
ity principle by taking a maximal injection from an ordinal to a subset of X.
The class of injections from ordinals to X is a set by Hartog’s theorem in the
next section.
The following is one of the oldest open problems in set theory:
Open Problem 8.5. Assume ZF. Are the following equivalent:
• (The partition principle) for all sets X and Y , if there is a surjection from
X to Y , then there is an injection from Y to X.
• The axiom of choice.

36
8.1 Fragments of the axiom of choice*
There are small fragments of the axiom of choice which are still not provable
in ZF. For example, the axiom of countable choice ACω says that if X is
a countable set whose elements are nonempty and pairwise disjoint, then there
is a choice set C for X. The axiom of dependent choice says that if R is
a binary relation on X and for all a ∈ X there exists b ∈ X such that a R b
(i.e. R is entire), then there is a sequence hxn : n ∈ ωi such that for all n ∈ ω,
xn R xn+1 .
It is easy to see that in ZF, AC implies DC which implies ACω . To prove
DC assuming AC, let G be a function on X so that for all a ∈ X, a R G(a),
which exists by AC. Then use recursion to find some F so that F (0) ∈ X and
F (n + 1) = G(F (n)). To prove ACω from S DC, take a bijection from ω to a
countably infinite set X, and relation R on X where a R b if there exists n
so that a ∈ f (n) and b ∈ f (n + 1). None of the implications AC → DC → ACω
can be reversed.
ZF by itself is a pretty dismal theory. For example ZF does not prove that
a countable union of countable sets is countable. Indeed, ZF cannot even prove
that R is not a countable union of countable sets. A recent theorem of Gitik is
that assuming certain large cardinal axioms, there is a model of ZF where there
are no uncountable regular cardinals.
But these are scary bedtime stories; mathematicians almost always work in
ZF + DC, even when they are being careful about the use of the axiom of choice.
DC is absolutely essential for developing lots of basic mathematics (e.g. all of
analysis), and DC does not have any of the “pathological” consequences of AC.
For example, DC does not imply the Banach-Tarski paradox, or existence of a
nonmeasurable subset of R, or a wellordering of the real numbers.

37
9 Cardinality in ZF
We’ll begin our discussion of cardinality in ZF without assuming the axiom
of choice. We will sometimes emphasize that a theorem is true just in ZF by
writing it in the assumptions, but all the definitions and theorems in the section
are true without assuming the axiom of choice.
Definition 9.1. If X and Y are sets, say that X and Y have the same car-
dinality and write |X| = |Y | if there is a bijection from X to Y . Formally,
cardinality is an equivalence relation, and |X| denotes the equivalence class un-
der this equivalence relation14 . Say that X has cardinality less than or equal to
Y and write |X| ≤ |Y | if there is an injection from X to Y .
Assuming ZF, if |X| ≤ |Y |, then there is a surjection from Y to X (but the
converse does not hold). If we also assume the axiom of choice, then |X| ≤ |Y |
if and only if there is a surjection from Y to X.
Exercise 9.2. Suppose X and Y are nonempty sets
1. (ZF) If |X| ≤ |Y |, then there is a surjection from Y to X.
2. (ZFC) |X| ≤ |Y | if and only if there is a surjection from Y to X.
Theorem 9.3 (ZF, Cantor-Shröder-Bernstein). Assume X and Y are sets,
|X| ≤ |Y | and |Y | ≤ |X|. Then |X| = |Y |.
Proof. Let f : X → Y and g : Y → X be injections. We define sequences of
sets hAn : n ∈ ωi and hBn : n ∈ ωi simultaneously by induction. Let A0 = X,
Bn = Y \ f (An ), and An+1 = X \ g(Bn ). Then it is easy to see that the An are
decreasing and T Bn are increasing.
S
Let A = n An , and B = n Bn . Now define a function h : XT→ Y by
−1
T = f  A ∪ g  g(B).
h S To see h is a bijection note that f (A) = f ( n An ) =
n (Y \ B n ) = Y \ n Bn = Y \ B. So since f (A) and B are disjoint h is
one-to-one, and since f (A) ∪ B = Y , we see h is onto.
In ZF without assuming choice, there can be many incomparable cardinali-
ties. For example, it is possible that |R|  |ω1 | and |ω1 |  |R|. An important
class of cardinalities are those that come from the sizes of ordinals.
Definition 9.4. Say that an ordinal α is a cardinal if |α| = 6 |β| for all β < α.
We will use Greek letters κ, λ, . . . for cardinals. By transfinite recursion, let ωα
be the αth infinite ordinal which is a cardinal. That is, ωα is the least ordinal
whose cardinality is greater than ωβ for all β < α. Let ℵα denote the cardinality
of ωα .
So for example, ω0 = ω, and ω1 is the least uncountable ordinal.
Exercise 9.5. Show that if α is a limit ordinal, then ωα = supβ<α ωβ .
14 When it is convenient to have a set representing this cardinality, we can use Scott’s trick

and take all elements of minimal rank that have a given cardinality

38
We have the following theorem relating the cardinality of arbitrary sets to
ordinals.
Theorem 9.6 (ZF, Hartog’s theorem). For each set X there is an ordinal which
cannot be injected into X.
Proof. Consider κ = {α : there is an injection from α to X}. This is equal to
{α : there is a wellordering of a subset of X of ordertype α}, which is a set by
the axiom of replacement. It is closed downward, and is hence an ordinal. Since
κ is not an element of itself, it does not inject into X.
The least ordinal that cannot be injected into X is called the Hartog’s
number of X, and is denoted h(X). It is always a cardinal.
Note that for all ordinals α, there is an ordinal λ of greater cardinality than
α: it is {β : |β| ≤ |α|} = h(α). We have special notation for this:
Definition 9.7. If κ is a cardinal, we let κ+ denote the least ordinal of cardi-
nality greater than κ.
We’ve already proved:
Theorem 9.8 (Cantor, ZF ). For every set X, |X| < | P(X)|.
So for every cardinality, there is a strictly larger cardinality. A different way
to map a set to a set of larger cardinality is the map X 7→ X ∪ h(X). In ZFC
for infinite cardinals κ this is just the map κ 7→ κ+ .
We have the following operations that we define on cardinalities:
Definition 9.9. Suppose X and Y are sets. Then |X| + |Y | is defined to be the
cardinality of X t Y , the disjoint union of X and Y 15 . |X| · |Y | is defined to be
the cardinality of the product X × Y . |Y ||X| is defined to be the cardinality of
the set of functions from X to Y .
Exercise 9.10.
1. The operations +, ·, exp are well defined on cardinals. If X, Y are sets and
|X| = |X 0 | and |Y | = |Y 0 |, then |X|+|Y | = |X 0 |+|Y 0 |, |X|·|Y | = |X 0 |·|Y 0 |,
0
and |X||Y | = |X 0 ||Y | .
2. The operations are nondecreasing in every coordinate. Suppose X, X 0 , Y
are sets, and |X| ≤ |X 0 | . Then |X|+|Y | ≤ |X 0 |+|Y |, |X|·|Y | ≤ |X 0 |·|Y |,
0
|X||Y | ≤ |X 0 ||Y | , and |Y ||X| ≤ |Y ||X | .
3. The operations + and · on cardinalities are commutative and associative.
Our next goal is to show that the operations of addition and multiplication of
cardinality of infinite ordinals are trivial, and ℵα + ℵβ = ℵα · ℵβ = max(ℵα , ℵβ ).
We begin with some easy facts about how cardinality behaves on ordinals:
Exercise 9.11. For all infinite ordinals α, |α| = |α + 1|.
15 Recall X t Y = {0} × X ∪ {1} × Y

39
We have the following:
Lemma 9.12. (ZF) If α is an infinite ordinal, then |α × α| = |α|.
Proof. We prove |α × α| = |α| by transfinite induction on α. For our base case,
|ω ×ω| = |ω| (e.g. via the pairing function f ((n, m)) = 21 (n+m)(n+m+1)+m).
Now suppose that for all infinite β < α we have |β × β| = |β|.
Case 1: α has the same cardinality as β for some β < α. Then |α| · |α| =
|β| · |β| = |β| = |α| by our induction hypothesis.
Case 2: suppose α is a cardinal. Then we claim that the following ordering
≺ on α × α is a wellordering of ordertype α. Define (β, γ) ≺ (β 0 , γ 0 ) iff

• max(β, γ) < max(β 0 , γ 0 ), or


• max(β, γ) = max(β 0 , γ 0 ) and β < β 0 , or
• max(β, γ) = max(β 0 , γ 0 ), β = β 0 and γ < γ 0 .
It is easy to check this is a wellordering.
Suppose (β, γ) ∈ α×α. It suffices to show that the initial segment {(β 0 , γ 0 ) : (β 0 , γ 0 ) ≺
(β, γ)} of ≺ on α×α has ordertype less than α. Hence, the ordertype map send-
ing each (β, γ) ∈ α × α to the ordertype of this initial segment is an injection
from α × α to α.
Let λ = max(β, γ) + 1, so λ < α. Since (β 0 , γ 0 )  (β, γ) implies β 0 , γ 0 < λ,
the initial segment given by (γ, β) has cardinality ≤ |λ · λ| = |λ| < α by our
induction hypothesis, so this initial segment of the wellordering  on α + α
must have ordertype less than α. Hence, every element of α × α is mapped to
an ordinal less than α, so |α| · |α| ≤ |α|, and hence |α| · |α| = |α|.
Exercise 9.13. (ZF) Suppose X and Y are infinite sets. Then |X| + |Y | ≤
|X| · |Y |.

Corollary 9.14. (ZF) |ℵα | + |ℵβ | = |ℵα | · |ℵβ | = |ℵmax(α,β) |.


Proof. Without loss of generality, assume α ≤ β. Then

|ℵβ | ≤ |ℵα | + |ℵβ | ≤ |ℵα | · |ℵβ | ≤ |ℵβ | · |ℵβ | = ℵβ .

where the last equality is by the above Lemma, since ℵβ is an infinite ordinal.
Cardinal addition and multiplication on nonwellorderable sets in ZF can be
quite interesting. Indeed, if cardinal addition and multiplication is too simple
on infinite sets, then the axiom of choice must be true.
Theorem 9.15 (ZF). Suppose that for all infinite sets X and Y , |X| + |Y | =
|X| × |Y |. Then the axiom of choice is true.
Proof. Suppose that for all infinite sets X and Y , |X| + |Y | = |X| × |Y |. It
suffices to show that every infinite set can be wellordered. Assume X is infinite,
and f is an injection from X × h(X) to X t h(X).

40
Case 1: there is some x ∈ X so that for all α ∈ h(X), f (x, α) ∈ X. Then
α 7→ f (x, α) is an injection from h(X) to X, which is a contradiction.
Case 2: For all x ∈ X, there is some α ∈ h(X) such that f (x, y) ∈ h(X). In
this case, let g(x) = f (x, α), where α is least such that f (x, α) ∈ h(X). Then g
is an injection from X to h(X), and hence X is wellorderable.
Using similar tricks with Hartog’s number, one can prove the following:
Exercise 9.16 (ZF). The GCH in ZF is the statement that for all sets X, if Y
is such that |X| ≤ |Y | ≤ 2|X| , then |Y | = |X| or |Y | = 2|X| . Show that in ZF,
if GCH is true, then AC is true.

9.1 Cardinality in models of the axiom of determinacy*


The axiom of determinacy is an axiom that contradicts the axiom of choice, and
paints a much different picture of the universe of sets. It is an important topic
of study in modern set theory. Models of ZF + DC + AD in some sense contain
only well behaved definable sets, have very regular behavior and no pathologies.
They are compatible with large cardinals. We briefly discuss what cardinality
is like in these models: they are beautiful examples of models of ZF without
choice.
We’ve already seen two completely different proofs of the existence of un-
countable sets: R is uncountable by Cantor’s diagonal argument. ω1 is uncount-
able since it is the set of all countable ordinals, and cannot be an element of
itself.
In natural models of the axiom of determinacy, R cannot be wellordered and
the cardinalities |R| and |ω1 | are incomparable; neither injects into the other.
However, all other sets are uncountable by virtue of containing a copy of one of
these two sets. Thus, in some sense, these two proofs for R and ω1 are the only
proofs that there exists an uncountable set.
Theorem 9.17 (Woodin, see [CK] and [H]). In “natural” models of AD (such
that L(R) or L(P(R)) assuming large cardinals), if X is an uncountable set,
then |R| ≤ |X| or |ω1 | ≤ |X|.
Note that assuming the axiom of choice, for nonempty X, |X| ≤ |Y | iff
there is a surjection from Y to X. However, this is not true in ZF. If we have a
surjection from Y to X, we need the axiom of choice to choose one element from
each preimage to construct an injection from X to Y . Indeed, it is possible that
if X is a set, and E is an equivalence relation on X, for |X| < |X/E|! Taking
quotients can increase the cardinality of sets! In these models of AD, important
cardinalities arise from equivalence relations E on the real numbers such that
|R| < |R/E|.

9.2 Resurrecting Tarski’s theory of cardinal algebras*


In the 1930s, Tarski investigated the theory of cardinal addition in ZF from an
algebraic viewpoint. He created an axiomatic framework for investigating the

41
addition operation on cardinalities in set theory without the full axiom of choice
(which makes cardinal addition trivial).16
There are a few surprising and nontrivial theorems in this setting. One of
the most famous is the following:
Theorem 9.18 (Lindenbaum and Tarski). Suppose X and Y are sets and
|n × X| = |n × Y |, then |X| = |Y |.
Even the case n = 3 in this theorem is not at all straightforward, and John
Conway and Peter Doyle published a famous paper about this theorem called
“Division by three” [CD].
Tarski published a volume on cardinal algebras in 1949 which contains many
pages of tedious algebraic manipulations, but with some nice applications, such
as the above theorem. Tarski’s theory seems to have been largely forgotten for
more than half a century. However, Kechris and Macdonald realized recently
that there are many natural cardinal algebras being studied in modern set the-
ory, such as in the theory of Borel equivalence relations, or equidecomposability
in group actions with respect to a give σ-algebra [KM] (and which have a similar
flavor in some ways to cardinality in models of set theory without the axiom of
choice). Tarski’s theory yields many new theorems in these settings which were
not previously known.
16 Precisely,
P
a cardinal algebra is a triple (A, +, ) where
• A is a set
• + is a binary function on A, and
• : Aω → A is a function
P

satisfying a short list of axioms (for example, one of the axioms is ∞


P P∞
n=0 an = a0 + n=0 an+1
P
for every sequence han : n ∈ ωi). Here we think of A as a set of cardinalities, and + and
as addition of two and countably many cardinalities.

42
10 Cofinality
Cofinality is a way of measuring how hard an ordinal is to approach from below.
It plays a huge role in our understanding of cardinality.
Definition 10.1. Suppose α is an ordinal and C ⊆ α. Then we say that C is
cofinal in α if for all β ∈ α there is some γ ∈ C such that γ ≥ β.
Suppose α + 1 is a successor ordinal. Then the set {α} containing just the
maximal element of α + 1 is cofinal in α + 1. Hence, the cofinality of every
successor ordinal is 1, and cofinality is only interesting for limit ordinals.
If α is a limit ordinal, then C ⊆ α is cofinal iff sup C = α.
Definition 10.2. We define cf(α) to be the least ordinal β such that there is a
function f : β → α such that ran(f ) is cofinal in α.
Exercise 10.3. If α is a limit ordinal, then cf(α) ≥ ω.
We compute a few examples:
• cf(ω ω ) = ω. This is since cf(ω ω ) ≥ ω by Exercise 10.3, and cf(ω ω ) ≤ ω
since {ω n : n ∈ ω} is cofinal in ω ω .
• cf(ωω ) = ω, since {ωn : n ∈ ω} is cofinal in ωω by Exercise 9.5.
• cf(ω1 ) = ω1 . This is because if C ⊆ ω1 is cofinal, then ω1 = C. However,
S
since each ordinal in ω1 is countable and a countable union of countable
sets is countable, a countable set cannot be cofinal in ω1 .
Lemma 10.4. Suppose α is a limit ordinal. Then cf(α) is the least ordinal β
such that there is a nondecreasing function f : β → α such that ran(f ) is cofinal
in α.
Proof. Suppose f : cf(α) → α is such that ran(f ) is cofinal in α. Now we define
a nondecreasing f 0 : β → α so that ran(f 0 ) is cofinal in α:

f 0 (γ) = sup{f (δ) : δ ≤ γ}.

Then f 0 (γ) ≥ f (γ) for all γ by definition, and f 0 (γ) < α for all γ < cf(α), by
definition of cf(α), since f  β cannot be cofinal in α for β < cf(α). Finally, f 0
is nondecreasing by definition.
Indeed, by slightly modifying the above proof we have the following:
Exercise 10.5. For every ordinal α, cf(α) is equal to the shortest length λ of
a strictly increasing sequence cofinal in α.
We will often use the following to compute cofinalities:
Exercise 10.6. Suppose α and β are limit ordinals, and there are nonde-
creasing functions f : α → β and g : β → α such that ran(f ) is cofinal in β and
ran(g) is cofinal in α. Then cf(α) = cf(β).

43
For example, if α is a limit ordinal, then cf(ωα ) = cf(α), since β 7→ ωβ is
a nondecreasing cofinal function from α to ℵα , and β 7→ sup{γ : ℵγ ≤ β} is a
nondecreasing cofinal function from ℵα to α.
Lemma 10.7. For all ordinals α, cf(cf(α)) = cf(α).

Proof. Suppose f : β → α is such that ran(f ) is cofinal and nondecreasing, and


g : γ → β is such that ran(g) is cofinal and nondecreasing. Then g ◦ f : γ → α
is nondecreasing and cofinal.
Lemma 10.8. If α is not a cardinal, then cf(α) < α.
Proof. By definition, if α is not a cardinal, there is a β < α such that there is
a bijection from β to α (and the range of this bijection is clearly cofinal).
Corollary 10.9. For every α, cf(α) is a cardinal.
Proof. If cf(α) was not a cardinal, then cf(cf(α)) < cf(α), by Lemma 10.8,
contradicting Lemma 10.7.

Cofinality breaks cardinals into two types that have very different behavior.
Definition 10.10. We say a cardinal κ is regular if cf(κ) = κ. Otherwise, we
say that κ is singular.
Our first observation is that all successor cardinals are regular.

Theorem 10.11 (ZFC). Every infinite successor cardinal κ+ is regular.


Proof. To see this, suppose for a contradiction that C ⊆ κ+ has cardinality κ
and is cofinal. Since every element of C has cardinality ≤ κ, by the axiom of
choice we can pick an injection
S fα from each element of C to κ. Now we can
make an injection from C to κ × κ by letting f (β) = (α, fα S
(β)) where α ∈ C
is least such that β ∈ α. By Lemma 9.12 this implies that C = sup C has
cardinality ≤ κ, contradicting the definition of κ+ .
We give some examples of singular cardinals. The first infinite singular
cardinal is ℵω , since ω is regular, and each ℵn for n > 0 is a successor cardinal
and hence regular. We computed earlier that cf(ℵω ) = ω. Indeed, since cf(ℵα ) =
α for all limit α, we have that cf(ℵω1 ) = ω1 , so ℵω1 is a singular cardinal.
Another example of a singular cardinal is the first fixed point of the function
α 7→ ℵα . Let α0 = ℵ0 , αn+1 = ℵαn , and let α = supn αn . Then since αn is a
strictly increasing sequence of cardinals, sup αn is a limit cardinal greater than
ℵαn for all n. But at limits α, ℵα = supβ<α ℵβ , and hence ℵα = α. So we have
cf(ℵα ) = ω, since αn is a cofinal sequence of ordertype ω.
If κ is a regular limit cardinal, then we say that κ is weakly inaccessible.
We’ll see soon that ZFC cannot prove uncountable weakly inaccessible cardinals
exist. This is because if κ is uncountable and weakly inaccessible, then ZFC
proves Lκ is a model of ZFC, and since ZFC 6` Con(ZFC), ZFC cannot prove a
weakly inaccessible cardinal exists.

44
We’ll finish this section on cofinality by discussing König’s theorem, which
gives another important map sending each cardinal κ to a cardinality greater
than κ:
Theorem 10.12 (König). If κ is a cardinal, then κ < κcf(κ) .

Proof. It suffices to show that if hfα : α ∈ κi is a sequence of functions where


fα : cf(κ) → κ, we can find some h : cf(κ) → κ such that h 6= fα for all α < κ.
This implies there is no surjection κ to κcf(κ) , and hence there is no injection
from κcf(κ) to κ.
Let g : cf(κ) → κ be so that ran(g) is cofinal in κ. Now given any collection
of values {fα (β) : α ≤ g(β)}, since this collection has cardinality ≤ |g(β)| which
is less than κ, there is some element of κ not in this set. Let h : cf(κ) → κ be
defined by
h(β) = inf (κ \ {fα (β) : α ≤ g(β)}) .
Then for every α ∈ κ, there is some g(β) such that α ≤ g(β), and hence
h(β) 6= fα (β).
One corollary of this theorem is that it tells us something about the cofinality
of 2κ , assuming ZFC.

Corollary 10.13 (ZFC). cf(2κ ) > κ.


cf(2κ )
Proof. (2κ )κ = 2κ·κ = 2κ . However, (2κ ) > 2κ by König’s theorem. So we
must have cf(2κ ) > κ.
For example we cannot have |R| = ℵω . This is because |R| = 2ω , so cf(|R|) >
ω by König’s theorem. However, cf(ℵω ) = ω. Set theorists have shown using
forcing that cf(|R|) 6= ω is essentially the only restriction ZFC imposes on the
cardinality of |R|.
Definition 10.14. The gimel function is defined to be ‫(ג‬κ) = κcf(κ) for κ an
infinite cardinal.

Note that if κ is regular, then ‫(ג‬κ) = κκ = 2κ . We will see that in ZFC the
values of this function determine κλ for all κ, λ.

45
11 Cardinal arithmetic in ZFC
In this section, we further develop the basics of cardinal arithmetic. We em-
phasize that we heavily are using the axiom of choice in this section. Since
AC implies every set can be wellordered, this implies every infinite set has the
cardinality of some ℵα . So for example, if κ > ℵα , then κ ≥ ℵα+1 .
Our eventual goal in this section is to get an inductive understanding of
the value of κλ for infinite cardinals κ, λ. ZFC doesn’t decide even the value
of ℵℵ0 0 . However, we’ll give a (recursive) formula for the value of κλ just in
terms of the gimel function ‫(ג‬κ) = κcf κ . The big question then becomes what
possible values are there for the ‫ ג‬function in ZFC. This is still partially an open
question, although much is known.
In Section 9, we defined sums and products for pairs of cardinals. More
generally, we can add or multiply any set of cardinals.

Definition 11.1. Suppose κi is a cardinal for all i ∈ I. Then

X [
κi = {i} × κi
i∈I i∈I

and Y
κi = |{f : dom(f ) = I ∧ (∀i ∈ I)f (i) ∈ κi }|
i∈I

Exercise 11.2. These P operations Pare well defined


Q on cardinalities.
Q If |xi | = |yi |
for all i ∈ I, then | i∈I xi | = | i∈I yi | and | i∈I xi | = | i∈I yi |.
Computing infinite sums of cardinals is easy by the following lemma:
Lemma 11.3. If κi > 0 for all i < λ, and λ ≥ ℵ0 , then
X
κi = λ · sup κi = max(λ, sup κi ).
i<λ i<λ
i<λ
P P
Proof. We clearly have i<λ P κi ≤ i<λP (supi<λ κi ) = λ · supi<λ κSi . For the
other
P inequality we have λ = i<λ
P 1 ≤ κ
i<λ i and sup κ
i<λ i = | i<λ κi | ≤
i<λ κ i . So max(λ, sup i<λ κi ) ≤ i<λ κi .
Computing products of cardinals is more difficult. One inequality we will
often use is that if κi ≤ κ for all i, then i<λ κi ≤ κλ .
Q

Exercise 11.4. Suppose κ and λ are cardinals. Then i<λ κ = κλ .


Q

Summing cardinals gives us a different way of understanding singular cardi-


nals.

Lemma 11.5. An infinite cardinal κ is singular iff it is a sum of fewer than


P ∃λ < κ and a sequence
κ cardinals smaller than κ. That is, κ is singular iff
hκi : i < λi where κi < κ for all i < λ such that κ = i<λ κi .

46
Proof. Suppose hαi : i ∈ λi is a sequence of ordinals with αi < κ for all i
P λ < κ. Then if the sequence αi is cofinal in κ,Pthen supi<λ αi = κ, so
and
i<λ |αi | = λ·κ = κ. However, if supi<λ αi < κ, then i<λ αi = λ·supi<λ αi =
max(λ, supi<λ αi ) < κ.
We have the following theorem relating cardinal sums and exponentiation,
which generalizes the two different diagonalization proofs that we’ve already
discussed: Cantor’s theorem that κ < 2κ and König’s theorem that κ < κcf κ .
Theorem 11.6 (König). Suppose κi < λi for all i ∈ I. Then
X Y
κi < λi
i∈I i∈I
P Q
Proof. Suppose we have a function f : i∈I κi → i∈I λi . We need to show f
is not a surjection.
Q
Define h ∈ i∈I λi as follows. We need to define h(i) ∈ λi for each i ∈ I.
Given i, consider the set {f ((i, α))(i) : α ∈ κi }. This set is a κi size subset of λi .
Since κi < λi , the complement of {f ((i, α))(i) : α ∈ κi } inside λi is nonempty
and so we can define

h(i) = inf(λi \ {f ((i, α))(i) : α ∈ κi }).

Since h(i) 6= f ((i, α))(i) for all i ∈ I and α ∈ κi , we have h 6= f ((i, α)). So
h∈
/ ran(f ).
Cantor’s theorem is the special case of König’s theorem where we let κi = 1
and λi = 2 for all i < κ:
X Y
κ= 1< 2 = 2κ
i<κ i<κ

König’s theorem that κ < κcf(κ) is the special case where I = cf(κ), f : I → κ
is cofinal, κi = f (i) and λi = κ:
X Y
κ= f (i) < κ = κcf(κ) .
i<cf(κ) i<cf(κ)

Our next goal is to describe a little of what ZFC proves about cardinal expo-
nentiation κλ . First, we have the following lemma that says we can determine
the value of κλ recursively if we know the values of the function κ 7→ 2κ and
κ 7→ κcf(κ) .
Theorem 11.7. Fix an infinite cardinal λ. Then for all infinite cardinals κ,
the value of κλ is the following:
1. If κ ≤ λ, then κλ = 2λ .

2. If κ > λ and (∃µ < κ)µλ ≥ κ, then κλ = µλ .

47
3. If κ > λ and (∀µ < κ)µλ < κ, then
(a) if cf(κ) > λ, then κλ = κ.
(b) if cf(κ) ≤ λ, then κλ = κcf κ .
Proof. For (1),
2λ ≤ κλ ≤ (2κ )λ = 2κ·λ = 2λ .
For (2),
µλ ≤ κλ ≤ (µλ )λ = µλ

S For (3a), since cf(κ) > λ every function f : λ → κ is bounded, so κλ =


| α<κ α | = α<κ |α|λ . But
λ
P

X X X
κ= 1≤ |α|λ ≤ κ = κ · κ = κ.
α<κ α<κ α<κ
P
For (3b), if cf(κ) ≤ λ, then first write κ = i<cf(κ) κi where 1 < κi < κ for
P Q
each i. Then since i<cf(κ) κi ≤ i<cf(κ) κi , we have
 λ
Y Y Y
κλ ≤  κi  = (κλi ) ≤ κ = κcf(κ) ≤ κλ .
i<cf(κ) i<cf(κ) i<cf(κ)

Recall Cantor’s continuum hypothesis is the statement that there is no cardi-


nal κ with |N| < κ < |R|. Since assuming AC, every set has the same cardinality
as an ordinal, and the next cardinal after ℵ0 = |N| is ℵ1 , we can reformulate
this as 2ℵ0 = ℵ1 . This statement is abbreviated CH

CH : 2ℵ0 = ℵ1 .

The generalized continuum hypothesis or GCH is the statement

GCH : for all infinite κ, 2κ = κ+ .

We will show eventually that L  GCH.


If GCH holds, then we can simplify Theorem 11.7.
Exercise 11.8. Assuming GCH, then for κ, λ infinite,

 κ if λ < cf(κ)
κλ = κ+ if cf(κ) ≤ λ < κ
 +
λ if κ ≤ λ

Theorem 11.7 describes the value of κλ in terms of values of 2λ and κcf κ .


Next, we turn to describing the value of 2λ . First we need another definition.
Definition 11.9. If κ and λ are cardinals, then κ<λ = µ<λ κµ
S

48
For example, κ<ω = κ for all infinite cardinals κ, since κn = κ for all n.
We also have the following exercise:
Exercise 11.10. For all cardinals κ, 2κ ≤ (2<κ )cf(κ)
Q. [Hint: choose a sequence
κi cofinal in κ, and find an injection from 2κ into i<cf(κ) 2κi .]
Now we have the following theorem which recursively computes the value of
2κ in terms of smaller values 2λ for λ < κ, and the gimel function:
Theorem 11.11. Suppose κ is an infinite cardinal. Then
1. If κ is regular, then 2κ = ‫(ג‬κ).
2. If κ is singular and 2λ is eventually constant as λ → κ (so 2<κ is this
constant value), then 2κ = 2<κ .
3. If κ is singular and 2λ is not eventually constant as λ → κ, then 2κ =
‫(ג‬2<κ ).
Proof. (1) If κ is regular, cf(κ) = κ, and 2κ = κκ = κcf(κ) by Theorem 11.7.1
(2) First, note 2κ ≥ 2<κ . Next, take cf(κ) ≤ λ < κ such that 2<κ = 2λ .
Then (2<κ )cf(κ) = (2λ )cf(κ) = 2λ·cf(κ) = 2λ = 2<κ . So

2<κ ≤ 2κ ≤ (2<κ )cf(κ) = 2<κ

(3) In this case cf(2<κ ) = cf(κ) since 2<κ = λ<κ 2µ , and by Exercise 10.6.
S

Hence, ‫(ג‬2<κ ) = (2<κ )cf(κ) ≥ 2κ by Exercise 11.10. On the other hand,


‫(ג‬2<κ ) = (2<κ )cf(κ) ≤ (2κ )cf κ = 2κ .

11.1 Some equivalents of CH


The continuum hypothesis has many implications and equivalences in of areas
of mathematics quite far from set theory. We briefly give a couple examples.
Our first example in Wetzel’s problem in complex analysis.

Wetzel’s problem: Let {fi }i∈I be a family of pairwise distinct ana-


lytic functions on the complex numbers such that for each z ∈ C,
the set of values {fi (z) : i ∈ I} is countable. Does it follow that the
set of functions {fi }i∈I is countable?

Theorem 11.12 (Erdős, 1963). Wetzel’s problem has a positive solution if and
only if CH is true.
Proof. Assume first that CH is false. We will then show that for any family
{fi }i∈I of analytic functions of size ℵ1 , there exists a complex number z so that
the values {fi (z) : i ∈ I} are distict. Hence, Wetzel’s problem has a positive
solution.
We claim that for any i, j the set A(i, j) = {z ∈ C : fi (z) = fj (z)} is
countable. This is because an analytic function is uniquely determined by its
values on any infinite set that has an accumulation point. Hence, if there are

49
infinitely many points of A(i, j) inside the bounded set Bn = {z : |z| = n} so that
fi (z) = fj (z), then fi (z) − fj (z) must be the constant zero function. So there
are onlySfinitely many points z in A(i, j) ∩ Bn for each n, and so |A(i, j)| ≤ ℵ0 .
Hence | i,j∈I A(i, j)| ≤ |ℵ0 · ℵ1 | < |R|.
Now assume that CH is true. Let {zα : α < ω1 } be a wellordering of C. Let
D be the set {p + qi : p, q ∈ Z} which is dense in C. We will construct a family
{fα : α < ω1 } of functions such that for all α < β:
1. fα (zβ ) ∈ D, and
2. fα (zβ ) 6= fβ (zβ ).

Given α, let β0 , β1 , . . . be an enumeration of all the ordinals less than α.


Let fα (x) = 0 + 1 (x − zβ0 ) + 2 (x − zβ0 )(x − zβ1 ) + . . ..
We will choose the values of n successively in order to make fα analytic,
and to ensure that conditions (1) and (2) are satisfied. If |n | → 0 sufficiently
fast, then fα will be analytic. Suppose we have already chosen 0 , . . . , n−1 , so
that fα (zβi ) 6= fβi (zβi ) for i < n. (Note that the value of fα (zβi ) depends only
on the first n values of i , since all subsequent terms are 0 at zβi ). Then there
is only one choice of n that would make fα (zβn ) = fβn (zβn ). Hence, since D is
dense, we can choose a sufficiently small value for n so that fα (zβn ) 6= fβn (zβn )
and fα (zβn ) ∈ D. Hence, fα will be analytic and satisfy (1) and (2) above.
Another beautiful equivalent of CH is the axiom of symmetry whose study
dates back to Sierpiński.

The axiom of symmetry: Suppose f : R → P ω1 (R) assigns to each


real number x ∈ R a countable set of real numbers. Then there
exists x, y ∈ R such that x ∈
/ f (y) and y ∈
/ f (x).

Theorem 11.13. The axiom of symmetry holds iff CH is false.

Proof. Assume CH. Let ≺ be a wellordering of R of ordertype ω1 . Then let


f (x) = {y ∈ R : y  x}. Then given any x, y either x  y or y  x, and hence
x ∈ f (y) or y ∈ f (x).
Now assume ¬CH. S Let {xα : α < ω1 } be a sequence of ω1 many distinct
reals. Then A = {f (xα ) : α < ω1 } has size ℵ1 , so there exists some y ∈ R
so that y ∈ / A. Since f (y) is countable, there also exists some α < ω1 so that
xα ∈/ f (y).
There is a thought experiment due to Freiling (built on ideas of Stuart
Davidson) that claims that the axiom of symmetry should be intuitively true.
Suppose we randomly throw two darts at the interval [0, 1] and they land at the
points x and y.
If we throw the first dart and it lands at x, then the second dart should land
in the set f (x) with probability 0. Indeed, in the sense of Lebesgue measure, if
we randomly pick y ∈ [0, 1] the chance it is in the countable set f (x) is precisely
0.

50
Since the chance that y lands in f (x) is 0 no matter how we pick x, we
should be able to make this prediction before we throw the first dart and it hits
x: “y ∈/ f (x)” almost surely. So if we threw two darts at the same time, then
we should similarly have y ∈/ f (x) almost surely. But then if we throw these two
darts simultaneously, then by the symmetry of the situation, we should have
x∈/ f (y) and y ∈/ f (x).
The reason that this is an intutive argument and not a precise proof is
that the switch from randomly picking x and then y to picking x and y at the
same time requires a result like the Fubini theorem from analysis to justify, and
Futini’s theorem is not true for all functions f . Indeed, the kind of Lebesgue
measurability that Freiling’s intution relies on can be pushed futher to give an
“intuitive” proof that the axiom of choice should be false; you shouldn’t be able
to wellorder the real numbers.
A much more detailed discussion of Freiling’s philosophical arguments, their
connections to precise mathematics, and their relation to measure theory, Baire
category, GCH and the axiom of choice are in Freiling’s paper [F].

11.2 The singular cardinals hypothesis*


An early theorem proved using forcing was Easton’s theorem that on regular
cardinals, powerset function ℵα 7→ 2ℵα can be any function which is nondecreas-
ing and has cf(2ℵα ) > ℵα (which is necessary by Corollary 10.13). However, the
possible behavior of the powerset function on singular cardinals remained open.
In the mid sixties, Solovay asked whether it is possible to have 2ℵn = ℵn+1
for every n, and 2ℵω = ℵω+2 . This ended up being a deep question which
anticipated part of the singular cardinals problems, and requires large cardinals
to answer.
As we’ve already seen, the gimel function ‫(ג‬κ) is the key we need to under-
stand all of cardinal exponentiation. One trivial case is that if 2cf κ ≥ κ, then
2cf(κ) = (2cf κ )cf κ ≥ κcf κ ≥ 2cf κ , so κcf κ = 2cf(κ) , and we can understand κcf(κ)
in terms of the powerset function at a smaller regular cardinal (cf(κ) is regular).
So the real interesting case is what happens to the value κcf(κ) when 2cf κ < κ.
Since κcf(κ) > κ, the smallest value it could possibly take is κ+ , and this is the
singular cardinals hypothesis, abbreviated SCH.

SCH : if κ is singular and 2cf(κ) < κ, then κcf(κ) = κ+


In a surprising development at the time, Jensen showed that the failure of
SCH has large cardinal strength; if SCH is not true, then certain large cardinals
exist. Eventually set theorists were able to prove from large cardinals there are
models in which SCH fails. For example, it is a result of Magidor that assuming
large cardinals, there is a positive answer to Solovay’s question: a model of ZFC
where 2ℵn = ℵn+1 for every n, and 2ℵω = ℵω+2 .
The possible behavior of the cardinal exponentiation at singular cardinals
is still a topic of current research in set theory, and is intimately tied to large

51
cardinals. For a longer introduction to this topic and pcf theory (which provides
stunning limitations on the possible values of 2κ ), see [J].

52
12 Filters and ultrafilters
Filters and ideals are an important way of measuring when sets are “large” in
many areas of mathematics.
Definition 12.1. A filter F on a set X is a collection of subsets of X such
that X ∈ F , ∅ ∈
/ F , F is closed under finite intersections (if A ∈ F and B ∈ F ,
then A ∩ B ∈ F ), and F is closed upward under ⊆ (if A ∈ F , B ⊆ X, and
A ⊆ B, then B ∈ F ).
An ideal I on a set X is a set of subsets of X such that ∅ ∈ I, X ∈ / I, I
is closed under finite unions (if A ∈ I and B ∈ I, then A ∪ B ∈ I), and I is
closed downward under ⊆ (if A ∈ I and B ⊆ A, then B ∈ I).
These definitions are dual to each other. If F is a filter on X, {X \A : A ∈ F }
is an ideal called the dual ideal of F . Similarly, if I is an ideal, then the set of
complements of elements of I forms a dual filter.
Here are some examples of ideals and filters:
• The collection of subsets of [0, 1] having Lebesgue measure 1 are a filter.
The dual ideal is the collection of nullsets.
• If A ⊆ N, we say A has asymptotic density d if
|A ∩ n|
lim = d.
n→∞ |n|

For example, the even numbers have asymptotic density 1/2. The collec-
tion of subsets of N of asymptotic density 1 are a filter.
• If X is a set, κ is an infinite cardinal and |X| ≥ κ, then P κ (X) = {A ⊆
X : |A| < κ} is an ideal. For example, the collection of finite subsets of ω
form an ideal.
You should think of a filter as a collection of “large” sets, and an ideal as a
collection of “small” sets. If I is an ideal on X and F is its dual filter, then a set
A ⊆ X is I-positive/F -positive if A ∈ / I. You should think of I-positive sets
as being “not small”. For example if I is the ideal of subsets of N of asymptotic
density 0, then the I-positive sets are those with positive upper density (i.e.
lim sup |A∩n|
|n| > 0).

Exercise 12.2. Suppose X is a set, and S ⊆ P(X) is nonempty, S is closed


under finite intersections, and ∅ ∈
/ S. Let F = {A ⊆ X : (∃B ∈ S)(B ⊆ A)}.
Show that F is a filter on X, called the filter generated by S.
The following is an important property of filters, which we will use in sub-
sequent sections:
Definition 12.3. We say that a filter F on X is κ-complete if it is closed
under intersections of size less thanTκ. That is, for all λ < κ, if hAα : α < λi is
a sequence of elements of F , then α<λ Aα ∈ F .

53
For example, the Lebesgue conull filter is ℵ1 -complete (the intersection of
countably many conull sets is conull) and the dual filter of P κ (X) is cf(κ)-
complete. Every filter is ℵ0 -complete.
A filter F on X is maximal if there is no filter F 0 on X such that F 0 ) F .
A filter F is an ultrafilter if for all A ⊆ X, either A ∈ F or (X \ A) ∈ F . These
two notions are actually the same:
Lemma 12.4. Suppose F is a filter on X, and A ⊆ X. Then either we can
find a filter F 0 ⊇ F on X so that A ∈ F 0 , or we already have (X \ A) ∈ F .
Proof. Consider S = {B ∩ A : B ∈ F }. Clearly S is closed under finite inter-
/ S, then the ultrafilter F 0 generated by S is a filter
sections, since F is. If ∅ ∈
containing A, by Exercise 12.2. If ∅ ∈ S, then there is some B ∈ F such that
B ∩ A = ∅, so B ⊆ (X \ A), so (X \ A) ∈ F .
Lemma 12.5. A filter F on X is a maximal filter iff it is an ultrafilter.
Proof. ⇒: Suppose F is a maximal filter, and A ⊆ X. We cannot have X \ A
and A both in F , since then ∅ would be in F since F is closed under finite
intersections. If neither X \ A nor A are in F , then F is not maximal by
Lemma 12.4.
⇐: Suppose F is an ultrafilter, F 0 ⊇ F is a filter, and A ∈ F 0 ) F . Then
since A ∈ / F , we must have X \ A ∈ F , so both X \ A and A are in F 0 .
Contradiction.
A very useful theorem of Tarski is that every filter can be extended to an
ultrafilter:
Theorem 12.6 (Tarski, ZFC). If F is a filter on X, then F can be extended to
an ultrafilter. That is, there is an ultrafilter F 0 ⊇ F .
Proof. Suppose F is a chain of filters. That is, S for every F, F 0 ∈ F, either
F ⊆ F 0 or F 0 ⊆ F . Then it is easy to check that F is a filter on X. Hence,
by the Hausdorff maximality principle applied to the set of all filters F 0 on X
so that F 0 ⊇ F , there is maximal filter F 0 on X so that F 0 ⊇ F . This maximal
filter is an ultrafilter by Lemma 12.5.
One trivial example of an ultrafilter is a principal ultrafilter.
Definition 12.7. Say that an filter F on X is principal if there is some A ⊆ X
such that B ∈ F iff A ⊆ B.
If X is a set and x ∈ X, then {A ⊆ X : {x} ⊆ A} is a principal ultrafilter on
X. It is consistent with ZF that every ultrafilter is principal [B]. However, in
ZFC, there are many nonprincipal ultrafilters. For example, let F be the filter of
cofinite subsets of ω. Then by Theorem 12.6, there is an ultrafilter F 0 extending
F which cannot be principal, since for every n ∈ ω, ω \ {n} ∈ F , so ω \ {n} ∈ F 0
and {n} ∈ / F 0.
Ultrafilters have many uses beyond the scope of this course. For example,
given any topological space X, there is a “largest” compact Hausdorff space βX

54
containing X, which is largest in the sense that any map from X to a compact
Hausdorff space K factors through X. This universal βX is called the Stone-
Čech compactification of X. βX is just the space of all ultrafilters on X with
an appropriate topology, and each x ∈ X is mapped to the principal ultrafilter
containing x. We give some more examples of applications:
Exercise 12.8 (The Stone representation theorem). Suppose B is a boolean
algebra (a structure having constants > and ⊥, binary function ∧, ∨ and a
unary function ¬, and satisfying the same theory as the standard two-element
boolean algebra {True, False}). Then B is isomorphic to a field of sets. That is,
there is a set X and a bijection f : B → P(X) such that
• f (a ∧ b) = f (a) ∩ f (b).
• f (a ∨ b) = f (a) ∪ f (b).
• f (¬a) = X \ f (a).
• f (⊥) = ∅.
[Hint: Define a filter on B to be a subset F of B such that > ∈ B, ⊥ ∈ / B,
if a, b ∈ F , then a ∧ b ∈ F , and if a ∈ B and a → b, then b ∈ F (where
a → b iff ¬a ∨ b = >). Define an ultrafilter on B to be a maximal filter. Let
f (a) = {U : U is an ultrafilter on B and a ∈ U }.]
One use of ultrafilters is in taking ultralimits. First, we define limits with
respect to a filter:
Definition 12.9. Suppose F is a filter on ω. If han : n ∈ ωi is a sequence of
real numbers, we say that limF an = x iff for all  > 0, {n : |an − x| < } ∈ F .
For example, if F is the filter of cofinite subsets of ω, then limF is the usual
limit. More generally, for any filter F , limF satisfies all the usual limit laws.
Exercise 12.10. Suppose F is a filter on ω.
1. Show that for every sequence of real numbers han : n ∈ ωi, limF an has at
most one value.
2. Show that if limF an and limF bn exist, then limF an +limF bn = limF (an +
bn ).
3. Show that if limF an exists and c is a constant, then limF can = c limF an .
4. Let U be an ultrafilter on ω. Suppose han : n ∈ ωi is a bounded sequence
of real numbers. Show that limU an exists.
Note that in Definition 12.9 we could have more generally taken a filter F on
any index set I and then defined the filter limit limF ai of sequences hai : i ∈ Ii
indexed by I. In this case all the parts of would still be true.
One way of using ultralimits is to define a finitely additive measure on all
subsets of N:

55
Exercise 12.11. Suppose U is a nonprincipal ultrafilter on N. Define µ : P(N) →
[0, 1] by µ(A) = limU |A∩n|
|n| . Show that µ is a finitely additive, µ is defined on
all subsets of N, µ(N) = 1, and µ({n}) = 0 for all n ∈ ω.
An important problem in the history of set theory was whether one could
similarly find a countably additive measure on all subsets of [0, 1].

12.1 Measurable cardinals*


A cardinal κ is called measurable if there is a κ-complete ultrafilter on κ. The
name measurable cardinal comes from the connection of this concept with the
measure problem studied by Lebesgue, Banach, Ulam, Tarski, and others. In
the wake of the construction of Vitali sets, set theorists became interested in the
problem of whether there is any measure on [0, 1], a function µ : P([0, 1]) →
[0, 1] such that:
• µ([0, 1]) = 1,

• µ({x}) = 0 for every x ∈ [0, 1],


• P
S
µ is countably additive: if hAn : n ∈ ωi are disjoint, then µ( n An ) =
n µ(An ).

Since we are dropping here any geometric requirements like translation invari-
ance (which is ruled out by the existence of Vitali sets), we may as well replace
[0, 1] with any set X and ask if there is a measure µ on X (a countably additive
function µ : P(X) → [0, 1] so that µ(X) = 1 and µ({x}) = 0 for every x ∈ X).
This clearly only depends on the cardinality of X.
It is easy to prove that if κ is the least cardinal such that there is a countably
additive measure on κ, then every measure on κ is much more than countably
additive, it is κ-additive, and a cardinal κ such that there is a κ-additive measure
on κ is called a real-valued measurable cardinal. If µ is a κ-additive measure
on the smallest cardinal κ admitting a measure, then it is easy to see that
µ(A) = 0 if |A| < κ, and hence by κ-additivity, κ is regular. Ulam proved in
1930 (using a technique now called an Ulam matrix) that if κ is real-valued
measurable, then κ must be a limit cardinal and hence is weakly inaccessible.
Thus, the existence of real-valued measurable cardinals cannot be proved in ZFC
and is a large cardinal property.
Now if there exists a κ-complete ( ultrafilter U on κ, then U gives a κ-additive
1 if A ∈ U
measure on κ by letting µ(A) = . So a measurable cardinal (as
0 if A ∈ /U
we’ve defined it above) is also real-valued measurable. A completely different
kind of measure µ on κ is an atomless measure, where for every A ⊆ κ with
µ(A) > 0, there is some B ⊆ A such that 0 < µ(B) < µ(A).
Ulam proved the following dichotomy:

Theorem 12.12 (Ulam). If κ is a real-valued measurable cardinal, then either

56
• κ ≤ 2ℵ0 , there is an atomless measure µ on κ, and there is also is a
measure on the full powerset of P([0, 1]) extending Lebesgue measure.
• κ is measurable (i.e. there is a κ-complete ultrafilter on κ), κ > 2ℵ0 (in
fact κ is a strong limit), and every measure µ on κ has an atom which
yields a κ-additive κ-complete ultrafilter on κ.

Hence, there are really two completely different kinds of real-valued measur-
able cardinals. It is the latter type, measurable cardinals, which have become
by far more important.
One source of the utility of measurable cardinals is the ultrapower construc-
tion. If κ is a measurable cardinal, then let U be a κ-complete ultrafilter on
κ. Then we can take the universe V , and take its ultrapower by this ultrafil-
ter to get an inner model M of V , and an elementary embedding j : V → M .
These types of elementary embedding from the universe into inner models are
fundamental tools in the study of large cardinals.

57
13 Ultraproducts
Ultraproducts were introduced independently in the 1950s in both logic and
operator algebras. Our goal in this section is to describe the ultraproduct con-
struction which has a myriad of applications in algebra, analysis, combinatorics,
as well as in model theory and set theory.
There are many examples where we would like to take a product of structures
hMi : i ∈ Ii, but mod out by phenomena that occur on only a small (e.g. finite)
subset of the structures. The ultraproduct construction gives a precise way of
doing this. The utility of using ultrafilters is that it ensures convergence in a
very general setting when we take limits. It is also key to getting the logical
properties that we would like. For example, negation will work the way we want
because if U is an ultrafilter on I then for each formula ϕ, {i : Mi  ϕ} ∈ U iff
{i : Mi  ¬ϕ} ∈/ U by the ultrafilter property of U .
Definition 13.1 (Ultraproducts). Suppose L is a language, hMi : i ∈ Ii are L
structures, and U is an ultrafilter on I. Let
Y
X= Mi
i∈I

So X is the set of functions with domain I so that f (i) ∈ Mi for all i ∈ I. Let
∼ be the equivalence relation on X where f ∼ g if {i : f (i) = g(i)} ∈ Q U (the
fact that this is an equivalence relation uses that U is a filter). Now let U Mi
be the structure whose universe is X/∼ and where we interpret the constants,
relations, and functions of L as follows:
• For each constant symbol c of L we let c
Q
Mi
U = [i 7→ cMi ]∼
• For each relation symbol R of L, we let R U Mi ([f1 ]∼ , . . . [fn ]∼ ) be true if
Q

{i : RMi (f1 (i), . . . , fn (i))} ∈ U , and false otherwise.


• For each function symbol g of L, we let g
Q
Mi
U ([f1 ]∼ , . . . [fn ]∼ ) = [i 7→
g Mi (f1 (i), . . . , fn (i))]∼ .
Q
That fact that the function g U Mi is well defined uses the fact that U is a fil-
ter. If f1 ∼ f10 , . . . fn ∼ fn0 , then {i : g Mi (f1 (i), . . . , fn (i)) = g Mi (f10 (i), . . . , fn0 (i))} ∈
U since U is closed under finite intersections.
The reason we require U to be an ultrafilter is to obtain the following theo-
rem:
Theorem 13.2 (Løs’ theorem). Suppose hMi : Q i ∈ Ii are L-structures, U is an
ultrafilter on I, and ϕ is an L-sentence. Then U Mi  ϕ iff {i : Mi  ϕ} ∈ U .
Proof. Suppose ϕ(x1 , . . . , xn ) is a formula with free variables x1 , . . . , xn . We
prove by induction on formula complexity that
Y
(∀f1 , . . . , fn ∈ X/∼) Mi  ϕ(f1 , . . . , fn ) ↔ {i : Mi  ϕ(f1 (i), . . . , fn (i)} ∈ U.
U
(*)

58
This is true for atomic formulas by definition of ∼ and our definition of the
functions and relations in the ultraproduct. Now we add logical connectives
and quantifiers:
¬ : Suppose we’ve proven (*) for the formula ϕ. We would like to prove (*)
for the formula ¬ϕ. Then
Y Y
Mi  ¬ϕ(f1 , . . . , fn ) ↔ ¬ Mi  ϕ(f1 , . . . , fn )
U U
↔ {i : Mi  ϕ(f1 (i), . . . , fn (i))} ∈
/ U ↔ {i : Mi  ¬ϕ(f1 (i), . . . , fn (i))} ∈ U
where the second-last step is by (*) for ϕ, and the last step is since U is an
ultrafilter, so A ∈
/ U iff I \ A ∈ U .
∧: Suppose (*) holds for the formulas ϕ and ψ. We would like to show it
holds for ϕ ∧ ψ. Then
Y
Mi  ϕ(f1 , . . . , fn ) ∧ ψ(f1 , . . . , fn )
U
Y Y
↔( Mi  ϕ(f1 , . . . , fn )) ∧ ( Mi  ψ(f1 , . . . , fn ))
U U
↔ {i : Mi  ϕ(f1 (i), . . . , fn (i))} ∈ U ∧ {i : Mi  ψ(f1 (i), . . . , fn (i))} ∈ U
↔ {i : Mi  ϕ(f1 (i), . . . , fn (i)) ∧ ψ(f1 (i), . . . , fn (i))} ∈ U
where the second-last step is by (*) holding for ϕ and ψ and the last step is
since for any filter U , A ∩ B ∈ U iff A ∈ U and B ∈ U . Q
∃ : Suppose we’ve already shown that for all f ∈ X/ ∼, U Mi  ϕ(f ) ↔
{i : Mi  ϕ(f (i))} ∈ U , and now we want to show this for the formula ∃xϕ(x).
Then
Y Y
Mi  ∃xϕ(x) ↔ ∃f Mi  ϕ(f )
U U
↔ (∃f {i : Mi  ϕ(f (i))} ∈ U ) ↔ {i : Mi  ∃xϕ(x)} ∈ U
(where the last step uses the axiom of choice).
An easy corollary of Løs’s theorem is the compactness theorem of first order
logic.
Corollary 13.3. Suppose L is a language, and T is a set of sentences in L
such that for every finite subset of T , there is a model M of T . Then there is a
model of T .
Proof. We may assume T is infinite. Let [T ]<∞ be the set of finite subsets of T .
Let F be the filter on [T ]<∞ generated by the sets {{R ∈ [T ]<∞ : S ⊆ R} : S ∈
[T ]<∞ }. Let U be an ultrafilter extending F .
For each finite S ⊆ T , let MS be a structure such that MS  S, and let
M be the ultraproduct of the MS with respect to the ultrafilter U . Since for
each S ∈ [T ]<∞ we have that {R : S ⊆ R} ∈ U , we have M  S by Løs’s
theorem.

59
Another important type of ultraproduct is an ultrapower
Definition 13.4. If M is a structure,Qand U is an ultrafilter on a set I, the
ultrapower of M by U is the structure U M .
Ultrapowers of structures are very nontrivial and interesting objects that are
quite different than the original structure (even though they will be elementarily
equivalent to it). For example, phenomena that happen “modQ ” for arbitrarily
small  inside M will happen exactly inside
Q the ultraproduct U M.
By Lós’s theorem, an ultrapower QU M is elementarily
Q equivalent to M .
Hence, if M and N are structures and U M and V N are isomorphic, then
clearly M and N are elementarily equivalent. A beautiful theorem of Kiesler
and Shelah is the the converse is true:
Theorem 13.5 (Kiesler-Shelah). Two structures M , N in a language Q L are
elementarily
Q equivalent if and only if there are ultrafilters U and V so that U M
and V N are isomorphic.
Kiesler first proved this theorem assuming the GCH, and then Shelah showed
that the theorem is true in all models of ZFC. We give the special case of
Keisler’s theorem for countable structures as an exercise. Recall a structure M
is κ-saturated if every 1-type over A ⊆ M with |A| < κ is realized.
Exercise 13.6. Fix a countable language L and a nonprincipal ultrafilter U on
ω.

1. Show that if M1 , and M2 are elementarily equivalent κ-saturated L-structures


of cardinality κ, then M1 and M2 are isomorphic.
2. Show that no countably infinite L-structure is ω1 -saturated.
3. Show
Q that if hMi : i ∈ ωi are countable L-structures then their ultraproduct
U M i is ω1 -saturated.
Q
4. Show that if hMi : i ∈ ωi are countable L-structures, then either U Mi is
finite or uncountable.
5. Assume CH is true. Show that if and M1 , M2 areQ countable elementarily
Q
equivalent L-structures, then their ultrapowers U M1 and U M2 are
isomorphic.

13.1 Ultraproducts of metric spaces and asymptotic cones*


Some very nice examples of ultraproducts come from metric spaces. Unfortu-
nately, metric spaces are not such natural first-order (discrete) structures in the
sense of model theory; they exist more naturally in the model theory of metric
structures. If we were determined, we could make a metric space a first-order
structure by having countably many relations Dq (x, y) expressing that the dis-
tance between x and y is less than q for each rational number q, and then we

60
could recover the original metric d by d(x, y) = inf c ¬Dc (x, y). (So we are es-
sentially using Dedekind cuts to represent real numbers). Instead of doing the
above (which is ugly), we’ll make a special definition of what an ultraproduct
of pointed metric spaces is.
Definition 13.7. Suppose h(Xn , xn , dn , ) : n ∈ ωi are triples where each pair
(Xn , dn ) is a metric space (so Xn is a set, dn : Xn2 → [0, ∞) is a metric on Xn ),
and xn ∈ Xn is a point in Xn which Qwe call a “base point”. If U is an ultrafilter
on ω, we define the ultraproduct U (X, xn , dn ) as follows. Let X be the set
of sequences han : n ∈ ωi where an ∈ Xn , and the sequence hd(xn , an ) : n ∈
ωi is bounded. Let ∼ be the equivalence relation on X where han : n ∈ ωi ∼
hbn : n ∈ ωi if limU dn (an , bn ) = 0, and let d be the metric on X/ ∼ where
Q n : n ∈ ωi]∼ , [hbn : n ∈ ωi]∼ ) = limU dn (an , bn ). We define the ultraproduct
d([ha
U (X, xn , dn ) to be the triple (X, [n 7→ xn ], d).

To be clear, we are not using Definition 13.1 here because we are not viewing
metric spaces as a first order structures.
An important example of this type of ultraproduct of metric spaces is the
following:
Definition 13.8. If (X, d) is a metric Qspace with a given basepoint p, the
asymptotic cone of (X, p, d) is equal to U (X, p, d/n).
So we take the product of countably many copies of the same set and base
point X and p, but we keep “zooming out” as n increases by dividing the metric
d by n. Intuitively, the asymptotic cone is a way of roughly viewing (X, d) from
“infinitely far away”. For example,
Exercise 13.9. The asymptotic cone of (Z, 0, dZ ) where dZ is the usual metric
on Z is isomorphic to (R, 0, dR ) where dR is the usual metric on R. [Hint:
show that the map [han : n ∈ ωi]∼ 7→7→ limU an /n is a bijection between the
asymptotic cone and (R, dR ).]
Asymptotic cones are very useful in fields like geometric group theory. For
instance, we can view a finitely generated group as a discrete metric space (using
the word metric metric). After taking an asymptotic cone of this metric space
along with the group structure coming from the ultraproduct, we get a larger
group which is an important tool for studying it. For example, if our original
group is nilpotent, its asymptotic cone will be a Lie group. See e.g. [DK] for
an introduction to asymptotic cones, and an application of them for proving
Gromov’s theorem on polynomial growth.

13.2 The Ax-Grothendieck theorem*


A nice application of ultraproducts is the Ax-Grothendieck theorem. A simple
case of the theorem says that if p is a polynomial on C, and p is injective,
then it is surjective. Lets give a short proof of this using ultraproducts and the
completeness of the theory of algebraically closed fields of characteristic 0.

61
For each n, we can write a formula ϕn in the language of fields that says
that for all polynomials p of degree n, if p is injective it is surjective. Now each
finite field GF(k) of order k satisfies every sentence ϕn since any injective map
on a finite set is surjective. Now if n divides m and p is prime, thenSGF(pn ) is a
subfield of GF(pm ), and it is not hard to see that for each prime p, m GF(pm! )
is the algebraic closure of GF(p). Since the sentences S ϕn are ∀∃ sentences and
GF(pm! ) satisfies ϕn for all n, it is easy to see that m GF(pm! ) satisfies ϕn for
all n.
Now consider the ultraproduct of these algebraically closed fields of charac-
teristic p. By Løs’s theorem, this ultraproduct is an algebraically closed field,
it has characteristic 0, and it satisfies ϕn for each n. Hence, by completeness of
the theory ACF0 every algebraically closed field of characteristic 0 (including
C) has the property that every injective polynomial is surjective.
The last step of this proof is a more generally a type of Lefschetz principle
for algebraically closed fields. If a first order formula in the language of fields is
true for a sequence of algebraically closed fields of arbitrarily large characteristic,
it is true for all algebraically closed fields of characteristic 0. Just take their
ultraproduct to see this.

62
14 Clubs and stationary sets
In set theory, the club filter and stationary sets are perhaps the most important
largeness notions for subsets of cardinals κ. They have myriad uses in set theory.
Definition 14.1. Suppose λ is a limit ordinal and C ⊆ λ. Then a set C ⊆ λ
is closed in λ if for all limit ν < λ, if C ∩ ν is cofinal in ν, then ν ∈ C. We
say C is unbounded in λ if for all α ∈ λ there exists β ∈ C so that β  α. We
call a closed unbounded subset of λ a club set in λ.
Alternately, C ⊆ λ is closed if it contains the suprema of increasing sequences
in C.
Exercise 14.2. Show that C ⊆ λ is closed if and only if for every increasing
sequence hαξ : ξ < βi with αξ ∈ C and supξ αξ < λ, we have supξ αξ ∈ C.

For example, let C ⊆ ω1 be the set of limit ordinals in ω1 . Then C is closed,


and any C 0 with C ⊆ C 0 ⊆ ω1 is closed.
There is a more topological way of understanding closed subsets of ordinals.
If λ is an ordinal, consider the order topology on λ. That is, the topology
where a subbasis consisting of all rays {γ ∈ λ : β < γ} and {γ ∈ λ : γ < δ} for
every β, δ < λ. So a basis for this topology is all the above rays together with
the open intervals (β, δ) = {γ : β < γ < δ}.
Exercise 14.3. Show that a set C ⊆ λ is closed in the sense of Definition 14.1
if and only if it is closed in the order topology.

Unfortunately tools from topology are not so useful for understanding or-
dinals and club sets, and so we won’t take this topological viewpoint. Instead
interactions between set theory and topology (that is, what is called the field
of set-theoretic topology) mostly focus on using set theoretic techniques to con-
struct interesting topological spaces, and prove independence results in topology.

Exercise 14.4. Show that ω1 equipped with the order topology is sequentially
compact, but not compact.
We’ll begin by showing that an intersection of clubs is a club. Hence, we
will be able to use the club sets to generate a filter.
Lemma 14.5. Suppose λ is a limit ordinal with cf(λ) > ω, and C, C 0 ⊆ λ are
clubs in λ. Then C ∩ C 0 is a club.
Proof. It is clear that C ∩ C 0 is closed. To see that C ∩ C 0 is unbounded,
suppose β ∈ λ. Let α0 ∈ C be such that α0 > β. Then let αn+1 be the
least element of C 0 that is greater than αn , and αn+2 be the least element of
C greater than αn+1 . These ordinals exist since C, C 0 are unbounded. Then
sup{α2n : n ∈ ω} = sup{α2n+1 : n ∈ ω} is in C ∩ C 0
Indeed, a stronger version of this lemma is true.

63
Figure 7: The proof that the intersection of two clubs is unbounded.

T cf(λ) > ω, and hCα : α < βi is a sequence


Lemma 14.6. If λ is a cardinal with
of clubs of length β < cf(λ), then α<β Cα is a club.
T
Proof. It is clear that α<β Cα is closed. To show it is unbounded, similarly
to Lemma 14.5, we can make an increasing sequence of length β · ω where the
(α, n)th elementT is an ordinal in Cα for every n, and hence the sup of this
sequence is in α<β Cα .
This lemma is best possible:
Exercise 14.7. If cf(λ) = ω, there exist disjoint clubs in λ. If cf(λ) > ω, then
there is a sequence of cf(λ) many clubs whose intersection is empty.
Since the intersection of two clubs is a club, the clubs generate a filter. We
will be most interested in this filter on regular cardinals:
Definition 14.8 (The club filter). The filter F generated by the club sets on a
regular cardinal κ is called the club filter on κ. That is, C ⊆ κ is in the club
filter on κ if and only if C contains a club set in κ. Note that the club filter on
κ is κ-complete by Lemma 14.6.
Caution: an element of the club filter is not a club. It is a set that contains
a club.
One common source of clubs is that they are the closure points of operations
on ordinals:

64
Exercise 14.9. Suppose κ > ω is a regular cardinal, λ < κ and hfα : α < λi
are functions where fα : κ → κ. Let C = {γ : (∀α < λ)(∀β < γ)fα (β) < γ} be
the set of ordinals γ so that γ is closed under all the functions fα . Show that C
is club in κ.
If we think of an element of the club filter as being a “large set”, then a “not
small” set is a set not in the dual ideal (i.e. an I-positive set wrt the dual ideal
I). That is, a set S is “not small” if its complement does not contain a club,
which is true iff S intersects every club.
Definition 14.10. If κ > ω is a regular cardinal, then S ⊆ κ is stationary if
S intersects every club in κ.

The dual ideal to the club filter is called the nonstationary ideal.
Exercise 14.11. Suppose κ > ω is a regular cardinal.
1. Show that if C ⊆ κ is a club and S ⊆ κ is stationary, then C ∩ S is
stationary.

2. Suppose S ⊆ κ is a stationary. Then show S is unbounded.


3. Suppose S ⊆ κ is stationary. Then show {λ ∈ S : λ is a limit ordinal} is
stationary.
Exercise 14.12. Suppose κ > ω is a regular cardinal. Show that if S ⊆ κ is
S into λ < κ many sets hSα : α < λi where the
stationary, and we partition S
Sα are pairwise disjoint and α<λ Sα = S, then there is some α < λ such that
Sα is stationary. [Hint: a union of fewer than κ many nonstationary sets is
nonstationary since an intersection of fewer than κ many clubs is a club.]
Our next goal is to prove Fodor’s lemma. To do this, we’ll first discussed
a more refined type of intersection which is very useful when dealing with club
sets.
Definition 14.13 (Diagonal intersection). Let hXα : α < κi be a sequence of κ
many subsets of κ. Their diagonal intersection is defined to be
\
4α<κ Xα = {β < κ : β ∈ Xα }.
α<β

Theorem 14.14. Suppose κ is a regular cardinal. Then the diagonal intersec-


tion of κ many clubs in κ is club in κ.
Proof. Suppose hCα : α < κi is a sequence ofTclubs. We may assume that
the Cα are decreasing under ⊇; letting Cα0 = β≤α Cα it is easy to see that
4α<κ Cα = 4α<κ Cα0 . The sets Cα0 are still club since the club filter is κ-
complete. Let C = 4α<κ Cα .
First we show that C is closed. Suppose ν ∈ C is such that C ∩ ν is cofinal
in ν. Fix α < ν. We want to show ν ∈ Cα . This is true since for all β ∈ C ∩ ν

65
T
such that β > α, we have β ∈ ξ<β Cξ , hence β ∈ Cα . Thus, ν ∈ Cα since
Cα is
T closed, and Cα contains an increasing sequence whose limit is ν. So
ν ∈ α<ν Cα , and ν ∈ C.
Next we show that C is unbounded. Let β0 ∈ C0 be an arbitrarily large
ordinal. Now let βn+1 ∈ Cβn be the least element of Cβn larger than βn . We
claim β = sup{βn : n ∈ ω} is in C. This is because for all α < β, there is some
n so that α ≤ βn , and hence βm+1 ∈ Cα for all m ≥ n since βm+1 ∈ Cβm ⊆ Cα
for all m ≥ n, since the Cα are decreasing. Thus, β ∈ Cα for all α < β and so
β ∈ C.
A filter on κ is called normal if it is closed under diagonal intersections.
Hence, Theorem 14.14 states that the club filter is normal.
A very important fact about stationary sets is Fodor’s lemma.
Lemma 14.15 (Fodor’s lemma). Suppose κ > ω is a regular cardinal,kS ⊆ κ
is stationary and f : S → κ is such that f (α) < α for all α ∈ S. Then there is
some γ ∈ κ and some stationary set T ⊆ S such that f (α) = γ for all α ∈ T .

Proof. For a contradiction, suppose that for each γ < κ, the set {α ∈ S : f (α) =
γ} is nonstationary. Hence, there is a club Cγ such that f (α) 6= γ for all
α ∈ Cγ . Then the diagonal intersection 4γ<κ Cγ is club, and hence S ∩ 4γ<κ Cγ
is nonempty. Let α be a nonzero element of S ∩ 4γ<κ Cγ . Then fT(α) = γ
for some γ < α, so by our choice of Cγ , α ∈ / Cγ . But then α ∈ / γ<α Cγ ,
contradicting the fact that α is in the diagonal intersection.

Fodor’s lemma for a filter F on κ is actually equivalent to normality of F .


Exercise 14.16. Suppose F is a filter on κ. Then the following are equivalent:
1. F is normal (i.e. closed under diagonal intersections).
2. If S ⊆ κ is F -positive and f : κ → κ is such that f (α) < α, then there is
some γ < κ and F -positive T ⊆ S so that f (α) = γ for all α ∈ T .
In fact, every normal filter on κ must extend the club filter.
Exercise 14.17. Suppose F is a nontrivial normal κ-complete filter on a regular
cardinal κ. Then every club set is in F .

66
15 Applications of Fodor: ∆-systems and Sil-
ver’s theorem
Anytime in set theory we can make an interesting function f on ordinals such
that f (α) < α, we can often gain a great deal of insight by applying Fodor’s
lemma.
We give a couple applications to illustrate this. Our first application is
the ∆-system lemma. The ∆-system lemma is the key combinatorial principle
behind Cohen’s proof of the consistency of ¬CH.
Definition 15.1. Suppose X is a collection of sets, r is a set, and for all
distinct A, B ∈ X, A ∩ B = r. Then we call X a ∆-system with root r.
For example, if the elements of X are pairwise disjoint, then X is a ∆-system
with root ∅.
Lemma 15.2 (The ∆-system lemma). Suppose X is an uncountable set of finite
sets. Then there are an uncountable subset X 0 ⊆ X and a finite set r such that
X 0 is a ∆-system with root r.
Proof. We may assume |X| = ω1 . Let hXα :Sα < ω1 i be an enumeration of
the elementsSof X. Since each Xα is finite, | X| ≤ ω1 , so by relabeling
S the
elements of X with countable ordinals we may also assume that X ⊆ ω1 .
That is, we may assume each Xα consists of finitely many ordinals less than ω1 .
We will use Fodor’s lemma to begin refining our collection of Xα . Let
f : ω1 → ω1 be the function
(
sup(Xα ∩ α) if Xα ∩ α is nonempty
f (α) =
0 otherwise.

Since Xα ∩ α is a finite set of ordinals less than α, its sup is less than α if it
is nonempty, and so f (α) < α for α > 0. Hence, by Fodor’s lemma, there are
some γ and some stationary set S such that for every α ∈ S, f (α) = γ.
There are only countably many finite sets of ordinals less than or equal to
γ which could be possible values for Xα ∩ α. Hence, if we partition S into
countably many sets {α ∈ S : Xα ∩ α = r0 } for each such r0 ⊆ γ, one of these
sets is stationary by Exercise 14.12. Fix a finite r and stationary S 0 ⊆ S so that
Xα ∩ α = r for every α ∈ S 0 .
Let C = {α : (∀β < α)Xβ ⊆ α}. Then C is a club subset of ω1Sby Exer-
cise 14.9 since it is the set of closure points of the function α 7→ sup β<α Xβ .
We claim {Xα : α ∈ C ∩ S 0 ∧ α > γ} is a ∆-system with root r. Consider
Xβ , Xα in this set where β < α. Then Xβ ∩Xα = Xβ ∩Xα ∩α = r since Xβ ⊆ α
by definition of C.
Exercise 15.3. Give a different proof of the ∆-system lemma that does not use
Fodor’s lemma. First, show it suffices to prove the ∆-system lemma in the case
where every element of X has size n. Then prove the lemma by induction on n.

67
More generally, we have the following ∆-system lemma for κ size collections
of sets of size < λ where κ > λ are infinite regular cardinals:
Exercise 15.4. Suppose κ > λ are infinite regular cardinals. Assume that for
all δ < κ, δ <λ < κ. Let X be a collection of κ many sets of cardinality less than
λ. Then there is a subset of X 0 of size κ so that X 0 forms a ∆-system.
The possible values of the powerset function 2κ on singular cardinals κ is
a deep problem in set theory (see Section 11.2). One early theorem which
gives limitations on its possible values is Silver’s theorem from 1974 that GCH
cannot fail first at a singular cardinal of uncountable cofinality. The essential
ingredient in proving this theorem is stationary sets. Our proof below is due to
Baumgartner and Prikry [BP]. We begin with a few exercises:
Exercise 15.5. Suppose <P is a partial order on the set P . Then there is a
linear order <L on P extending <P . That is, for all a, b ∈ P , a <P b → a <L b.
[Hint: Use Zorn’s lemma]
Exercise 15.6. Suppose <L is a linear order on a set L so that for each a ∈ L,
|{b ∈ L : b < a}| ≤ κ. Then |L| ≤ κ+ . [Hint: Suppose that f is an injection from
κ+ to L. Show that ran(f ) must be unbounded in L. Hence, L = y∈ran(f ) {b ∈
S

L : b < y} is a union of κ+ many sets of size κ.]


Theorem 15.7. Suppose κ is a singular cardinal of uncountable cofinality, and
2λ = λ+ for all λ < κ. Then 2κ = κ+ .
Proof. To show that 2κ = κ+ we’ll put a linear order on 2κ so that each point
has at most κ predecessors, and then apply Exercise 15.6. First, however, we’ll
replace 2κ by a set of the same cardinality which is easier to work with.
Let hµα : α < cf(κ)i be a strictly increasing sequence of length cf(κ) cofinal
in κ such that if α is a limit ordinal, then µα = supβ<α µβ . For each α < κ,
let gα : P(µα ) → µ+ α be a bijection (which exists since 2
µα
= µ+α ). For each
A ∈ P(κ), let fA : cf(κ) → κ be the function fA (α) = gα (A ∩ µα ). Note that
fA (α) < µ+α for every α. Let F = {fA : A ∈ P(κ)} so the function A 7→ fA is a
bijection from P(κ) to F.
The point of “coding” elements A ∈ P(κ) in terms of these functions fA ,
is that we only need to remember a small amount of information about fA
to recover what A is17 . In particular, if S ⊆ cf(κ) is unbounded, then we
can
S recover A just S from the values of fA (α) for α ∈ S. This is since A =
−1
α∈S A ∩ µ α = g
α∈S α (fA (α)).

Claim. For A ∈ P(κ), |{B ∈ P(κ) : {α : fB (α) < fA (α)} is stationary}| ≤ κ.


Proof of Claim. Fix A. We’ll prove this by using Fodor’s lemma to show that
if {α : fB (α) < fA (α)} is stationary, then fB is determined by a small amount
of information, and there are only κ many possibilities for what it is.
17 These types of robust codings are commonly used in set theory and computability theory.

For example, if A ⊆ ω then the function fA (n) = A ∩ n has the property that we can recover
A just from knowing infinitely many values of fA . This is a trick used often in computability
theory.

68
Now fA (α) < µ+ α for every α ∈ cf(κ). Hence, for each α ∈ cf(κ), we can
choose some injection hα : {β : β < fA (α)} → µα . So if fB (α) < fA (α), then
hα (fB (α)) < µα .
Now suppose B ∈ P(κ) is such that {α ∈ cf(κ) : fB (α) < fA (α)} is station-
ary. Consider the following function fB0 with domain

{α : α is a limit ordinal and fB (α) < fA (α)},

which is a stationary subset of cf(κ) by Exercise 14.11.3. Let fB0 (α) be the least
β such that hα (fB (α)) < µβ . Then hα (fB (α)) < µα , and if α is a limit (so
µα = supβ<α µβ ) we must have fB0 (α) < α. Hence, by Fodor’s lemma, there is
a stationary set SB and an ordinal γB such that fB0 (α) = γB for all α ∈ SB .
By our discussion above, we can recover all the values of fB from the function
fB  SB , which is determined by the values hα ◦ fB  SB , where ran(hα ◦ fB 
SB ) ⊆ µγB . Now there are cf(κ) < κ many choices of γB , there are at most
2cf(κ) = cf(κ)+ < κ different stationary subsets SB of cf(κ), and letting λ =
max(|µγB |, cf(κ)) < κ there are at most λλ = 2λ = λ+ ≤ κ many functions from
SB to µγB which could be equal to hα ◦ fB  SB . Hence, there are at most κ
many B such that {α : fB (α) < fA (α)} is stationary.  Claim.
Now consider the partial ordering on F where fB < fA if {α : fB (α) <
fA (α)} contains a club. This is a partial order since it is clearly irreflexive, and
it is transitive since the intersection of two clubs is a club. So by Exercise 15.5,
there is a linear order <∗ on F extending <. Now if A 6= B, then if {α : fB (α) <
fA (α)} is not stationary, then {α : fB (α) > fA (α)} is in the club filter (since the
set {α : fA (α) = fB (α)} is bounded), so fA <∗ fB . Taking the contrapositive,
fB <∗ fA implies that {α : fB (α) < fA (α)} is stationary, and hence by the
claim, {fB : fB <∗ fA } has size ≤ κ. Hence, by Exercise 15.6, |F| = κ+ .
Exercise 15.8. Suppose κ is a singular limit cardinal and 2κ = κ. Suppose F
is a set of functions from κ to κ such that:
• For all f ∈ F , f (α) < α for all α > 0. (The functions in F are regressive).
• For all f, g ∈ F , if f 6= g, then there exists β < κ such that f (α) 6= g(α)
for all α > β (F is an eventually different family of functions).

Show that |F | ≤ κ.
We remark that this proof actually gives the following stronger result:
Theorem 15.9 (Silver, 1974). If κ is a singular cardinal of uncountable cofi-
nality and 2λ = λ+ for a stationary set of λ < κ, then 2κ = κ+ .

69
16 Trees
A tree is a partial order (<T , T ) so that for all x ∈ T , {y : y ≤T x} is wellordered
by <T . For each x ∈ T , rankT (x) is therefore the ordertype of {y : y <T x}. The
αth level of T is defined to be {x ∈ T : rankT (x) = α}. This is an example
of an antichain in T , a set A ⊆ T so that for all distinct x, y ∈ A, x ≮T y
and y ≮T x. A chain in T is a subset of T that is linearly ordered by <T . A
branch in T is a chain which is closed downwards. The height of the tree T is
sup{rankT (x) + 1 : x ∈ T }; the least ordinal greater than all the levels in T .

Figure 8: A tree

An example of a tree of height κ is the following. Let X be a set, and X <κ


be the set of functions from ordinals less than κ to X. Then X <κ is a tree under
the ordering (.

Lemma 16.1 (König’s lemma). Suppose T is a tree of height ω and every level
of T is finite. Then T has an infinite branch.
To prove König’s lemma, we’ll repeatedly use the “pigeonhole principle” that
ω is regular: if XS is an infinite set and we write it as a union of finitely many
sets Xi , so X = i<n Xi , then some Xi is infinite.

Proof. We define an infinite branch {xn : n ∈ ω} where xn is at level n. Let


x0 be a node at level 0 such that {y : y >T x0 } is infinite (which exists by the
above pigeonhole principle).

70
Inductively, let xn+1 >T xn be an element of T at level n + 1 where {y ∈
T : y >T xn+1 } is infinite. Such an xn+1 must exist by the pigeonhole principle.

König’s lemma is a type of compactness phenomenon for ω. It has a precise


relationship to topological compactness:
Exercise 16.2. Suppose T is a tree of height ω and every level of T is finite. Let
X be the set of infinite branches in T . For each t ∈ T let Nt = {x ∈ X : t ∈ x}
be the set of all infinite branches that include t. Show that the topology on X
generated by the basic open sets Nt is compact.
The tree we considered in König’s lemma is called an ω-tree. More generally,
a tree T is a κ-tree iff the height of T is κ and every level has size less than κ.
Our next theorem is that the analogue of König’s lemma fails for ω1 .
Theorem 16.3 (Aronszajn). There is an ω1 -tree with no branches of order type
ω1 .
Proof. The tree T of all injections in ω <ω1 has height ω1 . However, it has no
branches of length ω1 , since there is no injection from ω1 to ω. We will construct
a subtree of T whose levels are countable which has height ω1 .
Let =∗ be the equivalence relation of equality mod finite on ω <ω1 . That is
s =∗ t if dom(s) = dom(t) and {α : s(α) 6= t(α)} is finite. Note that for every
s ∈ ω <ω1 there are countably many t ∈ ω <ω1 so that s =∗ t.
We construct a sequence hsα : α < ω1 i by transfinite recursion so that for
every α,
1. ω \ ran(sα ) is infinite, and
2. β < α → sα  β =∗ sβ .
Here condition 2 is what we really want at the end, and condition 1 is just an
extra hypothesis to ensure that our construction will work. Condition 1 makes
sure that as we construct larger and larger sα , there is enough empty space to
inject larger ordinals into ω, while satisfying 2.
At successor ordinals, let sα+1 = sα ∪ {(α, n)} for some n ∈ / ran(sα ). For
limit α, choose an increasing sequence hαn : n ∈ ωi cofinal in α. To define sα we
first will inductively define a sequence tn of partial functions from αn to ω, total
functions t0n : αn → ω so that t0n ⊇ tn , and finite sets xn ⊆ ω where |xn | = n
and xn is disjoint from ran(t0n ). We’ll ensure that t0n =∗ sαn . Let t0 = t00 = sα0
and x0 = ∅. Let

tn+1 = t0n ∪ sαn+1  {β : β ≥ αn ∧ sαn+1 (β) ∈


/ (ran(t0n ) ∪ xn )}.

Since t0n =∗ sαn =∗ sαn+1  αn+1 , we have that Note that dom(tn+1 ) is αn+1
minus a finite set, and tn+1 is an injection by definition. Let t0n+1 be any exten-
sion of tn+1 so dom(t0n+1 ) = αn+1 and ran(t0n+1S) is disjoint from xn . Finally, let
xn+1 = xn ∪ inf ω \ ran(tS0n+1 ). Now let sα = n tn . Now sα satisfies (1), since
ran(sα ) is disjoint from n xn . Finally Sα satisfies (2) since each t0n does.

71
Finally, let S be the set of injections t so that there exists α < ω1 so t =∗ sα .
Then t  β =∗ sα  β =∗ sβ by (2). Hence, t  β ∈ S. Hence, S is closed
downwards in ω <ω1 , and so levels in S agree with levels in ω <ω1 . So S is a tree
that has countable levels, and height ω1 .
The sequence of sα we constructed in the above proof is a type of “coherent
sequence”, which are an important in many part of set theory.
In general, a cardinal κ has the tree property if every κ-tree has a branch
of ordertype κ. So we have proved ω has the tree property and ω1 does not.
A κ-tree with no branch of ordertype κ (which is a counterexample to the tree
property at κ) is called a κ-Aronszajn tree. So we have proved that an ω1 -
Aronszajn tree exists.

Exercise 16.4. Assuming CH show that there is an ω2 -tree with no branches


of length ω2 . [Hint: use the fact that ω1ω = 2ω to help control the number of
countable subsets of ω1 .]
It is a result of Mitchell that assuming large cardinals, it is consistent that
ω2 has the tree property.

16.1 Compactness and incompactness in set theory*


Compactness and incompactness phenomena in set theory are important topics
of modern research. Compactness here refers to a much broader notion than just
topological compactness. It refers to reflection-type principles which roughly
state that if every “smaller subobject” of some object has a property, then the
object has this property. For instance, we can rephrase a topological space X
being compact as follows: X is compact iff for all collections U of open sets in
X, if every finite subset of U does not cover X, then U does not cover X. So
for topological compactness, the types of objects we are considering are open
covers, and “smaller” means finite.
Just as there are many interesting both compact and non-compact topologi-
cal spaces, there are many different examples of compactness and incompactness
phenomena in set theory which come from considering different types of objects
and notions of size. Myriad interesting open problems come from asking to what
extent compactness phenomena can hold throughout the universe of sets, and
especially when they can coexist with other incompactness phenomena.
We’ve already seen some incompactness that happens at the cardinal ω1 :
there is an ω1 -Aronszajn tree. At the level of ω1 there are many other interesting
examples of incompactness. Much research is motivated by trying to understand
to what extent types of incompactness that we find at ω1 can hold at other
cardinals.
We give some examples of compactness phenomena:
• The compactness theorem for first order logic: if every finite subset of a
first-order theory is satisfiable, then the theory is satisfiable.

72
• Silver’s theorem that if κ is a singular cardinal of uncountable cofinality
and GCH holds below κ, then GCH holds at κ.
• König’s lemma.
And some examples of incompactness phenomena:

• The failure of compactness for the infinitary logic Lω1 ,ω : there are Lω1 ,ω
theories all of whose finite subsets are consistent, but where the theory is
not18 .
• Magidor’s theorem that GCH can first fail at ℵω .

• ω1 -Aronszajn trees
• Kurosh monsters: uncountable groups all of whose proper subgroups are
countable. These were first constructed by Shelah.
• The  principle which follows from V = L.

Many large cardinal axioms are related to compactness. For example, some
simple large cardinal axioms are compactness principles which say that phenom-
ena which happen in V must happen at a particular Vα . More sophisticated
examples come from the type of reflection we get from elementary embeddings
of the universe into inner models.
Large cardinals are also an important way of measuring the strength of
other compactness principles. One important dividing line is whether a given
compactness principle at κ implies that κ is strongly inaccessible. One example
of a compactness principle for a cardinal κ is a higher analogue of Ramsey’s
theorem: does every function f : [κ]2 → 2 have a homogeneous set of size κ? If
κ has this property, we say κ is weakly compact. Weakly compact cardinals
are strongly inaccessible.
Even when is not the case that a compactness principle at κ implies that κ
is strongly inaccessible, it is often true that compactness principles at κ imply
that κ has large cardinal properties in canonical inner models. For example, if
κ has the tree property, then κ is strongly inaccessible in L.
18 Consider the language with countably many constant symbols hc : n ∈ ωi, the theory
nW
containing the sentences c0 6= cn for each n ≥ 1 and the sentence ∀x( n≥1 x = cn ). Every
finite subset of this theory is consistent and has a model, but the whole theory is inconsistent.
(Note, however, that “higher” versions of compactness principles for Lω1 ,ω are consistent from
large cardinal axioms.)

73
17 Suslin trees and ♦
The real numbers have a simple order-theoretic characterization. Recall that a
linear order < on X is dense if for all a, b ∈ X with a < b, there exists c ∈ X
such that a < c < b. A subset A ⊆ X is dense in X if for all a < b in X there
is a c ∈ A such that a < c < b. A linear order < on X is complete if any set
with an upper bound has a least upper bound, and any set with a lower bound
has a greatest lower bound. An endpoint of a linear order is an element that
is either greater than every other element, or less than every other element.
Exercise 17.1.
1. Show that any two countable dense linear orders without endpoints are
isomorphic. [Hint: back-and-forth]
2. Show that R is order-isomorphic to any complete dense linear order with-
out endpoints that has a countable dense subset.
Note that if we drop the requirement of having a countable dense subset, then
there are complete dense linear orders without endpoints that have arbitrarily
large cardinality. This is because there are dense linear orders without endpoints
of arbitrarily large cardinality (e.g. by Lowenheim-Skolem), and we can then
take their Dedekind completions.
In 1920, Suslin asked whether we can weaken the hypothesis in Exercise 17.1.2
of having a countable dense subset to instead say that any disjoint collection
of open intervals is countable Recall that an open interval is a set of the form
(a, b) = {c : a < c < b}. In modern terminology, Suslin asked if there is a Suslin
line:
Definition 17.2. Say that a linear order < on a set X is a Suslin line if < is
a complete dense linear order without endpoints such that every set of disjoint
open intervals is countable, but < has no countable dense set (hence it is not
isomorphic to R).
There is a related type of object called a Suslin tree.
Definition 17.3. A Suslin tree is a tree T of height ω1 so that every chain
and antichain in T is countable.
So a Suslin tree is an ω1 -Aronszajn tree with the extra property that every
antichain is countable.
Definition 17.4. Say that a tree T is Hausdorff if for all limit ordinals λ,
if rankT (x) = rankT (y) = λ and {z : z <T x} = {z : z <T y}, then x = y. In
particular, if a tree is Hausdorff, then T has a unique least element (called the
root).
Exercise 17.5. Show that if there is a Suslin tree, then there is a Suslin tree
T so that T is Hausdorff and if x ∈ T is not maximal and x is on level α, then
there is a countable infinite set of y at level α + 1 such that y > x.

74
Lemma 17.6. There is a Suslin tree iff there is a Suslin line.
Proof. ⇐: Suppose (X, <) is a Suslin line. By transfinite recursion, construct
a set T = {(aα , bα ) : α < ω1 } of ω1 many open intervals such that for all
I, J ∈ T , either I and J are disjoint, or I ( J. For each α < ω1 , since
{aβ : β < α} ∪ {bβ : β < α} is countable and therefore not dense in X (since X
is a Suslin line), we can find an interval (aα , bα ) in X which does not contain
any of the endpoints aβ or bβ for β < α.
Now we claim the set T under the relation ( is a Suslin tree. Every chain
in T under ( is wellfounded. This is since the rank of (aα , bα ) is ≤ α by
construction. T cannot have any uncountable antichain, since this would be an
uncountable set of disjoint open intervals in X. T cannot have a countable chain
h(cα , dα ) : α < ω1 i since then the intervals (cα , cα+1 ) and (dα , dα+1 ) would be
an uncountable set of open intervals.
⇒: Suppose T is a Suslin tree. By Exercise 17.5, assume T is a Suslin tree
with the two properties given in that exercise. For each x ∈ T of level α that
is not maximal, since {y : y >T x ∧ y is at level α + 1} is countably infinite, we
can define a dense linear order without endpoints <x on this set (using some
bijection with Q).
Now let X be the set of maximal branches through T . Let < be the linear
order on X defined as follows. Let a = {aγ : γ < α} and b = {bγ : γ < β} be
maximal branches in X. Since a and b are both maximal branches neither is a
subset of the other, so there is a least γ such that aγ 6= bγ . By the exercise, γ
cannot be a limit ordinal, so γ = ξ + 1 for some ξ, where aξ = bξ . Now define
a < b iff aγ <aξ bγ using the ordering <aξ on the successor of aξ .
It is clear that this ordering is dense and has no endpoints (since each or-
dering <x is dense and has no endpoints), and there is no countable dense set
(since the tree T has height ω1 ). Finally, if {(ai , bi ) : i ∈ I} is any set of dis-
joint open intervals in X, then choose xi ∈ T such that Nxi ⊆ (ai , bi ), where
Nxi = {c ∈ X : xi ∈ c}. Then {xi : i ∈ I} forms an antichain in T , since
the intervals (ai , bi ) are pairwise disjoint. Hence, since T has no uncountable
antichains, there is no uncountable set of disjoint open intervals.
If V = L, then there is a Suslin line. In 1970, Jensen isolated a combinatorial
principle called ♦ that follows from V = L which implies that there is a Suslin
tree.
Definition 17.7. A ♦-sequence is a sequence hAα : α < ω1 i where Aα ⊆ α such
that for all sets X ⊆ ω1 , {α : X ∩ α = Aα } is stationary. ♦ is the statement
that there exists a ♦-sequence.
♦ is a powerful principle for constructing objects in ω1 many steps. Obvi-
ously we cannot list all subsets of ω1 in ordertype ω1 since 2ω1 > ω1 . However, a
♦ sequence lets us understand and anticipate all subsets of ω1 (via their bounded
subsets) in an ω1 -length construction.
CH is an easy consequence of ♦:
Proposition 17.8. ♦ implies CH.

75
Proof. Fix a diamond sequence hAα : α < ω1 i. We claim P(ω) ⊆ {Aα : α < ω1 }.
This is since for all X ⊆ ω, there must be some α > ω such that X = X ∩ α =
Aα .
Next, we’ll prove the following theorem of Jensen.

Theorem 17.9 (Jensen). ♦ implies there is a Suslin tree.


The proof of this theorem is based on the following lemma:
Lemma 17.10. Suppose T is a Suslin tree. Define for each α < ω1 T<α =
{x ∈ T : rankT (x) < α}. If A is a maximal antichain in T , then the set
C = {α : A ∩ T<α is a maximal antichain in Tα } is club in ω1 .

Proof. Define a function f : T → ω1 as follows. Given x ∈ T , let f (x) be the


least β such that there is a y so y is compatible with x (y ≤ x or x ≤ y) and
y ∈ A. Now let C = {α : ∀x ∈ T<α f (x) < α}. Then if α ∈ C, then A is a
maximal antichain in Tα , and C is closed.
Proof of Theorem 17.9. Our tree T will be on the set ω1 . Our tree will have the
property

if x ∈ T as a successor in T , then x has infinitely many successors in T . ((*))

Hence, if there is an uncountable chain C in T , then the set of x ∈


/ C such that x
is a successor of some y ∈ C, is also an uncountable antichain. Thus, it suffices
to show there are no uncountable antichains in T .
For each countable limit ordinal α, we will define the ordering on α. Let
hAα : α ∈ ω1 i be a ♦ sequence.
Given that we have defined T<α , if Aα is a maximal antichain in T<α , define
T<α+1 by letting the only nodes at height α be computable with elements of Aα .
Otherwise, extend T<α to T<α+1 so that T<α+1 has height α + 1, and subject
to condition (*).
Now suppose A is a maximal antichain. Then C = {α : A∩T<α is a maximal antichain in Tα }
is club, and since hAα : α < ω1 i is a ♦ sequence, S = {α : A ∩ α = Aα } is sta-
tionary. Hence, C ∩ S is stationary, and is hence nonempty. Suppose α ∈ C ∩ S.
Then every node of rank α in T is above A ∩ α. Hence, there can be no elements
of A that are greater than α, else this would contradict that A is an antichain.
Hence, A is countable.

76
18 Models of set theory and absoluteness
The majority of questions of set theory are independent from the ZFC axioms, so
much of modern set theory deals with building and studying models of ZFC with
diverse properties. In this section we’ll prove some basic facts about models of
ZFC before we move to the study of Gödel’s constructible universe L. Precisely,
by a model of ZFC we mean a set M and a relation E on M (which we interpret
as the ∈ relation) so that (M, E) satisfies the ZFC axioms. There are lots of
important definable sets, classes, and functions in ZF (e.g. ω, ORD, α 7→ Vα ).
If M is a model of ZF (or some weaker theory) and x is a definable set which
ZF proves exists and is unique, we let xM be the interpretation of x inside M .
For example, ω M denotes the least nonzero ordinal in M that is not a successor
ordinal in M .
Theorem 18.1 (Skolem’s paradox). If ZFC is consistent, then there is a count-
able model of ZFC.
Proof. By completeness and the Löwenheim-Skolem theorem.
This is often called a paradox because ZFC proves there are uncountable sets.
So if (M, E) is a countable model of ZFC, then there must be x ∈ M so that
(M, E)  “there is no injection from x to ω”. However, since M is countable,
we know that x has countably many elements in M . Of course, the resolution
to this paradox is that (M, E) doesn’t contain an injection from x to ω M , but
this doesn’t mean that such an injection doesn’t exist outside the model. In
generally, any set-size model of ZFC is of size κ for some cardinal κ, and thus
there is a proper class of sets that this model is “missing”.
Models of ZFC can in general be very strange. For example, even though the
axiom of foundation asserts that the ∈ relation is wellfounded, this only means
that if (M, E) is a model of ZFC, then (M, E)  “E is wellfounded”; that is, M
will not contain any sets that don’t have any E-minimal elements. It is quite
possible that from outside the model, we see that E is actually illfounded.
Theorem 18.2. If ZFC is consistent, then there is a model (M, E) of ZFC such
that E relation is illfounded.
Proof. Let hcn : n ∈ ωi be new constant symbols we add S to the language of set
theory, and let ϕn be the formula cn+1 ∈ cn . Then ZFC + n ϕn is consistent by
the compactness theorem, since any finite subset is consistent. To see this, let
M be a model of ZFC. Note that given any number m, we can let ck = (m−k)M
for k ≤ m. Then M  ZFC + ϕ0 ∧ . . . ∧ ϕSm−1 .
Now take a model (M, E) of ZFC + n ϕn . The set {cM n : n ∈ ω} has no
E-minimal element, so E is illfounded.
The Mostowski collapse lemma tells us some interesting information about
wellfounded models of ZFC.
Theorem 18.3. Suppose there is a model (M, E) of ZFC such that the relation
E on M is wellfounded. Then (M, E) is isomorphic to a model of the form
(N, ∈ N ).

77
Proof. Apply the Mostowski collapse lemma to the relation E on M .
Illfounded models of ZFC are useful, but they can also think that all sorts
of crazy things are true. For example, by Gödel’s incompleteness theorem there
are models of ZFC + ¬ Con(ZFC). These models must be illfounded, and in fact
the ∈ relation on ω in such a model must be illfounded. In these notes we will
try to avoid illfounded models.
A model of the language of set theory of the form (N, ∈ N ) where N is a
transitive set is called a transitive set model. We will also deal with models of
set theory that are proper classes. In order to formalize this, we first introduce
some notation. If N is a class, then we will write ϕN to denote the sentence
ϕ where we replace the quantifiers ∀x and ∃x in ϕ with ∀x ∈ N and ∃x ∈ N
respectively.
Exercise 18.4. If (N, ∈ N ) is a transitive set model, then (N, ∈ N )  ϕ if
and only if ϕN is true.
A transitive class model is a transitive class N , equipped with the ∈
relation. A technical issue here is that the satisfaction relation  is only defined
for models which are sets. So if N is a transitive class, if we write N  ϕ, we
mean ϕN . (There can’t be a single first-order formula19 defining the satisfaction
relation for the class V by Tarski’s undefinability of truth)
There is a some meta-mathematical subtlety when dealing with transitive
class models. For example, when we prove that for a transitive class M , M 
ZFC, what we really mean that we can prove a theorem schema that for each
axiom ϕ of ZFC, ϕM is true. Another thing to be aware of is that Gödel’s
completeness theorem (that Con(T ) iff T has a model) is not true for class
models, just set models.
The reason that we will spend so much time studying transitive models of
ZFC is that they have a nice sort of Goldilocks property. Arbitrary models of
ZFC are too hard to work with: it is hard to determine what is true in these
models, and they can disagree with V about almost anything (for example,
there are models of ZFC + ¬ CON(ZFC)). Transitive models are just right: they
agree with V about some basic facts, but on more complicated questions like
CH or “there exists an inaccessible cardinal” they may differ. Strong types of
agreement with V (e.g. for each x ∈ M , P(x)M = P(x)V ) are too restrictive
and are too similar to V to be an interesting type of model to study.
There are important facts which will remain true between the real universe
and transitive models. These are called absoluteness results. We say that a
formula ϕ(x1 , . . . , xn ) is absolute for transitive models if for all transitive class
models N , for all x1 , . . . , xn ∈ N , we have ϕN (x1 , . . . , xn ) is true if and only
19 though for each n, and each fixed transitive class model M (including V ), the satis-

faction relation M n ϕ(x1 , . . . , xn ) for Σn formulas ϕ is definable. For n = 0, M 0


ϕ(x1 , . . . , xn ) where ϕ is Σ0 and x1 , . . . , xn ∈ M iff there exists a transitive set N such
that x1 , . . . xn ∈ N and N  ϕ(x1 , . . . , xn ). This is by ∆0 absoluteness. Now inductively,
M n+1 ∃y1 , . . . , ym ¬ψ(x1 , . . . , xn , y1 , . . . , ym ) where ψ is Σn iff ∃y1 , . . . , ym ∈ M ¬M n
ϕ(x1 , . . . , xn , y1 , . . . , ym ).

78
if ϕ(x1 , . . . , xn ) is true. Absoluteness is generally true for formulas with low
logical complexity.
Definition 18.5. The ∆0 formulas are the smallest class of formulas containing
the atomic formulas, and which are closed under ∧, ∨, ¬ and bounded quantifiers
(∀x ∈ y) and (∃x ∈ y).

Equivalently, a formula is a ∆0 formula if the only quantifiers it contains are


bounded quantifiers.
Proposition 18.6 (Absoluteness for ∆0 formulas). If M is a transitive class,
and ϕ is a ∆0 formula with n free variables, then for all x1 , . . . , xn ∈ M , we
have that ϕ(x1 , . . . , xn ) is true iff ϕM (x1 , . . . , xn ) is true.
Proof. By induction on formula complexity. Atomic sentences of the form x ∈ y
or x = y are clearly absolute, and a conjunction or negation of an absolute
formula is clearly absolute.
Finally, suppose ϕ(x1 , . . . , xn ) is absolute and now consider the formula
θ(y, x2 , . . . , xn ) := ∃x1 ∈ yϕ(x1 , x2 , . . . , xn ). If θN (y, x2 , . . . , xn ) is true, then
the x1 ∈ N that witnesses this formula in N also witnesses it in V , since
ϕ(x1 , . . . , xn ) is absolute. Similarly, if ∃x1 ∈ yϕ(x1 , x2 , . . . , xn ) is true in V ,
then the x1 witnessing this formula must be in N , since N is transitive.
Exercise 18.7 (ZF). The following are all equivalent to ∆0 formulas:
S
1. x = ∅, x = y, x is a singleton, x is an ordered pair, x = {y, z},
x = (y, z), x ⊆ y, x is transitive, x is an ordinal, x is a limit ordinal, x is
a natural number, x = ω.
2. z = x × y, z = x \ y, z = x ∩ y,
3. R is a relation, f is a function, x = dom(f ), y = ran(f ), y = f (x),
g = f  x.
More precisely, the definitions we gave of these concepts earlier in the notes
are not themselves all ∆0 , but we can prove that they are equivalent to ∆0
formulas. It is clear that if ψ is a formula, and ` ψ ↔ ϕ, then if ϕ is absolute,
then ψ is absolute.
Recall that using prenex normal form, we can write any formula with all the
quantifiers at its beginning. The Levy hierarchy measures the complexity of for-
mulas in prenex normal form by the number of alternations between existential
and universal quantifiers.
Definition 18.8 (The Levy Hierarchy). A formula is Σ0 or Π0 if it is ∆0 . A
Σn+1 formula is a formula of the form ∃x1 ∃x2 . . . ∃xk ψ where ψ is Πn . A Πn+1
formula is a formula of the form ∀x1 . . . ∀xk θ where θ is Σn . A formula is ∆n
if it is equivalent to both Σn and Πn formulas. More generally, we say that a
formula ϕ is ∆n in a theory T if there are Σn and Πn formulas ψ and θ such
that T ` ϕ ↔ ψ and T ` ϕ ↔ θ.

79
Figure 9: A Σn formula has n blocks of alternating existential and universal
quantifiers. The last block of quantifiers is ∃ if n is odd, and ∀ if n is even.

The Levy hierarchy counts the number of alternations of unbounded quan-


tifiers in formulas.
It is clear that every Σn formula is Πn+1 , and every Πn formula is Σn+1 .

Figure 10: A picture of the Levy hierarchy

Exercise 18.9. Show that if ϕ, ψ are Σn formulas, then

• ¬ϕ is equivalent to a Πn formula.
• ϕ ∧ ψ and ϕ ∨ ψ are equivalent to Σn formulas.
• (∀x ∈ y)ϕ is equivalent to a Σn formula, assuming ZF.

If x is a set or class in V , we say that x is Σn /Πn definable (without pa-


rameters) if there is a Σn /Πn formula ϕ with a single free variable such that
y ∈ x ↔ ϕ(y).
Lemma 18.10. Suppose G : V → V is a Σn function. Then the function F on
ORD such that F (α) = G(F  α) is a Σn function.

80
Proof.

y = F (α) ↔ α ∈ ORD ∧ ∃f (f is a function ∧ dom(f ) = α


∧ (∀β < α)(f (β) = G(f  b) ∧ y = G(f ))).
Now apply Exercise 18.9.
Lemma 18.11. Suppose F : V → V is a Σn function with a Πn domain. Then
F is ∆n .
Proof. Let ϕ be a Σn formula defining F . We give a Πn definition for F .
F (x) = y ↔ x ∈ dom(F ) ∧ ∀y 0 (y 0 6= y → ¬ϕ(x, y 0 )).
Here are some examples of concepts at various levels of the Levy hierarchy.
Exercise 18.12. The following are ∆1 in ZF:
• R is a wellfounded relation on X.
• rank(x) = α.
• x is finite.
• TC(x) = y.
The following are Σ1 :
• x is countable.
• There is an injection from x to y.
The following are Π1 :
• κ is a cardinal.
• κ is a regular cardinal.
• κ is a limit cardinal.
The following are Σ2 :
• There exists a strongly inaccessible cardinal.
The following can be written as Π2 formulas:
• The continuum hypothesis 2ℵ0 = ℵ1 .
Suppose N ⊆ M are transitive classes. We say that a formula ϕ with n free
variables is upwards absolute if for all x1 , . . . , xn ∈ N , if N  ϕ(x1 , . . . , xn ),
then M  ϕ(x1 , . . . , xn ). Similarly, we say that a formula ϕ is downwards
absolute if for x1 , . . . , xn ∈ N , if M  ϕ(x1 , . . . , xn ), then N  ϕ(x1 , . . . , xn ).
An important class of formulas that are upwards absolute are the Σ1 formulas.
If a Σ1 formula is true in N , then the witness that this formula is true is also
included in M , and verifying that it is actually a witness is a ∆0 statement
which is absolute between N and M . By taking negations, we similarly see that
Π1 formulas are downwards absolute.

81
Proposition 18.13. Suppose N, M are transitive classes and N ⊆ M . If ϕ is
a Σ1 formula, then ϕN implies ϕM . If ψ is a Π1 formula then ψ M implies ψ N .
Proof. Suppose ϕ is a Σ1 formula. Then take a witness x for ϕ in N . Since x ∈
M , we have that ϕ holds in M by ∆0 absoluteness. Downwards Π1 absoluteness
then holds by contraposition since the negation of a Σn formula is Πn .
Similar types of results for absoluteness, and upwards and downwards ab-
soluteness exist in many other fields of mathematics. In the study of models
of PA if M is an end extension of N , then we have upwards absoluteness for
Σ01 formulas, and downwards absoluteness for Π01 formulas in the arithmetical
hierarchy.
Another analogy comes from topology. Suppose we have a topological space
(X, τ ) and we enlarge the topology by adding more open sets. In this setting,
sufficiently simple topological properties of sets will be upwards/downwards
absolute.
Exercise 18.14. Suppose τ is a topology on a set X, and τ 0 ⊇ τ is a topology
with more open sets (i.e. a finer topology).
1. If A ⊆ X is open in (X, τ ), then A is also open in (X, τ 0 ) (i.e. being an
open set is upwards absolute).
2. If A ⊆ X is dense in (X, τ 0 ), then A ⊆ X is dense in (X, τ ) (i.e. being
dense is downwards absolute).
3. Show that A ⊆ X being nowhere dense is neither upwards nor downwards
absolute.
An important consequence of Proposition 18.13 is that if a theory T proves
that θ is a ∆1 formula, and T holds in two models M and N , then θN ↔
θM . Wellfoundedness is a very important examples of ∆1 properties which is
therefore absolute for models of ZFC.
Exercise 18.15. ZFC + Con(ZFC) does not prove there is a wellfounded model
of ZFC. [Hint: First show that if Con(ZFC) is true, then if N is any wellfounded
model of ZFC, then N  Con(ZFC). Then argue that if ZFC + Con(ZFC) implied
there was a wellfounded model of ZFC, we could find an infinite descending chain
of models of ZFC]
Theorem 18.16 (ZF). If α is a limit ordinal, then Vα is a model of Exten-
sionality, Foundation, Pairing, Union, Nullset, Separation, and Powerset. If
α > ω, then Vα is model of the axiom of Infinity. If κ is a strongly inaccessible
cardinal, then Vκ is a model of Replacement, and hence Vκ is a model of ZFC.
Proof. The extensionality and foundation axioms are Π1 , hence downwards ab-
solute and hold in all transitive classes. ∅ is an element of Vα and witnesses that
the nullset axiom is true.
The rest of the axioms of ZFC are set existence axioms. To show that they
are true, we check that the sets the axioms state exist are in Vα .

82
S S
Union:Ssuppose y ∈ Vα . Then x = y is in Vα , since rank( y) < rank(y).
Now x = y is ∆0 , and hence it holds in Vα .
Separation: suppose x, w1 , . . . , wn ∈ Vα , and ϕ is a formula. Let y = {z ∈
x : ϕVα (z, w1 , . . . , wn )}. Then y ∈ Vα , since rank(y) ≤ rank(x), and y witnesses
this instance of the separation axiom. (Note that it important that we use ϕVα
and not the formula ϕ here).
Infinity: if α > ω, then ω ∈ Vα , and x = ω is ∆0 .
Powerset: if rank(x) = β, then rank(P(x)) = β + 1. So if α is a limit ordinal
and x ∈ Vα , then P(x) ∈ Vα . Now y = P(x) is Π1 , so since it holds in V , it is
true in Vα by downwards absoluteness.
Pairing: since rank({x, y}) = sup(rank(x), rank(y)) + 1, if x, y ∈ Vα and α
is a limit ordinal, then {x, y} ∈ Vα , and z = {x, y} is absolute.
Now suppose κ is a strongly inaccessible cardinal. We verify that the axiom
of replacement is true in Vκ . Since κ is strongly inaccessible, every element of
κ has cardinality less than κ. Suppose X ∈ Vκ and F is a class function in
Vκ (i.e. there is some formula ϕ so that Vκ  ∀x∃!yϕ(x, y)). Let F [X]Vκ =
{y ∈ Vκ : (∃x ∈ X)Vκ  ϕ(x, y)}. Then rank(F [X]Vκ ) is the sup of fewer than κ
ordinals less than κ, so F [X]Vκ ∈ Vκ , and this set witnesses this instance of the
axiom of replacement.
Note that it is not true that Vκ  ZFC if and only if κ is inaccessible. We’ll
show this in the next section.
Recall that if κ is a cardinal, then Hκ is the collection of sets whose transitive
closure has size less than κ. Clearly Hκ is transitive.
Exercise 18.17. For every regular cardinal κ, Hκ is a model of ZFC−Powerset
Two important questions we will often study about models of ZFC are the
following:
1. Does the same definition in two different transitive models of ZFC define
the same set?
2. Does the same set in two different transitive models of ZFC share the same
properties?
Much of what we prove in subsequent chapters will require detailed analysis
of both of the above kinds.
Here an example of the first type of question. Recall that if M is a model of
ZFC, we let ω1M indicate the least uncountable ordinal in M (which ZFC proves
exists and is unique). This first exercise shows that a model of the form Vα
must computes ω1 to be the same ordinal as ω1V .
Exercise 18.18. Suppose α is an ordinal such that Vα  ZFC. Then ω1Vα = ω1V .
[Hint: First note that the ordinals in Vα are a subset of the ordinals in the real V .
Use the fact that “β is countable” is Σ1 definable to first show that ω1Vα ≤ ω1V .
Next, show that α ≥ ω + 3 and hence P(ω × ω)V ∈ Vα . Finally, show that
for every ordinal β < ω1V , Vα  “β is countable” by using the absoluteness of
wellfoundedness and the absoluteness of the Mostowski collapse.]

83
Here is an example of the second type of question.
Exercise 18.19. Suppose V  “κ is a cardinal”, and M ⊆ V is a transitive
model so that κ ∈ M . Then show that M  “κ is a cardinal”.[Hint: being a
cardinal is Π1 .]

In a later section we will show that if a measurable cardinal exists, then


ω1L 6= ω1V . Hence, by the above exercise, letting κ = ω1V , we must have L 
“κ is a cardinal”, and L  “κ is uncountable” since uncountability is downwards
absolute. Hence, since κ 6= ω1L , we must have ω1L < κ, and so ω1L must be
countable. So the converse of the above exercise is not true.

84
19 The reflection theorem
Before we we move to studying Gödel’s constructible universe L, we will prove
the reflection theorem. It is a general fact about all cumulative hierarchies like
V or L. The following is a theorem schema; it is a theorem of ZFC that is
provable for each formula ϕ:
Theorem 19.1. Suppose hMα : α ∈ ORDi is a cumulative hierarchy, so
• Mα is transitive,
• If α < β, Mα ⊆ Mβ , and
• if λ is a limit, then Mλ = α<λ Mα .
S
S
Let M be the class M = α∈ORD Mα . If ϕ is a formula, then there is a closed
and unbounded class of ordinals α such that

for all x1 , . . . , xn ∈ Mα , Mα  ϕ(x1 , . . . , xn ) ↔ M  ϕ(x1 , . . . xn ). (*)

Proof. We prove this by induction on the complexity of ϕ. The theorem is true


for ∆0 formulas for the club of all ordinals by ∆0 absoluteness, since M and
Mα are transitive.
Logical connectives are easy to handle. If (*) is true for ϕ, then it is also true
for ¬ϕ. Suppose ϕ and ϕ are formulas and C, C 0 ⊆ ORD are clubs such that
(*) holds for ϕ and ϕ0 on the clubs C and C 0 . Then (*) holds for the formulas
ϕ ∧ ϕ and ϕ ∨ ϕ0 on the club C ∩ C 0 .
Finally, suppose ϕ(x1 , . . . , xn ) is a formula and (*) holds for all α ∈ C
for some club C. We want to show that (*) holds for the formula ψ :=
∃x1 ϕ(x1 , . . . , xn ). Note that for all α ∈ C and all x2 , . . . , xn ∈ Mα if Mα 
∃x1 ϕ(x1 , . . . , xn ), then M  ∃x1 ϕ(x1 , . . . , xn ) using the same witness x1 ∈ Mα
since (*) holds for ϕ. It is the other direction of the implication which will
require careful work (and refining our club).
First we show that the class of α ∈ C such that (*) holds for ψ is closed.
Suppose hαξ : ξ < βi is an increasing sequence of ordinals in C with limit
λ = sup{αξ : ξ < β}, and for all x2 , . . . , xn ∈ Mαξ , if M  ∃x1 ϕ(x1 , . . . , xn ),
then Mαξ  ∃x1 ϕ(x1 , . . . , xn ). Then we claim that for all x2 , . . . , xn ∈ Mλ ,
if M  ∃x1 ϕ(x1 , . . . , xn ), then Mλ  ∃x1 ϕ(x1 , . . . , xn ). This is because if
x2 , . . . , xn ∈ Mλ , then there is some αξ such that x2 , . . . , xn ∈ Mαξ , and hence
∃x1 ∈ Mαξ ϕ(x1 , . . . , xn ), so this same x1 witnessing this statement is in Mλ .
Finally, we show that the class of α ∈ C such that (*) holds for ψ is
unbounded. Our basic idea is that if M  ∃x1 ϕ(x1 , . . . , xn ), then the wit-
ness x1 ∈ M must be in Mβ for some β. We will find a closure points
for these β witnessing these statements. For all α ∈ C, define the function
fα (x2 , . . . , xn ) to be the least β ∈ C such that Mβ  ∃x1 ϕ(x1 , . . . , xn ) if
M  ∃x1 ϕ(x1 , . . . , xn ). Otherwise, let fα (x2 , . . . , xn ) = α. Hence, if M 
∃x1 ϕ(x1 , x2 , . . . , xn ) then Mβ  ∃x1 ϕ(x1 , x2 , . . . , xn ) for β = fα (x2 , . . . , xn ).
Now let g(α) = sup{fα (x2 , . . . , xn ) : x2 . . . , xn ∈ Mα }. So if x2 , . . . , xn ∈ Mα ,
then if M  ∃x1 ϕ(x1 , x2 , . . . , xn ), then Mg(α)  ∃x1 ϕ(x1 , x2 , . . . , xn ).

85
Fix α0 ∈ C, and let αn+1 = g(αn ). For all x2 , . . . , xn ∈ Mαn , if (∃x1 ∈
M )ϕM (x1 , x2 , . . . , xn ) is true, then (∃x1 ∈ Mαn+1 )ϕMαn+1 (x1 , x2 , . . . , xn ). Hence,
if λ = sup{αn : n ∈ ω}, then (*) holds for the formula ψ for α = λ.

Figure 11: A proof of the reflection theorem.

Note that as the formula ϕ gets larger, the club Cϕ where (*) holds requires
a larger and larger formula to define it. Note that we cannot say at the end
that the intersection of the countably many club classes
T Cϕ is a club because
there won’t be a formula defining this intersection ϕ Cϕ . The formula would
be the conjunction of the infinitely many formulas defining the classes Cϕ and
would be infinitely long.
If M is a transitive model of ZFC, then from outside this model, it may
indeed be the case that the intersection of the countably many clubs Cϕ is
empty (when M has a height that has countable cofinality). Indeed, we will
soon prove that if there is a transitive model of ZFC, then there is a minimal
transitive model of ZFC (one contained in all the others). This model will must
exhibit this phenomenon.
To take a simpler example with the same flavor, since PA is consistent20
we have that for every n ∈ ω, PA ` “n does not code a proof of ¬ Con(PA)”.
20 ZFC proves that N is a model of PA

86
However, it is not true that PA ` (∀n)“n does not code a proof of ¬ Con(PA)”.
This would contradict the second incompleteness theorem.
We have the following corollary of the completeness theorem:
Corollary 19.2. Given a finite set of axioms of ZFC, there is an α such that
Vα is a model of these finitely many axioms. There is also a countable transitive
model M of these finitely many axioms.
Proof. Apply the reflection theorem to the conjunction ϕ of these finitely many
axioms, to find some Vα such that Vα  ϕ. We can obtain a countable transitive
model by taking a countable elementary submodel of Vα and then taking its
transitive collapse.
Indeed, by taking the transitive collapse of this Vα we can find a countable
elementary submodel The above corollary is also a “corollary schema”: we can
prove it for each finite set of axioms of ZFC. Of course ZFC does not prove “for
all finite subsets S ⊆ ZFC there is a model of S”; by the completeness theorem
this would contradict Gödel’s incompleteness theorem.
Here is another nice exercise which combines reflection, absoluteness, and
countable elementary submodels:
Exercise 19.3. Suppose ϕ is a Σ1 formula. Show that if there is a unique set
x such that ϕ(x) holds, then x is countable.
Corollary 19.2 is one way of formalizing some matters to do with forcing.
In general, it is most convenient to develop forcing over countable transitive
models of ZFC. However, ZFC cannot prove such models exists. However, we
can instead work with a countable transitive model of a “large enough” finite
subset of ZFC (large enough to prove the finitely many theorems that we use
in our forcing proof). If we show this way that if ϕ is a sentence of set theory
(e.g. ¬CH) and for every large enough finite subset S of ZFC, there is a model
of S + ϕ, then by the compactness theorem ϕ is consistent with ZFC.
Corollary 19.4. ZFC is not finitely axiomatizable.
Proof. By Gödel’s incompleteness theorem, ZFC cannot prove there is a model of
ZFC. But we have just shown that ZFC proves that for any finitely many axioms
of ZFC, ZFC proves there is some Vα such that Vα models theses axioms.
The proof of Theorem 19.1 is really just constructing appropriate closure
points to witnesses the Tarski-Vaught test for being a Σn elementary substruc-
ture (it is a little delicate to formalize the Tarski-Vaught test for class models,
so we haven’t done the above proof this way).
Exercise 19.5 (The Tarksi-Vaught test). Suppose M is a structure, and N ⊆
M is a substructure. Then N is an elementary substructure of M iff for every
formula ϕ(x1 , . . . , xn , y) and x1 , . . . , xn ∈ N , if M  ∃yϕ(x1 , . . . , xn , y), then
there exists some y ∈ N so that M  ϕ(x1 , . . . xn , y).
For cumulative hierarchies that are set-length (instead of length all the or-
dinals), we have the following version of the reflection theorem using the full
Tarski-Vaught test:

87
Theorem 19.6. Suppose κ is a limit ordinal and hMα : α ∈ κi is a sequence of
sets such that
• Mα is transitive,
• If α < β, Mα ⊆ Mβ , and

• if λ ≤ κ is a limit, then Mλ =
S
α<λ Mα .
• cf(κ) > |Mα | · ω for all α < κ.
Then for every formula ϕ, there is a closed and unbounded set of ordinals α
such that Mα is an elementary substructure of Mκ .

Proof. It is clear that the set of such α is closed. To see that it is unbounded,
suppose α0 < κ. Then recursively define αn+1 to be the least ordinal β such
that for all formulas ϕ and all x1 , . . . , xn ∈ Mα0 , if Mκ  ∃yϕ(x1 , . . . , xn ), then
there exists y ∈ Mβ such that M  ϕ(x1 , . . . , xn ). Since cf(κ) > |Mαn | · ω, note
that this β is the sup of |Mαn | · ω ordinals less than κ, and is hence less than κ.
Now let α = supn αn . Then by the Tarski-Vaught test, we have that Mα is an
elementary substructure of Mκ .
For example, the above theorem applies in the case where κ is strongly
inaccessible and Mα = Vα by Exercise . So if Vκ is strongly inaccessible, then
there is a club of α in κ so that Vα is an elementary substructure of Vκ . (And
so it is not true that Vκ  ZFC iff κ is strongly inaccessible).

Exercise 19.7. Suppose there exists some α such that Vα  ZFC. Show that
the last such α such that Vα  ZFC must have cf(α) = ω.
In general, we expect that the least ordinal α satisfying some “closure prop-
erty” will have cf(α) = ω.

88
20 Gödel’s constructible universe L
In this section we’ll define Gödel’s constructible universe L, and prove L is
a model of ZFC (more precisely, that for every axiom ϕ ∈ ZFC, ϕL is true).
We will do this carefully assuming only ZF without choice, since one of our
goals is to show that if there is a model M of ZF, then LM is a model of ZFC.
Hence, Con(ZF) → Con(ZFC), and so while the axiom of choice can be used to
construct strange objects like nonmeasurable set, we can prove it doesn’t create
any inconsistencies.
Gödel’s L is built up in a cumulative hierarchy where at each level, the next
level is the sets definable using parameters over the previous levels. We will
begin by carefully calculating the complexity of the map sending each set to its
definable subsets.
It is easy to check that the set of all formulas is a ∆1 definable subset of Hω ,
it has a ∆1 definition, the functions ∧, ¬, and ϕ 7→ ∃xϕ are ∆1 definable, and
the relation “ϕ is a subformula of ψ” is wellfounded.21
Exercise 20.1 (ZF). The relation M  ϕ(x1 , . . . , xn ) is a Σ1 definable relation
between transitive sets M , the set of formulas, and M <ω . [Hint: Define this
relation using recursion on the relation of being a subformula,
• M  x ∈ y iff x ∈ y.
• M  x = y iff x = y.
• M  ϕ ∧ ψ iff M  ϕ ∧ M  ψ

• M  ¬ϕ iff ¬M  ϕ
• M  ∃xϕ(x) iff (∃x ∈ M )M  ϕ(x).
Each clause of this definition is ∆0 (note that definition for ∃xϕ only uses a
bounded quantifier over M ). Now apply the same ideas as Lemmas 18.10 and
18.11 to finish.]
If M is a set, then let

Def(M ) = {y : (∃ϕ∃z1 , . . . , zn ∈ M )y = {x ∈ M : M  ϕ(x, z1 , . . . , zn )}},

So Def(M ) is all subsets of M that are definable using parameters in M . From


Lemmas 18.10, 18.11 and Exercise 20.1, we see M 7→ Def(M ) is a ∆1 function.
Definition 20.2 (Gödel’s constructible universe L).
• L0 = ∅.
• Lα+1 = Def(Lα ).
21 If we were being excruciatingly pedantic, we’d begin here defining what a formula is, and

engaging in other trivial and tedious syntactic discussions. If you want to engage in this kind
of pedantry, you’re on your own.

89
• Lλ = α<λ Lα if λ is a limit.
S

• L = α Lα .
S

Note that the map α 7→ Lα is ∆1 by Lemmas 18.10 and 18.11.


Lemma 20.3 (ZF). For all ordinals α,
1. Lα ⊆ Vα .
2. Lα and L are transitive
3. β ≤ α implies Lβ ⊆ Lα
4. Lα ∩ ORD = α. So, L contains all the ordinals.
Proof. These are all proved by transfinite induction on α. They are all easy to
check at limit steps.
(1) follows at successor stages since if Lα ⊆ Vα , then Lα+1 ⊆ P(Lα ) ⊆
P(Vα ) ⊆ Vα+1 .
(2) is true at successor steps since if Lα is transitive, if x ∈ y ∈ Lα+1 , then
y ∈ Lα+1 implies y ⊆ Lα , so if x ∈ y, then x ∈ Lα and so x = {a : a ∈ x} ∈
Lα+1 . It follows that L is a transitive class.
(3) follows since each x ∈ Lα is definable by using itself as a parameter.
For (4), if Lα ∩ ORD = α, then {β ∈ Lα : β is an ordinal} = α since being an
ordinal is ∆0 and hence absolute. Hence Lα+1 ∩ORD = {β : β ≤ α} = α+1.
The first few levels of L are the same as V , but they quickly become very
different.
Exercise 20.4 (ZFC). For each infinite α, |Lα | = |α|.
Now we show that L is a model of ZF.
Lemma 20.5 (ZF). For every axiom ϕ of ZF, ϕL .
Proof. The extensionality and foundation axioms are true in all transitive classes.
∅ is an element of L and witnesses that the nullset axiom is true.
The rest of the axioms of ZFC are set existence axioms. To show that they
are true, we check that the sets the axioms state S exist are in L.
Union: suppose x ∈ L where x ∈ Lα . Then x is in Lα+1 since it has a ∆0
definition using the parameter x.
Pairing: if x, y ∈ Lα , then {x, y} ∈ Lα+1 since it has a ∆0 definition from
the parameters x and y.
Infinity: ω ∈ Lω+1 and hence ω ∈ L, and so L witnesses the axiom of infinity
since the formula “x = ω” is ∆0 .
Separation: Suppose, x, w1 , . . . , wn ∈ L, and ϕ is a formula. By the re-
flection theorem, we can find some α such that x, w1 , . . . , wn ∈ Lα and Lα 
ϕ(z, w1 , . . . , wn ) ↔ L  ϕ(z, w1 , . . . , wn ) for all z ∈ Lα . Then clearly {z ∈
x : Lα  ϕ(z, w1 , . . . , wn } is in Lα+1 , and witnesses the separation axiom is true
in L for this formula and these parameters since ϕ reflects at α.

90
Powerset: given x, let β = supy⊆x∧y∈L inf{α : y ∈ Lα }. Now the set z =
P(x)∩L is in Lβ+1 since it has a ∆0 definition over Lβ ; it is the set of y such that
y ⊆ x. This set witnesses the powerset axiom is true in L since y ∈ z ↔ y ⊆ x
is ∆0 and hence true for every y ∈ L.
Replacement: suppose F is a class function in L and X ∈ L. Let β =
supx∈X inf{α : F [x] ∈ Lα }. By the reflection theorem reflecting the definition
of F to some γ > β, we can define F [X] in Lγ , and this will witness the
replacement axiom is true.
Definition 20.6. Let V = L abbreviate the sentence ∀x∃αx ∈ Lα .
Lemma 20.7 (The absoluteness of constructibility).
1. If M is a transitive class model of ZF which contains all the ordinals, then
LM = L.
2. (V = L)L .
Proof. (1) follows from the fact that the map α 7→ Lα is ∆1 and hence absolute,
so LM L
α = Lα . (2) is since (V = L) translates to (∀x ∈ L)(∃α ∈ L)[x ∈ Lα ]
L

which is true since L contains all the ordinals, and the definition of
L is absolute.
The above lemma may seem trivial but it definitely isn’t. We’re heavily
using the nontrivial fact that the definition of L is ∆1 and that ∆1 formulas
are absolute. For example, if M is a transitive class inner model of ZF, then in
general V M 6= V .
We say that a transitive class M is a transitive class model of a theory
T if for every ϕ ∈ T , ϕM is true.
Corollary 20.8. L is the smallest transitive class inner model of ZF which
contains all the ordinals.
Proof. If M is any transitive class model of ZF, then L = LM ⊆ M .
If ϕ is a formula, let Def ϕ (M ) be the subsets of M definable using the
formula ϕ and parameters from M .
Theorem 20.9 (ZF). L  AC.
Proof. Since L  V = L, we may assume V = L. We define a single class
wellordering of the whole universe by transfinite recursion which we note <L .
Let ≺ be a ∆1 wellordering of all formulas, and for every x, let rankL (x) =
inf{α : x ∈ Lα }.
Let x <L y iff rankL (x) < rankL (y) or α = rankL (x) = rankL (y) and
inf{ϕ : x ∈ Def ϕ (Lα )} ≺ inf{ϕ : y ∈ Def ϕ (Lα )} under our ordering ≺ on for-
mulas, or the least formula ψ defining x and y over Lα is the same, and the
<L -least parameters in Lα in lexicographic order defining x using ψ are less
than the <L -least parameters in Lα in lexicographic order defining y.
Exercise 20.10. This is a wellordering, and is ∆1 definable.

91
20.1 Gödel operations and fine structure*
When Gödel first defined L, he was worried that the logical nature of his defini-
tion would be viewed with suspicion. To assuage such fears, he defined 8 simple
functions F1 , . . . , F8 , such as F1 (X, Y ) = {X, Y }, F2 (X) = {(a, b) ∈ X : a ∈ b},
F3 (X, Y ) = X − Y , F4 (X, Y ) = {(a, b) ∈ X : b ∈ Y }, and F5 (X) = {b ∈
X : ∃a(a, b) ∈ Y }, (the remaining three operations permute the order of tu-
ples). Gödel then defined Lα+1 to be the subsets of Lα one could obtain from
a composition of finitely many Gödel operations applied to Lα or its elements.
The above definitions yields the same set as Def(Lα ) and the proof is quite
easy. In one direction, each of the Gödel operations is first-order definable. In
the other direction, given any formula, we can write it in prenex normal form
ϕ(y1 , . . . , yn ) = ∃x1 . . . Qxm ψ(x1 , . . . , xm , y1 , . . . , yn ) where ψ is quantifier free.
Then it is easy to show by induction on formula complexity that the set of tuples
{(x1 , . . . , xm , y1 , . . . , yn ) ∈ X1 ×. . .×Xm ×Y1 , . . .×Yn } : ψ(x1 , . . . , xm , y1 , . . . , yn )
satisfying the quantifier free formula ψ is given by a composition of Godel op-
erations applied to X1 , . . . , Xm , Y1 , . . . , Yn . But then we can obtain the set
of y1 , . . . , yn satisfying ϕ since existential quantifiers correspond to projecting
(using operation F5 ), and universal quantifiers are ¬∃¬ (and we can perform
complementation using operation F3 ).
From a modern perspective, the Gödel operations aren’t useful; it’s easier
just to talk directly about first order definability. However, Gödel’s idea antici-
pated an even more refined way of building up L using very simple operations
due to Jensen22 . Jensen introduced “rudimentary functions”, and defined a hi-
erarchy Jα where at each step we close under applying rudimentary S functions.
Jensen’s J hierarchy still constructs the same universe L = α Jα , in fact, the
hierarchies coincide at limit stages. The issue with the L hierarchy that the J
hierarchy fixes is that in general Def(Lα ) is difficult to analyze since Lα satisfies
very little of ZFC (e.g. it doesn’t even satisfy the pairing axiom). In contrast, in
Jensen’s fine structure, we can define Skolem functions over the Jα structures
in a nice way so that Σn definability over Jα is the same as Σ1 definability over
an associated structure.
The first major application of Jensen’s fine structure was the proof that it
is consistent that CH holds and there are no Suslin trees; the earlier result of
Solovay and Tennenbaum that it is consistent that there are no Suslin trees
constructed a model where CH failed. Jensen’s proof relied on the following
combinatorial principle κ : There is a sequence hCα : α is a limit ordinal < κ+ i
such that for α < κ+ :
• Cα ⊆ α is club in α

• For β a limit point of Cα , Cα ∩ β = Cβ , and


• For α such that cf(α) < κ, the order-type of Cα is less than κ.
22 and building on earlier fine-structure-style analysis of Boolos and Putnam

92
Jensen showed using his new fine structure that κ holds in L for each κ. Square
is extremely useful for performing constructions of length κ+ using approxima-
tions of cardinality less than κ.

93
21 Condensation in L and GCH
Our next goal is to prove that GCH is true in L. To do this, we’ll suppose x ⊆ κ
and x ∈ L, and figure out the first level Lα where x ∈ Lα . Our analysis will
consider elementary substructures of some Lβ such that x ∈ Lβ . Then we’ll use
the fact that if M is a transitive set such that M satisfies a sufficiently large
fragment of ZF and the sentence V = L, then M must actually be equal to Lα
for some α. This is because the construction of L is absolute. This fact is called
the condensation lemma:
Lemma 21.1 (The condensation lemma). There is a finite set S of axioms of
ZF − powerset so that if M is a transitive set where M  S and M  V = L,
then M = Lλ for some limit ordinal λ.
Proof. Let the axioms of S be pairing, union, and those axioms of ZF used
to prove that all the theorems leading up to the fact that for all α, Lα exists
and the map α 7→ Lα is ∆1 definable (and hence absolute). Suppose M  S
and M  V = L. Let λ be the least ordinal not in M . We must have that
ORDM = λ by absoluteness of being an ordinal. λ must be a limit ordinal since
for each α ∈ M , α + 1 = α ∪ {α} is in M by the pairing and union axioms.
M
Since M  ∀x∃α ∈ ORD(x ∈ Lα ), we have that ∀xS∈ M ∃α < λ(x S ∈ Lα ).
M
Since Lα = Lα by the absoluteness of Lα we have M ⊆ α∈M Lα = α<λ Lα =
Lλ .
Conversely, for each α < λ, LM α Sexists in M (since S is strong enough to
prove this), and LM α = Lα , so Lλ = α<λ Lα ⊆ M .

A stronger form of condensation is true, but proving it is technical; one


proof uses fine structure. There is a Π2 sentence ϕ so that M  ϕ if and only if
M = Lλ for some limit ordinal λ. Hence if M is a transitive set and M ≺Σ1 Lλ
for some limit λ, then M = Lλ0 for some λ0 ≤ λ.
In what follows, we’ll heavily use the following version of the downward
Löwenheim-Skolem theorem due to Tarski and Vaught:
Exercise 21.2. Suppose M is an infinite structure in a language L, A ⊆ M is
a subset of the universe of M . Then there is an elementary substructure N  M
such that A ⊆ N , and N has cardinality at most |A| + |L| + ω.
We’re ready to calculate the levels of L when new subsets of κ appear.
Lemma 21.3. Assume V = L. If κ is a cardinal and x ⊆ κ, then x ∈ Lλ for
some λ < κ+ .
Proof. Let β be sufficiently large so that x ∈ Lβ , and Lβ  S + V = L. Such a
β exists by the reflection theorem.
By the downward Löwnheim Skolem theorem, there is an elementary sub-
structure N ≺ Lβ so that κ ⊆ N , x ∈ N , and |N | = κ. Let π : N → M be the
Mostowski collapse of N to a transitive set model M .
By transfinite induction π(α) = α for all α ≤ κ. Now α ∈ x ↔ π(α) ∈
π(x) ↔ α ∈ π(x), and so π(x) = x. Hence π(x) = x, and so x ∈ M .

94
Now M  S and M  V = L since N is an elementary substructure of Lβ ,
and N and M are isomorphic. So we must have that M = Lλ for some λ by
the condensation lemma. Since κ = |M | = |Lλ | and |Lα | = α for all infinite α,
we must have |λ| = κ, and hence λ < κ+ .
Corollary 21.4. L  GCH.

Proof. If V = L, then P(κ) ⊆ Lκ+ by the above lemma. But |Lκ+ | = κ+ .


Corollary 21.5. Con(ZF) → Con(ZFC) → Con(ZFC + GCH)
In the above argument, we’ve used the reflection theorem to find an appro-
priate β to reflect from. It will be convenient to know that for regular cardinals
κ, Lκ satisfies ZF − Powerset.
Lemma 21.6. If κ is an uncountable regular cardinal, then Lκ  ZF − Powerset
Proof Sketch. Copy the proof that L  ZFC replacing the use of the reflection
theorem with Theorem 19.6 in the verification of the Replacement and Separa-
tion axiom.
Recall that Hκ also satisfies ZF − Powerset. This isn’t a coincidence.
Theorem 21.7. If V = L, then for all infinite κ, Hκ = Lκ .
Proof. Vω = Hω = Lω , so it suffices to prove this when κ is uncountable. Indeed,
it suffices to prove this when κ is an uncountable successor cardinal, since at
limit cardinals both Hκ and Lκ are the union of the previous levels.
First we show Lκ+ ⊆ Hκ+ . If x ∈ Lκ+ , then the transitive closure of x
is in Lκ+ by Lemma 21.6 and the absoluteness of transitive closure. Hence
| TC(x)| < κ+ since |Lα | = α for every infinite ordinal α.
Now we show Hκ+ ⊆ Lκ+ . Fix x ∈ Hκ+ . Since V = L, TC({x}) is contained
in Lµ for some regular uncountable µ. By Löwenheim-Skolem, we can find an
elementary substructure M ≺ Lµ of cardinality κ such that TC({x}) ∈ M .
Let π : M → N be the Mostowski collapse of M to a transitive set N . By
condensation, N = Lλ for some λ < κ+ . By transfinite induction, for all
y ∈ TC({x}) we have π(y) = y, and so π(x) = x is in Lλ ⊆ Lκ+ .
Exercise 21.8. Assume V = L. Show that for every countable ordinal α, there
is some countable β > α such that Lβ+1 \ Lβ contains a subset of ω.
Exercise 21.9. Show that there is a Π2 sentence ϕ so that Lω1  ϕ, but L  ¬ϕ.

95
22 V = L implies ♦
In the early 1970s, Jensen proved that V = L implies there is a Suslin tree.
Jensen abstracted the ♦ principle as the key combinatorial tool used in his
proof.
We’ll recursively construct a ♦-sequence assuming V = L, making use of the
wellordering <L . We will prove this sequence is a ♦-sequence by showing there
cannot be a <L -least set X such that X ∩ α 6= α for a club C of α in ω1 . To
see this, we’ll analyze our construction in Lω2 , taking a countable elementary
submodel of Lω2 containing X and C, and obtaining a contradiction to our
definition of the ♦-sequence.
We’ll begin by proving a quick lemma about countable elementary submodels
of Lω2 . Our use of ω2 in this lemma (and in our proof of ♦) isn’t so important;
all that matters for this lemma is that Hω1 ⊆ Lω2 assuming V = L, and Lω2
satisfies a sufficiently large fragment of ZF + V = L.
Lemma 22.1. Assume V = L. Suppose M is a countable elementary submodel
of Lω2 . Then ω1 ∩ M = α for some countable ordinal α.
Proof. Suppose β is a countable ordinal such that β ∈ M . We need to show
that if γ < β, then γ ∈ M . Then since M is countable, M ∩ ω1 is a countable
set of ordinals which is downwards closed, and hence it is a countable ordinal.
Since β is countable, there is a surjection from ω to β. Let f be the <L -least
surjection from ω to β. Then f ∈ Hω1 , and hence f ∈ Lω2 . Now since <L is a
∆1 -definable linear order, the fact that f is the <L -least surjection from ω to
β is a Π1 fact, and hence it is true in Lω2 by downwards absoluteness. Hence,
f is definable in Lω2 , and so it must be in the elementary submodel M . Now
for each n ∈ ω, f (n) is also definable, and hence it must also be in M . But this
implies every ordinal less than β is in M .
A similar proof gives the following:
Exercise 22.2. Assume V = L, and suppose M is a countable elementary
submodel of Lω1 . Then M must be a transitive set and hence M = Lα for some
countable ordinal α.
Now we’re ready to prove V = L implies ♦.
Theorem 22.3 (Jensen). V = L implies ♦.
Proof. Assume V = L. We construct a sequence hAα : α < ω1 i where Aα ⊆ α
such that for all sets X ⊆ ω1 , {α : X ∩ α = Aα } is stationary. We will also
construct a sequence hCα : α < ω1 i where Cα ⊆ α is club.
Let A0 = C0 = ∅ and at successor ordinals let Aα+1 = Aα and Cα+1 = Cα .
If α is a limit, let (Aα , Cα ) be the <L -least pair such that Aα ⊆ α, Cα is a
club subset of α, and Aα ∩ β 6= Aβ for all β ∈ Cα . If no such pair exists, let
Aα = Cα = α.
For a contradiction, assume that for some X ⊆ ω1 , there is a club set C such
that X ∩ α 6= Aα for all α ∈ C. Let (X, C) be the <L -least such pair.

96
Now hAα : α < ω1 i, hCα : α < ω1 i and (X, C) are all hereditarily of cardi-
nality < ω2 , and so they are in Hω2 and therefore in Lω2 . By the Löwenheim-
Skolem theorem, let M be a countable elementary submodel of Lω2 which con-
tains hAα : α < ω1 i, hCα : α < ω1 i and (X, C). Let π be the Mostowski collapse
of M . By condensation, π(M ) = Lλ for some countable ordinal λ.
By Lemma 22.1, ω1 ∩ M = δ for some countable ordinal δ. By transfinite
induction π(β) = β for all β < δ. Now π(ω1 ) must be an ordinal > β for all
β < δ, and hence π(ω1 ) ≥ δ. But then ω1 ≥ π −1 (δ), and since π −1 ∈ M we
must have π(ω1 ) = δ.
Similarly, π(X) = X ∩ δ, π(C) = C ∩ δ, π(hAα : α < ω1 i) = hAα : α < δi,
and π(hCα : α < δi) = hCα : α < δi.
The statement that (C, X) is the <L -least pair such that C is club in ω1 ,
and X ∩ α 6= Aα for all α ∈ C is ∆1 , hence it is true in Lω2 by absoluteness.
(Here we’re using the fact that the ordering <L is ∆1 , and if (C, X) ∈ Lβ , then
that saying (C, X) is <L -least only requires quantifying over all (C 0 , X 0 ) in Lβ ).
Since M is an elementary submodel of Lω2 and π is an isomorphism, Lλ
models that (C ∩δ, X ∩δ) is <L -least such that C ∩δ is club in δ, and X ∩β 6= Aβ
for all β ∈ C ∩ δ.
Hence by ∆1 absoluteness, this statement is true in L, and hence by the
definition of the sequences Aα and Cα , Aδ = X ∩ δ, and Cδ = C ∩ δ. But then
X ∩δ = Aδ by definition of hAα : α < ω1 i and since C ∩δ is club in δ, δ ∈ C. But
then X ∩ δ = Aδ and δ ∈ C which is a contradiction to our choice of (X, C).

97
23 L and large cardinals
Our goal in this section is to prove two theorems about the relationship between
large cardinals and L.
Recall a cardinal κ is weakly inaccessible if κ is a regular limit cardinal.
Lets show first that weakly inaccessible cardinals are very large.
Lemma 23.1. Suppose C is club in a weakly inaccessible κ. Then the set
C 0 = {α ∈ C : |C ∩ α| = α} is also club in κ.
Proof. C 0 is closed: if |C ∩ αξ | = αξ for a sequence hαξ : ξ < λi and β =
supξ<λ αξ , then β is a cardinal since it is a sup of cardinals, and |C ∩ β|, has
cardinality greater than every α < β, so |C ∩ β| = β.
C 0 is unbounded: given α ∈ κ, let β0 ∈ C be such that β0 > α. Let βn+1 ∈ C
be least such that |C ∩ βn+1 | > βn . Such a βn+1 exists since κ is a limit cardinal
(and |C| = κ since κ is regular). Then β = sup βn is a cardinal and |C ∩ β| > βn
for each n, so |C ∩ β| = β.
For example if κ is weakly inaccessible and we set C = κ, then C 0 = {α : |α| =
α} is the set of cardinals in κ, and C 00 = {α : ωα = α} is the set of fixed points
of the ℵ function, and these are both club in κ by the above lemma. So if κ is
weakly inaccessible, it is far larger than the first fixed point of the ℵ function.
We’ve shown that ZFC cannot prove there are strongly inaccessible cardinals;
if κ is strongly inaccessible, then Vκ  ZFC. Similarly, we can show that ZFC
cannot prove there are any weakly inaccessible cardinals, since if κ is weakly
inaccessible, then Lκ  ZFC.
Theorem 23.2. If κ is weakly inaccessible in V , then Lκ  ZFC. Hence, if ZFC
is consistent, then ZFC cannot prove there are inaccessible cardinals.
Proof. In Lemma 21.6 we already proved that Lκ  ZF − Powerset. Lκ satisfies
Powerset since for each x ∈ Lκ , x ∈ Lµ for some cardinal µ < κ and so
P(x)L ∈ Lµ+ by our work in Section 21. The statement y = P(x) is Π1 so
by downwards absoluteness, P(x)L is the powerset of x in Lκ . Lκ  AC is an
exercise.
Exercise 23.3. Show that if κ is a regular cardinal, then Lκ  AC.
Our second goal is to show that L is incompatible with the existence of
measurable cardinals. Recall that a cardinal κ is measurable if there is a
nonprincipal κ-complete ultrafilter on κ.
A use of measurable cardinals is they allow us to take an ultrapower of the
universe V of set theory, analogously to how we have defined ultrapowers of
structures whose universes are sets.
DefinitionQ23.4. Suppose U is an ultrafilter on a set I. Then we define the
ultrapower U V of V by U as follows. Consider the equivalence relation ∼
on all functions from I to M where f ∼ g if {i ∈ I : f (i) = g(i)} ∈ U . Now
for each f , {g : g ∼ f } is a proper class in general, so using Scott’s trick we

98
define [f ] = {g : g ∼ f ∧ (∀h ∼ f )(rank(g) ≤ rank(h))}, so [f ] isQthe set of
representative of the ∼-class of f of minimal rank. Now we define U V to be
the structure in the language of set theory whose
Q universe is the class of the set
[f ] and whereQthe ∈ relation is given by [f ] ∈ QU V [g] if {i : f (i) ∈ g(i)} ∈ U . If
the relation ∈ U V is wellfounded, we identify U V with its Mostowski collapse.
It is easy to check that the proof of Los’s Q
theorem still works in this context,
and so for every sentence ϕ, ϕV is true iff ϕ U V is true. Indeed,
Exercise 23.5. Q Suppose U is an ultrafilter on an index set I. Show that the
function j : V → U V defined by setting
Q j(x) to be the constant function i 7→ x
is an elementary embedding of V to U V .
Q
If U is ω1 -complete, then U V will always be wellfounded:
Lemma 23.6. Suppose U is an ultrafilter on an index set I. Then if M Q is a
transitive set or class model, and U is ω1 -complete, then the ultrapower U M
is wellfounded.
Proof. Suppose [f0 ], [f1 ], . . . was an infinite descending sequence. Then by Los’s
theorem,
T for each n, Xn = {i ∈ I : fn+1T (i) ∈ fn (i)} is in the ultrafilter U . So
X
n n is in the ultrafilter. Pick x ∈ n∈ω Xn . Then f0 (x), f1 (x), . . . is an
infinite descending sequence in V which is a contradiction.
Q
In fact ω1 -completeness of U exactly characterizes when U V is wellfounded.
Exercise 23.7. ShowQthat if U is an ultrafilter on I and U is not ω1 -complete,
then the ultraproduct U V is illfounded.
Lemma 23.8. Suppose κ is a measurable cardinal and U is a nonprincipal κ-
complete
Q ultrafilter on κ, and j is the elementary embedding from V into M =
U V given above. Then for every α < κ, j(α) = α. However j(κ) > κ. Thus
if there is a measurable cardinal, then there is a nontrivial (i.e. nonidentity)
elementary embedding j from V into a transitive class model M .
Q
Proof. In this proof we will implicitly identify U V with its Mostowski collapse.
So for example, we will talk about whether some f : κ → V has the property that
[f ] is an ordinal, weQmean that the image of [f ] under the collapse is an ordinal.
By Los’s theorem, QU V  [f ] is an ordinal iff {α : f (α) is an ordinal} ∈ U , and
[f ] is an ordinal iff U V  [f ] is an ordinal by ∆0 absoluteness.
First, we claim that f (β) = β for all β < κ. This is by transfinite induction.
j(0) = 0 since j is an elementary embedding, and 0 is definable in both V and
M as the least ordinal. Now fix β < κ and suppose that for all α < β, j(α) = α.
Then since α < β implies j(α) < j(β), and j(α) = α, we have j(β) > α for all
α < β, and so j(β) ≥ β. We need to show j(β) ≤ β. Suppose f : κ → V was
such that [f ] < j(β). It suffices to show that [f ] = α for some α Q
< β. Now since
j(β) is the constant function β, by definition of the  relation in U M , we have
that X = S {α : f (α) < β} ∈ U . Now for each γ < β, let Xγ = {α : f (α) = γ}.
Now X = γ<β Xγ , so since U is κ-complete, some Xγ must be in U . For this
Xγ ∈ U , we have {α : f (α) = γ} ∈ U , and so [f ] = j(γ) = γ < β.

99
Now we show that j(κ) > κ. Consider the function d : κ → κ where d(α) = α.
Then [d] is an ordinal, and [d] > α for each α < κ, since d is eventually greater
than α, and a set of size less than κ cannot be in U . So [d] ≥ κ. However,
[d] < j(κ), since d(α) < κ for each α ∈ κ. Hence, j(κ) > κ.
Exercise 23.9. Suppose that U is a normal κ-additive ultrafilter on a measur-
able cardinal κ. (Normal in the sense of Section 12, and hence by Exercise 14.17
it extends the club filter). Then show that for the function d defined in the proof
of Lemma 23.8, [d] = κ. Further, U is normal if and only if [d] = κ.
The converse of the above lemma is true.

Exercise 23.10. If there is a nontrivial elementary embedding j : V → M where


M is a transitive class model, then there is a measurable cardinal. [Hint: First
prove that if j(α) = α for all α, then j is trivial. To show this, argue that
if rank(j(x)) = rank(x) for all x ∈ V , then by induction on rank, we’d have
j(x) = x for all x. Now let κ be the least number such that j(κ) 6= κ. For each
subset X of κ, put X ∈ U iff κ ∈ j(X). Show that U is a κ-complete ultrafilter.]

Now we see measurable cardinals cannot exist in L.


Theorem 23.11 (Scott). If there is a measurable cardinal, then V 6= L.
Proof. Suppose there is a measurable cardinal. Let κ be the least measurable
cardinal,
Q and let U be a nonprincipal κ-complete ultrafilter on κ. Let M =
U V , and let j : V → M be the corresponding elementary embedding.
If V = L, then the only transitive class model containing all the ordinals
is L, so V = M = L. But V  κ is the least measurable cardinal, and since j
is an elementary embedding M  j(κ) is the least measurable cardinal, and by
Lemma 23.8, j(κ) > κ. So we cannot have M = V . Contradiction!

23.1 Finding right universe of set theory*


L is an incredibly important object of study for set theorists. It reveals a huge
amount about ZFC. It has an intricate structure theory with myriad uses.
But if your goal is to find the “right” platonic realm of sets, L is a terrible
universe in which to live. We understand L well enough to answer most all
questions of set theory inside it. But the answers are almost always myopic. L
is like the tiny town where I grew up; people refused to acknowledge anything
outside the town borders and nothing interesting ever happened.
The standout deficiency of L is that V = L implies there are no measurable
cardinals. Large cardinals are an incredibly important part of set theory and
are intimately tied to most all its branches. From extensive work, we have deep
intuitions about how they behave, we have detailed fine-structural descriptions
of models containing them (from the inner model theory of large cardinals),
and myriad connections and uses of them which touch many other areas of
mathematics. The consistency of these large cardinal axioms (as opposed to
the set-theoretic statement that they exist) are also Π01 statements which make

100
predictions about reality (we’ll never find a proof of their inconsistency), which
have held true for more than a century. Since set theorists strongly believe these
large cardinal axioms are consistent for so many reasons, living in a universe
where they don’t exist feels very wrong.
An important focus of modern set theory is the inner model program: finding
L-like models which contain large cardinals, and help us understand their fine
structure, consistency strength, and what can be forced using them.
A research programme of Hugh Woodin is to build an inner model called
Ultimate L. It would help us answer the technical questions about forcing and
large cardinals alluded to above. But in addition, Woodin’s vision is for there
to be a convincing philosophical argument that this is the “right” universe of
sets, and that mathematicians should add the axiom V = Ultimate L to ZFC.

101
24 The basics of forcing
Forcing was introduced by Cohen in order to prove that CH is independent of
ZFC; if there is a model of ZFC, then there is a model of ZFC + ¬CH. Forcing is
a way of taking a countable transitive model M of ZFC, and constructing from
it a “generic extension” M [G] which will be another countable transitive model
of ZFC which contains the same ordinals. The model M [G] will contain M as
a submodel, M will contain a set G where G ∈ V , but G ∈ / M , and M [G] will
contain all the other sets we can reasonably define from M and G.
By way of analogy, we can think of taking a field F , and forming an algebraic
extension of it. Our original field F must be missing roots of some polynomials
which exist in the extension, and we can understand possible algebraic field
extensions of F by studying polynomials in F . Similarly, any transitive set
model M of ZFC is missing some sets (since M is not a proper class) and we
can add some of these missing sets and then understand the model M [G] by
analyzing it from the perspective of M .
We can’t adjoin just any set G to a countable transitive model M and get
a model of ZFC. Indeed, it is consistent that there are a transitive set model
M of ZFC and a set X so that there is no transitive set model N of ZFC so
M ⊆ N and X ∈ N . In forcing, we require G to have good “approximations”
inside M , and we also require that G is sufficiently “generic”. To make a forcing
extension of a countable transitive model M , we need to choose a partial order
(P, ≤) to “guide” our construction. We will always assume our partial order has
a maximal element which we denote 1P . The set G will be a subset of P that
always includes 1P .
To define M [G], we’ll use the concept of P-names. Every element of M [G]
will have a “name” in M .
Definition 24.1. Fix a partial order (P, ≤). By recursion, say that a set τ is
a P-name if every element of τ is an ordered pair (σ, p) where σ is a P-name,
and p ∈ P.

S of PP-names, where V0 = ∅, Vα+1 =


P P
Alternatively, weScould define a hierarchy
VαP × P, and Vλ = α<λ Vα , and V = α Vα is the class of P-names. Note
P P P

that being a P-name is absolute. So if M is a transitive model which contains


P, then the set of τ ∈ M such that M  “τ is a P-name” is equal to V P ∩ M .
Now we describe how to interpret these names:
Definition 24.2. Suppose P is a partial order and G is a subset of P, and
τ is a P-name. Then the value of τ with respect to G, denoted τ [G], is
defined by recursion as τ [G] = {σ[G] : (σ, p) ∈ τ ∧ p ∈ G}. Finally, let M [G] =
{τ [G] : τ is a P-name and τ ∈ M }.
It is clear that if M is countable, then M [G] is countable; it is the image of
M under some function. We prove some basic facts about M [G].
Lemma 24.3. Suppose M is a transitive model, (P, ≤) is a partial order in M ,
and G is a subset of P. Then

102
1. M [G] is transitive,
2. M ⊆ M [G],
3. G ∈ M [G], and
4. ORD ∩ M = ORD ∩ M [G].

Proof. (1) To see that M [G] is transitive, suppose x ∈ M [G] and y ∈ x. Then
x = τ [G] for some P-name τ ∈ M , so by definition of τ [G], y = σ[G] for some
P-name σ. Hence, y ∈ M [G].
(2) We define a map x 7→ x̌ by recursion so that for each x ∈ M , x̌[G] = x.
Let
x̌ = {(y̌, 1P ) : y ∈ x}.
So for example, ˇ∅ = ∅ always takes the value ∅ no matter what G is. Then by
induction on the rank of names,

x̌[G] = {y̌[G] : (y̌, 1P ) ∈ x̌}

which is equal to {y̌[G] : y ∈ x} = {y : y ∈ x} = x where the penultimate


equality is by our induction hypothesis.
(3) Consider the name τ = {(p̌, p) : p ∈ P }. Then τ [G] = {p̌[G] : p ∈ G} =
{p : p ∈ G} = G.
(4) By induction, it is easy to check rank(τ [G]) ≤ rank(τ ). So if τ is a P-
name such that τ [G] = α is an ordinal, then rank(τ [G]) ≤ rank(τ ) ∈ M . Hence,
M contains an ordinal greater than or equal to α, thus M contains α since M
is transitive. So M [G] ∩ ORD ⊆ M . We’ve already proved M ⊆ M [G]. Thus,
M ∩ ORD = M [G] ∩ ORD.
We get a few other axioms of ZFC:
Lemma 24.4. Suppose M is a transitive model of ZFC, P is a partial order in
M , and G ⊆ P is nonempty. Then M [G] satisfies the axioms of infinity, and
pairing.

Proof. We have ω ∈ M [G] since M satisfies ZFC, so ω ∈ M and M ⊆ M [G].


This witnesses the axiom of infinity in M [G].
Pairing is true since if x, y ∈ M [G], where x = τ [G] and y = σ[G], then
{x, y} is the evaluation of the name {(τ, 1), (σ, 1)}.
Now if G ⊆ P is an arbitrary set, then we cannot prove that M [G] satisfies
ZFC. For example, consider the partial order P of finite partial functions from
ω × ω to 2 ordered under reverse inclusion. Fix a countable transitive model
M , and let α be a countable ordinal not in M . Then there is an ordering of
ω isomorphic to α, and the finite subsets of the characteristic function of this
ordering relation form Sa filter in P. However, if M [G] satisfies ZFC, then the
Mostowski collapse of G would be in M [G], but this would be α which is an
ordinal not in M contradicting the above lemma that ORD ∩ M = ORD ∩ M [G].

103
(Indeed, it is consistent that there is no model of ZFC which contains both M
and α).
The problem in the previous paragraph is that the G we chose contains too
much information. In order to ensure that M [G] is a model of ZFC, we will
require that G is M -generic in the following sense:
Definition 24.5. Suppose (P, ≤) is a forcing partial order. If p, q ∈ P, and
p ≤ q, then we say that p refines q or p extends q. If p, q ∈ P we say p
and q are compatible if there exists r such that r extends p and r extends q.
Otherwise, we say p and q are incompatible. Finally, we say that a subset F
of P is a filter if all p, q ∈ F are compatible, and p ∈ F and r ≥ p implies
r ∈ F . Note that every nonempty filter contains 1P . If (P, ≤) is a partial order,
and X ⊆ P then we say X is dense if for every p ∈ P there is some q ∈ X
such that q ≤ p. If M is a transitive model that contains P, a filter G ⊆ P is
M -generic if for every dense set D ⊆ P, we have G ∩ D 6= ∅.
M -genericity is a sort of Murphy’s law: anything that can happen will hap-
pen. We’ll use this type of genericity to show we can compute what’s true about
M [G] internally in M , and show that M [G] satisfies the axioms of ZFC.
It is important that M is countable in order to guarantee that M -generic
filters exist.
Lemma 24.6. Suppose M is a countable transitive model of ZFC, and P is a
forcing poset such that P ∈ M , and p ∈ P . Then there exists an M -generic
filter G such that p ∈ G.
Proof. Let hDn : n ∈ ωi enumerate the dense subsets of P that are in M . Since
M is countable there are only countably many such Dn . Now inductively define
a sequence hpn : n ∈ ωi where p0 = p, and pn+1 ≤ pn is such that pn+1 ∈ Dn .
Such a pn+1 exists since Dn+1 is dense. Finally, let G be {q : ∃nq ≥ pn }. Then
G is an M -generic filter.
The above lemma fails in general if M is not countable. For example, suppose
P ∈ M is non-atomic, meaning for every p ∈ P there exist incompatible q, r
such that q ≤ p and r ≤ p. Then for every filter F ⊆ P, DF = {p : p ∈ / F } is
a dense set. However, there cannot be any filter F meeting all of these dense
sets, since F doesn’t meet DF . So if P(P)M = P(P)V , then there do not exist
any M -generic filters.
In general, you should think of the elements of P as being “approximations”
to the generic filter G we want to add to M . In this way, from the forcing poset
P that we force with we can readily get an understanding of the generic G that
we will add. For example,

• PCohen is the partial order of all elements of 2<ω ; functions from n to 2


for some natural number n, ordered by reverse inclusion. Here the empty
function is the maximal element of PCohen . We think of such a finite partial
function as an “approximation” to a total function from ω → {0, 1}. And
indeed, Cohen forcing adds an infinite binary sequence to a countable

104
model M of ZFC. (Since M is countable, it is missing uncountably many
such sequences).
• The random poset Prandom consists of all closed subsets of [0, 1] of positive
Lebesgue measure, ordered by inclusion. We think of these sets as being
approximations to their intersection which will be a single “random” real
number.
• The Lévy collapse PCol(κ,ω) of κ to ω is the partial order of finite partial
functions from ω to κ, ordered by extension. We think of an element of
this partial order as an approximation to a single function from ω to κ.
A generic for this poset will give a total function from ω to κ. For each
α ∈ κ, the set of finite partial functions p so that α ∈ ran(p) is a dense set.
Hence, a generic such function will be onto. Of course if M is a countable
transitive model, and κ ∈ M , then κ is really countable, and hence in the
real universe there will be some function from ω onto κ.
• The poset for “shooting a club through a stationary set S ⊆ ω1 ” is the set
of all closed countable sequences in S ordered under reverse inclusion. We
think of elements of this poset as approximating a closed subset of ω1 .

105
25 Forcing = Truth
For this section, fix a transitive model M of ZFC and a forcing poset P ∈
M . Our next goal is to introduce the forcing relation which gives us a way of
understanding what will be true in the generic extension M [G]. We’ll want to
understand when M [G]  ϕ(x1 , . . . , xn ) for a formula ϕ. Each xi ∈ M [G] comes
from some P-name τi in M , so we can rewrite this as M [G]  ϕ(τ1 [G], . . . , τn [G]).
We will show there is a relation definable inside M called the forcing relation
and and noted M so that
M
M [G]  ϕ(τ1 [G], . . . , τn [G]) iff (∃p ∈ G)p ϕ(τ1 , . . . , τn )

The generic G is not an element of M , but each element of P is, and so we can
understand part of what can possibly be true in M [G] using the forcing relation
inside M . The forcing relation M depends on the poset P we choose and if we
want to emphasize the particular poset we use we use the notation M P .
We caution that we think of ϕ(τ1 , . . . , τn ) as a formal expression, and we do
not interpret it using the usual relations ∈ and =. Technically it is part of what
is called the forcing language, and we think of this expression as something
that will fully gain meaning once we obtain a generic G and interpret each of
the names τ1 , . . . , τn . Furthermore, from more and more partial information
about G we will be able to discover more and more of what is true about these
types of sentences.
For example, let τ1 = ∅ (which is the canonical name for ∅, and τ2 = {(∅, p)}.
Then if p is an element of the generic G, then τ2 [G] = {∅}, and so τ1 [G] ∈ τ2 [G].
In this way, p being in the generic G “forces” that τ1 [G] ∈ τ2 [G]. If however, p
is not an element of the generic G, then τ2 [G] = ∅ and so τ1 [G] ∈ τ2 [G]. Hence,
if q ∈ G and q is incompatible with p (so p cannot be in G), then q “forces”
τ1 [G] ∈
/ τ2 [G]. Note that the set X = {q : q ≤ p or q is incompatible with p} is
dense, and so if G is M -generic, then it meets this set, and so we must either
force whether or not τ1 [G] ∈ τ2 [G] in the above way.
We begin with a useful characterization of M -genericity:
Definition 25.1. If (P, ≤) is a partial order, and X ⊆ P then we say X is
dense below p if for all q ≤ p, there is some r ≤ q such that r ∈ X.
We more often deal with sets that are dense below a condition p in our
definitions (e.g. of the forcing relation). By the following lemma, this type of
density can also characterize when G is M -generic.
Lemma 25.2. A filter G ⊆ P is M -generic if and only if for every set X ∈ M
with X ⊆ P, if p ∈ G and X is dense below p, then G meets X.
Proof. The reverse implication follows since if X is dense, it is dense below 1P ,
and every generic filter contains 1P ..
The forward implication is since if X is dense below p, then X ∪ {q ∈
P : q is incompatible with p} is dense and it is ∆1 definable and hence in M , so
G must meet X.

106
Note also that if X is dense below p and q ≤ p, then X is dense below q.
Definition 25.3 (The forcing relation for atomic formulas). The forcing re-
lation p M ϕ(τ1 , . . . , τn ) is defined between elements p ∈ P, formulas ϕ, and
n-tuples of names as follows. For atomic formulas, we define p M τ ∈ σ and
p M τ = σ by simultaneous induction on the rank of σ and τ .
• p M τ ∈ σ iff {p0 ∈ P : (∃q ≥ p0 )(∃σ 0 ∈ σ)(σ 0 , q) ∈ σ ∧ p0 M
τ = σ 0 } is
dense below p. Note that the set
• p M τ = σ iff for all (τ 0 , q) ∈ τ , {p0 ∈ P : q ≥ p0 → p0 M
τ 0 ∈ σ} is
dense below p and for all (σ 0 , q) ∈ σ, {p0 ∈ P : q ≥ p0 → p0 M
σ 0 ∈ τ } is
dense below p.
Note that the definition of the forcing relation M for atomic formulas is ∆1 ,
definable in M , and absolute between M and V . Another important technical
fact that follows from this definition is that if p ≤ q and q M ϕ for some atomic
formula ϕ, then p M ϕ.
Lemma 25.4 (Forcing = Truth for atomic formulas). Suppose M is a transitive
set model of ZFC and G is M -generic. Then
M
M [G]  τ [G] = σ[G] iff (∃p ∈ G)p τ = σ.
and
M
M [G]  τ [G] ∈ σ[G] iff (∃p ∈ G)p τ ∈ σ.
Proof. We prove these simultaneously by induction on the rank of σ and τ .
Suppose G is M -generic.
If p M τ ∈ σ and p ∈ G, then the set {p0 ∈ P : (∃q ≥ p0 )(∃σ 0 ∈ σ)(σ 0 , q) ∈
σ∧p0 M τ = σ 0 } is dense below p, and hence there is some p0 in the set such that
p0 ∈ G, since G is M -generic. Let (σ 0 , q 0 ) witness the membership of p0 in this
set. Then p0 M τ = σ 0 , so by our induction hypothesis, M [G]  τ [G] = σ 0 [G],
and since G is a filter, q ∈ G, and so σ 0 [G] ∈ σ[G], so M [G]  τ [G] ∈ σ[G].
For the converse, suppose M [G]  τ [G] ∈ σ[G]. Then M [G]  τ [G] = σ 0 [G]
for some σ 0 [G] ∈ σ[G]. Hence, by our induction hypothesis, there is some p such
that p M τ = σ 0 . But then by definition of the forcing relation, p M τ ∈ σ.
Now suppose p M τ = σ and p ∈ G. By symmetry it suffices to show that
M [G]  τ [G] ⊆ σ[G]. Suppose (τ 0 , q) ∈ τ and q ∈ G, so τ 0 [G] ∈ τ [G]. Let r
extend both p and q. Then {p0 ∈ P : p0 M τ 0 ∈ σ} is dense below r, and so G
meets this set at some p0 where p0 M τ 0 ∈ σ, and hence M [G]  τ 0 [G] ∈ σ[G].
Now suppose there is no p ∈ G such that p M τ = σ. The set of p0 such
that ∃(τ 0 , q) ∈ τ such that q ≥ p0 and (∀r ≤ p)r 1M τ 0 ∈ σ or ∃(σ 0 , q) ∈ σ such
that q ≥ p0 and (∀r ≤ p)r 1M σ 0 ∈ τ must be dense below p, so G must meet
this set. Without loss of generality, assume there is some (τ 0 , q) and p0 ∈ G
such that q ≥ p0 and (∀r ≤ p)r 1M τ 0 ∈ σ. Then by our induction hypothesis,
M [G]  τ 0 [G] ∈/ σ[G], and M [G]  τ 0 [G] ∈ τ [G], so M [G]  τ [G] 6= σ[G].
M
Definition 25.5 (The forcing relation). We define the forcing relation p
ϕ(τ1 , . . . , τn ) on all formulas by induction as follows:

107
• p M ϕ(τ1 , . . . , τn ) ∧ ψ(σ1 , . . . , σn ) iff p M
ϕ(τ1 , . . . , τn ) and p M

ψ(σ1 , . . . , σn ).
• p M
¬ϕ(τ1 , . . . , τn ) if there is no q ≤ p such that q M
ϕ(τ1 , . . . , τn ).
• p M ∃xϕ(x, τ2 , . . . , τn ) if the set of p0 ≤ p such that (∃τ1 ∈ M )p0 M

ϕ(τ1 , τ2 , . . . , τn ) is dense below p.


Note that this is a definition scheme, and for each formula ϕ, the forcing
relation is definable, but there is no single definition of the forcing relation
for all formulas; the class of P-names is a proper class, and so the complexity
of the definition of the forcing relation increases in the Levy hierarchy as the
complexity of ϕ increases.
The only place where we use have used M in the definition of the forcing
relation is when we are quantifying over names in M in the clause of the forcing
relation for an existential quantifier. In this sense, M really is the realizvization
of the definition of the full forcing relation , where we restrict all the quantfiers
to be over M . Often we think of working inside the model M and in this case
when we just write instead of writing M .
Lemma 25.6 (Forcing = Truth). Suppose M is a transitive set model of ZFC,
G is M -generic, and ϕ is a formula. Then
M
M [G]  ϕ(τ1 [G], . . . , τn [G]) iff (∃p ∈ G)p ϕ(τ1 , . . . , τn )

Proof. We prove this by induction on formula complexity. By Lemma 25.4, the


lemma is true for atomic formulas.
For conjunction,
M [G]  ϕ(τ1 [G], . . . , τn [G]) ∧ ψ(σ1 [G], . . . , σn [G])
↔ M  ϕ(τ1 [G], . . . , τn [G]) and M  ψ(σ1 [G], . . . , σn [G])
M M
↔ (∃p1 ∈ G)p1 ϕ(τ1 , . . . , τn ) and (∃p2 ∈ G)p2 ψ(σ1 , . . . , σn )
M
↔ (∃p ∈ G)p ϕ(τ1 , . . . , τn ) ∧ ψ(σ1 , . . . , σn ).
The last equivalence here is since given any p1 , p2 ∈ G, there is some common
refinement p ≤ p1 and p ≤ p2 , since G is a filter.
For negation, M [G]  ¬ϕ(τ1 [G], . . . , τn [G]) iff it’s not the case that M [G] 
ϕ(τ1 [G], . . . , τn [G]) iff there is no p ∈ G such that p M ϕ(τ1 , . . . , τn ). We need
to show this is equivalent to p M ¬ϕ(τ1 , . . . , τn ).
If there is some p ∈ G so that no q ≤ p has q M ϕ(τ1 , . . . , τn ), then clearly
we cannot have q M ϕ(τ1 , . . . , τn ) for any q ∈ G. For the reverse direction
suppose there is no p ∈ G such that p M ϕ(τ1 , . . . , τn ). Consider the set X
of q such that q M ϕ(τ1 , . . . , τn ) or q M ¬ϕ(τ1 , . . . , τn ). The set X is dense
by the definition of the forcing relation of a negation, and so G meets X, hence
there must be some q ∈ G such that q M ¬ϕ(τ1 , . . . τn ), otherwise we would
contradict our induction hypothesis that M [G]  ϕ(τ1 [G], . . . , τn [G]) iff (∃p ∈
G)p M ϕ(τ1 , . . . , τn ).
We leave the existential case as an exercise.

108
Lemma 25.7. Suppose M is a transitive model and G is M -generic. Then
M [G]  ZFC.
Proof. We begin by showing that separation holds. Suppose ϕ(x, w1 , . . . , wn ) is
a formula, and τ1 , . . . , τn and σ are names. We want to show M [G]  ∃y∀z(z ∈
y ↔ z ∈ σ[G] ∧ ϕ(z, w1 , . . . , wn ). We construct a name ρ which well be a
witness for the y in this existential statement. Consider the name ν = {(σ 0 , p) ∈
σ : p M ϕ(σ 0 , τ1 , . . . , τn )}. Then if G is M -generic, then by the forcing = truth
lemma, for each (σ 0 , p) ∈ σ, σ 0 [G] ∈ ν[G] iff M [G]  ϕ(σ 0 [G], τ1 [G], . . . , τn [G]).
We leave the remaining axioms as exercises.
Exercise 25.8. Suppose M is a model of ZFC, and G is M -generic. Then show
that M [G] satisfies replacement, and the axiom of choice.
Exercise 25.9. Suppose ϕ is a formula, τ1 , . . . , τn are P-names,
M M
1. If p ϕ(τ1 , . . . , τn ) and q ≤ p, then q ϕ(τ1 , . . . , τn ).
M
2. If the set of q ≤ p such that q ϕ(τ1 , . . . , τn ) is dense below p. Then
p M ϕ(τ1 , . . . , τn ).

Exercise 25.10. Suppose M is a countable transitive model of ZFC, and


P ∈ M is a forcing poset.
1. p M ϕ(τ1 , . . . , τn ) if and only if for every M -generic G such that p ∈ G,
we have M [G]  ϕ(τ1 [G], . . . , τn [G]).
M
2. The forcing relation in M is deductively closed. If p ϕ, and from ϕ
we can prove ψ, then p M ψ.

109
26 The consistency of ¬CH
We’ll prove in this section that if there’s a countable transitive model of ZFC,
then there is a countable transitive model of ZFC + ¬CH.
Definition 26.1. Suppose P is a poset. We say that P is ccc (has the countable
chain condition) if every antichain (a set of pairwise incompatible elements) in
P is countable.
This should be called the countable antichain condition, but there is a very
long history of calling it the “countable chain condition” so we perpetuate this
badly chosen name.
Theorem 26.2 (ccc forcing preserves cardinals). Suppose M is a countable
transitive model of ZFC, P ∈ M is a forcing poset, G is an M -generic P-
filter, and M  “P is a ccc poset”. Then for every ordinal κ ∈ M , M [G] 
“κ is a cardinal” iff M  “κ is a cardinal”.
Proof. Since M ⊆ M [G] are both transitive, if M [G]  “κ is a cardinal”, then
M  “κ is a cardinal” by downwards absoluteness since being a cardinal is Π1 .
Suppose κ is an infinite cardinal, M  “κ is a cardinal”, and for a contra-
diction, suppose that there is some f ∈ M [G] and α < κ so M [G]  “f is a
surjection from α to κ”. Let f˙ be a P-name for f , so f˙[G] = f , and let p0 be
such that p0 M “f˙ is a surjection from α̌ to κ̌”. Such a p0 exists since forcing
= truth.
For each β < α, there is some γ ∈ κ such that M [G]  f (β) = γ, and hence
some p ≤ p0 such that p M f˙(β̌) = γ̌.
From now on, work inside M . For each β < α, let Xβ be the set of γ < κ such
that there is some p ≤ p0 such that p f˙(β̌) = γ̌. Let Xˇβ = {(γ̌, 1P ) : γ ∈ Xβ }
be the set of canonical names γ̌ for the elements of Xβ .
Claim: p0 f˙(β̌) ∈ Xˇβ . If this was not true, then by Exercise 25.9 there
would be some q ≤ p0 such that no q 0 ≤ q has q 0 f˙(β̌) ∈ Xˇβ . Take a generic
G0 extending q (which exists by Lemma 24.6). By our choice of q, there can be
no p ∈ G0 such that p f˙(β̌) ∈ Xβ , so by forcing = truth, M [G0 ]  f (β) ∈ / Xβ .
Since p0 ∈ G0 , M [G0 ]  f is a surjection from α to κ, hence M [G0 ]  f (β) = γ 0
for some γ ∈ κ. So by forcing = truth, there must be some p0 such that
p0 f (β) = γ. Since p0 and p0 are compatible, there is some p extending
both of them, so p f (β) = γ. But then γ ∈ Xβ by definition, which is a
contradiction). This proves the claim. S
A similar proof shows p0 ran(f˙) ⊆ β<α Xˇβ .
Claim: M  Xγ is countable. Still working inside M , for each γ ∈ Xβ , by
the axiom of choice, we can pick some pγ such that pγ f˙(β̌) = γ̌. Then for
0 0
any two distinct γ, γ ∈ Xβ , we must have that pγ , pγ are incompatible, since p0
forces f to be a function, and hence we can’t have two compatible conditions
forcing two different values of f (β). Thus, S since P has the ccc in M , we have
that M  Xγ is countable. SBut in M , β<α Xβ therefore has cardinality at
most ω · |α| < κ, and hence β<α Xβ is a proper subset of κ. This contradicts
that p0 forces that ran(f ) = κ.

110
Lemma 26.3. Let P be the set of finite partial functions from ω × ω2 → 2
ordered under reverse inclusion, so p ≤P q if dom(p) ⊇ dom(q) and for all x in
dom(q), q(x) = p(x). P has the ccc.
Proof. Suppose A is an uncountable set of conditions in P. We claim these
conditions cannot all be incompatible.
Let D = {dom(p) : p ∈ A} be the set of domains of the conditions in A.
Since for each possible finite d ⊆ ω × ω2 , there are only finitely many functions
from d to 2, we must have that D is also uncountable. By the ∆-system lemma,
there is a uncountable subset D0 ⊆ D which is a ∆-system with root r. Since
there are only finitely many functions from r to 2, there must be some q : r → 2
such that there are uncountably many p ∈ A such that p  r = q. But any
two such p must be compatible in P, since their union is a common extension
of them both.
Theorem 26.4. Suppose there is a countable transitive model M of ZFC. Then
there is a countable transitive model of ZFC + ¬CH.
Proof. Working inside M , let P be the partial order of finite partial functions
from ω × ω2 → 2 ordered under reverse inclusion. Let G be an M -generic filter
for P. By the previous two lemmas, M [G] has the same cardinals as M , so
M [G] M [G]
ω1 = ω1M and ω2 = ω2M . We claim that M [G]  “there is an injection
from ω2 to P(ω), and hence | P(ω)|M [G] > ω1 , so M [G]  ¬CH.
Working inside M , for each (n, α) ∈ ω × ω2 , the D(n,α) = {p ∈ P : (n, α) ∈
dom(p)} is is dense in P. To see this, given any q ∈ P, let p ≤ q be defined by
p = q if (n, α) ∈ dom(q), and p = q ∪ ((n, α), 0) otherwise.
S Now G meets each
of these dense sets D(n,α) . So in M [G], if we let g = G, then g is a function
from ω × ω2 → 2.
In M [G], we claim that the function f defined by f (α) = {n : g(n, α) = 1} is
M [G]
an injection from ω2 to P(ω). Working in M , suppose α, β < ω2 . Then the
set of p ∈ P such that there exists some n such that (n, α) ∈ dom(p) ∧ (n, β) ∈
dom(p)∧p(n, α) 6= p(n, β) is dense by a similar argument to the above. Hence, G
M [G]
meets this dense set, and so M [G]  f (α) 6= f (β) for all α, β < ω2M = ω2 .

Exercise 26.5. Assume M in the above theorem has M  CH. Show that
M [G]  2ω = ω2 . Show that if we replace ω2 with any regular cardinal κ ∈ M
of uncountable cofinality in M , then M [G]  2ω = κ.
Now as we proved in an exercise, Con(ZFC) does not prove that there is
a countable transitive model of ZFC. However, by the reflection theorem and
Lowenheim-Skolem, Con(ZFC) does prove that for any finite set S of axioms of
ZFC, there is a countable transitive model of S. Now let S be an arbitrarily large
finite set that includes all the finitely many axioms of ZFC that we have used in
these notes to prove the above theorem. Then the above theorem shows there
is a model of S + ¬CH, for arbitrarily large finite S ⊆ ZFC. By the compactness
theorem, this implies Con(ZFC + ¬CH).

111
References
[B] A. Blass, A model without ultrafilters, Bull. Acad. Polon. Sci. Sér. Math.
Astronom. Phys. 25 (1977), 329–331.
[BP] J.E. Baumgartner and K. Prikry Singular cardinals and the Generalized
Continuum Hypothesis, The Amer. Math. Monthly. vol. 84, No. 2 (1977)
108–113.

[CD] P. Doyle and J. Conway Division by three, arXiv:math/0605779, 1994.


[CK] A.E. Caicedo and R. Ketchersid A trichotomy theorem in natural models
of AD+. Contemp. Math 533 (2009) 227–258.
[D] K. Devlin, Conductibility, Perspectives in Mathematical Logic, Vol. 6
Berlin: Springer-Verlag, (1984).
[DK] C. Druţu and M. Kapovich, Geometric Group Theory, Amer. Math. Soc.
Coll. Publications. American Mathematical Society, Providence, RI, 2018.
With an appendix by Bogdan Nica.
[F] C. Freiling Throwing Darts at the Real Number Line, J. Symb. Logic, 51,
No. 1 (1986) 190–200.
[H] G. Hjorth, A Dichotomy for the Definable Universe, J. Symb. Logic, 60,
No. 4 (1995) 1199–1207.
[J] T. Jech, Singular Cardinals and the pcf Theory, Bull. Symb. Logic, 1, No.
4 (1995) 408–424.
[KM] A. Kechris and H. Macdonald, Borel equivalence relations and cardinal
algebras, Fund. Math., 235, (2016), 183–198.
[M] A.R.D. Mathias, Weak systems of Gandy, Jensen and Devlin, in Set theory:
Centre de Recerca Matematica, Barcelona 2003-4, edited by Joan Bagaria
and Stevo Todorcevic, 149-224. Trends in Mathematics, Birkhäuser Verlag,
Basel, 2006.

112

You might also like