Open Logic Text
Open Logic Text
Complete Build
1
Contents
1 Sets 25
1.1 Extensionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.2 Subsets and Power Sets . . . . . . . . . . . . . . . . . . . . . . . 26
1.3 Some Important Sets . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.4 Unions and Intersections . . . . . . . . . . . . . . . . . . . . . . 28
1.5 Pairs, Tuples, Cartesian Products . . . . . . . . . . . . . . . . . . 31
1.6 Russell’s Paradox . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2 Relations 35
2.1 Relations as Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.2 Philosophical Reflections . . . . . . . . . . . . . . . . . . . . . . 37
2.3 Special Properties of Relations . . . . . . . . . . . . . . . . . . . 38
2.4 Equivalence Relations . . . . . . . . . . . . . . . . . . . . . . . . 39
2.5 Orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.6 Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.7 Operations on Relations . . . . . . . . . . . . . . . . . . . . . . . 43
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3 Functions 45
3.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.2 Kinds of Functions . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.3 Functions as Relations . . . . . . . . . . . . . . . . . . . . . . . . 49
3.4 Inverses of Functions . . . . . . . . . . . . . . . . . . . . . . . . 50
3.5 Composition of Functions . . . . . . . . . . . . . . . . . . . . . . 52
3.6 Partial Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
2
CONTENTS
5 Arithmetization 79
5.1 From N to Z . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.2 From Z to Q . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.3 The Real Line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.4 From Q to R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.5 Some Philosophical Reflections . . . . . . . . . . . . . . . . . . . 85
5.6 Ordered Rings and Fields . . . . . . . . . . . . . . . . . . . . . . 87
5.7 Appendix: the Reals as Cauchy Sequences . . . . . . . . . . . . 90
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6 Infinite Sets 94
6.1 Hilbert’s Hotel . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
6.2 Dedekind Algebras . . . . . . . . . . . . . . . . . . . . . . . . . 95
6.3 Arithmetical Induction . . . . . . . . . . . . . . . . . . . . . . . 97
6.4 Dedekind’s “Proof” . . . . . . . . . . . . . . . . . . . . . . . . . 98
6.5 Appendix: Proving Schröder-Bernstein . . . . . . . . . . . . . . 100
11 Tableaux 156
11.1 Rules and Tableaux . . . . . . . . . . . . . . . . . . . . . . . . . 156
11.2 Propositional Rules . . . . . . . . . . . . . . . . . . . . . . . . . 157
11.3 Tableaux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
11.4 Examples of Tableaux . . . . . . . . . . . . . . . . . . . . . . . . 159
11.5 Proof-Theoretic Notions . . . . . . . . . . . . . . . . . . . . . . . 163
11.6 Derivability and Consistency . . . . . . . . . . . . . . . . . . . . 165
11.7 Derivability and the Propositional Connectives . . . . . . . . . 167
11.8 Soundness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
21 Tableaux 306
21.1 Rules and Tableaux . . . . . . . . . . . . . . . . . . . . . . . . . 306
21.2 Propositional Rules . . . . . . . . . . . . . . . . . . . . . . . . . 307
21.3 Quantifier Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . 308
21.4 Tableaux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
21.5 Examples of Tableaux . . . . . . . . . . . . . . . . . . . . . . . . 310
21.6 Tableaux with Quantifiers . . . . . . . . . . . . . . . . . . . . . . 314
21.7 Proof-Theoretic Notions . . . . . . . . . . . . . . . . . . . . . . . 318
21.8 Derivability and Consistency . . . . . . . . . . . . . . . . . . . . 320
21.9 Derivability and the Propositional Connectives . . . . . . . . . 321
21.10 Derivability and the Quantifiers . . . . . . . . . . . . . . . . . . 324
21.11 Soundness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
21.12 Tableaux with Identity predicate . . . . . . . . . . . . . . . . . . 328
21.13 Soundness with Identity predicate . . . . . . . . . . . . . . . . . 329
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
V Computability 416
32 Undecidability 493
32.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 493
32.2 Enumerating Turing Machines . . . . . . . . . . . . . . . . . . . 495
32.3 Universal Turing Machines . . . . . . . . . . . . . . . . . . . . . 497
32.4 The Halting Problem . . . . . . . . . . . . . . . . . . . . . . . . . 499
32.5 The Decision Problem . . . . . . . . . . . . . . . . . . . . . . . . 500
35 Representability in Q 548
35.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 548
35.2 Functions Representable in Q are Computable . . . . . . . . . . 550
35.3 The Beta Function Lemma . . . . . . . . . . . . . . . . . . . . . 551
35.4 Simulating Primitive Recursion . . . . . . . . . . . . . . . . . . 554
35.5 Basic Functions are Representable in Q . . . . . . . . . . . . . . 555
35.6 Composition is Representable in Q . . . . . . . . . . . . . . . . 558
35.7 Regular Minimization is Representable in Q . . . . . . . . . . . 559
35.8 Computable Functions are Representable in Q . . . . . . . . . . 562
35.9 Representing Relations . . . . . . . . . . . . . . . . . . . . . . . 563
35.10 Undecidability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 564
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 565
41 Introduction 612
41.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 612
41.2 The Syntax of the Lambda Calculus . . . . . . . . . . . . . . . . 613
41.3 Reduction of Lambda Terms . . . . . . . . . . . . . . . . . . . . 614
41.4 The Church-Rosser Property . . . . . . . . . . . . . . . . . . . . 615
41.5 Currying . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 616
41.6 λ-Definable Arithmetical Functions . . . . . . . . . . . . . . . . 617
41.7 λ-Definable Functions are Computable . . . . . . . . . . . . . . 618
41.8 Computable Functions are λ-Definable . . . . . . . . . . . . . . 618
41.9 The Basic Primitive Recursive Functions are λ-Definable . . . . 619
41.10 The λ-Definable Functions are Closed under Composition . . . 619
41.11 λ-Definable Functions are Closed under Primitive Recursion . 619
41.12 Fixed-Point Combinators . . . . . . . . . . . . . . . . . . . . . . 622
41.13 The λ-Definable Functions are Closed under Minimization . . 622
42 Syntax 624
42.1 Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 624
42.2 Unique Readability . . . . . . . . . . . . . . . . . . . . . . . . . 624
42.3 Abbreviated Syntax . . . . . . . . . . . . . . . . . . . . . . . . . 626
42.4 Free Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 626
42.5 Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 628
42.6 α-Conversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 630
42.7 The De Bruijn Index . . . . . . . . . . . . . . . . . . . . . . . . . 634
42.8 Terms as α-Equivalence Classes . . . . . . . . . . . . . . . . . . 635
42.9 β-reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 636
42.10 η-conversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 638
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 640
55 Introduction 778
55.1 Constructive Reasoning . . . . . . . . . . . . . . . . . . . . . . . 778
55.2 Syntax of Intuitionistic Logic . . . . . . . . . . . . . . . . . . . . 779
55.3 The Brouwer-Heyting-Kolmogorov Interpretation . . . . . . . . 780
55.4 Natural Deduction . . . . . . . . . . . . . . . . . . . . . . . . . . 783
55.5 Axiomatic Derivations . . . . . . . . . . . . . . . . . . . . . . . . 786
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 787
56 Semantics 788
59 Introduction 817
59.1 The Material Conditional . . . . . . . . . . . . . . . . . . . . . . 817
59.2 Paradoxes of the Material Conditional . . . . . . . . . . . . . . 818
59.3 The Strict Conditional . . . . . . . . . . . . . . . . . . . . . . . . 819
59.4 Counterfactuals . . . . . . . . . . . . . . . . . . . . . . . . . . . 821
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 822
63 Ordinals 851
63.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 851
63.2 The General Idea of an Ordinal . . . . . . . . . . . . . . . . . . . 851
63.3 Well-Orderings . . . . . . . . . . . . . . . . . . . . . . . . . . . . 852
63.4 Order-Isomorphisms . . . . . . . . . . . . . . . . . . . . . . . . . 853
63.5 Von Neumann’s Construction . . . . . . . . . . . . . . . . . . . 855
63.6 Basic Properties of the Ordinals . . . . . . . . . . . . . . . . . . 856
63.7 Replacement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 858
63.8 ZF− : a milestone . . . . . . . . . . . . . . . . . . . . . . . . . . . 859
63.9 Ordinals as Order-Types . . . . . . . . . . . . . . . . . . . . . . 860
63.10 Successor and Limit Ordinals . . . . . . . . . . . . . . . . . . . . 861
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 862
65 Replacement 871
65.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 871
65.2 The Strength of Replacement . . . . . . . . . . . . . . . . . . . . 871
65.3 Extrinsic Considerations . . . . . . . . . . . . . . . . . . . . . . 872
65.4 Limitation-of-size . . . . . . . . . . . . . . . . . . . . . . . . . . 874
65.5 Replacement and “Absolute Infinity” . . . . . . . . . . . . . . . 875
65.6 Replacement and Reflection . . . . . . . . . . . . . . . . . . . . 877
65.7 Appendix: Results surrounding Replacement . . . . . . . . . . 877
65.8 Appendix: Finite axiomatizability . . . . . . . . . . . . . . . . . 880
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 881
67 Cardinals 891
67.1 Cantor’s Principle . . . . . . . . . . . . . . . . . . . . . . . . . . 891
67.2 Cardinals as Ordinals . . . . . . . . . . . . . . . . . . . . . . . . 892
67.3 ZFC: A Milestone . . . . . . . . . . . . . . . . . . . . . . . . . . 893
67.4 Finite, Enumerable, Non-enumerable . . . . . . . . . . . . . . . 894
67.5 Appendix: Hume’s Principle . . . . . . . . . . . . . . . . . . . . 896
69 Choice 909
69.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 909
69.2 The Tarski-Scott Trick . . . . . . . . . . . . . . . . . . . . . . . . 909
69.3 Comparability and Hartogs’ Lemma . . . . . . . . . . . . . . . 910
69.4 The Well-Ordering Problem . . . . . . . . . . . . . . . . . . . . . 911
69.5 Countable Choice . . . . . . . . . . . . . . . . . . . . . . . . . . 912
69.6 Intrinsic Considerations about Choice . . . . . . . . . . . . . . . 915
69.7 The Banach-Tarski Paradox . . . . . . . . . . . . . . . . . . . . . 916
69.8 Appendix: Vitali’s Paradox . . . . . . . . . . . . . . . . . . . . . 917
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 921
XV Methods 922
70 Proofs 924
70.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 924
70.2 Starting a Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . 925
70.3 Using Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . 926
70.4 Inference Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . 927
70.5 An Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 933
70.6 Another Example . . . . . . . . . . . . . . . . . . . . . . . . . . 936
70.7 Proof by Contradiction . . . . . . . . . . . . . . . . . . . . . . . 937
70.8 Reading Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 941
70.9 I Can’t Do It! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 942
70.10 Other Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . 943
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 944
71 Induction 945
71.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 945
71.2 Induction on N . . . . . . . . . . . . . . . . . . . . . . . . . . . . 946
71.3 Strong Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . 948
71.4 Inductive Definitions . . . . . . . . . . . . . . . . . . . . . . . . 949
71.5 Structural Induction . . . . . . . . . . . . . . . . . . . . . . . . . 951
71.6 Relations and Functions . . . . . . . . . . . . . . . . . . . . . . . 952
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 955
72 Biographies 957
72.1 Georg Cantor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 957
72.2 Alonzo Church . . . . . . . . . . . . . . . . . . . . . . . . . . . . 958
72.3 Gerhard Gentzen . . . . . . . . . . . . . . . . . . . . . . . . . . . 959
72.4 Kurt Gödel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 960
72.5 Emmy Noether . . . . . . . . . . . . . . . . . . . . . . . . . . . . 961
72.6 Rózsa Péter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 962
72.7 Julia Robinson . . . . . . . . . . . . . . . . . . . . . . . . . . . . 964
72.8 Bertrand Russell . . . . . . . . . . . . . . . . . . . . . . . . . . . 966
72.9 Alfred Tarski . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 967
72.10 Alan Turing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 968
72.11 Ernst Zermelo . . . . . . . . . . . . . . . . . . . . . . . . . . . . 969
Bibliography 986
This file loads all content included in the Open Logic Project. Editorial
notes like this, if displayed, indicate that the file was compiled without
any thought to how this material will be presented. If you can read this,
it is probably not advisable to teach or study from this PDF.
The Open Logic Project provides many mechanisms by which a text
can be generate which is more appropriate for teaching or self-study. For
instance, by default, the text will make all logical operators primitives and
carry out all cases for all operators in proofs. But it is much better to leave
some of these cases as exercises. The Open Logic Project is also a work in
progress. In an effort to stimulate collaboration and improvement, mate-
rial is included even if it is only in draft form, is missing exercises, etc. A
PDF produced for a course will exclude these sections.
To find PDFs more suitable for teaching and studying, have a look at
the sample courses available on the OLP website. To make your own,
you might start from the sample driver file or look at the sources of the
derived textbooks for more fancy and advanced examples.
23
CONTENTS
Sets
1.1 Extensionality
A set is a collection of objects, considered as a single object. The objects making
up the set are called elements or members of the set. If x is an element of a set a,
we write x ∈ a; if not, we write x ∈ / a. The set which has no elements is called
the empty set and denoted “∅”.
It does not matter how we specify the set, or how we order its elements, or
indeed how many times we count its elements. All that matters are what its
elements are. We codify this in the following principle.
Definition 1.1 (Extensionality). If A and B are sets, then A = B iff every ele-
ment of A is also an element of B, and vice versa.
{ a, a, b} = { a, b} = {b, a}.
This delivers on the point that, when we consider sets, we don’t care about
the order of their elements, or how many times they are specified.
Example 1.2. Whenever you have a bunch of objects, you can collect them
together in a set. The set of Richard’s siblings, for instance, is a set that con-
tains one person, and we could write it as S = {Ruth}. The set of positive
integers less than 4 is {1, 2, 3}, but it can also be written as {3, 2, 1} or even as
{1, 2, 1, 2, 3}. These are all the same set, by extensionality. For every element
of {1, 2, 3} is also an element of {3, 2, 1} (and of {1, 2, 1, 2, 3}), and vice versa.
Frequently we’ll specify a set by some property that its elements share.
We’ll use the following shorthand notation for that: { x : φ( x )}, where the
25
CHAPTER 1. SETS
φ( x ) stands for the property that x has to have in order to be counted among
the elements of the set.
S = { x : x is a sibling of Richard}.
Example 1.4. A number is called perfect iff it is equal to the sum of its proper
divisors (i.e., numbers that evenly divide it but aren’t identical to the number).
For instance, 6 is perfect because its proper divisors are 1, 2, and 3, and 6 =
1 + 2 + 3. In fact, 6 is the only positive integer less than 10 that is perfect. So,
using extensionality, we can say:
We read the notation on the right as “the set of x’s such that x is perfect and
0 ≤ x ≤ 10”. The identity here confirms that, when we consider sets, we don’t
care about how they are specified. And, more generally, extensionality guar-
antees that there is always only one set of x’s such that φ( x ). So, extensionality
justifies calling { x : φ( x )} the set of x’s such that φ( x ).
Extensionality gives us a way for showing that sets are identical: to show
that A = B, show that whenever x ∈ A then also x ∈ B, and whenever y ∈ B
then also y ∈ A.
Example 1.6. Every set is a subset of itself, and ∅ is a subset of every set. The
set of even numbers is a subset of the set of natural numbers. Also, { a, b} ⊆
{ a, b, c}. But { a, b, e} is not a subset of { a, b, c}.
Example 1.7. The number 2 is an element of the set of integers, whereas the
set of even numbers is a subset of the set of integers. However, a set may hap-
pen to both be an element and a subset of some other set, e.g., {0} ∈ {0, {0}}
and also {0} ⊆ {0, {0}}.
Definition 1.10 (Power Set). The set consisting of all subsets of a set A is called
the power set of A, written ℘( A).
℘( A) = { B : B ⊆ A}
Example 1.11. What are all the possible subsets of { a, b, c}? They are: ∅,
{ a}, {b}, {c}, { a, b}, { a, c}, {b, c}, { a, b, c}. The set of all these subsets is
℘({ a, b, c}):
℘({ a, b, c}) = {∅, { a}, {b}, {c}, { a, b}, {b, c}, { a, c}, { a, b, c}}
N = {0, 1, 2, 3, . . .}
the set of natural numbers
Z = {. . . , −2, −1, 0, 1, 2, . . .}
the set of integers
Q= {m/n : m, n ∈ Z and n ̸= 0}
the set of rationals
R = (−∞, ∞)
the set of real numbers (the continuum)
These are all infinite sets, that is, they each have infinitely many elements.
As we move through these sets, we are adding more numbers to our stock.
Indeed, it should be clear that N ⊆ Z ⊆ Q ⊆ R: after all, every natural
number is an integer; every integer is a rational; and every rational is a real.
Equally, it should be clear that N ⊊ Z ⊊ Q, since −1 is an integer but not
a natural number, and 1/2 is rational but not integer. It is less obvious that
Q ⊊ R, i.e., that there are some real numbers which are not rational.
We’ll sometimes also use the set of positive integers Z+ = {1, 2, 3, . . . } and
the set containing just the first two natural numbers B = {0, 1}.
Example 1.14 (Infinite sequences). For any set A we may also consider the
set Aω of infinite sequences of elements of A. An infinite sequence a1 a2 a3 a4 . . .
consists of a one-way infinite list of objects, each one of which is an element
of A.
Figure 1.1: The union A ∪ B of two sets is set of elements of A together with
those of B.
can mention sets we’ve already defined. So for instance, if A and B are sets,
the set { x : x ∈ A ∨ x ∈ B} consists of all those objects which are elements
of either A or B, i.e., it’s the set that combines the elements of A and B. We
can visualize this as in Figure 1.1, where the highlighted area indicates the
elements of the two sets A and B together.
This operation on sets—combining them—is very useful and common,
and so we give it a formal name and a symbol.
Definition 1.15 (Union). The union of two sets A and B, written A ∪ B, is the
set of all things which are elements of A, B, or both.
A ∪ B = { x : x ∈ A ∨ x ∈ B}
Example 1.16. Since the multiplicity of elements doesn’t matter, the union of
two sets which have an element in common contains that element only once,
e.g., { a, b, c} ∪ { a, 0, 1} = { a, b, c, 0, 1}.
The union of a set and one of its subsets is just the bigger set: { a, b, c} ∪
{ a} = { a, b, c}.
The union of a set with the empty set is identical to the set: { a, b, c} ∪ ∅ =
{ a, b, c}.
A ∩ B = { x : x ∈ A ∧ x ∈ B}
Two sets are called disjoint if their intersection is empty. This means they have
no elements in common.
Figure 1.2: The intersection A ∩ B of two sets is the set of elements they have
in common.
We can also form the union or intersection of more than two sets. An
elegant way of dealing with this in general is the following: suppose you
collect all the sets you want to form the union (or intersection) of into a single
set. Then we can define the union of all our original sets as the set of all objects
which belong to at least one element of the set, and the intersection as the set
of all objects which belong to every element of the set.
S
Definition 1.19. If A is a set of sets, then A is the set of elements of elements
of A:
[
A = { x : x belongs to an element of A}, i.e.,
= { x : there is a B ∈ A so that x ∈ B}
T
Definition 1.20. If A is a set of sets, then A is the set of objects which all
elements of A have in common:
\
A = { x : x belongs to every element of A}, i.e.,
= { x : for all B ∈ A, x ∈ B}
and A = { a}.
T
Figure 1.3: The difference A \ B of two sets is the set of those elements of A
which are not also elements of B.
When we have an index of sets, i.e., some set I such that we are considering
Ai for each i ∈ I, we may also use these abbreviations:
[ [
Ai = { Ai : i ∈ I }
i∈ I
\ \
Ai = { Ai : i ∈ I }
i∈ I
Finally, we may want to think about the set of all elements in A which are
not in B. We can depict this as in Figure 1.3.
Definition 1.22 (Difference). The set difference A \ B is the set of all elements
of A which are not also elements of B, i.e.,
A \ B = { x : x ∈ A and x ∈
/ B }.
We can define ordered pairs in set theory using the Wiener-Kuratowski defi-
nition.
Definition 1.24 (Cartesian product). Given sets A and B, their Cartesian prod-
uct A × B is defined by
A × B = {⟨ x, y⟩ : x ∈ A and y ∈ B}.
Example 1.25. If A = {0, 1}, and B = {1, a, b}, then their product is
A × B = {⟨0, 1⟩, ⟨0, a⟩, ⟨0, b⟩, ⟨1, 1⟩, ⟨1, a⟩, ⟨1, b⟩}.
A1 = A
A k +1 = A k × A
Bx1 = {⟨ x1 , y1 ⟩ ⟨ x1 , y2 ⟩ ... ⟨ x1 , ym ⟩}
Bx2 = {⟨ x2 , y1 ⟩ ⟨ x2 , y2 ⟩ ... ⟨ x2 , ym ⟩}
.. ..
. .
Bx n = {⟨ xn , y1 ⟩ ⟨ xn , y2 ⟩ . . . ⟨ xn , ym ⟩}
Since the xi are all different, and the y j are all different, no two of the pairs in
this grid are the same, and there are n · m of them.
A ∗ = { ∅ } ∪ A ∪ A2 ∪ A3 ∪ . . .
R = {x : x ∈
/ x}
Proof. If R = { x : x ∈
/ x } exists, then R ∈ R iff R ∈
/ R, which is a contradic-
tion.
Let’s run through this proof more slowly. If R exists, it makes sense to ask
whether R ∈ R or not. Suppose that indeed R ∈ R. Now, R was defined as
the set of all sets that are not elements of themselves. So, if R ∈ R, then R does
not itself have R’s defining property. But only sets that have this property are
in R, hence, R cannot be an element of R, i.e., R ∈ / R. But R can’t both be and
not be an element of R, so we have a contradiction.
Problems
Problem 1.1. Prove that there is at most one empty set, i.e., show that if A and
B are sets without elements, then A = B.
Problem 1.8. Using Definition 1.23, prove that ⟨ a, b⟩ = ⟨c, d⟩ iff both a = c
and b = d.
Relations
35
CHAPTER 2. RELATIONS
We have put the diagonal, here, in bold, since the subset of N2 consisting of
the pairs lying on the diagonal, i.e.,
is the identity relation on N. (Since the identity relation is popular, let’s define
Id A = {⟨ x, x ⟩ : x ∈ A} for any set A.) The subset of all pairs lying above the
diagonal, i.e.,
L = {⟨0, 1⟩, ⟨0, 2⟩, . . . , ⟨1, 2⟩, ⟨1, 3⟩, . . . , ⟨2, 3⟩, ⟨2, 4⟩, . . .},
is the less than relation, i.e., Lnm iff n < m. The subset of pairs below the
diagonal, i.e.,
G = {⟨1, 0⟩, ⟨2, 0⟩, ⟨2, 1⟩, ⟨3, 0⟩, ⟨3, 1⟩, ⟨3, 2⟩, . . . },
is the greater than relation, i.e., Gnm iff n > m. The union of L with I, which
we might call K = L ∪ I, is the less than or equal to relation: Knm iff n ≤ m.
Similarly, H = G ∪ I is the greater than or equal to relation. These relations L, G,
K, and H are special kinds of relations called orders. L and G have the property
that no number bears L or G to itself (i.e., for all n, neither Lnn nor Gnn).
Relations with this property are called irreflexive, and, if they also happen to
be orders, they are called strict orders.
Although orders and identity are important and natural relations, it should
be emphasized that according to our definition any subset of A2 is a relation
on A, regardless of how unnatural or contrived it seems. In particular, ∅ is a
relation on any set (the empty relation, which no pair of elements bears), and
A2 itself is a relation on A as well (one which every pair bears), called the
universal relation. But also something like E = {⟨n, m⟩ : n > 5 or m × n ≥ 34}
counts as a relation.
spirit in which that remark is made. We are not stating a metaphysical identity
fact. We are simply noting that, in certain contexts, we can (and will) treat
(certain) relations as certain sets.
2.5 Orders
Many of our comparisons involve describing some objects as being “less than”,
“equal to”, or “greater than” other objects, in a certain respect. These involve
order relations. But there are different kinds of order relations. For instance,
some require that any two objects be comparable, others don’t. Some include
identity (like ≤) and some exclude it (like <). It will help us to have a taxon-
omy here.
Definition 2.16 (Linear order). A partial order which is also connected is called
a total order or linear order.
Example 2.17. Every linear order is also a partial order, and every partial or-
der is also a preorder, but the converses don’t hold. The universal relation
on A is a preorder, since it is reflexive and transitive. But, if A has more than
one element, the universal relation is not anti-symmetric, and so not a partial
order.
Definition 2.22 (Strict linear order). A strict order which is also connected is
called a strict linear order.
Example 2.23. ≤ is the linear order corresponding to the strict linear order <.
⊆ is the partial order corresponding to the strict order ⊊.
Definition 2.24 (Total order). A strict order which is also connected is called
a total order. This is also sometimes called a strict linear order.
Any strict order R on A can be turned into a partial order by adding the
diagonal Id A , i.e., adding all the pairs ⟨ x, x ⟩. (This is called the reflexive closure
of R.) Conversely, starting from a partial order, one can get a strict order by
removing Id A . These next two results make this precise.
Example 2.27. ≤ is the linear order corresponding to the total order <. ⊆ is
the partial order corresponding to the strict order ⊊.
The following simple result which establishes that total orders satisfy an
extensionality-like property:
Proof. Suppose (∀ x ∈ A)( x < a ↔ x < b). If a < b, then a < a, contradicting
the fact that < is irreflexive; so a ≮ b. Exactly similarly, b ≮ a. So a = b, as <
is connected.
2.6 Graphs
Example 2.30. The graph ⟨V, E⟩ with V = {1, 2, 3, 4} and E = {⟨1, 1⟩, ⟨1, 2⟩,
⟨1, 3⟩, ⟨2, 3⟩} looks like this:
1 2 4
This is a different graph than ⟨V ′ , E⟩ with V ′ = {1, 2, 3}, which looks like this:
1 2
R1 = R and Rn+1 = Rn | R.
The reflexive transitive closure of R is R∗ = R+ ∪ Id A .
Problems
Problem 2.1. List the elements of the relation ⊆ on the set ℘({ a, b, c}).
Problem 2.2. Give examples of relations that are (a) reflexive and symmetric
but not transitive, (b) reflexive and anti-symmetric, (c) anti-symmetric, transi-
tive, but not reflexive, and (d) reflexive, symmetric, and transitive. Do not use
relations on numbers or sets.
Functions
3.1 Basics
A function is a map which sends each element of a given set to a specific ele-
ment in some (other) given set. For instance, the operation of adding 1 defines
a function: each number n is mapped to a unique number n + 1.
More generally, functions may take pairs, triples, etc., as inputs and re-
turn some kind of output. Many functions are familiar to us from basic arith-
metic. For instance, addition and multiplication are functions. They take in
two numbers and return a third.
In this mathematical, abstract sense, a function is a black box: what matters
is only what output is paired with what input, not the method for calculating
the output.
The diagram in Figure 3.1 may help to think about functions. The ellipse
on the left represents the function’s domain; the ellipse on the right represents
the function’s codomain; and an arrow points from an argument in the domain
to the corresponding value in the codomain.
Example 3.2. Multiplication takes pairs of natural numbers as inputs and maps
them to natural numbers as outputs, so goes from N × N (the domain) to N
(the codomain). As it turns out, the range is also N, since every n ∈ N is
n × 1.
45
CHAPTER 3. FUNCTIONS
Example 3.4. The relation that pairs each student in a class with their final
grade is a function—no student can get two different final grades in the same
class. The relation that pairs each student in a class with their parents is not a
function: students can have zero, or two, or more parents.
We can define functions by specifying in some precise way what the value
of the function is for every possible argument. Different ways of doing this are
by giving a formula, describing a method for computing the value, or listing
the values for each argument. However functions are defined, we must make
sure that for each argument we specify one, and only one, value.
Figure 3.2: A surjective function has every element of the codomain as a value.
if ∀ x f ( x ) = g( x ), then f = g
Example 3.7. We can also define functions by cases. For instance, we could
define h : N → N by
(
x
if x is even
h( x ) = 2x+1
2 if x is odd.
Since every natural number is either even or odd, the output of this function
will always be a natural number. Just remember that if you define a function
by cases, every possible input must fall into exactly one case. In some cases,
this will require a proof that the cases are exhaustive and exclusive.
(∀y ∈ B)(∃ x ∈ A) f ( x ) = y.
If you want to show that f is a surjection, then you need to show that every
object in f ’s codomain is the value of f ( x ) for some input x.
Figure 3.3: An injective function never maps two different arguments to the
same value.
Note that any function induces a surjection. After all, given a function
f : A → B, let f ′ : A → ran( f ) be defined by f ′ ( x ) = f ( x ). Since ran( f ) is
defined as { f ( x ) ∈ B : x ∈ A}, this function f ′ is guaranteed to be a surjection
Now, any function maps each possible input to a unique output. But there
are also functions which never map different inputs to the same outputs. Such
functions are called injective, and can be pictured as in Figure 3.3.
Definition 3.9 (Injective function). A function f : A → B is injective iff for
each y ∈ B there is at most one x ∈ A such that f ( x ) = y. We call such a
function an injection from A to B.
If you want to show that f is an injection, you need to show that for any
elements x and y of f ’s domain, if f ( x ) = f (y), then x = y.
Example 3.10. The constant function f : N → N given by f ( x ) = 1 is neither
injective, nor surjective.
The identity function f : N → N given by f ( x ) = x is both injective and
surjective.
The successor function f : N → N given by f ( x ) = x + 1 is injective but
not surjective.
The function f : N → N defined by:
(
x
if x is even
f ( x ) = 2x+1
2 if x is odd.
is surjective, but not injective.
Often enough, we want to consider functions which are both injective and
surjective. We call such functions bijective. They look like the function pic-
tured in Figure 3.4. Bijections are also sometimes called one-to-one correspon-
dences, since they uniquely pair elements of the codomain with elements of
the domain.
Definition 3.11 (Bijection). A function f : A → B is bijective iff it is both sur-
jective and injective. We call such a function a bijection from A to B (or be-
tween A and B).
Figure 3.4: A bijective function uniquely pairs the elements of the codomain
with those of the domain.
R f = {⟨ x, y⟩ : f ( x ) = y}.
Proof. Suppose there is a y such that Rxy. If there were another z ̸= y such
that Rxz, the condition on R would be violated. Hence, if there is a y such that
Rxy, this y is unique, and so f is well-defined. Obviously, R f = R.
function simply as its graph. In other words, functions can be identified with
certain relations, i.e., with certain sets of tuples. Note, though, that the spirit of
this “identification” is as in section 2.2: it is not a claim about the metaphysics
of functions, but an observation that it is convenient to treat functions as cer-
tain sets. One reason that this is so convenient, is that we can now consider
performing similar operations on functions as we performed on relations (see
section 2.7). In particular:
It follows from these definitions that ran( f ) = f [dom( f )], for any func-
tion f . These notions are exactly as one would expect, given the definitions
in section 2.7 and our identification of functions with relations. But two other
operations—inverses and relative products—require a little more detail. We
will provide that in section 3.4 and section 3.5.
But the scare quotes around “defined by” (and “the”) suggest that this is not
a definition. At least, it will not always work, with complete generality. For,
in order for this definition to specify a function, there has to be one and only
It is defined for all y ∈ B, since for each such y ∈ ran( f ) there is exactly one
x ∈ A such that f ( x ) = y. By definition, if y = f ( x ), then g(y) = x, i.e.,
g( f ( x )) = x.
By combining the ideas in the previous proof, we now get that every bijec-
tion has an inverse, i.e., there is a single function which is both a left and right
inverse of f .
1 Since f is surjective, for every y ∈ B the set { x : f ( x ) = y } is nonempty. Our definition
of h requires that we choose a single x from each of these sets. That this is always possible is
actually not obvious—the possibility of making these choices is simply assumed as an axiom. In
other words, this proposition assumes the so-called Axiom of Choice, an issue we will revisit in
chapter 69. However, in many specific cases, e.g., when A = N or is finite, or when f is bijective,
the Axiom of Choice is not required. (In the particular case when f is bijective, for each y ∈ B the
set { x : f ( x ) = y} has exactly one element, so that there is no choice to make.)
Proof. Exercise.
Proposition 3.19. Show that if f : A → B has a left inverse g and a right inverse h,
then h = g.
Proof. Exercise.
R f = {⟨ x, y⟩ : f ( x ) = y}.
Proposition 3.27. Suppose R ⊆ A × B has the property that whenever Rxy and
Rxy′ then y = y′ . Then R is the graph of the partial function f : X → 7 Y defined by:
if there is a y such that Rxy, then f ( x ) = y, otherwise f ( x ) ↑. If R is also serial, i.e.,
for each x ∈ X there is a y ∈ Y such that Rxy, then f is total.
Proof. Suppose there is a y such that Rxy. If there were another y′ ̸= y such
that Rxy′ , the condition on R would be violated. Hence, if there is a y such
that Rxy, that y is unique, and so f is well-defined. Obviously, R f = R and f
is total if R is serial.
Problems
Problem 3.1. Show that if f : A → B has a left inverse g, then f is injective.
Problem 3.3. Prove Proposition 3.18. You have to define f −1 , show that it
is a function, and show that it is an inverse of f , i.e., f −1 ( f ( x )) = x and
f ( f −1 (y)) = y for all x ∈ A and y ∈ B.
4.1 Introduction
When Georg Cantor developed set theory in the 1870s, one of his aims was
to make palatable the idea of an infinite collection—an actual infinity, as the
medievals would say. A key part of this was his treatment of the size of dif-
ferent sets. If a, b and c are all distinct, then the set { a, b, c} is intuitively larger
than { a, b}. But what about infinite sets? Are they all as large as each other?
It turns out that they are not.
The first important idea here is that of an enumeration. We can list every
finite set by listing all its elements. For some infinite sets, we can also list
all their elements if we allow the list itself to be infinite. Such sets are called
enumerable. Cantor’s surprising result, which we will fully understand by
the end of this chapter, was that some infinite sets are not enumerable.
55
CHAPTER 4. THE SIZE OF SETS
We’ve already given examples of sets by listing their elements. Let’s discuss
in more general terms how and when we can list the elements of a set, even if
that set is infinite.
Definition 4.1 (Enumeration, informally). Informally, an enumeration of a set A
is a list (possibly infinite) of elements of A such that every element of A ap-
pears on the list at some finite position. If A has an enumeration, then A is
said to be enumerable.
The last argument shows that in order to get a good handle on enumera-
tions and enumerable sets and to prove things about them, we need a more
precise definition. The following provides it.
Definition 4.3 (Enumeration, formally). An enumeration of a set A ̸= ∅ is any
surjective function f : Z+ → A.
Let’s convince ourselves that the formal definition and the informal defini-
tion using a possibly infinite list are equivalent. First, any surjective function
from Z+ to a set A enumerates A. Such a function determines an enumeration
as defined informally above: the list f (1), f (2), f (3), . . . . Since f is surjective,
every element of A is guaranteed to be the value of f (n) for some n ∈ Z+ .
Hence, every element of A appears at some finite position in the list. Since the
function may not be injective, the list may be redundant, but that is acceptable
(as noted above).
On the other hand, given a list that enumerates all elements of A, we can
define a surjective function f : Z+ → A by letting f (n) be the nth element
of the list, or the final element of the list if there is no nth element. The only
case where this does not produce a surjective function is when A is empty,
and hence the list is empty. So, every non-empty list determines a surjective
function f : Z+ → A.
Definition 4.4. A set A is enumerable iff it is empty or has an enumeration.
Example 4.5. A function enumerating the positive integers (Z+ ) is simply the
identity function given by f (n) = n. A function enumerating the natural
numbers N is the function g(n) = n − 1.
−⌈ 02 ⌉ ⌈ 12 ⌉ −⌈ 22 ⌉ ⌈ 32 ⌉ −⌈ 42 ⌉ ⌈ 52 ⌉ −⌈ 62 ⌉ . . .
0 1 −1 2 −2 3 ...
Proof. We define the function g recursively: Let g(1) = f (1). If g(i ) has al-
ready been defined, let g(i + 1) be the first value of f (1), f (2), . . . not already
among g(1), . . . , g(i ), if there is one. If A has just n elements, then g(1), . . . ,
g(n) are all defined, and so we have defined a function g : {1, . . . , n} → A. If
A has infinitely many elements, then for any i there must be an element of A
in the enumeration f (1), f (2), . . . , which is not already among g(1), . . . , g(i ).
In this case we have defined a funtion g : Z+ → A.
N × N = {⟨n, m⟩ : n, m ∈ N}
0 1 2 3 ...
0 ⟨0, 0⟩ ⟨0, 1⟩ ⟨0, 2⟩ ⟨0, 3⟩ ...
1 ⟨1, 0⟩ ⟨1, 1⟩ ⟨1, 2⟩ ⟨1, 3⟩ ...
2 ⟨2, 0⟩ ⟨2, 1⟩ ⟨2, 2⟩ ⟨2, 3⟩ ...
3 ⟨3, 0⟩ ⟨3, 1⟩ ⟨3, 2⟩ ⟨3, 3⟩ ...
.. .. .. .. .. ..
. . . . . .
Clearly, every ordered pair in N × N will appear exactly once in the array.
In particular, ⟨n, m⟩ will appear in the nth row and mth column. But how
do we organize the elements of such an array into a “one-dimensional” list?
The pattern in the array below demonstrates one way to do this (although of
course there are many other options):
0 1 2 3 4 ...
0 0 1 3 6 10 ...
1 2 4 7 11 ... ...
2 5 8 12 ... ... ...
3 9 13 ... ... ... ...
4 14 ... ... ... ... ...
.. .. .. .. .. ..
. . . . . ... .
⟨0, 0⟩, ⟨0, 1⟩, ⟨1, 0⟩, ⟨0, 2⟩, ⟨1, 1⟩, ⟨2, 0⟩, ⟨0, 3⟩, ⟨1, 2⟩, ⟨2, 1⟩, ⟨3, 0⟩, . . .
This technique also generalises rather nicely. For example, we can use it to
enumerate the set of ordered triples of natural numbers, i.e.:
N × N × N = {⟨n, m, k⟩ : n, m, k ∈ N}
N3 = (N × N) × N = {⟨⟨n, m⟩, k ⟩ : n, m, k ∈ N}
and thus we can enumerate N3 with an array by labelling one axis with the
enumeration of N, and the other axis with the enumeration of N2 :
0 1 2 3 ...
⟨0, 0⟩ ⟨0, 0, 0⟩ ⟨0, 0, 1⟩ ⟨0, 0, 2⟩ ⟨0, 0, 3⟩ ...
⟨0, 1⟩ ⟨0, 1, 0⟩ ⟨0, 1, 1⟩ ⟨0, 1, 2⟩ ⟨0, 1, 3⟩ ...
⟨1, 0⟩ ⟨1, 0, 0⟩ ⟨1, 0, 1⟩ ⟨1, 0, 2⟩ ⟨1, 0, 3⟩ ...
⟨0, 2⟩ ⟨0, 2, 0⟩ ⟨0, 2, 1⟩ ⟨0, 2, 2⟩ ⟨0, 2, 3⟩ ...
.. .. .. .. .. ..
. . . . . .
Thus, by using a method like Cantor’s zig-zag method, we may similarly ob-
tain an enumeration of N3 . And we can keep going, obtaining enumerations
of Nn for any natural number n. So, we have:
This would enable us to calculate exactly where ⟨n, m⟩ will occur in our enu-
meration.
In fact, we can define g directly by making two observations. First: if the
nth row and mth column contains value v, then the (n + 1)st row and (m − 1)st
column contains value v + 1. Second: the first row of our enumeration con-
sists of the triangular numbers, starting with 0, 1, 3, 6, etc. The kth triangular
number is the sum of the natural numbers < k, which can be computed as
k (k + 1)/2. Putting these two observations together, consider this function:
(n + m + 1)(n + m)
g(n, m) = +n
2
We often just write g(n, m) rather that g(⟨n, m⟩), since it is easier on the eyes.
This tells you first to determine the (n + m)th triangle number, and then add
n to it. And it populates the array in exactly the way we would like. So in
particular, the pair ⟨1, 2⟩ is sent to 4×2 3 + 1 = 7.
This function g is the inverse of an enumeration of a set of pairs. Such
functions are called pairing functions.
We can use pairing functions to encode, e.g., pairs of natural numbers; or,
in other words, we can represent each pair of elements using a single number.
Using the inverse of the pairing function, we can decode the number, i.e., find
out which pair it represents.
1 2 3 4 5 6 7 8 9 10 ...
Repeat this with pairs ⟨1, m⟩ for the place that still remain empty, again skip-
ping every other empty place:
1 2 3 4 5 6 7 8 9 10 ...
Enter pairs ⟨2, m⟩, ⟨2, m⟩, etc., in the same way. Our completed enumeration
thus starts like this:
1 2 3 4 5 6 7 8 9 10 ...
⟨0, 0⟩ ⟨1, 0⟩ ⟨0, 1⟩ ⟨2, 0⟩ ⟨0, 2⟩ ⟨1, 1⟩ ⟨0, 3⟩ ⟨3, 0⟩ ⟨0, 4⟩ ⟨1, 2⟩ ...
0 1 2 3 4 5 ...
0 1 3 5 7 9 11 ...
1 2 6 10 14 18 ... ...
2 4 12 20 28 ... ... ...
3 8 24 40 ... ... ... ...
4 16 48 ... ... ... ... ...
5 32 ... ... ... ... ... ...
.. .. .. .. .. .. .. ..
. . . . . . . .
We can see that the pairs in row 0 are in the odd numbered places of our
enumeration, i.e., pair ⟨0, m⟩ is in place 2m + 1; pairs in the second row, ⟨1, m⟩,
are in places whose number is the double of an odd number, specifically, 2 ·
(2m + 1); pairs in the third row, ⟨2, m⟩, are in places whose number is four
times an odd number, 4 · (2m + 1); and so on. The factors of (2m + 1) for
each row, 1, 2, 4, 8, . . . , are exactly the powers of 2: 1 = 20 , 2 = 21 , 4 = 22 ,
8 = 23 , . . . In fact, the relevant exponent is always the first member of the pair
in question. Thus, for pair ⟨n, m⟩ the factor is 2n . This gives us the general
formula: 2n · (2m + 1). However, this is a mapping of pairs to positive integers,
i.e., ⟨0, 0⟩ has position 1. If we want to begin at position 0 we must subtract 1
from the result. This gives us:
h(n, m) = 2n (2m + 1) − 1
j(n, m) = 2n 3m
is an injective function N2 → N.
Some sets, such as the set Z+ of positive integers, are infinite. So far we’ve
seen examples of infinite sets which were all enumerable. However, there are
also infinite sets which do not have this property. Such sets are called non-
enumerable.
First of all, it is perhaps already surprising that there are non-enumerable
sets. For any enumerable set A there is a surjective function f : Z+ → A. If a
set is non-enumerable there is no such function. That is, no function mapping
the infinitely many elements of Z+ to A can exhaust all of A. So there are
“more” elements of A than the infinitely many positive integers.
How would one prove that a set is non-enumerable? You have to show
that no such surjective function can exist. Equivalently, you have to show that
the elements of A cannot be enumerated in a one way infinite list. The best
way to do this is to show that every list of elements of A must leave at least
one element out; or that no function f : Z+ → A can be surjective. We can
do this using Cantor’s diagonal method. Given a list of elements of A, say, x1 ,
x2 , . . . , we construct another element of A which, by its construction, cannot
possibly be on that list.
Our first example is the set Bω of all infinite, non-gappy sequences of 0’s
and 1’s.
Theorem 4.17. Bω is non-enumerable.
We may arrange this list, and the elements of each sequence si in it, in an
array:
1 2 3 4 ...
1 s 1 ( 1 ) s1 (2) s1 (3) s1 (4) . . .
2 s2 (1) s 2 ( 2 ) s2 (3) s2 (4) . . .
3 s3 (1) s3 (2) s 3 ( 3 ) s3 (4) . . .
4 s4 (1) s4 (2) s4 (3) s 4 ( 4 ) . . .
.. .. .. .. .. ..
. . . . . .
The labels down the side give the number of the sequence in the list s1 , s2 , . . . ;
the numbers across the top label the elements of the individual sequences. For
instance, s1 (1) is a name for whatever number, a 0 or a 1, is the first element
in the sequence s1 , and so on.
Now we construct an infinite sequence, s, of 0’s and 1’s which cannot pos-
sibly be on this list. The definition of s will depend on the list s1 , s2 , . . . .
Any infinite list of infinite sequences of 0’s and 1’s gives rise to an infinite
sequence s which is guaranteed to not appear on the list.
To define s, we specify what all its elements are, i.e., we specify s(n) for all
n ∈ Z+ . We do this by reading down the diagonal of the array above (hence
the name “diagonal method”) and then changing every 1 to a 0 and every 0 to
a 1. More abstractly, we define s(n) to be 0 or 1 according to whether the n-th
element of the diagonal, sn (n), is 1 or 0.
(
1 if sn (n) = 0
s(n) =
0 if sn (n) = 1.
If you like formulas better than definitions by cases, you could also define
s ( n ) = 1 − s n ( n ).
Clearly s is an infinite sequence of 0’s and 1’s, since it is just the mirror
sequence to the sequence of 0’s and 1’s that appear on the diagonal of our
array. So s is an element of Bω . But it cannot be on the list s1 , s2 , . . . Why not?
It can’t be the first sequence in the list, s1 , because it differs from s1 in the
first element. Whatever s1 (1) is, we defined s(1) to be the opposite. It can’t be
the second sequence in the list, because s differs from s2 in the second element:
if s2 (2) is 0, s(2) is 1, and vice versa. And so on.
More precisely: if s were on the list, there would be some k so that s = sk .
Two sequences are identical iff they agree at every place, i.e., for any n, s(n) =
sk (n). So in particular, taking n = k as a special case, s(k) = sk (k) would
have to hold. sk (k) is either 0 or 1. If it is 0 then s(k ) must be 1—that’s how
we defined s. But if sk (k ) = 1 then, again because of the way we defined s,
s(k) = 0. In either case s(k) ̸= sk (k ).
We started by assuming that there is a list of elements of Bω , s1 , s2 , . . .
From this list we constructed a sequence s which we proved cannot be on the
list. But it definitely is a sequence of 0’s and 1’s if all the si are sequences of
0’s and 1’s, i.e., s ∈ Bω . This shows in particular that there can be no list of
all elements of Bω , since for any such list we could also construct a sequence s
guaranteed to not be on the list, so the assumption that there is a list of all
sequences in Bω leads to a contradiction.
Proof. We proceed in the same way, by showing that for every list of subsets
of Z+ there is a subset of Z+ which cannot be on the list. Suppose the follow-
ing is a given list of subsets of Z+ :
Z1 , Z2 , Z3 , . . .
Z = { n ∈ Z+ : n ∈
/ Zn }
Z is clearly a set of positive integers, since by assumption each Zn is, and thus
Z ∈ ℘(Z+ ). But Z cannot be on the list. To show this, we’ll establish that for
each k ∈ Z+ , Z ̸= Zk .
So let k ∈ Z+ be arbitrary. We’ve defined Z so that for any n ∈ Z+ , n ∈ Z
iff n ∈
/ Zn . In particular, taking n = k, k ∈ Z iff k ∈/ Zk . But this shows that
Z ̸= Zk , since k is an element of one but not the other, and so Z and Zk have
different elements. Since k was arbitrary, Z is not on the list Z1 , Z2 , . . .
The preceding proof did not mention a diagonal, but you can think of it
as involving a diagonal if you picture it this way: Imagine the sets Z1 , Z2 , . . . ,
written in an array, where each element j ∈ Zi is listed in the j-th column.
Say the first four sets on that list are {1, 2, 3, . . . }, {2, 4, 6, . . . }, {1, 2, 5}, and
{3, 4, 5, . . . }. Then the array would begin with
Z1 = {1, 2, 3, 4, 5, 6, ...}
Z2 ={ 2, 4, 6, ...}
Z3 = { 1, 2, 5 }
Z4 ={ 3, 4, 5, 6, ...}
.. ..
. .
Then Z is the set obtained by going down the diagonal, leaving out any num-
bers that appear along the diagonal and include those j where the array has a
gap in the j-th row/column. In the above case, we would leave out 1 and 2,
include 3, leave out 4, etc.
4.7 Reduction
Z = { n ∈ Z+ : s ( n ) = 1 }
f ( Z1 ), f ( Z2 ), f ( Z3 ), . . .
It is easy to be confused about the direction the reduction goes in. For
instance, a surjective function g : Bω → B does not establish that B is non-
enumerable. (Consider g : Bω → B defined by g(s) = s(1), the function that
maps a sequence of 0’s and 1’s to its first element. It is surjective, because
some sequences start with 0 and some start with 1. But B is finite.) Note also
that the function f must be surjective, or otherwise the argument does not go
through: f ( x1 ), f ( x2 ), . . . would then not be guaranteed to include all the
elements of B. For instance,
h(n) = 000
| {z. . . 0}
n 0’s
4.8 Equinumerosity
We have an intuitive notion of “size” of sets, which works fine for finite sets.
But what about infinite sets? If we want to come up with a formal way of
comparing the sizes of two sets of any size, it is a good idea to start by defining
when sets are the same size. Here is Frege:
The insight of this passage can be brought out through a formal definition:
The following proof uses Definition 4.4 if section 4.2 is included and
Definition 4.27 otherwise.
( f ◦ g)(n) = f ( g(n)) = f ( x ) = y
and thus f ◦ g is surjective. We have that f ◦ g is an enumeration of B, and so
B is enumerable.
If B is enumerable, we obtain that A is enumerable by repeating the argu-
ment with the bijection f −1 : B → A instead of f .
It is clear that this is a reflexive and transitive relation, but that it is not
symmetric (this is left as an exercise). We can also introduce a notion, which
states that one set is (strictly) smaller than another.
Definition 4.23. A is smaller than B, written A ≺ B, iff there is an injection f : A →
B but no bijection g : A → B, i.e., A ⪯ B and A ̸≈ B.
It is clear that this relation is irreflexive and transitive. (This is left as an ex-
ercise.) Using this notation, we can say that a set A is enumerable iff A ⪯ N,
and that A is non-enumerable iff N ≺ A. This allows us to restate Theo-
rem 4.32 as the observation that N ≺ ℘(N). In fact, Cantor (1892) proved that
this last point is perfectly general:
A = {x ∈ A : x ∈
/ g( x )}.
It’s instructive to compare the proof of Theorem 4.24 to that of Theorem 4.18.
There we showed that for any list Z1 , Z2 , . . . , of subsets of Z+ one can con-
struct a set Z of numbers guaranteed not to be on the list. It was guaranteed
not to be on the list because, for every n ∈ Z+ , n ∈ Zn iff n ∈ / Z. This way,
there is always some number that is an element of one of Zn or Z but not the
other. We follow the same idea here, except the indices n are now elements
of A instead of Z+ . The set B is defined so that it is different from g( x ) for
each x ∈ A, because x ∈ g( x ) iff x ∈ / B. Again, there is always an element
of A which is an element of one of g( x ) and B but not the other. And just as Z
therefore cannot be on the list Z1 , Z2 , . . . , B cannot be in the range of g.
It’s instructive to compare the proof of Theorem 4.24 to that of Theorem 4.32.
There we showed that for any list N0 , N1 , N2 , . . . , of subsets of N we can con-
struct a set D of numbers guaranteed not to be on the list. It was guaranteed
The following section 4.11, section 4.12, section 4.13 are alternative
versions of section 4.2, section 4.6, section 4.7 due to Tim Button for use
in his Open Set Theory text. They are slightly more advanced and use a
difference definition of enumerability more suitable in a set theory context
(i.e., bijection with N or an initial segment, rather than being listable or
being the range of a surjective function from Z+ ).
1 For more on the history, see e.g., Potter (2004, pp. 165–6).
A = { a1 , a2 , . . . , a n }.
Assuming that the elements a1 , . . . , an are all distinct, this gives us a bijection
between A and the first n natural numbers 0, . . . , n − 1. Conversely, since
every finite set has only finitely many elements, every finite set can be put
into such a correspondence. In other words, if A is finite, there is a bijection
between A and {0, . . . , n − 1}, where n is the number of elements of A.
If we allow for certain kinds of infinite sets, then we will also allow some
infinite sets to be enumerated. We can make this precise by saying that an
infinite set is enumerated by a bijection between it and all of N.
Example 4.28. A function enumerating the natural numbers is simply the iden-
tity function IdN : N → N given by IdN (n) = n. A function enumerating the
positive natural numbers, N+ = N \ {0}, is the function g(n) = n + 1, i.e., the
successor function.
2 Yes, we count from 0. Of course we could also start with 1. This would make no big differ-
f (n) = 2n and
g(n) = 2n + 1
respectively enumerate the even natural numbers and the odd natural num-
bers. But neither is surjective, so neither is an enumeration of N.
Example 4.30. Let ⌈ x ⌉ be the ceiling function, which rounds x up to the nearest
integer. Then the function f : N → Z given by:
f (n) = (−1)n n2
0 −1 1 −2 2 −3 3 ...
Notice how f generates the values of Z by “hopping” back and forth between
positive and negative integers. You can also think of f as defined by cases as
follows: (
n
if n is even
f ( n ) = 2 n +1
− 2 if n is odd
0 1 2 3 ...
0 s0 (0) s0 (1) s0 (2) s0 (3) ...
1 s1 (0) s1 (1) s1 (2) s1 (3) ...
2 s2 (0) s2 (1) s2 (2) s2 (3) ...
3 s3 (0) s3 (1) s3 (2) s3 (3) ...
.. .. .. .. .. ..
. . . . . .
We will now construct an infinite string, d, of 0’s and 1’s which is not on this
list. We will do this by specifying each of its entries, i.e., we specify d(n) for
all n ∈ N. Intuitively, we do this by reading down the diagonal of the array
above (hence the name “diagonal method”) and then changing every 1 to a 0
and every 1 to a 0. More abstractly, we define d(n) to be 0 or 1 according to
whether the n-th element of the diagonal, sn (n), is 1 or 0, that is:
(
1 if sn (n) = 0
d(n) =
0 if sn (n) = 1
Clearly d ∈ Bω , since it is an infinite string of 0’s and 1’s. But we have con-
structed d so that d(n) ̸= sn (n) for any n ∈ N. That is, d differs from sn in its
nth entry. So d ̸= sn for any n ∈ N. So d cannot be on the list s0 , s1 , s2 , . . .
We have shown, given an arbitrary enumeration of some subset of Bω , that
it will omit some element of Bω . So there is no enumeration of the set Bω , i.e.,
Bω is non-enumerable.
Proof. We proceed in the same way, by showing that every list of subsets of N
omits some subset of N. So, suppose that we have some list N0 , N1 , N2 , . . . of
subsets of N. We define a set D as follows: n ∈ D iff n ∈
/ Nn :
D = {n ∈ N : n ∈
/ Nn }
The preceding proof did not mention a diagonal. Still, you can think of
it as involving a diagonal if you picture it this way: Imagine the sets N0 , N1 ,
. . . , written in an array, where we write Nn on the nth row by writing m in
the mth column iff if m ∈ Nn . For example, say the first four sets on that list
are {0, 1, 2, . . . }, {1, 3, 5, . . . }, {0, 1, 4}, and {2, 3, 4, . . . }; then our array would
begin with
N0 = {0, 1, 2, ...}
N1 = { 1, 3, 5, . . . }
N2 = { 0, 1, 4 }
N3 = { 2, 3, 4, ...}
.. ..
. .
Then D is the set obtained by going down the diagonal, placing n ∈ D iff n
is not on the diagonal. So in the above case, we would leave out 0 and 1, we
would include 2, we would leave out 3, etc.
4.13 Reduction
Proof of Theorem 4.32 by reduction. For reductio, suppose that ℘(N) is enumer-
able, and thus that there is an enumeration of it, N1 , N2 , N3 , . . .
Define the function f : ℘(N) → Bω by letting f ( N ) be the string sk such
that sk (n) = 1 iff n ∈ N, and sk (n) = 0 otherwise.
This clearly defines a function, since whenever N ⊆ N, any n ∈ N either
is an element of N or isn’t. For instance, the set 2N = {2n : n ∈ N} =
{0, 2, 4, 6, . . . } of even naturals gets mapped to the string 1010101 . . . ; ∅ gets
mapped to 0000 . . . ; N gets mapped to 1111 . . . .
It is also surjective: every string of 0s and 1s corresponds to some set of nat-
ural numbers, namely the one which has as its members those natural num-
bers corresponding to the places where the string has 1s. More precisely, if
s ∈ Bω , then define N ⊆ N by:
N = { n ∈ N : s ( n ) = 1}
f ( N1 ), f ( N2 ), f ( N3 ), . . .
Problems
Problem 4.1. Define an enumeration of the positive squares 1, 4, 9, 16, . . .
Problem 4.7. Show that (Z+ )∗ is enumerable. You may assume problem 4.6.
Problem 4.8. Give an enumeration of the set of all non-negative rational num-
bers.
Problem 4.9. Show that Q is enumerable. Recall that any rational number can
be written as a fraction z/m with z ∈ Z, m ∈ N+ .
Problem 4.11. Recall from your introductory logic course that each possible
truth table expresses a truth function. In other words, the truth functions are
all functions from Bk → B for some k. Prove that the set of all truth functions
is enumerable.
Problem 4.12. Show that the set of all finite subsets of an arbitrary infinite
enumerable set is enumerable.
Problem 4.14. Show that the enumerable union of enumerable sets is enumer-
able. That is, whenever A1 , A2 , . . . are sets, and each Ai is enumerable, then
the union i∞=1 Ai of all of them is also enumerable. [NB: this is hard!]
S
Problem 4.20. Show that the set of all sets of pairs of positive integers is non-
enumerable by a reduction argument.
Problem 4.22. Show that Nω , the set of infinite sequences of natural numbers,
is non-enumerable by a reduction argument.
Problem 4.23. Let P be the set of functions from the set of positive integers
to the set {0}, and let Q be the set of partial functions from the set of positive
integers to the set {0}. Show that P is enumerable and Q is not. (Hint: reduce
the problem of enumerating Bω to enumerating Q).
Problem 4.24. Let S be the set of all surjective functions from the set of posi-
tive integers to the set {0,1}, i.e., S consists of all surjective f : Z+ → B. Show
that S is non-enumerable.
Problem 4.25. Show that the set R of all real numbers is non-enumerable.
Problem 4.36. Show that the set of all sets of pairs of natural numbers, i.e.,
℘(N × N), is non-enumerable by a reduction argument.
Problem 4.37. Show that Nω , the set of infinite sequences of natural numbers,
is non-enumerable by a reduction argument.
Problem 4.38. Let S be the set of all surjections from N to the set {0, 1}, i.e., S
consists of all surjections f : N → B. Show that S is non-enumerable.
Problem 4.39. Show that the set R of all real numbers is non-enumerable.
Arithmetization
5.1 From N to Z
Here are two basic realisations:
1. Every integer can be written in the form n − m, with n, m ∈ N.
a − b = c − d iff a + d = c + b
(It should be obvious that this is how integers are meant to behave: just add b
and d to both sides.) And the easy way to guarantee this behaviour is just to
define an equivalence relation between ordered pairs, ∼, as follows:
⟨ a, b⟩ ∼ ⟨c, d⟩ iff a + d = c + b
79
CHAPTER 5. ARITHMETIZATION
Definition 5.2. The integers are the equivalence classes, under ∼, of ordered
pairs of natural numbers; that is, Z = N2 /∼ .
Now, one might have plenty of different philosophical reactions to this stip-
ulative definition. Before we consider those reactions, though, it is worth con-
tinuing with some of the technicalities.
Having said what the integers are, we shall need to define basic functions
and relations on them. Let’s write [m, n]∼ for the equivalence class under ∼
with ⟨m, n⟩ as an element.1 That is:
(As is common, I’m using ‘ab’ stand for ‘( a × b)’, just to make the axioms
easier to read.) Now, we need to make sure that these definitions behave
as they ought to. Spelling out what this means, and checking it through, is
rather laborious; we relegate the details to section 5.6. But the short point is:
everything works!
One final thing remains. We have constructed the integers using natural
numbers. But this will mean that the natural numbers are not themselves inte-
gers. We will return to the philosophical significance of this in section 5.5. On
a purely technical front, though, we will need some way to be able to treat
natural numbers as integers. The idea is quite easy: for each n ∈ N, we just
1 Note: using the notation introduced in Definition 2.11, we would have written [⟨ m, n ⟩] for
∼
the same thing. But that’s just a bit harder to read.
stipulate that nZ = [n, 0]∼ . We need to confirm that this definition is well-
behaved, i.e., that for any m, n ∈ N
( m + n )Z = mZ + nZ
( m × n )Z = mZ × nZ
m ≤ n ↔ mZ ≤ nZ
But this is all pretty straightforward. For example, to show that the second
of these obtains, we can simply help ourselves to the behaviour of the natural
numbers and reason as follows:
(m × n)Z = [m × n, 0]∼
= [m × n + 0 × 0, m × 0 + 0 × n]∼
= [m, 0]∼ × [n, 0]∼
= mZ × nZ
5.2 From Z to Q
We just saw how to construct the integers from the natural numbers, using
some naı̈ve set theory. We shall now see how to construct the rationals from
the integers in a very similar way. Our initial realisations are:
1. Every rational can be written in the form i/j, where both i and j are inte-
gers but j is non-zero.
⟨ a, b⟩ ∽ ⟨c, d⟩ iff a × d = b × c
We must check that this is an equivalence relation. This is very much like the
case of ∼, and we will leave it as an exercise. But it allows us to say:
Definition 5.3. The rationals are the equivalence classes, under ∽, of pairs
of integers (whose second element is non-zero). That is, Q = (Z × (Z \
{0Z }))/∽ .
As with the integers, we also want to define some basic operations. Where
[i, j]∽ is the equivalence class under ∽ with ⟨i, j⟩ as an element, we say:
n
m
Since m2 = 2n2 , the region where the two squares of side n overlap has the
same area as the region which neither of the two squares cover; i.e., the area
of the orange square equals the sum of the area of the two unshaded squares.
So where the orange square
√ has side p, and each unshaded square has side
q, p = 2q . But now 2 = p/q, with p < m and q < n and p, q ∈ N. This
2 2
{ p ∈ Q : p2 < 2 or p < 0}
This has an upper bound in the rationals; its elements are all smaller √ than 3,
for example. But what√ is its least upper bound? We want to say ‘ 2’; but we
have just seen
√ that 2 is not rational. And there is no least rational number
greater than 2. So the set has an upper bound but no least upper bound.
Hence the rationals lack the Completeness Property.
By contrast, the continuum “morally
√ ought” to have the Completeness
Property. We do not just want 2 to be a real number; we want to fill all
the “gaps” in the rational line. Indeed, we want the continuum itself to have
no “gaps” in it. That is just what we will get via Completeness.
5.4 From Q to R
In essence, the Completeness Property shows that any point α of the real line
divides that line into two halves perfectly: those for which α is the least upper
bound, and those for which α is the greatest lower bound. To construct the
real numbers from the rational numbers, Dedekind suggested that we simply
think
√ of the reals as the cuts that partition the rationals.
√ That is, we identify
√
2 with the cut which separates the rationals < 2 from the rationals > 2.
Let’s tidy this up. If we cut the rational numbers into two halves, we can
uniquely identify the partition we made just by considering its bottom half. So,
getting precise, we offer the following definition:
Definition 5.5 (Cut). A cut α is any non-empty proper initial segment of the
rationals with no greatest element. That is, α is a cut iff:
1. non-empty, proper: ∅ ̸= α ⊊ Q
α ≤ β iff α ⊆ β
This definition of an order allows to state the central result, that the set of cuts
has the Completeness Property. Spelled out fully, the statement has this shape.
If S is a non-empty set of cuts with an upper bound, then S has a least upper
bound. In more detail: there is a cut, λ, which is an upper bound for S, i.e.
(∀α ∈ S)α ⊆ λ, and λ is the least such cut, i.e. (∀ β ∈ R)((∀α ∈ S)α ⊆ β → λ ⊆
β). Now here is the proof of the result:
α + β = { p + q : p ∈ α ∧ q ∈ β}
α × β = { p × q : 0 ≤ p ∈ α ∧ 0 ≤ q ∈ β} ∪ 0R if α, β ≥ 0R
−α = { p − q : p < 0 ∧ q ∈
/ α}
We then need to check that each of these definitions always yields a cut. And
finally, we need to go through an easy (but long-winded) demonstration that
the cuts, so defined, behave exactly as they should. But we relegate all of this
to section 5.6.
It is even less clear that the (much easier) arithmetization of the integers,
or of the rationals, increases rigour in those areas. Here, it is worth making
a simple observation. Having constructed the integers as equivalence classes
of ordered pairs of naturals, and then constructed the rationals as equivalence
classes of ordered pairs of integers, and then constructed the reals as sets of
rationals, we immediately forget about the constructions. In particular: no one
would ever want to invoke these constructions during a mathematical proof
(excepting, of course, a proof that the constructions behaved as they were
supposed to). It’s much easier to speak about a real, directly, than to speak
about some set of sets of sets of sets of sets of sets of sets of naturals.
It is most doubtful of all that these definitions tell us what the integers,
rationals, or reals are, metaphysically speaking. That is, it is doubtful that the
reals (say) are certain sets (of sets of sets. . . ). The main barrier to such a view
is that the construction could have been done in many different ways. In the
case of the reals, there are some genuinely interestingly different construc-
tions (see section 5.7). But here is a really trivial way to obtain some different
constructions: as in section 2.2, we could have defined ordered pairs slightly
differently; if we had used this alternative notion of an ordered pair, then
our constructions would have worked precisely as well as they did, but we
would have ended up with different objects. As such, there are many rival
set-theoretic constructions of the integers, the rationals, and the reals. And
now it would just be arbitrary (and embarrassing) to claim that the integers
(say) are these sets, rather than those. (As in section 2.2, this is an instance of
an argument made famous by Benacerraf 1965.)
A further point is worth raising: there is something quite odd about our
constructions. We started with the natural numbers. We then construct the
integers, and construct “the 0 of the integers”, i.e., [0, 0]∼ . But 0 ̸= [0, 0]∼ .
Indeed, given our constructions, no natural number is an integer. But that
seems extremely counter-intuitive. Indeed, in section 1.3, we claimed without
much argument that N ⊆ Q. If the constructions tell us exactly what the
numbers are, this claim was trivially false.
Standing back, then, where do we get to? Working in a naı̈ve set theory,
and helping ourselves to the naturals, we are able to treat integers, rationals,
and reals as certain sets. In that sense, we can embed the theories of these
entities within a set theory. But the philosophical import of this embedding is
just not that straightforward.
Of course, none of this is the last word! The point is only this. Showing that
the arithmetization of the reals is of deep philosophical significance would
require some additional philosophical argument.
Associativity a + (b + c) = ( a + b) + c
( a × b) × c = a × (b × c)
Commutativity a+b = b+a
a×b = b×a
Identities a+0 = a
a×1 = a
Additive Inverse (∃b ∈ S)0 = a + b
Distributivity a × (b + c) = ( a × b) + ( a × c)
Implicitly, these are all bound with universal quantifiers restricted to S. And
note that the elements 0 and 1 here need not be the natural numbers with the
same name.
So, to check that the integers form a commutative ring, we just need to
check that we meet these eight conditions. None of the conditions is difficult
to establish, but this is a bit laborious. For example, here is how to prove
Associativity, in the case of addition:
i + ( j + k) = [ a1 , b1 ] + ([ a2 , b2 ] + [ a3 , b3 ])
= [ a1 , b1 ] + [ a2 + a3 , b2 + b3 ]
= [ a1 + ( a2 + a3 ), b1 + (b2 + b3 )]
= [( a1 + a2 ) + a3 , (b1 + b2 ) + b3 ]
= [ a1 + a2 , b1 + b2 ] + [ a3 , b3 ]
= ([ a1 , b1 ] + [ a2 , b2 ]) + [ a3 , b3 ]
= (i + j ) + k
i × ( j + k) = [ a1 , b1 ] × ([ a2 , b2 ] + [ a3 , b3 ])
= [ a1 , b1 ] × [ a2 + a3 , b2 + b3 ]
= [ a1 ( a2 + a3 ) + b1 (b2 + b3 ), a1 (b2 + b3 ) + b1 ( a2 + a3 )]
= [ a1 a2 + a1 a3 + b1 b2 + b1 b3 , a1 b2 + a1 b3 + a2 b1 + a3 b1 ]
= [ a1 a2 + b1 b2 , a1 b2 + a2 b1 ] + [ a1 a3 + b1 b3 , a1 b3 + a3 b1 ]
= ([ a1 , b1 ] × [ a2 , b2 ]) + ([ a1 , b1 ] × [ a3 , b3 ])
= (i × j ) + (i × k )
a ≤ b→a+c ≤ b+c
( a ≤ b ∧ 0 ≤ c) → a × c ≤ b × c
and connected. In the context of order relations, connectedness is sometimes called trichotomy,
since for any a and b we have a ≤ b ∨ a = b ∨ a ≥ b.
Once you have shown that Z constitutes an ordered ring, it is easy but
laborious to show that Q constitutes an ordered field.
Having dealt with the integers and the rationals, it only remains to deal
with the reals. In particular, we need to show that R constitutes a complete
ordered field, i.e., an ordered field with the Completeness Property. Now,
Theorem 5.6 established that R has the Completeness Property. However, it
remains to run through the (tedious) of checking that R is an ordered field.
Before tearing off into that laborious exercise, we need to check some more
“immediate” things. For example, we need a guarantee that α + β, as defined,
is indeed a cut, for any cuts α and β. Here is a proof of that fact:
Proof. Since α and β are both cuts, α + β = { p + q : p ∈ α ∧ q ∈ β} is a non-
empty proper subset of Q. Now suppose x < p + q for some p ∈ α and q ∈ β.
Then x − p < q, so x − p ∈ β, and x = p + ( x − p) ∈ α + β. So α + β is an initial
segment of Q. Finally, for any p + q ∈ α + β, since α and β are both cuts, there
are p1 ∈ α and q1 ∈ β such that p < p1 and q < q1 ; so p + q < p1 + q1 ∈ α + β;
so α + β has no maximum.
Similar efforts will allow you to check that α − β and α × β and α ÷ β are
cuts (in the last case, ignoring the case where β is the zero-cut). Again, though,
we will simply leave this to you.
But here
√ is a small loose end to tidy up. In section 5.4, we suggest that we
can take 2 = { p ∈ Q : p < 0 or p2 < 2}. But we do need to show that this
set is a cut. Here is a proof of that fact:
Proof. Clearly this is a nonempty proper initial segment of the rationals; so
it suffices to show that it has no maximum. In particular, it suffices to show
2p+2
that, where p is a positive rational with p2 < 2 and q = p+2 , both p < q and
q2 < 2. To see that p < q, just note:
p2 < 2
p2 + 2p < 2 + 2p
p( p + 2) < 2 + 2p
2+2p
p< p +2 =q
1.41421356237 . . .
The idea that reals can be considered via “increasingly good approximations”
provides us with the basis for another sequence of insights (akin to the reali-
sations that we used when constructing Q from Z, or Z from N). The basic
insights are these:
Of course, not just any function from N to Q will give us a real number. For
instance, consider this function:
(
1 if n is odd
f (n) =
0 if n is even
The general idea of a limit is the same as before: if you want a certain
level of precision (measured by ε), there is a “region” to look in (any input
greater than ℓ). And it is easy to see that our sequence
√ 1, 1.4, 1.414, 1.4142,
1.41421. . . has a limit: if you want to approximate 2 to within an error of
1/10n , then just look to any entry after the nth.
The obvious thought, then, would be to say that a real number just is any
Cauchy sequence. But, as in the constructions of Z and Q, this would be
too naı̈ve: for any given real number, multiple different Cauchy sequences
indicate that real number. A simple way to see this as follows. Given a Cauchy
sequence f , define g to be exactly the same function as f , except that g(0) ̸=
f (0). Since the two sequences agree everywhere after the first number, we will
(ultimately) want to say that they have the same limit, in the sense employed
in Definition 5.10, and so should be thought of “defining” the same real. So,
we should really think of these Cauchy sequences as the same real number.
Consequently, we again need to define an equivalence relation on the Cauchy
sequences, and identify real numbers with equivalence relations. First we
need the idea of a function which tends to 0 in the limit. For any function
h : N → Q, say that h tends to 0 iff for any positive ε ∈ Q we have that
(∃ℓ ∈ N)(∀n > ℓ)| f (n)| < ε.4 Further, where f and g are functions N → Q,
let ( f − g)(n) = f (n) − g(n). Now define:
f ≎ g iff ( f − g) tends to 0.
will drop the subscript and write just [ f ] . We also stipulate that, for each
q ∈ Q, we have qR = [cq ] , where cq is the constant function cq (n) = q for all
n ∈ N. We then define basic relations and operations on the reals, e.g.:
[ f ] + [ g] = [( f + g)]
[ f ] × [ g] = [( f × g)]
where ( f + g)(n) = f (n) + g(n) and ( f × g)(n) = f (n) × g(n). Of course,
we also need to check that each of ( f + g), ( f − g) and ( f × g) are Cauchy
sequences when f and g are; but they are, and we leave this to you.
Finally, we define we a notion of order. Say [ f ] is positive iff both [ f ] ̸= 0Q
and (∃ℓ ∈ N)(∀n > ℓ)0 < f (n). Then say [ f ] < [ g] iff [( g − f )] is positive. We
have to check that this is well-defined (i.e., that it does not depend upon choice
of “representative” function from the equivalence class). But having done this,
it is quite easy to show that these yield the right algebraic properties; that is:
Theorem 5.11. The Cauchy sequences constitute an ordered field.
Proof. Exercise.
Proof sketch. Let S be any non-empty set of Cauchy sequences with an upper
bound. So there is some p ∈ Q such that pR is an upper bound for S. Let
r ∈ S; then there is some q ∈ Q such that qR < r. So if a least upper bound on
S exists, it is between qR and pR (inclusive).
We will hone in on the l.u.b., by approaching it simultaneously from below
and above. In particular, we define two functions, f , g : N → Q, with the aim
that f will hone in on the l.u.b. from above, and g will hone on in it from
below. We start by defining:
f (0) = p
g (0) = q
f (n)+ g(n)
Then, where an = 2 , let:5
(
an if (∀h ∈ S)[h] ≤ ( an )R
f ( n + 1) =
f (n) otherwise
(
an if (∃h ∈ S)[h] ≥ ( an )R
g ( n + 1) =
g(n) otherwise
5 This is a recursive definition. But we have not yet given any reason to think that recursive
Both f and g are Cauchy sequences. (This can be checked fairly easily; but
we leave it as an exercise.) Note that the function ( f − g) tends to 0, since the
difference between f and g halves at every step. Hence [ f ] = [ g] .
We will show that (∀h ∈ S)[h] ≤ [ f ] , invoking Theorem 5.11 as we go. Let
h ∈ S and suppose, for reductio, that [ f ] < [h] , so that 0R < [(h − f )] . Since
f is a monotonically decreasing Cauchy sequence, there is some n ∈ N such
that [(c f (n) − f )] < [(h − f )] . So:
Problems
Problem 5.1. Show that (m + n)Z = mZ + nZ and m ≤ n ↔ mZ ≤ nZ , for
any m, n ∈ N.
Problem 5.8. Let f (n) = 0 for every n. Let g(n) = (n+11)2 . Show that both are
Cauchy sequences, and indeed that the limit of both functions is 0, so that also
f ∼R g.
Problem 5.9. Prove that the Cauchy sequences constitute an ordered field.
Infinite Sets
This chapter on infinite sets is taken from Tim Button’s Open Set The-
ory.
1 2 3 4 5 6 7 8 9 ...
1 2 3 4 5 6 7 8 9 ...
94
6.2. DEDEKIND ALGEBRAS
1. o ∈ clo f (o ); and
Proof. Note that there is at least one f -closed set with o as an element, namely
ran( f ) ∪ {o }. So clo f (o ), the intersection of all such sets, exists. We must now
check (1)–(3).
Concerning (1): o ∈ clo f (o ) as it is an intersection of sets which all have o
as an element.
Concerning (2): suppose x ∈ clo f (o ). So if o ∈ X and X is f -closed, then
x ∈ X, and now f ( x ) ∈ X as X is f -closed. So f ( x ) ∈ clo f (o ).
Concerning (3): quite generally, if X ∈ C then C ⊆ X.
T
1. o ∈
/ ran( f )
2. f is an injection
3. A = clo f (o )
Since A = clo f (o ), our earlier result tells us that A is the smallest f -closed
set with o as an element. Clearly a Dedekind algebra is Dedekind infinite; just
look at clauses (1) and (2) of the definition. But the more exciting fact is that
any Dedekind infinite set can be turned into a Dedekind algebra.
Theorem 6.5. If there is a Dedekind infinite set, then there is a Dedekind algebra.
Corollary 6.7. Let N, s, o comprise a Dedekind algebra. Then for any formula φ( x ),
which may have parameters:
this after chapter 66, or perhaps read an alternative treatment, such as Pot-
ter 2004, pp. 95–8.) But, where N, s, o comprise a Dedekind algebra, we will
ultimately be able to stipulate the following:
Dedekind’s bold idea is this. We have just shown how to build the natural
numbers using (naı̈ve) set theory alone. In chapter 5, we saw how to con-
struct the reals given the natural numbers and some set theory. So, perhaps,
“arithmetic (algebra, analysis)” turn out to be “merely a part of logic” (in
Dedekind’s extended sense of the word “logic”).
That’s the idea. But hold on for a moment. Our construction of a Dedekind
algebra (our surrogate for the natural numbers) is conditional on the existence
of a Dedekind infinite set. (Just look back to Theorem 6.5.) Unless the exis-
tence of a Dedekind infinite set can be established via “logic” or “the pure
laws of thought”, the project stalls.
So, can the existence of a Dedekind infinite set be established by “the pure
laws of thought”? Here was Dedekind’s effort:
This is quite an astonishing thing to find in the middle of a book which largely
consists of highly rigorous mathematical proofs. Two remarks are worth mak-
ing.
First: this “proof” scarcely has what we would now recognize as a “math-
ematical” character. It speaks of psychological objects (thoughts), and merely
possible ones at that.
Second: at least as we have presented Dedekind algebras, this “proof”
has a straightforward technical shortcoming. If Dedekind’s argument is suc-
cessful, it establishes only that there are infinitely many things (specifically,
infinitely many thoughts). But Dedekind also needs to give us a reason to re-
gard S as a single set, with infinitely many elements, rather than thinking of S
as some things (in the plural).
The fact that Dedekind did not see a gap here might suggest that his use
of the word “totality” does not precisely track our use of the word “set”.1 But
this would not be too surprising. The project we have pursued in the last two
chapters—a “construction” of the naturals, and from them a “construction”
of the integers, reals and rationals—has all been carried out naı̈vely. We have
helped ourselves to this set, or that set, as and when we have needed them,
without laying down many general principles concerning exactly which sets
1 Indeed, we have other reasons to think it did not; see Potter (2004, p. 23).
exist, and when. But we know that we need some general principles, for oth-
erwise we will fall into Russell’s Paradox.
The time has come for us to outgrow our naı̈vety.
for each set B and function f . Defined thus, Clo f ( B) is the smallest f -closed
set containing B, in that:
We’ll show that g is a bijection from C → B, from which it will follow that
g ◦ f −1 : A → B is a bijection, completing the proof.
Finally, here is the proof of the main result. Recall that given a function h
and set D, we define h[ D ] = {h( x ) : x ∈ D }.
Propositional Logic
102
6.5. APPENDIX: PROVING SCHRÖDER-BERNSTEIN
7.1 Introduction
Propositional logic deals with formulas that are built from propositional vari-
ables using the propositional connectives ¬, ∧, ∨, →, and ↔. Intuitively,
a propositional variable p stands for a sentence or proposition that is true or
false. Whenever the “truth value” of the propositional variable in a formula
is determined, so is the truth value of any formulas formed from them using
propositional connectives. We say that propositional logic is truth functional,
because its semantics is given by functions of truth values. In particular, in
propositional logic we leave out of consideration any further determination
of truth and falsity, e.g., whether something is necessarily true rather than
just contingently true, or whether something is known to be true, or whether
something is true now rather than was true or will be true. We only consider
two truth values true (T) and false (F), and so exclude from discussion the
possibility that a statement may be neither true nor false, or only half true. We
also concentrate only on connectives where the truth value of a formula built
from them is completely determined by the truth values of its parts (and not,
say, on its meaning). In particular, whether the truth value of conditionals in
English is truth functional in this sense is contentious. The material condi-
tional → is; other logics deal with conditionals that are not truth functional.
In order to develop the theory and metatheory of truth-functional propo-
sitional logic, we must first define the syntax and semantics of its expressions.
We will describe one way of constructing formulas from propositional vari-
ables using the connectives. Alternative definitions are possible. Other sys-
104
7.2. PROPOSITIONAL FORMULAS
tems will choose different symbols, will select different sets of connectives
as primitive, and will use parentheses differently (or even not at all, as in
the case of so-called Polish notation). What all approaches have in common,
though, is that the formation rules define the set of formulas inductively. If
done properly, every expression can result essentially in only one way accord-
ing to the formation rules. The inductive definition resulting in expressions
that are uniquely readable means we can give meanings to these expressions
using the same method—inductive definition.
Giving the meaning of expressions is the domain of semantics. The central
concept in semantics for propositional logic is that of satisfaction in a valua-
tion. A valuation v assigns truth values T, F to the propositional variables.
Any valuation determines a truth value v( φ) for any formula φ. A formula is
satisfied in a valuation v iff v( φ) = T—we write this as v ⊨ φ. This relation
can also be defined by induction on the structure of φ, using the truth func-
tions for the logical connectives to define, say, satisfaction of φ ∧ ψ in terms of
satisfaction (or not) of φ and ψ.
On the basis of the satisfaction relation v ⊨ φ for sentences we can then
define the basic semantic notions of tautology, entailment, and satisfiability.
A formula is a tautology, ⊨ φ, if every valuation satisfies it, i.e., v( φ) = T for
any v. It is entailed by a set of formulas, Γ ⊨ φ, if every valuation that satisfies
all the formulas in Γ also satisfies φ. And a set of formulas is satisfiable if
some valuation satisfies all formulas in it at the same time. Because formulas
are inductively defined, and satisfaction is in turn defined by induction on
the structure of formulas, we can use induction to prove properties of our
semantics and to relate the semantic notions defined.
if we only used primitive symbols, get quite long. This is obviously an ad-
vantage. The bigger advantage, however, is that proofs become shorter. If a
symbol is primitive, it has to be treated separately in proofs. The more primi-
tive symbols, therefore, the longer our proofs.
You may be familiar with different terminology and symbols than the ones
we use above. Logic texts (and teachers) commonly use either ∼, ¬, and ! for
“negation”, ∧, ·, and & for “conjunction”. Commonly used symbols for the
“conditional” or “implication” are →, ⇒, and ⊃. Symbols for “biconditional,”
“bi-implication,” or “(material) equivalence” are ↔, ⇔, and ≡. The ⊥ sym-
bol is variously called “falsity,” “falsum,” “absurdity,” or “bottom.” The ⊤
symbol is variously called “truth,” “verum,” or “top.”
1. ⊥ is an atomic formula.
1. ⊤ abbreviates ¬⊥.
2. φ ↔ ψ abbreviates ( φ → ψ) ∧ (ψ → φ).
7.3 Preliminaries
Theorem 7.4 (Principle of induction on formulas). If some property P holds for
all the atomic formulas and is such that
Proposition 7.5. Any formula in Frm(L0 ) is balanced, in that it has as many left
parentheses as right ones.
1. ⊥.
Proof. By induction on φ. For instance, suppose that φ has two distinct read-
ings as (ψ → χ) and (ψ′ → χ′ ). Then ψ and ψ′ must be the same (or else one
would be a proper initial segment of the other); so if the two readings of φ are
distinct it must be because χ and χ′ are distinct readings of the same sequence
of symbols, which is impossible by the inductive hypothesis.
1. φi ≡ ¬ φ j .
2. φi ≡ ( φ j ∧ φk ).
3. φi ≡ ( φ j ∨ φk ).
4. φi ≡ ( φ j → φk ).
Example 7.10.
As can be seen from the second example, formation sequences may contain
‘junk’: formulas which are redundant or do not contribute to the construction.
We can also prove the converse. This is important because it shows that
our two ways of defining formulas are equivalent: they give the same results.
It also means that we can prove theorems about formulas by using ordinary
induction on the length of formation sequences.
Theorem 7.13. Frm(L0 ) is the set of all expressions (strings of symbols) in the lan-
guage L0 with a formation sequence.
Proof. Let F be the set of all strings of symbols in the language L0 that have a
formation sequence. We have seen in Proposition 7.11 that Frm(L0 ) ⊆ F, so
now we prove the converse.
Suppose φ has a formation sequence ⟨ φ0 , . . . , φn ⟩. We prove that φ ∈
Frm(L0 ) by strong induction on n. Our induction hypothesis is that every
string of symbols with a formation sequence of length m < n is in Frm(L0 ).
By the definition of a formation sequence, either φn is atomic or there must
exist j, k < n such that one of the following is the case:
1. φi ≡ ¬ φ j .
2. φi ≡ ( φ j ∧ φk ).
3. φi ≡ ( φ j ∨ φk ).
4. φi ≡ ( φ j → φk ).
φ ψ φ→ψ
T T T
T F F
F T T
F F T
Theorem 7.16 (Local Determination). Suppose that v1 and v2 are valuations that
agree on the propositional letters occurring in φ, i.e., v1 (pn ) = v2 (pn ) whenever pn
occurs in some formula φ. Then v1 and v2 also agree on φ, i.e., v1 ( φ) = v2 ( φ).
Proof. By induction on φ.
1. φ ≡ ⊥: v ⊭ φ.
2. φ ≡ pi : v ⊨ φ iff v(pi ) = T.
3. φ ≡ ¬ψ: v ⊨ φ iff v ⊭ ψ.
Proof. By induction on φ.
2. If Γ ⊨ φ and Γ ⊨ φ → ψ then Γ ⊨ ψ;
Proof. Exercise.
Proof. Exercise.
Proof. Exercise.
Problems
Problem 7.1. Prove Proposition 7.5
Problem 7.3. For each of the five formulas below determine whether the for-
mula can be expressed as a substitution φ[ψ/pi ] where φ is (i) p0 ; (ii) (¬p0 ∧
p1 ); and (iii) ((¬p0 → p1 ) ∧ p2 ). In each case specify the relevant substitution.
1. p1
2. (¬p0 ∧ p0 )
3. ((p0 ∨ p1 ) ∧ p2 )
4. ¬((p0 → p1 ) ∧ p2 )
Problem 7.7. For each of the following four formulas determine whether it is
(a) satisfiable, (b) tautology, and (c) contingent.
Derivation Systems
8.1 Introduction
Logics commonly have both a semantics and a derivation system. The seman-
tics concerns concepts such as truth, satisfiability, validity, and entailment.
The purpose of derivation systems is to provide a purely syntactic method
of establishing entailment and validity. They are purely syntactic in the sense
that a derivation in such a system is a finite syntactic object, usually a sequence
(or other finite arrangement) of sentences or formulas. Good derivation sys-
tems have the property that any given sequence or arrangement of sentences
or formulas can be verified mechanically to be “correct.”
The simplest (and historically first) derivation systems for first-order logic
were axiomatic. A sequence of formulas counts as a derivation in such a sys-
tem if each individual formula in it is either among a fixed set of “axioms”
or follows from formulas coming before it in the sequence by one of a fixed
number of “inference rules”—and it can be mechanically verified if a formula
is an axiom and whether it follows correctly from other formulas by one of the
inference rules. Axiomatic derivation systems are easy to describe—and also
easy to handle meta-theoretically—but derivations in them are hard to read
and understand, and are also hard to produce.
Other derivation systems have been developed with the aim of making it
easier to construct derivations or easier to understand derivations once they
are complete. Examples are natural deduction, truth trees, also known as
tableaux proofs, and the sequent calculus. Some derivation systems are de-
114
8.1. INTRODUCTION
1. ⊢ φ if and only if ⊨ φ
2. Γ ⊢ φ if and only if Γ ⊨ φ
The “only if” direction of the above is called soundness. A derivation system is
sound if derivability guarantees entailment (or validity). Every decent deriva-
tion system has to be sound; unsound derivation systems are not useful at all.
After all, the entire purpose of a derivation is to provide a syntactic guarantee
of validity or entailment. We’ll prove soundness for the derivation systems
we present.
The converse “if” direction is also important: it is called completeness. A
complete derivation system is strong enough to show that φ is a theorem
whenever φ is valid, and that Γ ⊢ φ whenever Γ ⊨ φ. Completeness is harder
to establish, and some logics have no complete derivation systems. First-order
logic does. Kurt Gödel was the first one to prove completeness for a derivation
system of first-order logic in his 1929 dissertation.
Another concept that is connected to derivation systems is that of consis-
tency. A set of sentences is called inconsistent if anything whatsoever can be
derived from it, and consistent otherwise. Inconsistency is the syntactic coun-
terpart to unsatisfiablity: like unsatisfiable sets, inconsistent sets of sentences
do not make good theories, they are defective in a fundamental way. Con-
sistent sets of sentences may not be true or useful, but at least they pass that
minimal threshold of logical usefulness. For different derivation systems the
specific definition of consistency of sets of sentences might differ, but like ⊢,
we want consistency to coincide with its semantic counterpart, satisfiability.
We want it to always be the case that Γ is consistent if and only if it is satis-
fiable. Here, the “if” direction amounts to completeness (consistency guaran-
tees satisfiability), and the “only if” direction amounts to soundness (satisfi-
ability guarantees consistency). In fact, for classical first-order logic, the two
versions of soundness and completeness are equivalent.
φ ⇒ φ
φ∧ψ ⇒ φ
∧L
→R
⇒ ( φ ∧ ψ) → φ
[ φ ∧ ψ ]1
φ ∧Elim
1 →Intro
( φ ∧ ψ) → φ
inference.
A set Γ is inconsistent iff Γ ⊢ ⊥ in natural deduction. The rule ⊥ I makes
it so that from an inconsistent set, any sentence can be derived.
Natural deduction systems were developed by Gerhard Gentzen and Sta-
nisław Jaśkowski in the 1930s, and later developed by Dag Prawitz and Fred-
eric Fitch. Because its inferences mirror natural methods of proof, it is favored
by philosophers. The versions developed by Fitch are often used in introduc-
tory logic textbooks. In the philosophy of logic, the rules of natural deduc-
tion have sometimes been taken to give the meanings of the logical operators
(“proof-theoretic semantics”).
8.4 Tableaux
T φ or F φ.
{F φ, Tψ1 , . . . , Tψn }
1. F ( φ ∧ ψ) → φ Assumption
2. Tφ ∧ ψ →F 1
3. Fφ →F 1
4. Tφ →T 2
5. Tψ →T 2
⊗
1. φ is an axiom, or
φ → (ψ → φ) ψ → (ψ ∨ χ) (ψ ∧ χ) → ψ
are common axioms that govern →, ∨ and ∧. Some axiom systems aim at a
minimal number of axioms. Depending on the connectives that are taken as
primitives, it is even possible to find axiom systems that consist of a single
axiom.
A rule of inference is a conditional statement that gives a sufficient condi-
tion for a sentence in a derivation to be justified. Modus ponens is one very
common such rule: it says that if φ and φ → ψ are already justified, then ψ is
justified. This means that a line in a derivation containing the sentence ψ is
justified, provided that both φ and φ → ψ (for some sentence φ) appear in the
derivation before ψ.
The ⊢ relation based on axiomatic derivations is defined as follows: Γ ⊢ φ
iff there is a derivation with the sentence φ as its last formula (and Γ is taken
as the set of sentences in that derivation which are justified by (2) above). φ
is a theorem if φ has a derivation where Γ is empty, i.e., every sentence in the
derivation is justfied either by (1) or (3). For instance, here is a derivation that
shows that ⊢ φ → (ψ → (ψ ∨ φ)):
1. ψ → (ψ ∨ φ)
2. (ψ → (ψ ∨ φ)) → ( φ → (ψ → (ψ ∨ φ)))
3. φ → (ψ → (ψ ∨ φ))
The sentence on line 1 is of the form of the axiom φ → ( φ ∨ ψ) (with the roles
of φ and ψ reversed). The sentence on line 2 is of the form of the axiom φ →
(ψ → φ). Thus, both lines are justified. Line 3 is justified by modus ponens: if
we abbreviate it as θ, then line 2 has the form χ → θ, where χ is ψ → (ψ ∨ φ),
i.e., line 1.
A set Γ is inconsistent if Γ ⊢ ⊥. A complete axiom system will also prove
that ⊥ → φ for any φ, and so if Γ is inconsistent, then Γ ⊢ φ for any φ.
Systems of axiomatic derivations for logic were first given by Gottlob Frege
in his 1879 Begriffsschrift, which for this reason is often considered the first
work of modern logic. They were perfected in Alfred North Whitehead and
Bertrand Russell’s Principia Mathematica and by David Hilbert and his stu-
dents in the 1920s. They are thus often called “Frege systems” or “Hilbert
systems.” They are very versatile in that it is often easy to find an axiomatic
system for a logic. Because derivations have a very simple structure and only
one or two inference rules, it is also relatively easy to prove things about them.
However, they are very hard to use in practice, i.e., it is difficult to find and
write proofs.
Γ⇒∆
where Γ and ∆ are finite (possibly empty) sequences of sentences of the lan-
guage L. Γ is called the antecedent, while ∆ is the succedent.
The intuitive idea behind a sequent is: if all of the sentences in the an-
tecedent hold, then at least one of the sentences in the succedent holds. That
is, if Γ = ⟨ φ1 , . . . , φm ⟩ and ∆ = ⟨ψ1 , . . . , ψn ⟩, then Γ ⇒ ∆ holds iff
( φ1 ∧ · · · ∧ φm ) → (ψ1 ∨ · · · ∨ ψn )
holds. There are two special cases: where Γ is empty and when ∆ is empty.
When Γ is empty, i.e., m = 0, ⇒ ∆ holds iff ψ1 ∨ · · · ∨ ψn holds. When ∆ is
empty, i.e., n = 0, Γ ⇒ holds iff ¬( φ1 ∧ · · · ∧ φm ) does. We say a sequent is
valid iff the corresponding sentence is valid.
If Γ is a sequence of sentences, we write Γ, φ for the result of appending
φ to the right end of Γ (and φ, Γ for the result of appending φ to the left end
of Γ). If ∆ is a sequence of sentences also, then Γ, ∆ is the concatenation of the
two sequences.
121
CHAPTER 9. THE SEQUENT CALCULUS
1. φ ⇒ φ
2. ⊥ ⇒
Rules for ¬
Γ ⇒ ∆, φ φ, Γ ⇒ ∆
¬L ¬R
¬ φ, Γ ⇒ ∆ Γ ⇒ ∆, ¬ φ
Rules for ∧
φ, Γ ⇒ ∆
∧L
φ ∧ ψ, Γ ⇒ ∆ Γ ⇒ ∆, φ Γ ⇒ ∆, ψ
∧R
ψ, Γ ⇒ ∆ Γ ⇒ ∆, φ ∧ ψ
∧L
φ ∧ ψ, Γ ⇒ ∆
Rules for ∨
Γ ⇒ ∆, φ
∨R
φ, Γ ⇒ ∆ ψ, Γ ⇒ ∆ Γ ⇒ ∆, φ ∨ ψ
∨L
φ ∨ ψ, Γ ⇒ ∆ Γ ⇒ ∆, ψ
∨R
Γ ⇒ ∆, φ ∨ ψ
Rules for →
Γ ⇒ ∆, φ ψ, Π ⇒ Λ φ, Γ ⇒ ∆, ψ
→L →R
φ → ψ, Γ, Π ⇒ ∆, Λ Γ ⇒ ∆, φ → ψ
Weakening
Γ ⇒ ∆ Γ ⇒ ∆
WL WR
φ, Γ ⇒ ∆ Γ ⇒ ∆, φ
Contraction
φ, φ, Γ ⇒ ∆ Γ ⇒ ∆, φ, φ
CL CR
φ, Γ ⇒ ∆ Γ ⇒ ∆, φ
Exchange
Γ, φ, ψ, Π ⇒ ∆ Γ ⇒ ∆, φ, ψ, Λ
XL XR
Γ, ψ, φ, Π ⇒ ∆ Γ ⇒ ∆, ψ, φ, Λ
Γ ⇒ ∆, φ φ, Π ⇒ Λ
Cut
Γ, Π ⇒ ∆, Λ
9.4 Derivations
We’ve said what an initial sequent looks like, and we’ve given the rules of
inference. Derivations in the sequent calculus are inductively generated from
these: each derivation either is an initial sequent on its own, or consists of one
or two derivations followed by an inference.
We then say that S is the end-sequent of the derivation and that S is derivable in
LK (or LK-derivable).
The rule, however, is meant to be general: we can replace the φ in the rule
with any sentence, e.g., also with θ. If the premise matches our initial sequent
χ ⇒ χ, that means that both Γ and ∆ are just χ, and the conclusion would
then be θ, χ ⇒ χ. So, the following is a derivation:
χ ⇒ χ
WL
θ, χ ⇒ χ
We can now apply another rule, say XL, which allows us to switch two sen-
tences on the left. So, the following is also a correct derivation:
χ ⇒ χ
WL
θ, χ ⇒ χ
XL
χ, θ ⇒ χ
both Γ and Π were empty, ∆ is χ, and the roles of φ and ψ are played by θ
and χ, respectively. In much the same way, we also see that
θ ⇒ θ
WL
χ, θ ⇒ θ
is a derivation. Now we can take these two derivations, and combine them
using ∧R. That rule was
Γ ⇒ ∆, φ Γ ⇒ ∆, ψ
∧R
Γ ⇒ ∆, φ ∧ ψ
In our case, the premises must match the last sequents of the derivations end-
ing in the premises. That means that Γ is χ, θ, ∆ is empty, φ is χ and ψ is θ. So
the conclusion, if the inference should be correct, is χ, θ ⇒ χ ∧ θ.
χ ⇒ χ
WL
θ, χ ⇒ χ θ ⇒ θ
XL WL
χ, θ ⇒ χ χ, θ ⇒ θ
∧R
χ, θ ⇒ χ ∧ θ
Of course, we can also reverse the premises, then φ would be θ and ψ would
be χ.
χ ⇒ χ
WL
θ ⇒ θ θ, χ ⇒ χ
WL XL
χ, θ ⇒ θ χ, θ ⇒ χ
∧R
χ, θ ⇒ θ ∧ χ
φ∧ψ ⇒ φ
Next, we need to figure out what kind of inference could have a lower sequent
of this form. This could be a structural rule, but it is a good idea to start by
looking for a logical rule. The only logical connective occurring in the lower
sequent is ∧, so we’re looking for an ∧ rule, and since the ∧ symbol occurs in
the antecedent, we’re looking at the ∧L rule.
φ∧ψ ⇒ φ
∧L
There are two options for what could have been the upper sequent of the ∧L
inference: we could have an upper sequent of φ ⇒ φ, or of ψ ⇒ φ. Clearly,
φ ⇒ φ is an initial sequent (which is a good thing), while ψ ⇒ φ is not
derivable in general. We fill in the upper sequent:
φ ⇒ φ
φ∧ψ ⇒ φ
∧L
¬φ ∨ ψ ⇒ φ → ψ
To find a logical rule that could give us this end-sequent, we look at the log-
ical connectives in the end-sequent: ¬, ∨, and →. We only care at the mo-
ment about ∨ and → because they are main operators of sentences in the end-
sequent, while ¬ is inside the scope of another connective, so we will take care
of it later. Our options for logical rules for the final inference are therefore the
∨L rule and the →R rule. We could pick either rule, really, but let’s pick the
→R rule (if for no reason other than it allows us to put off splitting into two
branches). According to the form of →R inferences which can yield the lower
sequent, this must look like:
φ, ¬ φ ∨ ψ ⇒ ψ
¬ φ ∨ ψ ⇒ φ → ψ →R
¬ φ, φ ⇒ ψ ψ, φ ⇒ ψ
¬ φ ∨ ψ, φ ⇒ ψ ∨L
φ, ¬ φ ∨ ψ ⇒ ψ XR
¬φ ∨ ψ ⇒ φ → ψ →R
Remember that we are trying to wind our way up to initial sequents; we seem
to be pretty close! The right branch is just one weakening and one exchange
away from an initial sequent and then it is done:
ψ ⇒ ψ
WL
φ, ψ ⇒ ψ
XL
¬ φ, φ ⇒ ψ ψ, φ ⇒ ψ
¬ φ ∨ ψ, φ ⇒ ψ ∨L
XR
φ, ¬ φ ∨ ψ ⇒ ψ
¬φ ∨ ψ ⇒ φ → ψ →R
Now looking at the left branch, the only logical connective in any sentence
is the ¬ symbol in the antecedent sentences, so we’re looking at an instance of
the ¬L rule.
ψ ⇒ ψ
WL
φ ⇒ ψ, φ φ, ψ ⇒ ψ
¬ φ, φ ⇒ ψ ¬L ψ, φ ⇒ ψ
XL
¬ φ ∨ ψ, φ ⇒ ψ
∨L
XR
φ, ¬ φ ∨ ψ ⇒ ψ
¬φ ∨ ψ ⇒ φ→ψ
→R
Similarly to how we finished off the right branch, we are just one weakening
and one exchange away from finishing off this left branch as well.
φ ⇒ φ
φ ⇒ φ, ψ WR ψ ⇒ ψ
φ ⇒ ψ, φ XR φ, ψ ⇒ ψ
WL
¬ φ, φ ⇒ ψ ¬L ψ, φ ⇒ ψ
XL
¬ φ ∨ ψ, φ ⇒ ψ
∨L
XR
φ, ¬ φ ∨ ψ ⇒ ψ
¬φ ∨ ψ ⇒ φ→ψ
→R
¬ φ ∨ ¬ψ ⇒ ¬( φ ∧ ψ)
The available main connectives of sentences in the end-sequent are the ∨ sym-
bol and the ¬ symbol. It would work to apply either the ∨L or the ¬R rule
here, but we start with the ¬R rule because it avoids splitting up into two
branches for a moment:
φ ∧ ψ, ¬ φ ∨ ¬ψ ⇒
¬R
¬ φ ∨ ¬ψ ⇒ ¬( φ ∧ ψ)
Now we have a choice of whether to look at the ∧L or the ∨L rule. Let’s see
what happens when we apply the ∧L rule: we have a choice to start with
either the sequent φ, ¬ φ ∨ ψ ⇒ or the sequent ψ, ¬ φ ∨ ψ ⇒ . Since the
derivation is symmetric with regards to φ and ψ, let’s go with the former:
φ, ¬ φ ∨ ¬ψ ⇒
φ ∧ ψ, ¬ φ ∨ ¬ψ ⇒
∧L
¬R
¬ φ ∨ ¬ψ ⇒ ¬( φ ∧ ψ)
?
φ ⇒ φ φ ⇒ ψ
¬ φ, φ ⇒ ¬L ¬ψ, φ ⇒ ¬L
¬ φ ∨ ¬ψ, φ ⇒ ∨L
XL
φ, ¬ φ ∨ ¬ψ ⇒
φ ∧ ψ, ¬ φ ∨ ¬ψ ⇒ ∧L
¬R
¬ φ ∨ ¬ψ ⇒ ¬( φ ∧ ψ)
The top of the right branch cannot be reduced any further, and it cannot be
brought by way of structural inferences to an initial sequent, so this is not the
right path to take. So clearly, it was a mistake to apply the ∧L rule above.
Going back to what we had before and carrying out the ∨L rule instead, we
get
¬ φ, φ ∧ ψ ⇒ ¬ψ, φ ∧ ψ ⇒
¬ φ ∨ ¬ψ, φ ∧ ψ ⇒ ∨L
XL
φ ∧ ψ, ¬ φ ∨ ¬ψ ⇒
¬R
¬ φ ∨ ¬ψ ⇒ ¬( φ ∧ ψ)
φ ⇒ φ ψ ⇒ ψ
φ∧ψ ⇒ φ
∧L φ∧ψ ⇒ ψ
∧L
¬ φ, φ ∧ ψ ⇒ ¬ L
¬ψ, φ ∧ ψ ⇒ ¬L
¬ φ ∨ ¬ψ, φ ∧ ψ ⇒ ∨L
XL
φ ∧ ψ, ¬ φ ∨ ¬ψ ⇒
¬R
¬ φ ∨ ¬ψ ⇒ ¬( φ ∧ ψ)
(We could have carried out the ∧ rules lower than the ¬ rules in these steps
and still obtained a correct derivation).
Example 9.8. So far we haven’t used the contraction rule, but it is sometimes
required. Here’s an example where that happens. Suppose we want to prove
⇒ φ ∨ ¬ φ. Applying ∨R backwards would give us one of these two deriva-
tions:
φ ⇒
⇒ φ ⇒ ¬ φ ¬R
⇒ φ ∨ ¬ φ ∨R ⇒ φ ∨ ¬ φ ∨R
Neither of these of course ends in an initial sequent. The trick is to realize that
the contraction rule allows us to combine two copies of a sentence into one—
and when we’re searching for a proof, i.e., going from bottom to top, we can
keep a copy of φ ∨ ¬ φ in the premise, e.g.,
⇒ φ ∨ ¬ φ, φ
⇒ φ ∨ ¬ φ, φ ∨ ¬ φ ∨R
⇒ φ ∨ ¬φ CR
Now we can apply ∨R a second time, and also get ¬ φ, which leads to a com-
plete derivation.
φ ⇒ φ
⇒ φ, ¬ φ ¬R
⇒ φ, φ ∨ ¬ φ ∨R
⇒ φ ∨ ¬ φ, φ XR
⇒ φ ∨ ¬ φ, φ ∨ ¬ φ ∨R
⇒ φ ∨ ¬φ CR
This section collects the definitions of the provability relation and con-
sistency for natural deduction.
Because of the contraction, weakening, and exchange rules, the order and
number of sentences in Γ0′ does not matter: if a sequent Γ0′ ⇒ φ is deriv-
able, then so is Γ0′′ ⇒ φ for any Γ0′′ that contains the same sentences as Γ0′ .
For instance, if Γ0 = {ψ, χ} then both Γ0′ = ⟨ψ, ψ, χ⟩ and Γ0′′ = ⟨χ, χ, ψ⟩ are
sequences containing just the sentences in Γ0 . If a sequent containing one is
derivable, so is the other, e.g.:
ψ, ψ, χ ⇒ φ
CL
ψ, χ ⇒ φ
XL
χ, ψ ⇒ φ
WL
χ, χ, ψ ⇒ φ
From now on we’ll say that if Γ0 is a finite set of sentences then Γ0 ⇒ φ is
any sequent where the antecedent is a sequence of sentences in Γ0 and tacitly
include contractions, exchanges, and weakenings if necessary.
Definition 9.11 (Consistency). A set of sentences Γ is inconsistent iff there is a
finite subset Γ0 ⊆ Γ such that LK derives Γ0 ⇒ . If Γ is not inconsistent, i.e.,
if for every finite Γ0 ⊆ Γ, LK does not derive Γ0 ⇒ , we say it is consistent.
π0 π1
Γ0 ⇒ φ φ, ∆ 0 ⇒ ψ
Cut
Γ0 , ∆ 0 ⇒ ψ
Since Γ0 ∪ ∆ 0 ⊆ Γ ∪ ∆, this shows Γ ∪ ∆ ⊢ ψ.
Proof. Exercise.
π0 π1
Γ0 ⇒ φ φ, Γ1 ⇒
Cut
Γ0 , Γ1 ⇒
π1
φ ⇒ φ
⇒ φ, ¬ φ ¬R ¬ φ, Γ ⇒
Cut
Γ ⇒ φ
π φ ⇒ φ
¬ φ, φ ⇒ ¬L
Γ0 ⇒ φ φ, ¬ φ ⇒ XL
Cut
Γ, ¬ φ ⇒
π0
π1
φ, Γ0 ⇒
¬R
Γ0 ⇒ ¬ φ ¬ φ, Γ1 ⇒
Cut
Γ0 , Γ1 ⇒
2. φ, ψ ⊢ φ ∧ ψ.
φ ⇒ φ ψ ⇒ ψ
φ∧ψ ⇒ φ
∧L φ∧ψ ⇒ ψ
∧L
φ ⇒ φ ψ ⇒ ψ
φ, ψ ⇒ φ ∧ ψ
∧R
2. Both φ ⊢ φ ∨ ψ and ψ ⊢ φ ∨ ψ.
φ ⇒ φ ψ ⇒ ψ
¬ φ, φ ⇒ ¬L ¬ψ, ψ ⇒ ¬L
φ, ¬ φ, ¬ψ ⇒ ψ, ¬ φ, ¬ψ ⇒
φ ∨ ψ, ¬ φ, ¬ψ ⇒
∨L
φ ⇒ φ ψ ⇒ ψ
φ ⇒ φ∨ψ
∨R ψ ⇒ φ∨ψ
∨R
Proposition 9.23. 1. φ, φ → ψ ⊢ ψ.
2. Both ¬ φ ⊢ φ → ψ and ψ ⊢ φ → ψ.
φ ⇒ φ ψ ⇒ ψ
φ → ψ, φ ⇒ ψ
→L
φ ⇒ φ
¬ φ, φ ⇒ ¬L
φ, ¬ φ ⇒ XL ψ ⇒ ψ
φ, ¬ φ ⇒ ψ WR φ, ψ ⇒ ψ
WL
¬φ ⇒ φ → ψ →R ψ ⇒ φ→ψ
→R
9.9 Soundness
A derivation system, such as the sequent calculus, is sound if it cannot de-
rive things that do not actually hold. Soundness is thus a kind of guaranteed
safety property for derivation systems. Depending on which proof theoretic
property is in question, we would like to know for instance, that
Γ ⇒ ∆ Γ ⇒ ∆
WL WR
φ, Γ ⇒ ∆ Γ ⇒ ∆, φ
2. The last inference is ¬L: Then the premise of the last inference is Γ ⇒
∆, φ and the conclusion is ¬ φ, Γ ⇒ ∆, i.e., the derivation ends in
Γ ⇒ ∆, φ
¬L
¬ φ, Γ ⇒ ∆
and Θ = ¬ φ, Γ while Ξ = ∆.
The induction hypothesis tells us that Γ ⇒ ∆, φ is valid, i.e., for every v,
either (a) for some χ ∈ Γ, v ⊭ χ, or (b) for some χ ∈ ∆, v ⊨ χ, or (c) v ⊨ φ.
We want to show that Θ ⇒ Ξ is also valid. Let v be a valuation. If (a)
holds, then there is χ ∈ Γ so that v ⊭ χ, but χ ∈ Θ as well. If (b) holds,
there is χ ∈ ∆ such that v ⊨ χ, but χ ∈ Ξ as well. Finally, if v ⊨ φ, then
v ⊭ ¬ φ. Since ¬ φ ∈ Θ, there is χ ∈ Θ such that v ⊭ χ. Consequently,
Θ ⇒ Ξ is valid.
4. The last inference is ∧L: There are two variants: φ ∧ ψ may be inferred
on the left from φ or from ψ on the left side of the premise. In the first
case, the π ends in
φ, Γ ⇒ ∆
∧L
φ ∧ ψ, Γ ⇒ ∆
5. The last inference is ∨R: There are two variants: φ ∨ ψ may be inferred
on the right from φ or from ψ on the right side of the premise. In the first
case, π ends in
Γ ⇒ ∆, φ
∨R
Γ ⇒ ∆, φ ∨ ψ
φ, Γ ⇒ ∆, ψ
→R
Γ ⇒ ∆, φ → ψ
Again, the induction hypothesis says that the premise is valid; we want
to show that the conclusion is valid as well. Let v be arbitrary. Since
φ, Γ ⇒ ∆, ψ is valid, at least one of the following cases obtains: (a) v ⊭ φ,
(b) v ⊨ ψ, (c) v ⊭ χ for some χ ∈ Γ, or (d) v ⊨ χ for some χ ∈ ∆. In cases
(a) and (b), v ⊨ φ → ψ and so there is a χ ∈ ∆, φ → ψ such that v ⊨ χ. In
case (c), for some χ ∈ Γ, v ⊭ χ. In case (d), for some χ ∈ ∆, v ⊨ χ. In
each case, v satisfies Γ ⇒ ∆, φ → ψ. Since v was arbitrary, Γ ⇒ ∆, φ → ψ
is valid.
Now let’s consider the possible inferences with two premises.
1. The last inference is a cut: then π ends in
Γ ⇒ ∆, φ φ, Π ⇒ Λ
Cut
Γ, Π ⇒ ∆, Λ
Γ ⇒ ∆, φ Γ ⇒ ∆, ψ
∧R
Γ ⇒ ∆, φ ∧ ψ
Γ ⇒ ∆, φ ψ, Π ⇒ Λ
→L
φ → ψ, Γ, Π ⇒ ∆, Λ
Problems
Problem 9.1. Give derivations of the following sequents:
1. φ ∧ (ψ ∧ χ) ⇒ ( φ ∧ ψ) ∧ χ.
2. φ ∨ (ψ ∨ χ) ⇒ ( φ ∨ ψ) ∨ χ.
3. φ → (ψ → χ) ⇒ ψ → ( φ → χ).
4. φ ⇒ ¬¬ φ.
1. ( φ ∨ ψ) → χ ⇒ φ → χ.
2. ( φ → χ) ∧ (ψ → χ) ⇒ ( φ ∨ ψ) → χ.
3. ⇒ ¬( φ ∧ ¬ φ).
4. ψ → φ ⇒ ¬ φ → ¬ψ.
5. ⇒ ( φ → ¬ φ) → ¬ φ.
6. ⇒ ¬( φ → ψ) → ¬ψ.
7. φ → χ ⇒ ¬( φ ∧ ¬χ).
8. φ ∧ ¬χ ⇒ ¬( φ → χ).
9. φ ∨ ψ, ¬ψ ⇒ φ.
10. ¬ φ ∨ ¬ψ ⇒ ¬( φ ∧ ψ).
12. ⇒ ¬( φ ∨ ψ) → (¬ φ ∧ ¬ψ).
1. ¬( φ → ψ) ⇒ φ.
2. ¬( φ ∧ ψ) ⇒ ¬ φ ∨ ¬ψ.
3. φ → ψ ⇒ ¬ φ ∨ ψ.
4. ⇒ ¬¬ φ → φ.
5. φ → ψ, ¬ φ → ψ ⇒ ψ.
6. ( φ ∧ ψ) → χ ⇒ ( φ → χ) ∨ (ψ → χ).
7. ( φ → ψ) → φ ⇒ φ.
8. ⇒ ( φ → ψ) ∨ (ψ → χ).
Natural Deduction
139
CHAPTER 10. NATURAL DEDUCTION
Rules for ∧
φ∧ψ
φ ∧Elim
φ ψ
∧Intro
φ∧ψ φ∧ψ
ψ
∧Elim
Rules for ∨
φ [ φ]n [ψ]n
∨Intro
φ∨ψ
ψ
∨Intro φ∨ψ χ χ
φ∨ψ n ∨Elim
χ
Rules for →
[ φ]n
φ→ψ φ
ψ
→Elim
ψ
n →Intro
φ→ψ
Rules for ¬
[ φ]n
¬φ φ
¬Elim
⊥
⊥
¬ φ ¬Intro
n
Rules for ⊥
[¬ φ]n
⊥ ⊥
φ I
n
⊥ ⊥
φ C
Note that ¬Intro and ⊥C are very similar: The difference is that ¬Intro derives
a negated sentence ¬ φ but ⊥C a positive sentence φ.
Whenever a rule indicates that some assumption may be discharged, we
take this to be a permission, but not a requirement. E.g., in the →Intro rule,
we may discharge any number of assumptions of the form φ in the derivation
of the premise ψ, including zero.
10.3 Derivations
We’ve said what an assumption is, and we’ve given the rules of inference.
Derivations in natural deduction are inductively generated from these: each
derivation either is an assumption on its own, or consists of one, two, or three
derivations followed by a correct inference.
3. Every sentence in the tree except the sentence φ at the bottom is a premise
of a correct application of an inference rule whose conclusion stands di-
rectly below that sentence in the tree.
We then say that φ is the conclusion of the derivation and Γ its undischarged
assumptions.
If a derivation of φ from Γ exists, we say that φ is derivable from Γ, or in
symbols: Γ ⊢ φ. If there is a derivation of φ in which every assumption is
discharged, we write ⊢ φ.
φ ψ
∧Intro
φ∧ψ
These rules are meant to be general: we can replace the φ and ψ in it with any
sentences, e.g., by χ and θ. Then the conclusion would be χ ∧ θ, and so
χ θ
∧Intro
χ∧θ
θ χ
∧Intro
θ∧χ
( φ ∧ ψ) → φ
Next, we need to figure out what kind of inference could result in a sen-
tence of this form. The main operator of the conclusion is →, so we’ll try to
arrive at the conclusion using the →Intro rule. It is best to write down the as-
sumptions involved and label the inference rules as you progress, so it is easy
to see whether all assumptions have been discharged at the end of the proof.
[ φ ∧ ψ ]1
φ
1 →Intro
( φ ∧ ψ) → φ
[ φ ∧ ψ ]1
φ ∧Elim
1 →Intro
( φ ∧ ψ) → φ
(¬ φ ∨ ψ) → ( φ → ψ)
To find a logical rule that could give us this conclusion, we look at the logical
connectives in the conclusion: ¬, ∨, and →. We only care at the moment about
the first occurrence of → because it is the main operator of the sentence in the
end-sequent, while ¬, ∨ and the second occurrence of → are inside the scope
of another connective, so we will take care of those later. We therefore start
with the →Intro rule. A correct application must look like this:
[¬ φ ∨ ψ]1
φ→ψ
1 →Intro
(¬ φ ∨ ψ) → ( φ → ψ)
This leaves us with two possibilities to continue. Either we can keep working
from the bottom up and look for another application of the →Intro rule, or we
can work from the top down and apply a ∨Elim rule. Let us apply the latter.
We will use the assumption ¬ φ ∨ ψ as the leftmost premise of ∨Elim. For a
valid application of ∨Elim, the other two premises must be identical to the
conclusion φ → ψ, but each may be derived in turn from another assumption,
namely one of the two disjuncts of ¬ φ ∨ ψ. So our derivation will look like
this:
[¬ φ]2 [ ψ ]2
[¬ φ]2 , [ φ]3 [ ψ ]2 , [ φ ]4
ψ ψ
3 →Intro 4 →Intro
[¬ φ ∨ ψ]1 φ→ψ φ→ψ
2
φ→ψ
∨Elim
1 →Intro
(¬ φ ∨ ψ) → ( φ → ψ)
For the two missing parts of the derivation, we need derivations of ψ from
¬ φ and φ in the middle, and from φ and ψ on the left. Let’s take the former
first. ¬ φ and φ are the two premises of ¬Elim:
[¬ φ]2 [ φ ]3
¬Elim
⊥
[ ψ ]2 , [ φ ]4
[¬ φ]2 [ φ ]3
⊥Intro
⊥ ⊥
I
ψ ψ
3 →Intro 4 →Intro
[¬ φ ∨ ψ]1 φ→ψ φ→ψ
2
φ→ψ
∨Elim
1 →Intro
(¬ φ ∨ ψ) → ( φ → ψ)
Let’s now look at the rightmost branch. Here it’s important to realize that
the definition of derivation allows assumptions to be discharged but does not re-
quire them to be. In other words, if we can derive ψ from one of the assump-
tions φ and ψ without using the other, that’s ok. And to derive ψ from ψ is
trivial: ψ by itself is such a derivation, and no inferences are needed. So we
can simply delete the assumption φ.
[¬ φ]2 [ φ ]3
¬Elim
⊥ ⊥
I
ψ [ ψ ]2
3 →Intro →Intro
[¬ φ ∨ ψ]1 φ→ψ φ→ψ
2
φ→ψ
∨Elim
1 →Intro
(¬ φ ∨ ψ) → ( φ → ψ)
Note that in the finished derivation, the rightmost →Intro inference does not
actually discharge any assumptions.
Example 10.6. So far we have not needed the ⊥C rule. It is special in that it al-
lows us to discharge an assumption that isn’t a sub-formula of the conclusion
of the rule. It is closely related to the ⊥ I rule. In fact, the ⊥ I rule is a special
case of the ⊥C rule—there is a logic called “intuitionistic logic” in which only
⊥ I is allowed. The ⊥C rule is a last resort when nothing else works. For in-
stance, suppose we want to derive φ ∨ ¬ φ. Our usual strategy would be to
attempt to derive φ ∨ ¬ φ using ∨Intro. But this would require us to derive
either φ or ¬ φ from no assumptions, and this can’t be done. ⊥C to the rescue!
[¬( φ ∨ ¬ φ)]1
1
⊥ ⊥C
φ ∨ ¬φ
¬φ φ
¬Elim
1
⊥ ⊥C
φ ∨ ¬φ
⊥
2
¬ φ ¬Intro φ
¬Elim
1
⊥ ⊥C
φ ∨ ¬φ
[ φ ]2 [¬( φ ∨ ¬ φ)]1
[¬( φ ∨ ¬ φ)]1 φ ∨ ¬ φ ∨Intro
¬Elim
⊥
2
¬φ ¬ Intro φ
¬Elim
1
⊥ ⊥C
φ ∨ ¬φ
[ φ ]2 [¬ φ]3
[¬( φ ∨ ¬ φ)]1 φ ∨ ¬φ ∨ Intro [¬( φ ∨ ¬ φ)]1 φ ∨ ¬ φ ∨Intro
¬Elim ¬Elim
⊥ ⊥ ⊥
2
¬ φ ¬Intro 3
φ C
¬Elim
1
⊥ ⊥C
φ ∨ ¬φ
This section collects the definitions the provability relation and consis-
tency for natural deduction.
δ1 Γ
δ0
ψ
1 →Intro
φ→ψ φ
ψ
→Elim
Proof. Exercise.
Γ, [¬ φ]1
δ1
1
⊥ ⊥
φ C
δ
¬φ φ
¬Elim
⊥
Since ¬ φ ∈ Γ, all undischarged assumptions are in Γ, this shows that Γ ⊢ ⊥.
Γ, [¬ φ]2 Γ, [ φ]1
δ2 δ1
⊥ ⊥
2
¬¬ φ ¬Intro 1
¬ φ ¬Intro
¬Elim
⊥
Since the assumptions φ and ¬ φ are discharged, this is a derivation of ⊥
from Γ alone. Hence Γ is inconsistent.
2. φ, ψ ⊢ φ ∧ ψ.
φ∧ψ φ∧ψ
φ ∧Elim ψ
∧Elim
2. We can derive:
φ ψ
∧Intro
φ∧ψ
2. Both φ ⊢ φ ∨ ψ and ψ ⊢ φ ∨ ψ.
¬φ [ φ ]1 ¬ψ [ ψ ]1
¬Elim ¬Elim
φ∨ψ ⊥ ⊥
1 ∨Elim
⊥
φ ψ
∨Intro ∨Intro
φ∨ψ φ∨ψ
Proposition 10.21. 1. φ, φ → ψ ⊢ ψ.
2. Both ¬ φ ⊢ φ → ψ and ψ ⊢ φ → ψ.
φ→ψ φ
ψ
→Elim
¬φ [ φ ]1
¬Elim
⊥ ⊥
I
ψ ψ
1 →Intro →Intro
φ→ψ φ→ψ
Note that →Intro may, but does not have to, discharge the assumption φ.
10.8 Soundness
A derivation system, such as natural deduction, is sound if it cannot derive
things that do not actually follow. Soundness is thus a kind of guaranteed
safety property for derivation systems. Depending on which proof theoretic
property is in question, we would like to know for instance, that
1. Suppose that the last inference is ¬Intro: The derivation has the form
Γ, [ φ]n
δ1
⊥
¬ φ ¬Intro
n
2. The last inference is ∧Elim: There are two variants: φ or ψ may be in-
ferred from the premise φ ∧ ψ. Consider the first case. The derivation δ
looks like this:
Γ
δ1
φ∧ψ
φ ∧Elim
3. The last inference is ∨Intro: There are two variants: φ ∨ ψ may be in-
ferred from the premise φ or the premise ψ. Consider the first case. The
derivation has the form
Γ
δ1
φ
∨Intro
φ∨ψ
Γ, [ φ]n
δ1
ψ
n →Intro
φ→ψ
Γ
δ1
⊥ ⊥
φ I
Now let’s consider the possible inferences with several premises: ∨Elim,
∧Intro, and →Elim.
1. The last inference is ∧Intro. φ ∧ ψ is inferred from the premises φ and ψ
and δ has the form
Γ1 Γ2
δ1 δ2
φ ψ
∧Intro
φ∧ψ
Γ1 Γ2
δ1 δ2
φ→ψ φ
ψ
→Elim
Problems
Problem 10.1. Give derivations that show the following:
1. φ ∧ (ψ ∧ χ) ⊢ ( φ ∧ ψ) ∧ χ.
2. φ ∨ (ψ ∨ χ) ⊢ ( φ ∨ ψ) ∨ χ.
3. φ → (ψ → χ) ⊢ ψ → ( φ → χ).
4. φ ⊢ ¬¬ φ.
1. ( φ ∨ ψ) → χ ⊢ φ → χ.
2. ( φ → χ) ∧ (ψ → χ) ⊢ ( φ ∨ ψ) → χ.
3. ⊢ ¬( φ ∧ ¬ φ).
4. ψ → φ ⊢ ¬ φ → ¬ψ.
5. ⊢ ( φ → ¬ φ) → ¬ φ.
6. ⊢ ¬( φ → ψ) → ¬ψ.
7. φ → χ ⊢ ¬( φ ∧ ¬χ).
8. φ ∧ ¬χ ⊢ ¬( φ → χ).
9. φ ∨ ψ, ¬ψ ⊢ φ.
10. ¬ φ ∨ ¬ψ ⊢ ¬( φ ∧ ψ).
12. ⊢ ¬( φ ∨ ψ) → (¬ φ ∧ ¬ψ).
1. ¬( φ → ψ) ⊢ φ.
2. ¬( φ ∧ ψ) ⊢ ¬ φ ∨ ¬ψ.
3. φ → ψ ⊢ ¬ φ ∨ ψ.
4. ⊢ ¬¬ φ → φ.
5. φ → ψ, ¬ φ → ψ ⊢ ψ.
6. ( φ ∧ ψ) → χ ⊢ ( φ → χ) ∨ (ψ → χ).
7. ( φ → ψ) → φ ⊢ φ.
8. ⊢ ( φ → ψ) ∨ (ψ → χ).
Tableaux
Definition 11.1. A signed formula is a pair consisting of a truth value and a sen-
tence, i.e., either:
T φ or F φ.
156
11.2. PROPOSITIONAL RULES
other words, if a branch is closed, the possibility it describes has been ruled
out. In particular, that means that a closed tableau rules out all possibilities
of simultaneously making every assumption of the form T φ true and every
assumption of the form F φ false.
A closed tableau for φ is a closed tableau with root F φ. If such a closed
tableau exists, all possibilities for φ being false have been ruled out; i.e., φ
must be true in every structure.
Rules for ¬
T¬ φ F ¬φ
¬T ¬F
Fφ Tφ
Rules for ∧
Tφ ∧ ψ
∧T Fφ ∧ ψ
Tφ ∧F
F φ | Fψ
Tψ
Rules for ∨
Fφ ∨ ψ
Tφ ∨ ψ ∨F
∨T Fφ
T φ | Tψ
Fψ
Rules for →
Fφ → ψ
Tφ → ψ →F
→T Tφ
F φ | Tψ
Fψ
Cut
Tφ | Fφ
The Cut rule is not applied “to” a previous signed formula; rather, it allows
every branch in a tableau to be split in two, one branch containing T φ, the
other F φ. It is not necessary—any set of signed formulas with a closed tableau
has one not using Cut—but it allows us to combine tableaux in a convenient
way.
11.3 Tableaux
We’ve said what an assumption is, and we’ve given the rules of inference.
Tableaux are inductively generated from these: each tableau either is a single
branch consisting of one or more assumptions, or it results from a tableau by
applying one of the rules of inference on a branch.
1. The n topmost signed formulas of the tree are Si φi , one below the other.
2. Every signed formula in the tree that is not one of the assumptions re-
sults from a correct application of an inference rule to a signed formula
in the branch above it.
A branch of a tableau is closed iff it contains both T φ and F φ, and open other-
wise. A tableau in which every branch is closed is a closed tableau (for its set
of assumptions). If a tableau is not closed, i.e., if it contains at least one open
branch, it is open.
Example 11.3. Every set of assumptions on its own is a tableau, but it will
generally not be closed. (Obviously, it is closed only if the assumptions al-
ready contain a pair of signed formulas T φ and F φ.)
From a tableau (open or closed) we can obtain a new, larger one by ap-
plying one of the rules of inference to a signed formula φ in it. The rule will
append one or more signed formulas to the end of any branch containing the
occurrence of φ to which we apply the rule.
For instance, consider the assumption T φ ∧ ¬ φ. Here is the (open) tableau
consisting of just that assumption:
1. T φ ∧ ¬φ Assumption
1. T φ ∧ ¬φ Assumption
2. Tφ ∧T 1
3. T¬ φ ∧T 1
When we write down tableaux, we record the rules we’ve applied on the right
(e.g., ∧T1 means that the signed formula on that line is the result of applying
the ∧T rule to the signed formula on line 1). This new tableau now contains
additional signed formulas, but to only one (T ¬ φ) can we apply a rule (in this
case, the ¬T rule). This results in the closed tableau
1. T φ ∧ ¬φ Assumption
2. Tφ ∧T 1
3. T¬ φ ∧T 1
4. Fφ ¬T 3
⊗
1. F ( φ ∧ ψ) → φ Assumption
There is only one assumption, so only one signed formula to which we can
apply a rule. (For every signed formula, there is always at most one rule that
can be applied: it’s the rule for the corresponding sign and main operator of
the sentence.) In this case, this means, we must apply →F.
1. F ( φ ∧ ψ) → φ ✓ Assumption
2. Tφ ∧ ψ →F 1
3. Fφ →F 1
1. F ( φ ∧ ψ) → φ ✓ Assumption
2. Tφ ∧ ψ ✓ →F 1
3. Fφ →F 1
4. Tφ ∧T 2
5. Tψ ∧T 2
⊗
Since the branch now contains both T φ (on line 4) and F φ (on line 3), the
branch is closed. Since it is the only branch, the tableau is closed. We have
found a closed tableau for ( φ ∧ ψ) → φ.
1. F (¬ φ ∨ ψ) → ( φ → ψ) Assumption
The one signed formula in this tableau has main operator → and sign F, so
we apply the →F rule to it to obtain:
1. F (¬ φ ∨ ψ) → ( φ → ψ) ✓ Assumption
2. T¬ φ ∨ ψ →F 1
3. F ( φ → ψ) →F 1
1. F (¬ φ ∨ ψ) → ( φ → ψ) ✓ Assumption
2. T¬ φ ∨ ψ ✓ →F 1
3. F ( φ → ψ) →F 1
4. T¬ φ Tψ ∨T 2
We have not applied the →F rule to line 3 yet: let’s do that now. To save
time, we apply it to both branches. Recall that we write a checkmark next
to a signed formula only if we have applied the corresponding rule in every
open branch. So it’s a good idea to apply a rule at the end of every branch that
contains the signed formula the rule applies to. That way we won’t have to
return to that signed formula lower down in the various branches.
1. F (¬ φ ∨ ψ) → ( φ → ψ) ✓ Assumption
2. T¬ φ ∨ ψ ✓ →F 1
3. F ( φ → ψ) ✓ →F 1
4. T¬ φ Tψ ∨T 2
5. Tφ Tφ →F 3
6. Fψ Fψ →F 3
⊗
The right branch is now closed. On the left branch, we can still apply the ¬T
rule to line 4. This results in F φ and closes the left branch:
1. F (¬ φ ∨ ψ) → ( φ → ψ) ✓ Assumption
2. T¬ φ ∨ ψ ✓ →F 1
3. F ( φ → ψ) ✓ →F 1
4. T¬ φ Tψ ∨T 2
5. Tφ Tφ →F 3
6. Fψ Fψ →F 3
7. Fφ ⊗ ¬T 4
⊗
Example 11.6. We can give tableaux for any number of signed formulas as
assumptions. Often it is also necessary to apply more than one rule that allows
branching; and in general a tableau can have any number of branches. For
instance, consider a tableau for {T φ ∨ (ψ ∧ χ), F ( φ ∨ ψ) ∧ ( φ ∨ χ)}. We start
by applying the ∨T to the first assumption:
1. T φ ∨ (ψ ∧ χ) ✓ Assumption
2. F ( φ ∨ ψ) ∧ ( φ ∨ χ) Assumption
3. Tφ Tψ ∧ χ ∨T 1
Now we can apply the ∧F rule to line 2. We do this on both branches simul-
taneously, and can therefore check off line 2:
1. T φ ∨ (ψ ∧ χ) ✓ Assumption
2. F ( φ ∨ ψ) ∧ ( φ ∨ χ) ✓ Assumption
3. Tφ Tψ ∧ χ ∨T 1
4. Fφ ∨ ψ Fφ ∨ χ Fφ ∨ ψ Fφ ∨ χ ∧F 2
1. T φ ∨ (ψ ∧ χ) ✓ Assumption
2. F ( φ ∨ ψ) ∧ ( φ ∨ χ) ✓ Assumption
3. Tφ Tψ ∧ χ ∨T 1
4. Fφ ∨ ψ ✓ Fφ ∨ χ Fφ ∨ ψ ✓ Fφ ∨ χ ∧F 2
5. Fφ Fφ ∨F 4
6. Fψ Fψ ∨F 4
⊗
1. T φ ∨ (ψ ∧ χ) ✓ Assumption
2. F ( φ ∨ ψ) ∧ ( φ ∨ χ) ✓ Assumption
3. Tφ Tψ ∧ χ ∨T 1
4. Fφ ∨ ψ ✓ Fφ ∨ χ ✓ Fφ ∨ ψ ✓ Fφ ∨ χ ✓ ∧F 2
5. Fφ Fφ ∨F 4
6. Fψ Fψ ∨F 4
7. ⊗ Fφ Fφ ∨F 4
8. Fχ Fχ ∨F 4
⊗
Note that we moved the result of applying ∨F a second time below for clarity.
In this instance it would not have been needed, since the justifications would
have been the same.
Two branches remain open, and Tψ ∧ χ on line 3 remains unchecked. We
apply ∧T to it to obtain a closed tableau:
1. T φ ∨ (ψ ∧ χ) ✓ Assumption
2. F ( φ ∨ ψ) ∧ ( φ ∨ χ) ✓ Assumption
3. Tφ Tψ ∧ χ ✓ ∨T 1
4. Fφ ∨ ψ ✓ Fφ ∨ χ ✓ Fφ ∨ ψ ✓ Fφ ∨ χ ✓ ∧F 2
5. Fφ Fφ Fφ Fφ ∨F 4
6. Fψ Fχ Fψ Fχ ∨F 4
7. ⊗ ⊗ Tψ Tψ ∧T 3
8. Tχ Tχ ∧T 3
⊗ ⊗
For comparison, here’s a closed tableau for the same set of assumptions in
which the rules are applied in a different order:
1. T φ ∨ (ψ ∧ χ) ✓ Assumption
2. F ( φ ∨ ψ) ∧ ( φ ∨ χ) ✓ Assumption
3. Fφ ∨ ψ ✓ Fφ ∨ χ ✓ ∧F 2
4. Fφ Fφ ∨F 3
5. Fψ Fχ ∨F 3
6. Tφ Tψ ∧ χ ✓ Tφ Tψ ∧ χ ✓ ∨T 1
7. ⊗ Tψ ⊗ Tψ ∧T 6
8. Tχ Tχ ∧T 6
⊗ ⊗
This section collects the definitions of the provability relation and con-
sistency for tableaux.
{F φ, Tψ1 , . . . , Tψn }.
{Tψ1 , . . . , Tψn }.
1. Fφ Assumption
2. Tφ Assumption
⊗
is closed.
{F φ,Tθ1 , . . . , Tθm }
Apply the Cut rule on φ. This generates two branches, one has T φ in it, the
other F φ. Thus, on the one branch, all of
{F ψ, T φ, Tχ1 , . . . , Tχn }
are available. Since there is a closed tableau for these assumptions, we can
attach it to that branch; every branch through T φ closes. On the other branch,
all of
{F φ, Tθ1 , . . . , Tθm }
are available, so we can also complete the other side to obtain a closed tableau.
This shows Γ ∪ ∆ ⊢ ψ.
Proof. Exercise.
{F φ,Tψ1 , . . . , Tψn }
{T φ,Tχ1 , . . . , Tχm }
have closed tableaux. Using the Cut rule on φ we can combine these into a
single closed tableau that shows Γ0 ∪ Γ1 is inconsistent. Since Γ0 ⊆ Γ and
Γ1 ⊆ Γ, Γ0 ∪ Γ1 ⊆ Γ, hence Γ is inconsistent.
{F φ, Tψ1 , . . . , Tψn }
Using the ¬T rule, this can be turned into a closed tableau for
{T ¬ φ, Tψ1 , . . . , Tψn }.
On the other hand, if there is a closed tableau for the latter, we can turn it
into a closed tableau of the former by removing every formula that results
from ¬T applied to the first assumption T ¬ φ as well as that assumption,
and adding the assumption F φ. For if a branch was closed before because
it contained the conclusion of ¬T applied to T ¬ φ, i.e., F φ, the corresponding
branch in the new tableau is also closed. If a branch in the old tableau was
closed because it contained the assumption T ¬ φ as well as F ¬ φ we can turn
it into a closed branch by applying ¬F to F ¬ φ to obtain T φ. This closes the
branch since we added F φ as an assumption.
{F φ, Tψ1 , . . . , Tψn }
1. Fφ Assumption
2. Tφ ∧ ψ Assumption
3. Tφ ∧T 2
4. Tψ ∧T 2
⊗
1. Fψ Assumption
2. Tφ ∧ ψ Assumption
3. Tφ ∧T 2
4. Tψ ∧T 2
⊗
1. Fφ ∧ ψ Assumption
2. Tφ Assumption
3. Tψ Assumption
4. Fφ Fψ ∧F 1
⊗ ⊗
2. Both φ ⊢ φ ∨ ψ and ψ ⊢ φ ∨ ψ.
1. Tφ ∨ ψ Assumption
2. T¬ φ Assumption
3. T ¬ψ Assumption
4. Fφ ¬T 2
5. Fψ ¬T 3
6. Tφ Tψ ∨T 1
⊗ ⊗
1. Fφ ∨ ψ Assumption
2. Tφ Assumption
3. Fφ ∨F 1
4. Fψ ∨F 1
⊗
1. Fφ ∨ ψ Assumption
2. Tψ Assumption
3. Fφ ∨F 1
4. Fψ ∨F 1
⊗
Proposition 11.21. 1. φ, φ → ψ ⊢ ψ.
2. Both ¬ φ ⊢ φ → ψ and ψ ⊢ φ → ψ.
1. Fψ Assumption
2. Tφ → ψ Assumption
3. Tφ Assumption
4. Fφ Tψ →T 2
⊗ ⊗
1. Fφ → ψ Assumption
2. T¬ φ Assumption
3. Tφ →F 1
4. Fψ →F 1
5. Fφ ¬T 2
⊗
1. Fφ → ψ Assumption
2. Tψ Assumption
3. Tφ →F 1
4. Fψ →F 1
⊗
11.8 Soundness
A derivation system, such as tableaux, is sound if it cannot derive things that
do not actually hold. Soundness is thus a kind of guaranteed safety property
for derivation systems. Depending on which proof theoretic property is in
question, we would like to know for instance, that
Proof. Let’s call a branch of a tableau satisfiable iff the set of signed formulas
on it is satisfiable, and let’s call a tableau satisfiable if it contains at least one
satisfiable branch.
We show the following: Extending a satisfiable tableau by one of the rules
of inference always results in a satisfiable tableau. This will prove the theo-
rem: any closed tableau results by applying rules of inference to the tableau
consisting only of assumptions from Γ. So if Γ were satisfiable, any tableau
for it would be satisfiable. A closed tableau, however, is clearly not satisfiable:
every branch contains both T φ and F φ, and no structure can both satisfy and
not satisfy φ.
Suppose we have a satisfiable tableau, i.e., a tableau with at least one sat-
isfiable branch. Applying a rule of inference either adds signed formulas to a
branch, or splits a branch in two. If the tableau has a satisfiable branch which
is not extended by the rule application in question, it remains a satisfiable
branch in the extended tableau, so the extended tableau is satisfiable. So we
only have to consider the case where a rule is applied to a satisfiable branch.
Let Γ be the set of signed formulas on that branch, and let S φ ∈ Γ be the
signed formula to which the rule is applied. If the rule does not result in a split
branch, we have to show that the extended branch, i.e., Γ together with the
conclusions of the rule, is still satisfiable. If the rule results in a split branch,
we have to show that at least one of the two resulting branches is satisfiable.
First, we consider the possible inferences that do not result in a split branch.
1. The branch is expanded by applying ¬T to T ¬ψ ∈ Γ. Then the ex-
tended branch contains the signed formulas Γ ∪ {F ψ}. Suppose v ⊨ Γ.
In particular, v ⊨ ¬ψ. Thus, v ⊭ ψ, i.e., v satisfies F ψ.
2. The branch is expanded by applying ¬F to F ¬ψ ∈ Γ: Exercise.
3. The branch is expanded by applying ∧T to Tψ ∧ χ ∈ Γ, which results
in two new signed formulas on the branch: Tψ and Tχ. Suppose v ⊨ Γ,
in particular v ⊨ ψ ∧ χ. Then v ⊨ ψ and v ⊨ χ. This means that v satisfies
both Tψ and Tχ.
4. The branch is expanded by applying ∨F to F ψ ∨ χ ∈ Γ: Exercise.
5. The branch is expanded by applying →F to F ψ → χ ∈ Γ: This results in
two new signed formulas on the branch: Tψ and F χ. Suppose v ⊨ Γ, in
particular v ⊭ ψ → χ. Then v ⊨ ψ and v ⊭ χ. This means that v satisfies
both Tψ and F χ.
Now let’s consider the possible inferences that result in a split branch.
1. The branch is expanded by applying ∧F to F ψ ∧ χ ∈ Γ, which results in
two branches, a left one continuing through F ψ and a right one through
F χ. Suppose v ⊨ Γ, in particular v ⊭ ψ ∧ χ. Then v ⊭ ψ or v ⊭ χ. In
the former case, v satisfies F ψ, i.e., v satisfies the formulas on the left
branch. In the latter, v satisfies F χ, i.e., v satisfies the formulas on the
right branch.
2. The branch is expanded by applying ∨T to Tψ ∨ χ ∈ Γ: Exercise.
3. The branch is expanded by applying →T to Tψ → χ ∈ Γ: Exercise.
4. The branch is expanded by Cut: This results in two branches, one con-
taining Tψ, the other containing F ψ. Since v ⊨ Γ and either v ⊨ ψ or
v ⊭ ψ, v satisfies either the left or the right branch.
Problems
Problem 11.1. Give closed tableaux of the following:
1. T φ ∧ (ψ ∧ χ), F ( φ ∧ ψ) ∧ χ.
2. T φ ∨ (ψ ∨ χ), F ( φ ∨ ψ) ∨ χ.
3. T φ → (ψ → χ), F ψ → ( φ → χ).
4. T φ, F ¬¬ φ.
1. T ( φ ∨ ψ) → χ, F φ → χ.
2. T ( φ → χ) ∧ (ψ → χ), F ( φ ∨ ψ) → χ.
3. F ¬( φ ∧ ¬ φ).
4. Tψ → φ, F ¬ φ → ¬ψ.
5. F ( φ → ¬ φ) → ¬ φ.
6. F ¬( φ → ψ) → ¬ψ.
7. T φ → χ, F ¬( φ ∧ ¬χ).
8. T φ ∧ ¬χ, F ¬( φ → χ).
9. T φ ∨ ψ, ¬ψ, F φ.
12. F ¬( φ ∨ ψ) → (¬ φ ∧ ¬ψ).
1. T ¬( φ → ψ), F φ.
2. T ¬( φ ∧ ψ), F ¬ φ ∨ ¬ψ.
3. T φ → ψ, F ¬ φ ∨ ψ.
4. F ¬¬ φ → φ.
5. T φ → ψ, T ¬ φ → ψ, F ψ.
6. T ( φ ∧ ψ) → χ, F ( φ → χ) ∨ (ψ → χ).
7. T ( φ → ψ) → φ, F φ.
8. F ( φ → ψ) ∨ (ψ → χ).
Axiomatic Derivations
No effort has been made yet to ensure that the material in this chap-
ter respects various tags indicating which connectives and quantifiers are
primitive or defined: all are assumed to be primitive, except ↔ which is
assumed to be defined. If the FOL tag is true, we produce a version with
quantifiers, otherwise without.
1. φi ∈ Γ; or
2. φi is an axiom; or
174
12.1. RULES AND DERIVATIONS
It gets more interesting if the rule of inference appeals to formulas that appear
before the step considered. The following rule is called modus ponens:
If this is the only rule of inference, then our definition of derivation above
amounts to this: φ1 , . . . , φn is a derivation iff for each i ≤ n one of the follow-
ing holds:
1. φi ∈ Γ; or
2. φi is an axiom; or
The last clause says that φi follows from φ j (ψ) and φk (ψ → φi ) by modus
ponens. If we can go from 1 to n, and each time we find a formula φi that is
either in Γ, an axiom, or which a rule of inference tells us that it is a correct
inference step, then the entire sequence counts as a correct derivation.
Definition 12.5 (Axioms). The set of Ax0 of axioms for the propositional con-
nectives comprises all formulas of the following forms:
( φ ∧ ψ) → φ (12.1)
( φ ∧ ψ) → ψ (12.2)
φ → (ψ → ( φ ∧ ψ)) (12.3)
φ → ( φ ∨ ψ) (12.4)
φ → (ψ ∨ φ) (12.5)
( φ → χ) → ((ψ → χ) → (( φ ∨ ψ) → χ)) (12.6)
φ → (ψ → φ) (12.7)
( φ → (ψ → χ)) → (( φ → ψ) → ( φ → χ)) (12.8)
( φ → ψ) → (( φ → ¬ψ) → ¬ φ) (12.9)
¬ φ → ( φ → ψ) (12.10)
⊤ (12.11)
⊥→φ (12.12)
( φ → ⊥) → ¬ φ (12.13)
¬¬ φ → φ (12.14)
Why? Two applications of MP yield the last part, which is what we want. And
we easily see that ¬θ → (θ → α) is an instance of eq. (12.10), and α → (θ → α)
is an instance of eq. (12.7). So our derivation is:
1. ¬θ → (θ → α) eq. (12.10)
2. (¬θ → (θ → α)) →
((α → (θ → α)) → ((¬θ ∨ α) → (θ → α))) eq. (12.6)
3. ((α → (θ → α)) → ((¬θ ∨ α) → (θ → α)) 1, 2, MP
4. α → (θ → α) eq. (12.7)
5. (¬θ ∨ α) → (θ → α) 3, 4, MP
θ → (θ → θ )
In order to apply MP, we would also need to justify the corresponding second
premise, namely φ. But in our case, that would be θ, and we won’t be able to
derive θ by itself. So we need a different strategy.
The other axiom involving just → is eq. (12.8), i.e.,
( φ → (ψ → χ)) → (( φ → ψ) → ( φ → χ))
We could get to the last nested conditional by applying MP twice. Again, that
would mean that we want an instance of eq. (12.8) where φ → χ is θ → θ, the
formula we are aiming for. Then of course, φ and χ are both θ. How should
we pick ψ so that both φ → (ψ → χ) and φ → ψ, i.e., in our case θ → (ψ → θ )
and θ → ψ, are also derivable? Well, the first of these is already an instance of
eq. (12.7), whatever we decide ψ to be. And θ → ψ would be another instance
of eq. (12.7) if ψ were (θ → θ ). So, our derivation is:
1. φ→ψ H YP
2. ψ→χ H YP
3. (ψ → χ) → ( φ → (ψ → χ)) eq. (12.7)
4. φ → (ψ → χ) 2, 3, MP
5. ( φ → (ψ → χ)) →
(( φ → ψ) → ( φ → χ)) eq. (12.8)
6. (( φ → ψ) → ( φ → χ)) 4, 5, MP
7. φ→χ 1, 6, MP
The lines labelled “H YP” (for “hypothesis”) indicate that the formula on that
line is an element of Γ.
φ1 , . . . , φk = φ, ψ1 , . . . , ψl = ψ.
Proof. Exercise.
1. φ Hyp.
2. φ→ψ Hyp.
3. ψ 1, 2, MP
By Proposition 12.16, Γ ⊢ ψ.
The most important result we’ll use in this context is the deduction theo-
rem:
Γ ⊢ φ → ( χ → ψ );
Γ ⊢ φ → χ.
But also
Γ ⊢ ( φ → (χ → ψ)) → (( φ → χ) → ( φ → ψ)),
by eq. (12.8), and two applications of Proposition 12.19 give Γ ⊢ φ → ψ, as
required.
Notice how eq. (12.7) and eq. (12.8) were chosen precisely so that the De-
duction Theorem would hold.
The following are some useful facts about derivability, which we leave as
exercises.
5. If Γ ⊢ ¬¬ φ then Γ ⊢ φ;
Proof. Exercise.
2. φ, ψ ⊢ φ ∧ ψ.
2. Both φ ⊢ φ ∨ ψ and ψ ⊢ φ ∨ ψ.
Proposition 12.28. 1. φ, φ → ψ ⊢ ψ.
2. Both ¬ φ ⊢ φ → ψ and ψ ⊢ φ → ψ.
1. φ H YP
2. φ→ψ H YP
3. ψ 1, 2, MP
2. By eq. (12.10) and eq. (12.7) and the deduction theorem, respectively.
12.8 Soundness
A derivation system, such as axiomatic deduction, is sound if it cannot de-
rive things that do not actually hold. Soundness is thus a kind of guaranteed
safety property for derivation systems. Depending on which proof theoretic
property is in question, we would like to know for instance, that
Proof. Do truth tables for each axiom to verify that they are tautologies.
Problems
Problem 12.1. Show that the following hold by exhibiting derivations from
the axioms:
1. ( φ ∧ ψ) → (ψ ∧ φ)
2. (( φ ∧ ψ) → χ) → ( φ → (ψ → χ))
3. ¬( φ ∨ ψ) → ¬ φ
13.1 Introduction
The completeness theorem is one of the most fundamental results about logic.
It comes in two formulations, the equivalence of which we’ll prove. In its first
formulation it says something fundamental about the relationship between
semantic consequence and our derivation system: if a sentence φ follows from
some sentences Γ, then there is also a derivation that establishes Γ ⊢ φ. Thus,
the derivation system is as strong as it can possibly be without proving things
that don’t actually follow.
In its second formulation, it can be stated as a model existence result: ev-
ery consistent set of sentences is satisfiable. Consistency is a proof-theoretic
notion: it says that our derivation system is unable to produce certain deriva-
tions. But who’s to say that just because there are no derivations of a certain
sort from Γ, it’s guaranteed that there is valuation v with v ⊨ Γ? Before the
completeness theorem was first proved—in fact before we had the derivation
systems we now do—the great German mathematician David Hilbert held the
view that consistency of mathematical theories guarantees the existence of the
objects they are about. He put it as follows in a letter to Gottlob Frege:
185
CHAPTER 13. THE COMPLETENESS THEOREM
set of all formulas so added Γ ∗ . Then our construction above would provide
us with a valuation v for which we could prove, by induction, that it satisfies
all sentences in Γ ∗ , and hence also all sentence in Γ since Γ ⊆ Γ ∗ . It turns
out that guaranteeing (a) and (b) is enough. A set of sentences for which (b)
holds is called complete. So our task will be to extend the consistent set Γ to a
consistent and complete set Γ ∗ .
So here’s what we’ll do. First we investigate the properties of complete
consistent sets, in particular we prove that a complete consistent set contains
φ ∧ ψ iff it contains both φ and ψ, φ ∨ ψ iff it contains at least one of them,
etc. (Proposition 13.2). We’ll then take the consistent set Γ and show that it
can be extended to a consistent and complete set Γ ∗ (Lemma 13.3). This set Γ ∗
is what we’ll use to define our valuation v( Γ ∗ ). The valuation is determined
by the propositional variables in Γ ∗ (Definition 13.4). We’ll use the proper-
ties of complete consistent sets to show that indeed v( Γ ∗ ) ⊨ φ iff φ ∈ Γ ∗
(Lemma 13.5), and thus in particular, v( Γ ∗ ) ⊨ Γ.
1. If Γ ⊢ φ, then φ ∈ Γ.
3. φ ∨ ψ ∈ Γ iff either φ ∈ Γ or ψ ∈ Γ.
4. φ → ψ ∈ Γ iff either φ ∈
/ Γ or ψ ∈ Γ.
Proof. Let us suppose for all of the following that Γ is complete and consistent.
1. If Γ ⊢ φ, then φ ∈ Γ.
Suppose that Γ ⊢ φ. Suppose to the contrary that φ ∈ / Γ. Since Γ is
complete, ¬ φ ∈ Γ. By Propositions 10.17, 11.17, 9.19 and 12.24, Γ is in-
consistent. This contradicts the assumption that Γ is consistent. Hence,
it cannot be the case that φ ∈
/ Γ, so φ ∈ Γ.
2. Exercise.
4. Exercise.
Let Γ ∗ = n≥0 Γn .
S
complete.
6. φ ≡ ψ → χ: exercise.
Corollary 13.7 (Completeness Theorem, Second Version). For all Γ and sen-
tences φ: if Γ ⊨ φ then Γ ⊢ φ.
Proof. Note that the Γ’s in Corollary 13.7 and Theorem 13.6 are universally
quantified. To make sure we do not confuse ourselves, let us restate Theo-
rem 13.6 using a different variable: for any set of sentences ∆, if ∆ is consistent,
it is satisfiable. By contraposition, if ∆ is not satisfiable, then ∆ is inconsistent.
We will use this to prove the corollary.
Suppose that Γ ⊨ φ. Then Γ ∪ {¬ φ} is unsatisfiable by Proposition 7.21.
Taking Γ ∪ {¬ φ} as our ∆, the previous version of Theorem 13.6 gives us that
Γ ∪ {¬ φ} is inconsistent. By Propositions 10.16, 11.16, 9.18 and 12.23, Γ ⊢ φ.
Theorem 13.9 (Compactness Theorem). The following hold for any sentences Γ
and φ:
Lemma 13.11. Every finitely satisfiable set Γ can be extended to a complete and
finitely satisfiable set Γ ∗ .
Problems
Problem 13.1. Complete the proof of Proposition 13.2.
Problem 13.3. Use Corollary 13.7 to prove Theorem 13.6, thus showing that
the two formulations of the completeness theorem are equivalent.
Problem 13.4. In order for a derivation system to be complete, its rules must
be strong enough to prove every unsatisfiable set inconsistent. Which of the
rules of derivation were necessary to prove completeness? Are any of these
rules not used anywhere in the proof? In order to answer these questions,
make a list or diagram that shows which of the rules of derivation were used
in which results that lead up to the proof of Theorem 13.6. Be sure to note any
tacit uses of rules in these proofs.
Problem 13.7. Prove Lemma 13.11. (Hint: the crucial step is to show that if Γn
is finitely satisfiable, then either Γn ∪ { φn } or Γn ∪ {¬ φn } is finitely satisfiable.)
Problem 13.8. Write out the complete proof of the Truth Lemma (Lemma 13.5)
in the version required for the proof of Theorem 13.12.
First-order Logic
193
CHAPTER 13. THE COMPLETENESS THEOREM
195
CHAPTER 14. INTRODUCTION TO FIRST-ORDER LOGIC
M ⊨ φ) for sentences φ and structures M. Once this is done, we can also give
precise definitions of the other semantical terms such as “follows from” or “is
logically true.” These definitions will make it possible to settle, again with
mathematical precision, whether, e.g., ∀ x ( φ( x ) → ψ( x )), ∃ x φ( x ) ⊨ ∃ x ψ( x ).
The answer will, of course, be “yes.” If you’ve already been trained to sym-
bolize sentences of English in first-order logic, you will recognize this as, e.g.,
the symbolizations of, say, “All ants are insects, there are ants, therefore there
are insects.” That is obviously a valid argument, and so our mathematical
model of “follows from” for our formal language should give the same an-
swer.
Another topic you probably remember from your first introduction to for-
mal logic is that there are derivations. If you have taken a first formal logic
course, your instructor will have made you practice finding such derivations,
perhaps even a derivation that shows that the above entailment holds. There
are many different ways to give derivations: you may have done something
called “natural deduction” or “truth trees,” but there are many others. The
purpose of derivation systems is to provide tools using which the logicians’
questions above can be answered: e.g., a natural deduction derivation in which
∀ x ( φ( x ) → ψ( x )) and ∃ x φ( x ) are premises and ∃ x ψ( x ) is the conclusion (last
line) verifies that ∃ x ψ( x ) logically follows from ∀ x ( φ( x ) → ψ( x )) and ∃ x φ( x ).
But why is that? On the face of it, derivation systems have nothing to do
with semantics: giving a formal derivation merely involves arranging sym-
bols in certain rule-governed ways; they don’t mention “cases” or “true in” at
all. The connection between derivation systems and semantics has to be estab-
lished by a meta-logical investigation. What’s needed is a mathematical proof,
e.g., that a formal derivation of ∃ x ψ( x ) from premises ∀ x ( φ( x ) → ψ( x )) and
∃ x φ( x ) is possible, if, and only if, ∀ x ( φ( x ) → ψ( x )) and ∃ x φ( x ) together en-
tail ∃ x ψ( x ). Before this can be done, however, a lot of painstaking work has
to be carried out to get the definitions of syntax and semantics correct.
14.2 Syntax
We first must make precise what strings of symbols count as sentences of first-
order logic. We’ll do this later; for now we’ll just proceed by example. The
basic building blocks—the vocabulary—of first-order logic divides into two
parts. The first part is the symbols we use to say specific things or to pick out
specific things. We pick out things using constant symbols, and we say stuff
about the things we pick out using predicate symbols. E.g, we might use a as
a constant symbol to pick out a single thing, and then say something about
it using the sentence P (a). If you have meanings for “a” and “P ” in mind,
you can read P (a) as a sentence of English (and you probably have done so
when you first learned formal logic). Once you have such simple sentences
of first-order logic, you can build more complex ones using the second part
of the vocabulary: the logical symbols (connectives and quantifiers). So, for
instance, we can form expressions like (P (a) ∧ Q(b )) or ∃x P (x ).
In order to provide the precise definitions of semantics and the rules of
our derivation systems required for rigorous meta-logical study, we first of
all have to give a precise definition of what counts as a sentence of first-order
logic. The basic idea is easy enough to understand: there are some simple sen-
tences we can form from just predicate symbols and constant symbols, such
as P (a). And then from these we form more complex ones using the connec-
tives and quantifiers. But what exactly are the rules by which we are allowed
to form more complex sentences? These must be specified, otherwise we have
not defined “sentence of first-order logic” precisely enough. There are a few
issues. The first one is to get the right strings to count as sentences. The sec-
ond one is to do this in such a way that we can give mathematical proofs about
all sentences. Finally, we’ll have to also give precise definitions of some rudi-
mentary operations with sentences, such as “replace every x in φ by b.” The
trouble is that the quantifiers and variables we have in first-order logic make
it not entirely obvious how this should be done. E.g., should ∃x P (a) count as
a sentence? What about ∃x ∃x P (x )? What should the result of “replace x by b
in (P (x ) ∧ ∃x P (x ))” be?
14.3 Formulas
Here is the approach we will use to rigorously specify sentences of first-order
logic and to deal with the issues arising from the use of variables. We first
define a different set of expressions: formulas. Once we’ve done that, we can
consider the role variables play in them—and on the basis of some other ideas,
namely those of “free” and “bound” variables, we can define what a sentence
is (namely, a formula without free variables). We do this not just because it
makes the definition of “sentence” more manageable, but also because it will
be crucial to the way we define the semantic notion of satisfaction.
Let’s define “formula” for a simple first-order language, one containing
only a single predicate symbol P and a single constant symbol a, and only the
logical symbols ¬, ∧, and ∃. Our full definitions will be much more general:
we’ll allow infinitely many predicate symbols and constant symbols. In fact,
we will also consider function symbols which can be combined with constant
symbols and variables to form “terms.” For now, a and the variables will be
our only terms. We do need infinitely many variables. We’ll officially use the
symbols v0 , v1 , . . . , as variables.
(1) tells us that P (a) and P (vi ) are formulas, for any i ∈ N. These are the
so-called atomic formulas. They give us something to start from. The other
clauses give us ways of forming new formulas from ones we have already
formed. So for instance, by (2), we get that ¬P (v2 ) is a formula, since P (v2 )
is already a formula by (1). Then, by (4), we get that ∃v2 ¬P (v2 ) is another
formula, and so on. (5) tells us that only strings we can form in this way count
as formulas. In particular, ∃v0 P (a) and ∃v0 ∃v0 P (a) do count as formulas, and
(¬P (a)) does not, because of the extraneous outer parentheses.
This way of defining formulas is called an inductive definition, and it allows
us to prove things about formulas using a version of proof by induction called
structural induction. These are discussed in a general way in section 71.4 and
section 71.5, which you should review before delving into the proofs later on.
Basically, the idea is that if you want to give a proof that something is true for
all formulas, you show first that it is true for the atomic formulas, and then
that if it’s true for any formula φ (and ψ), it’s also true for ¬ φ, ( φ ∧ ψ), and
∃ x φ. For instance, this proves that it’s true for ∃v2 ¬P (v2 ): from the first part
you know that it’s true for the atomic formula P (v2 ). Then you get that it’s
true for ¬P (v2 ) by the second part, and then again that it’s true for ∃v2 ¬P (v2 )
itself. Since all formulas are inductively generated from atomic formulas, this
works for any of them.
14.4 Satisfaction
We can already skip ahead to the semantics of first-order logic once we know
what formulas are: here, the basic definition is that of a structure. For our
simple language, a structure M has just three components: a non-empty set
|M| called the domain, what a picks out in M, and what P is true of in M.
The object picked out by a is denoted aM and the set of things P is true of
by P M . A structure M consists of just these three things: |M|, aM ∈ |M|
and P M ⊆ |M|. The general case will be more complicated, since there will
be many predicate symbols and constant symbols, the constant symbols can
have more than one place, and there will also be function symbols.
This is enough to give a definition of satisfaction for formulas that don’t
contain variables. The idea is to give an inductive definition that mirrors the
way we have defined formulas. We specify when an atomic formula is satis-
fied in M, and then when, e.g., ¬ φ is satisfied in M on the basis of whether or
not φ is satisfied in M. E.g., we could define:
Let’s say that |M| = {0, 1, 2}, aM = 1, and P M = {1, 2}. This definition
would tell us that P (a) is satisfied in M (since aM = 1 ∈ {1, 2} = P M ). It
tells us further that ¬P (a) is not satisfied in M, and that in turn ¬¬P (a) is
and (¬P (a) ∧ P (a)) is not satisfied, and so on.
The trouble comes when we want to give a definition for the quantifiers:
we’d like to say something like, “∃v0 P (v0 ) is satisfied iff P (v0 ) is satisfied.”
But the structure M doesn’t tell us what to do about variables. What we ac-
tually want to say is that P (v0 ) is satisfied for some value of v0 . To make this
precise we need a way to assign elements of |M| not just to a but also to v0 . To
this end, we introduce variable assignments. A variable assignment is simply
a function s that maps variables to elements of |M| (in our example, to one
of 1, 2, or 3). Since we don’t know beforehand which variables might appear
in a formula we can’t limit which variables s assigns values to. The simple
solution is to require that s assigns values to all variables v0 , v1 , . . . We’ll just
use only the ones we need.
Instead of defining satisfaction of formulas just relative to a structure, we’ll
define it relative to a structure M and a variable assignment s, and write M, s ⊨
φ for short. Our definition will now include an additional clause to deal with
atomic formulas containing variables:
1. M, s ⊨ P (a) iff aM ∈ P M .
3. M, s ⊨ ¬ φ iff not M, s ⊨ φ.
4. M, s ⊨ ( φ ∧ ψ) iff M, s ⊨ φ and M, s ⊨ ψ.
Ok, this solves one problem: we can now say when M satisfies P (v0 ) for the
value s(v0 ). To get the definition right for ∃v0 P (v0 ) we have to do one more
thing: We want to have that M, s ⊨ ∃v0 P (v0 ) iff M, s′ ⊨ P (v0 ) for some way
s′ of assigning a value to v0 . But the value assigned to v0 does not necessarily
have to be the value that s(v0 ) picks out. We’ll introduce a notation for that:
if m ∈ |M|, then we let s[m/v0 ] be the assignment that is just like s (for all
variables other than v0 ), except to v0 it assigns m. Now our definition can be:
Does it work out? Let’s say we let s(vi ) = 0 for all i ∈ N. M, s ⊨ ∃v0 P (v0 ) iff
there is an m ∈ |M| so that M, s[m/v0 ] ⊨ P (v0 ). And there is: we can choose
m = 1 or m = 2. Note that this is true even if the value s(v0 ) assigned to v0 by
s itself—in this case, 0—doesn’t do the job. We have M, s[1/v0 ] ⊨ P (v0 ) but
not M, s ⊨ P (v0 ).
If this looks confusing and cumbersome: it is. But the added complexity is
required to give a precise, inductive definition of satisfaction for all formulas,
and we need something like it to precisely define the semantic notions. There
are other ways of doing it, but they are all equally (in)elegant.
14.5 Sentences
Ok, now we have a (sketch of a) definition of satisfaction (“true in”) for struc-
tures and formulas. But it needs this additional bit—a variable assignment—
and what we wanted is a definition of sentences. How do we get rid of as-
signments, and what are sentences?
You probably remember a discussion in your first introduction to formal
logic about the relation between variables and quantifiers. A quantifier is al-
ways followed by a variable, and then in the part of the sentence to which that
quantifier applies (its “scope”), we understand that the variable is “bound”
by that quantifier. In formulas it was not required that every variable has a
matching quantifier, and variables without matching quantifiers are “free” or
“unbound.” We will take sentences to be all those formulas that have no free
variables.
Again, the intuitive idea of when an occurrence of a variable in a formula φ
is bound, which quantifier binds it, and when it is free, is not difficult to get.
You may have learned a method for testing this, perhaps involving counting
parentheses. We have to insist on a precise definition—and because we have
defined formulas by induction, we can give a definition of the free and bound
occurrences of a variable x in a formula φ also by induction. E.g., it might look
like this for our simplified language:
1. If φ is atomic, all occurrences of x in it are free (that is, the occurrence of
x in P ( x ) is free).
2. If φ is of the form ¬ψ, then an occurrence of x in ¬ψ is free iff the cor-
responding occurrence of x is free in ψ (that is, the free occurrences of
variables in ψ are exactly the corresponding occurrences in ¬ψ).
3. If φ is of the form (ψ ∧ χ), then an occurrence of x in (ψ ∧ χ) is free iff
the corresponding occurrence of x is free in ψ or in χ.
4. If φ is of the form ∃ x ψ, then no occurrence of x in φ is free; if it is of the
form ∃y ψ where y is a different variable than x, then an occurrence of x
in ∃y ψ is free iff the corresponding occurrence of x is free in ψ.
Once we have a precise definition of free and bound occurrences of vari-
ables, we can simply say: a sentence is any formula without free occurrences
of variables.
14.7 Substitution
We’ll discuss an example to illustrate how things hang together, and how the
development of syntax and semantics lays the foundation for our more ad-
vanced investigations later. Our derivation systems should let us derive P (a)
from ∀v0 P (v0 ). Maybe we even want to state this as a rule of inference. How-
ever, to do so, we must be able to state it in the most general terms: not just
for P , a, and v0 , but for any formula φ, and term t, and variable x. (Recall
that constant symbols are terms, but we’ll consider also more complicated
terms built from constant symbols and function symbols.) So we want to be
able to say something like, “whenever you have derived ∀ x φ( x ) you are jus-
tified in inferring φ(t)—the result of removing ∀ x and replacing x by t.” But
what exactly does “replacing x by t” mean? What is the relation between φ( x )
and φ(t)? Does this always work?
∀ v 0 P ( v0 , v0 )
∀v0 ∀v1 ∀v2 ((P (v0 , v1 ) ∧ P (v1 , v2 )) → P (v0 , v2 ))
These sentences are just the symbolizations of “for any x, Rxx” (R is reflexive)
and “whenever Rxy and Ryz then also Rxz” (R is transitive). We see that
a structure M is a model of these two sentences Γ iff R (i.e., P M ), is a preorder
on A (i.e., |M|). In other words, the models of Γ are exactly the preorders. Any
property of all preorders that can be expressed in the first-order language with
15.1 Introduction
In order to develop the theory and metatheory of first-order logic, we must
first define the syntax and semantics of its expressions. The expressions of
first-order logic are terms and formulas. Terms are formed from variables,
constant symbols, and function symbols. Formulas, in turn, are formed from
predicate symbols together with terms (these form the smallest, “atomic” for-
mulas), and then from atomic formulas we can form more complex ones us-
ing logical connectives and quantifiers. There are many different ways to set
down the formation rules; we give just one possible one. Other systems will
chose different symbols, will select different sets of connectives as primitive,
will use parentheses differently (or even not at all, as in the case of so-called
Polish notation). What all approaches have in common, though, is that the
formation rules define the set of terms and formulas inductively. If done prop-
erly, every expression can result essentially in only one way according to the
formation rules. The inductive definition resulting in expressions that are
uniquely readable means we can give meanings to these expressions using the
same method—inductive definition.
205
CHAPTER 15. SYNTAX OF FIRST-ORDER LOGIC
1. Logical symbols
Most of our definitions and results will be formulated for the full standard
language of first-order logic. However, depending on the application, we may
also restrict the language to only a few predicate symbols, constant symbols,
and function symbols.
Example 15.2. The language of set theory L Z contains only the single two-
place predicate symbol ∈.
Example 15.3. The language of orders L≤ contains only the two-place predi-
cate symbol ≤.
Again, these are conventions: officially, these are just aliases, e.g., <, ∈,
and ≤ are aliases for A20 , 0 for c0 , ′ for f01 , + for f02 , × for f12 .
In addition to the primitive connectives and quantifiers introduced above,
we also use the following defined symbols: ↔ (biconditional), truth ⊤
A defined symbol is not officially part of the language, but is introduced
as an informal abbreviation: it allows us to abbreviate formulas which would,
if we only used primitive symbols, get quite long. This is obviously an ad-
vantage. The bigger advantage, however, is that proofs become shorter. If a
symbol is primitive, it has to be treated separately in proofs. The more primi-
tive symbols, therefore, the longer our proofs.
You may be familiar with different terminology and symbols than the ones
we use above. Logic texts (and teachers) commonly use ∼, ¬, or ! for “nega-
tion”, ∧, ·, or & for “conjunction”. Commonly used symbols for the “condi-
tional” or “implication” are →, ⇒, and ⊃. Symbols for “biconditional,” “bi-
implication,” or “(material) equivalence” are ↔, ⇔, and ≡. The ⊥ symbol is
variously called “falsity,” “falsum,”, “absurdity,” or “bottom.” The ⊤ symbol
is variously called “truth,” “verum,” or “top.”
It is conventional to use lower case letters (e.g., a, b, c) from the begin-
ning of the Latin alphabet for constant symbols (sometimes called names),
and lower case letters from the end (e.g., x, y, z) for variables. Quantifiers
combine with variables, e.g., x; notational variations include ∀ x, (∀ x ), ( x ),
Πx, x for the universal quantifier and ∃ x, (∃ x ), ( Ex ), Σx, x for the existen-
V W
tial quantifier.
We might treat all the propositional operators and both quantifiers as prim-
itive symbols of the language. We might instead choose a smaller stock of
primitive symbols and treat the other logical operators as defined. “Truth
functionally complete” sets of Boolean operators include {¬, ∨}, {¬, ∧}, and
{¬, →}—these can be combined with either quantifier for an expressively
complete first-order language.
You may be familiar with two other logical operators: the Sheffer stroke |
(named after Henry Sheffer), and Peirce’s arrow ↓, also known as Quine’s
dagger. When given their usual readings of “nand” and “nor” (respectively),
these operators are truth functionally complete by themselves.
The constant symbols appear in our specification of the language and the
terms as a separate category of symbols, but they could instead have been in-
cluded as zero-place function symbols. We could then do without the second
clause in the definition of terms. We just have to understand f (t1 , . . . , tn ) as
just f by itself if n = 0.
Definition 15.5 (Formulas). The set of formulas Frm(L) of the language L is
defined inductively as follows:
1. ⊥ is an atomic formula.
2. If R is an n-place predicate symbol of L and t1 , . . . , tn are terms of L,
then R(t1 , . . . , tn ) is an atomic formula.
3. If t1 and t2 are terms of L, then =(t1 , t2 ) is an atomic formula.
4. If φ is a formula, then ¬ φ is formula.
5. If φ and ψ are formulas, then ( φ ∧ ψ) is a formula.
6. If φ and ψ are formulas, then ( φ ∨ ψ) is a formula.
7. If φ and ψ are formulas, then ( φ → ψ) is a formula.
8. If φ is a formula and x is a variable, then ∀ x φ is a formula.
9. If φ is a formula and x is a variable, then ∃ x φ is a formula.
10. Nothing else is a formula.
The definitions of the set of terms and that of formulas are inductive defini-
tions. Essentially, we construct the set of formulas in infinitely many stages. In
the initial stage, we pronounce all atomic formulas to be formulas; this corre-
sponds to the first few cases of the definition, i.e., the cases for ⊥, R(t1 , . . . , tn )
and =(t1 , t2 ). “Atomic formula” thus means any formula of this form.
The other cases of the definition give rules for constructing new formulas
out of formulas already constructed. At the second stage, we can use them to
construct formulas out of atomic formulas. At the third stage, we construct
new formulas from the atomic formulas and those obtained in the second
stage, and so on. A formula is anything that is eventually constructed at such
a stage, and nothing else.
By convention, we write = between its arguments and leave out the paren-
theses: t1 = t2 is an abbreviation for =(t1 , t2 ). Moreover, ¬=(t1 , t2 ) is abbre-
viated as t1 ̸= t2 . When writing a formula (ψ ∗ χ) constructed from ψ, χ
using a two-place connective ∗, we will often leave out the outermost pair of
parentheses and write simply ψ ∗ χ.
Some logic texts require that the variable x must occur in φ in order for
∃ x φ and ∀ x φ to count as formulas. Nothing bad happens if you don’t require
this, and it makes things easier.
1. ⊤ abbreviates ¬⊥.
2. φ ↔ ψ abbreviates ( φ → ψ) ∧ (ψ → φ).
1. φ is an atomic formula.
1. We take θ to be φ and θ → θ to be ψ.
2. We take φ to be θ → θ and ψ is θ.
Lemma 15.10. The number of left and right parentheses in a formula φ are equal.
7. φ ≡ ∃ x ψ: Similarly.
Proof. Exercise.
Proposition 15.13. If φ is an atomic formula, then it satisfies one, and only one of
the following conditions.
1. φ ≡ ⊥.
Proof. Exercise.
Proposition 15.14 (Unique Readability). Every formula satisfies one, and only
one of the following conditions.
1. φ is atomic.
6. φ is of the form ∀ x ψ.
7. φ is of the form ∃ x ψ.
Moreover, in each case ψ, or ψ and χ, are uniquely determined. This means that, e.g.,
there are no different pairs ψ, χ and ψ′ , χ′ so that φ is both of the form (ψ → χ) and
( ψ ′ → χ ′ ).
Proof. The formation rules require that if a formula is not atomic, it must start
with an opening parenthesis (, ¬, or a quantifier. On the other hand, every for-
mula that starts with one of the following symbols must be atomic: a predicate
symbol, a function symbol, a constant symbol, ⊥.
So we really only have to show that if φ is of the form (ψ ∗ χ) and also of
the form (ψ′ ∗′ χ′ ), then ψ ≡ ψ′ , χ ≡ χ′ , and ∗ = ∗′ .
So suppose both φ ≡ (ψ ∗ χ) and φ ≡ (ψ′ ∗′ χ′ ). Then either ψ ≡ ψ′ or not.
If it is, clearly ∗ = ∗′ and χ ≡ χ′ , since they then are substrings of φ that begin
in the same place and are of the same length. The other case is ψ ̸≡ ψ′ . Since
ψ and ψ′ are both substrings of φ that begin at the same place, one must be a
proper prefix of the other. But this is impossible by Lemma 15.12.
In each case, we intend the specific indicated occurrence of the main oper-
ator in the formula. For instance, since the formula ((θ → α) → (α → θ )) is of
the form (ψ → χ) where ψ is (θ → α) and χ is (α → θ ), the second occurrence
of → is the main operator.
This is a recursive definition of a function which maps all non-atomic for-
mulas to their main operator occurrence. Because of the way formulas are de-
fined inductively, every formula φ satisfies one of the cases in Definition 15.15.
This guarantees that for each non-atomic formula φ a main operator exists.
Because each formula satisfies only one of these conditions, and because the
smaller formulas from which φ is constructed are uniquely determined in each
case, the main operator occurrence of φ is unique, and so we have defined a
function.
We call formulas by the names in Table 15.1 depending on which symbol
their main operator is.Recall, however, that defined operators do not officially
appear in formulas. They are just abbreviations, so officially they cannot be
the main operator of a formula. In proofs about all formulas they therefore do
not have to be treated separately.
Main operator Type of formula Example
none atomic (formula) ⊥, R ( t1 , . . . , t n ), t1 = t2
¬ negation ¬φ
∧ conjunction ( φ ∧ ψ)
∨ disjunction ( φ ∨ ψ)
→ conditional ( φ → ψ)
↔ biconditional ( φ ↔ ψ)
∀ universal (formula) ∀x φ
∃ existential (formula) ∃x φ
Table 15.1: Main operator and names of formulas
15.6 Subformulas
It is often useful to talk about the formulas that “make up” a given formula.
We call these its subformulas. Any formula counts as a subformula of itself; a
subformula of φ other than φ itself is a proper subformula.
Example 15.22. For any first-order language L, all L-formulas are L-strings,
but not conversely. For example,
)(v0 → ∃
1. φi ≡ ¬ φ j .
2. φi ≡ ( φ j ∧ φk ).
3. φi ≡ ( φ j ∨ φk ).
4. φi ≡ ( φ j → φk ).
5. φi ≡ ∀ x φ j .
6. φi ≡ ∃ x φ j .
Example 15.26.
⟨A10 (v0 ), A11 (c1 ), (A11 (c1 ) ∧ A10 (v0 )), ∃v0 (A11 (c1 ) ∧ A10 (v0 ))⟩
⟨A10 (v0 ), A11 (c1 ), (A11 (c1 ) ∧ A10 (v0 )), A11 (c1 ),
∀v1 A10 (v0 ), ∃v0 (A11 (c1 ) ∧ A10 (v0 ))⟩.
As can be seen from the second example, formation sequences may contain
“junk”: formulas which are redundant or do not contribute to the construc-
tion.
We can also prove the converse. This is important because it shows that
our two ways of defining formulas are equivalent: they give the same results.
It also means that we can prove theorems about formulas by using ordinary
induction on the length of formation sequences.
Proof. Exercise.
Theorem 15.29. Frm(L) is the set of all expressions (strings of symbols) in the lan-
guage L with a formation sequence.
Proof. Let F be the set of all strings of symbols in the language L that have a
formation sequence. We have seen in Proposition 15.27 that Frm(L) ⊆ F, so
now we prove the converse.
Suppose φ has a formation sequence ⟨ φ0 , . . . , φn ⟩. We prove that φ ∈
Frm(L) by strong induction on n. Our induction hypothesis is that every
string of symbols with a formation sequence of length m < n is in Frm(L). By
the definition of a formation sequence, either φn is atomic or there must exist
j, k < n such that one of the following is the case:
1. φi ≡ ¬ φ j .
2. φi ≡ ( φ j ∧ φk ).
3. φi ≡ ( φ j ∨ φk ).
4. φi ≡ ( φ j → φk ).
5. φi ≡ ∀ x φ j .
6. φi ≡ ∃ x φ j .
Formation sequences for terms have similar properties to those for formu-
las.
Proposition 15.30. Trm(L) is the set of all expressions t in the language L such
that there exists a (term) formation sequence fo t.
Proof. Exercise.
There are two types of “junk” that can appear in formation sequences: re-
peated elements, and elements that are irrelevant to the construction of the
formation or term. We can eliminate both by looking at minimal formation
sequences.
1. ψ is a sub-formula of φ.
Proof. Exercise.
ψ is the scope of the first ∀v0 , χ is the scope of ∃v1 , and θ is the scope of
the second ∀v0 . The first ∀v0 binds the occurrences of v0 in ψ, ∃v1 binds the
occurrence of v1 in χ, and the second ∀v0 binds the occurrence of v0 in θ. The
first occurrence of v1 and the fourth occurrence of v0 are free in φ. The last
occurrence of v0 is free in θ, but bound in χ and φ.
15.9 Substitution
Definition 15.38 (Substitution in a term). We define s[t/x ], the result of sub-
stituting t for every occurrence of x in s, recursively:
1. s ≡ c: s[t/x ] is just s.
3. s ≡ x: s[t/x ] is t.
Example 15.40.
1. φ ≡ ⊥: φ[t/x ] is ⊥.
Note that substitution may be vacuous: If x does not occur in φ at all, then
φ[t/x ] is just φ.
The restriction that t must be free for x in φ is necessary to exclude cases
like the following. If φ ≡ ∃y x < y and t ≡ y, then φ[t/x ] would be ∃y y <
y. In this case the free variable y is “captured” by the quantifier ∃y upon
substitution, and that is undesirable. For instance, we would like it to be the
case that whenever ∀ x ψ holds, so does ψ[t/x ]. But consider ∀ x ∃y x < y (here
ψ is ∃y x < y). It is a sentence that is true about, e.g., the natural numbers:
for every number x there is a number y greater than it. If we allowed y as a
possible substitution for x, we would end up with ψ[y/x ] ≡ ∃y y < y, which
is false. We prevent this by requiring that none of the free variables in t would
end up being bound by a quantifier in φ.
We often use the following convention to avoid cumbersome notation: If
φ is a formula which may contain the variable x free, we also write φ( x ) to
indicate this. When it is clear which φ and x we have in mind, and t is a term
(assumed to be free for x in φ( x )), then we write φ(t) as short for φ[t/x ]. So
for instance, we might say, “we call φ(t) an instance of ∀ x φ( x ).” By this we
mean that if φ is any formula, x a variable, and t a term that’s free for x in φ,
then φ[t/x ] is an instance of ∀ x φ.
Problems
Problem 15.1. Prove Lemma 15.8.
Problem 15.4. Prove Proposition 15.13 (Hint: Formulate and prove a version
of Lemma 15.12 for terms.)
Problem 15.8. Prove Proposition 15.30. Hint: use a similar strategy to that
used in the proof of Theorem 15.29.
16.1 Introduction
222
16.2. STRUCTURES FOR FIRST-ORDER LANGUAGES
1. |N| = N
2. 0N = 0
However, there are many other possible structures for L A . For instance,
we might take as the domain the set Z of integers instead of N, and define the
interpretations of 0, ′, +, ×, < accordingly. But we can also define structures
for L A which have nothing even remotely to do with numbers.
Example 16.3. A structure M for the language L Z of set theory requires just a
set and a single-two place relation. So technically, e.g., the set of people plus
the relation “x is older than y” could be used as a structure for L Z , as well as
N together with n ≥ m for n, m ∈ N.
A particularly interesting structure for L Z in which the elements of the
domain are actually sets, and the interpretation of ∈ actually is the relation “x
is an element of y” is the structure HF of hereditarily finite sets:
Example 16.6. Let L be the language with constant symbols zer o, one, tw o,
. . . , the binary predicate symbol <, and the binary function symbols + and
×. Then a structure M for L is the one with domain |M| = {0, 1, 2, . . .} and
to 5, and similarly for the binary function symbol ×. Hence, the value of
f our is just 4, and the value of ×(tw o, +(thr ee, z er o )) (or in infix notation,
tw o × (thr ee + z er o )) is
name elements of the domain. For this we define the value of terms induc-
tively. For constant symbols and variables the value is just as the structure or
the variable assignment specifies it; for more complex terms it is computed
recursively using the functions the structure assigns to the function symbols.
1. t ≡ c: ValM M
s (t) = c .
2. t ≡ x: ValM
s ( t ) = s ( x ).
3. t ≡ f (t1 , . . . , tn ):
ValM M M M
s ( t ) = f (Vals ( t1 ), . . . , Vals ( tn )).
1. φ ≡ ⊥: M, s ⊭ φ.
3. φ ≡ t1 = t2 : M, s ⊨ φ iff ValM M
s ( t1 ) = Vals ( t2 ).
4. φ ≡ ¬ψ: M, s ⊨ φ iff M, s ⊭ ψ.
The variable assignments are important in the last two clauses. We cannot
define satisfaction of ∀ x ψ( x ) by “for all m ∈ |M|, M ⊨ ψ(m).” We cannot
define satisfaction of ∃ x ψ( x ) by “for at least one m ∈ |M|, M ⊨ ψ(m).” The
reason is that if m ∈ |M|, it is not a symbol of the language, and so ψ(m) is not
a formula (that is, ψ[m/x ] is undefined). We also cannot assume that we have
constant symbols or terms available that name every element of M, since there
is nothing in the definition of structures that requires it. In the standard lan-
guage, the set of constant symbols is denumerable, so if |M| is not enumerable
there aren’t even enough constant symbols to name every object.
We solve this problem by introducing variable assignments, which allow
us to link variables directly with elements of the domain. Then instead of
saying that, e.g., ∃ x ψ( x ) is satisfied in M iff for at least one m ∈ |M|, we say
it is satisfied in M relative to s iff ψ( x ) is satisfied relative to s[m/x ] for at least
one m ∈ |M|.
1. |M| = {1, 2, 3, 4}
2. aM = 1
3. bM = 2
4. f M ( x, y) = x + y if x + y ≤ 3 and = 3 otherwise.
ValM M M M
s ( f ( a, b )) = f (Vals ( a ), Vals ( b )).
ValM M
s ( f ( a, b )) = f (1, 2) = 1 + 2 = 3.
ValM M M M M
s ( f ( f ( a, b ), a )) = f (Vals ( f ( a, b )), Vals ( a )) = f (3, 1) = 3,
ValM M M M M
s ( f ( f ( a, b ), x )) = f (Vals ( f ( a, b )), Vals ( x )) = f (3, 1) = 3,
M, s ⊨ R(b, x ) ∨ R( x, b) iff
M, s ⊨ R(b, x ) or M, s ⊨ R( x, b)
s1 = s[1/x ], s2 = s[2/x ],
s3 = s[3/x ], s4 = s[4/x ].
So, e.g., s2 ( x ) = 2 and s2 (y) = s(y) = 1 for all variables y other than x. These
are all the x-variants of s for the structure M, since |M| = {1, 2, 3, 4}. Note, in
particular, that s1 = s (s is always an x-variant of itself).
To determine if an existentially quantified formula ∃ x φ( x ) is satisfied, we
have to determine if M, s[m/x ] ⊨ φ( x ) for at least one m ∈ |M|. So,
M, s ⊨ ∃ x ( R(b, x ) ∨ R( x, b)),
since M, s[1/x ] ⊨ R(b, x ) ∨ R( x, b) (s[3/x ] would also fit the bill). But,
M, s ⊭ ∃ x ( R(b, x ) ∧ R( x, b))
M, s ⊨ ∀ x ( R( x, a) → R( a, x )),
M, s ⊭ ∀ x ( R( a, x ) → R( x, a))
∀ x ( R( a, x ) → ∃y R( x, y)).
M, s ⊨ ∀ x ( R( a, x ) → ∃y R( x, y)).
M, s ⊭ ∃ x ( R( a, x ) ∧ ∀y R( x, y)).
Proof. By induction on the complexity of t. For the base case, t can be a con-
stant symbol or one of the variables x1 , . . . , xn . If t = c, then ValM s1 ( t ) = c
M =
ValM
s2 ( t ). If t = xi , s1 ( xi ) = s2 ( xi ) by the hypothesis of the proposition, and so
Vals1 (t) = s1 ( xi ) = s2 ( xi ) = ValM
M
s2 ( t ).
For the inductive step, assume that t = f (t1 , . . . , tk ) and that the claim
holds for t1 , . . . , tk . Then
ValM M
s1 ( t ) = Vals1 ( f ( t1 , . . . , tk )) =
= f M (ValM M
s1 ( t1 ), . . . , Vals1 ( tk ))
ValM M
s1 ( t ) = Vals1 ( f ( t1 , . . . , tk )) =
= f M (ValM M
s1 ( t1 ), . . . , Vals1 ( tk )) =
= f M (ValM M
s2 ( t1 ), . . . , Vals2 ( tk )) =
= ValM M
s2 ( f ( t1 , . . . , tk )) = Vals2 ( t ).
Proof. We use induction on the complexity of φ. For the base case, where φ is
atomic, φ can be: ⊥, R(t1 , . . . , tk ) for a k-place predicate R and terms t1 , . . . , tk ,
or t1 = t2 for terms t1 and t2 .
1. φ ≡ ⊥: both M, s1 ⊭ φ and M, s2 ⊭ φ.
⟨ValM M M
s1 ( t1 ), . . . , Vals1 ( tk )⟩ ∈ R .
For i = 1, . . . , k, ValM M
s1 ( ti ) = Vals2 ( ti ) by Proposition 16.13. So we also
M M
have ⟨Vals2 (ti ), . . . , Vals2 (tk )⟩ ∈ RM .
ValM M
s2 ( t1 ) = Vals1 ( t1 ) (by Proposition 16.13)
= ValM
s1 ( t 2 ) (since M, s1 ⊨ t1 = t2 )
= ValM
s2 ( t 2 ) (by Proposition 16.13),
so M, s2 ⊨ t1 = t2 .
2. φ ≡ ψ ∧ χ: exercise.
3. φ ≡ ψ ∨ χ: if M, s1 ⊨ φ, then M, s1 ⊨ ψ or M, s1 ⊨ χ. By induction
hypothesis, M, s2 ⊨ ψ or M, s2 ⊨ χ, so M, s2 ⊨ φ.
4. φ ≡ ψ → χ: exercise.
6. φ ≡ ∀ x ψ: exercise.
Proof. Exercise.
Proof. Exercise.
16.6 Extensionality
Extensionality, sometimes called relevance, can be expressed informally as fol-
lows: the only factors that bear upon the satisfaction of formula φ in a struc-
ture M relative to a variable assignment s, are the size of the domain and the
assignments made by M and s to the elements of the language that actually
appear in φ.
One immediate consequence of extensionality is that where two struc-
tures M and M′ agree on all the elements of the language appearing in a sen-
tence φ and have the same domain, M and M′ must also agree on whether or
not φ itself is true.
Then prove the proposition by induction on φ, making use of the claim just
proved for the induction basis (where φ is atomic).
Proof. By induction on t.
′
ValM
s ( t [ t /x ]) =
′ ′
= ValM
s ( f ( t1 [ t /x ], . . . , tn [ t /x ]))
by definition of t[t′ /x ]
′ ′
= f M (ValM M
s ( t1 [ t /x ]), . . . , Vals ( tn [ t /x ]))
by definition of ValM
s ( f ( . . . ))
= f M (ValM (t ), . . . , ValM
s[ValM (t′ )/x ] 1
(t ))
s[ValM (t′ )/x ] n
s s
by induction hypothesis
= ValM
s[ValM (t′ )/x ]
(t) by definition of ValM
s[ValM (t′ )/x ]
( f (. . . ))
s s
Proof. Exercise.
Proof. For the forward direction, let φ be valid, and let Γ be a set of sentences.
Let M be a structure so that M ⊨ Γ. Since φ is valid, M ⊨ φ, hence Γ ⊨ φ.
For the contrapositive of the reverse direction, let φ be invalid, so there is
a structure M with M ⊭ φ. When Γ = {⊤}, since ⊤ is valid, M ⊨ Γ. Hence,
there is a structure M so that M ⊨ Γ but M ⊭ φ, hence Γ does not entail φ.
Proof. For the forward direction, suppose Γ ⊨ φ and suppose to the contrary
that there is a structure M so that M ⊨ Γ ∪ {¬ φ}. Since M ⊨ Γ and Γ ⊨ φ,
M ⊨ φ. Also, since M ⊨ Γ ∪ {¬ φ}, M ⊨ ¬ φ, so we have both M ⊨ φ and
M ⊭ φ, a contradiction. Hence, there can be no such structure M, so Γ ∪ {¬ φ}
is unsatisfiable.
For the reverse direction, suppose Γ ∪ {¬ φ} is unsatisfiable. So for every
structure M, either M ⊭ Γ or M ⊨ φ. Hence, for every structure M with
M ⊨ Γ, M ⊨ φ, so Γ ⊨ φ.
Proposition 16.30. Let M be a structure, and φ( x ) a formula with one free vari-
able x, and t a closed term. Then:
1. φ(t) ⊨ ∃ x φ( x )
2. ∀ x φ( x ) ⊨ φ(t)
2. Exercise.
Problems
Problem 16.1. Is N, the standard model of arithmetic, covered? Explain.
Problem 16.2. Let L = {c, f , A} with one constant symbol, one one-place
function symbol and one two-place predicate symbol, and let the structure M
be given by
1. |M| = {1, 2, 3}
2. cM = 3
1. φ ≡ ⊥: not M ||= φ.
3. φ ≡ d1 = d2 : M ||= φ iff dM M
1 = d2 .
8. φ ≡ ∀ x ψ: M ||= φ iff for all a ∈ |M|, M[ a/c] ||= ψ[c/x ], if c does not
occur in ψ.
Problem 16.7. Suppose that f is a function symbol not in φ( x, y). Show that
there is a structure M such that M ⊨ ∀ x ∃y φ( x, y) iff there is an M′ such that
M′ ⊨ ∀ x φ( x, f ( x )).
(This problem is a special case of what’s known as Skolem’s Theorem;
∀ x φ( x, f ( x )) is called a Skolem normal form of ∀ x ∃y φ( x, y).)
17.1 Introduction
The development of the axiomatic method is a significant achievement in the
history of science, and is of special importance in the history of mathemat-
ics. An axiomatic development of a field involves the clarification of many
questions: What is the field about? What are the most fundamental concepts?
How are they related? Can all the concepts of the field be defined in terms of
these fundamental concepts? What laws do, and must, these concepts obey?
The axiomatic method and logic were made for each other. Formal logic
provides the tools for formulating axiomatic theories, for proving theorems
from the axioms of the theory in a precisely specified way, for studying the
properties of all systems satisfying the axioms in a systematic way.
238
17.1. INTRODUCTION
2. We may fail in this respect because there are M such that M ⊨ Γ, but M
is not one of the structures we intend. This may lead us to add axioms
which are not true in M.
3. If we are successful at least in the respect that Γ is true in all the intended
structures, then a sentence φ is true in all intended structures whenever
Γ ⊨ φ. Thus we can use logical tools (such as derivation methods) to
show that sentences are true in all intended structures simply by show-
ing that they are entailed by the axioms.
( φ(0) ∧ ∀ x ( φ( x ) → φ( x ′ ))) → ∀ x φ( x )
Since there are infinitely many sentences of the latter form, this axiom sys-
tem is infinite. The latter form is called the induction schema. (Actually, the
induction schema is a bit more complicated than we let on here.)
The last axiom is an explicit definition of <.
Example 17.7. The theory of pure sets plays an important role in the founda-
tions (and in the philosophy) of mathematics. A set is pure if all its elements
are also pure sets. The empty set counts therefore as pure, but a set that has
something as an element that is not a set would not be pure. So the pure sets
are those that are formed just from the empty set and no “urelements,” i.e.,
objects that are not themselves sets.
The following might be considered as an axiom system for a theory of pure
sets:
∃ x ¬∃y y ∈ x
∀ x ∀y (∀z(z ∈ x ↔ z ∈ y) → x = y)
∀ x ∀y ∃z ∀u (u ∈ z ↔ (u = x ∨ u = y))
∀ x ∃y ∀z (z ∈ y ↔ ∃u (z ∈ u ∧ u ∈ x ))
∃ x ∀y (y ∈ x ↔ φ(y))
The first axiom says that there is a set with no elements (i.e., ∅ exists); the
second says that sets are extensional; the third that for any sets X and Y, the
set { X, Y } exists; the fourth that for any set X, the set ∪ X exists, where ∪ X is
the union of all the elements of X.
The sentences mentioned last are collectively called the naive comprehension
scheme. It essentially says that for every φ( x ), the set { x : φ( x )} exists—so
at first glance a true, useful, and perhaps even necessary axiom. It is called
“naive” because, as it turns out, it makes this theory unsatisfiable: if you take
φ(y) to be ¬y ∈ y, you get the sentence
∃ x ∀y (y ∈ x ↔ ¬y ∈ y)
∀ x P ( x, x )
∀ x ∀y ((P ( x, y) ∧ P (y, x )) → x = y)
∀ x ∀y ∀z ((P ( x, y) ∧ P (y, z)) → P ( x, z))
Moreover, any two objects have a mereological sum (an object that has these
two objects as parts, and is minimal in this respect).
These are only some of the basic principles of parthood considered by meta-
physicians. Further principles, however, quickly become hard to formulate or
write down without first introducing some defined relations. For instance,
Note that we have to involve variable assignments here: we can’t just say “Rab
iff M ⊨ A20 ( a, b)” because a and b are not symbols of our language: they are
elements of |M|.
Since we don’t just have atomic formulas, but can combine them using
the logical connectives and the quantifiers, more complex formulas can define
other relations which aren’t directly built into M. We’re interested in how to
do that, and specifically, which relations we can define in a structure.
This idea is not just interesting in specific structures, but generally when-
ever we use a language to describe an intended model or models, i.e., when
we consider theories. These theories often only contain a few predicate sym-
bols as basic symbols, but in the domain they are used to describe often many
other relations play an important role. If these other relations can be system-
atically expressed by the relations that interpret the basic predicate symbols
of the language, we say we can define them in the language.
∀z (z ∈ x → z ∈ y)
∀ x ∀y ((∀z (z ∈ x → z ∈ y) ∧ ∀z (z ∈ y → z ∈ x )) → x = y).
example, to express the fact that ∅ is a subset of every set, we could write
∃ x (¬∃y y ∈ x ∧ ∀z x ⊆ z)
∀u ((u ∈ x ∨ u ∈ y) ↔ u ∈ z)
∀u (u ⊆ x ↔ u ∈ y)
since the elements of X ∪ Y are exactly the sets that are either elements of X or
elements of Y, and the elements of ℘( X ) are exactly the subsets of X. However,
this doesn’t allow us to use x ∪ y or ℘( x ) as if they were terms: we can only
use the entire formulas that define the relations X ∪ Y = Z and ℘( X ) = Y.
In fact, we do not know that these relations are ever satisfied, i.e., we do not
know that unions and power sets always exist. For instance, the sentence
∀ x ∃y ℘( x ) = y is another axiom of ZFC (the power set axiom).
Now what about talk of ordered pairs or functions? Here we have to ex-
plain how we can think of ordered pairs and functions as special kinds of sets.
One way to define the ordered pair ⟨ x, y⟩ is as the set {{ x }, { x, y}}. But like
before, we cannot introduce a function symbol that names this set; we can
only define the relation ⟨ x, y⟩ = z, i.e., {{ x }, { x, y}} = z:
∀u (u ∈ z ↔ (∀v (v ∈ u ↔ v = x ) ∨ ∀v (v ∈ u ↔ (v = x ∨ v = y))))
This says that the elements u of z are exactly those sets which either have x
as its only element or have x and y as its only elements (in other words, those
sets that are either identical to { x } or identical to { x, y}). Once we have this,
we can say further things, e.g., that X × Y = Z:
∀z (z ∈ Z ↔ ∃ x ∃y ( x ∈ X ∧ y ∈ Y ∧ ⟨ x, y⟩ = z))
∀u (u ∈ f → ∃ x ∃y ( x ∈ X ∧ y ∈ Y ∧ ⟨ x, y⟩ = u)) ∧
∀ x ( x ∈ X → (∃y (y ∈ Y ∧ maps( f , x, y)) ∧
(∀y ∀y′ ((maps( f , x, y) ∧ maps( f , x, y′ )) → y = y′ )))
f : X → Y ∧ ∀ x ∀ x ′ (( x ∈ X ∧ x ′ ∈ X ∧
∃y (maps( f , x, y) ∧ maps( f , x ′ , y))) → x = x ′ )
A function f : X → Y is injective iff, whenever f maps x, x ′ ∈ X to a single y,
x = x ′ . If we abbreviate this formula as inj( f , X, Y ), we’re already in a position
to state in the language of set theory something as non-trivial as Cantor’s
theorem: there is no injective function from ℘( X ) to X:
∀ X ∀Y (℘( X ) = Y → ¬∃ f inj( f , Y, X ))
One might think that set theory requires another axiom that guarantees
the existence of a set for every defining property. If φ( x ) is a formula of set
theory with the variable x free, we can consider the sentence
∃y ∀ x ( x ∈ y ↔ φ( x )).
This sentence states that there is a set y whose elements are all and only those
x that satisfy φ( x ). This schema is called the “comprehension principle.” It
looks very useful; unfortunately it is inconsistent. Take φ( x ) ≡ ¬ x ∈ x, then
the comprehension principle states
∃y ∀ x ( x ∈ y ↔ x ∈
/ x ),
i.e., it states the existence of a set of all sets that are not elements of them-
selves. No such set can exist—this is Russell’s Paradox. ZFC, in fact, contains
a restricted—and consistent—version of this principle, the separation princi-
ple:
∀z ∃y ∀ x ( x ∈ y ↔ ( x ∈ z ∧ φ( x )).
φ ≥ n ≡ ∃ x1 ∃ x2 . . . ∃ x n
( x1 ̸ = x2 ∧ x1 ̸ = x3 ∧ x1 ̸ = x4 ∧ · · · ∧ x1 ̸ = x n ∧
x2 ̸ = x3 ∧ x2 ̸ = x4 ∧ · · · ∧ x2 ̸ = x n ∧
..
.
x n −1 ̸ = x n )
φ = n ≡ ∃ x1 ∃ x2 . . . ∃ x n
( x1 ̸ = x2 ∧ x1 ̸ = x3 ∧ x1 ̸ = x4 ∧ · · · ∧ x1 ̸ = x n ∧
x2 ̸ = x3 ∧ x2 ̸ = x4 ∧ · · · ∧ x2 ̸ = x n ∧
..
.
x n −1 ̸ = x n ∧
∀y (y = x1 ∨ · · · ∨ y = xn ))
{ φ ≥1 , φ ≥2 , φ ≥3 , . . . } .
Problems
Problem 17.1. Find formulas in L A which define the following relations:
1. n is between i and j;
Problem 17.2. Suppose the formula φ(v1 , v2 ) expresses the relation R ⊆ |M|2
in a structure M. Find formulas that express the following relations:
2. {1} is definable in N;
3. {2} is definable in N;
∃y ∀ x ( x ∈ y ↔ x ∈
/ x ) ⊢ ⊥.
Derivation Systems
18.1 Introduction
Logics commonly have both a semantics and a derivation system. The seman-
tics concerns concepts such as truth, satisfiability, validity, and entailment.
The purpose of derivation systems is to provide a purely syntactic method
of establishing entailment and validity. They are purely syntactic in the sense
that a derivation in such a system is a finite syntactic object, usually a sequence
(or other finite arrangement) of sentences or formulas. Good derivation sys-
tems have the property that any given sequence or arrangement of sentences
or formulas can be verified mechanically to be “correct.”
The simplest (and historically first) derivation systems for first-order logic
were axiomatic. A sequence of formulas counts as a derivation in such a sys-
tem if each individual formula in it is either among a fixed set of “axioms”
or follows from formulas coming before it in the sequence by one of a fixed
number of “inference rules”—and it can be mechanically verified if a formula
is an axiom and whether it follows correctly from other formulas by one of the
inference rules. Axiomatic derivation systems are easy to describe—and also
easy to handle meta-theoretically—but derivations in them are hard to read
and understand, and are also hard to produce.
Other derivation systems have been developed with the aim of making it
easier to construct derivations or easier to understand derivations once they
are complete. Examples are natural deduction, truth trees, also known as
tableaux proofs, and the sequent calculus. Some derivation systems are de-
249
CHAPTER 18. DERIVATION SYSTEMS
1. ⊢ φ if and only if ⊨ φ
2. Γ ⊢ φ if and only if Γ ⊨ φ
The “only if” direction of the above is called soundness. A derivation system is
sound if derivability guarantees entailment (or validity). Every decent deriva-
tion system has to be sound; unsound derivation systems are not useful at all.
After all, the entire purpose of a derivation is to provide a syntactic guarantee
of validity or entailment. We’ll prove soundness for the derivation systems
we present.
The converse “if” direction is also important: it is called completeness. A
complete derivation system is strong enough to show that φ is a theorem
whenever φ is valid, and that Γ ⊢ φ whenever Γ ⊨ φ. Completeness is harder
to establish, and some logics have no complete derivation systems. First-order
logic does. Kurt Gödel was the first one to prove completeness for a derivation
system of first-order logic in his 1929 dissertation.
Another concept that is connected to derivation systems is that of consis-
tency. A set of sentences is called inconsistent if anything whatsoever can be
derived from it, and consistent otherwise. Inconsistency is the syntactic coun-
terpart to unsatisfiablity: like unsatisfiable sets, inconsistent sets of sentences
do not make good theories, they are defective in a fundamental way. Con-
sistent sets of sentences may not be true or useful, but at least they pass that
minimal threshold of logical usefulness. For different derivation systems the
specific definition of consistency of sets of sentences might differ, but like ⊢,
we want consistency to coincide with its semantic counterpart, satisfiability.
We want it to always be the case that Γ is consistent if and only if it is satis-
fiable. Here, the “if” direction amounts to completeness (consistency guaran-
tees satisfiability), and the “only if” direction amounts to soundness (satisfi-
ability guarantees consistency). In fact, for classical first-order logic, the two
versions of soundness and completeness are equivalent.
φ ⇒ φ
φ∧ψ ⇒ φ
∧L
→R
⇒ ( φ ∧ ψ) → φ
[ φ ∧ ψ ]1
φ ∧Elim
1 →Intro
( φ ∧ ψ) → φ
inference.
A set Γ is inconsistent iff Γ ⊢ ⊥ in natural deduction. The rule ⊥ I makes
it so that from an inconsistent set, any sentence can be derived.
Natural deduction systems were developed by Gerhard Gentzen and Sta-
nisław Jaśkowski in the 1930s, and later developed by Dag Prawitz and Fred-
eric Fitch. Because its inferences mirror natural methods of proof, it is favored
by philosophers. The versions developed by Fitch are often used in introduc-
tory logic textbooks. In the philosophy of logic, the rules of natural deduc-
tion have sometimes been taken to give the meanings of the logical operators
(“proof-theoretic semantics”).
18.4 Tableaux
T φ or F φ.
{F φ, Tψ1 , . . . , Tψn }
1. F ( φ ∧ ψ) → φ Assumption
2. Tφ ∧ ψ →F 1
3. Fφ →F 1
4. Tφ →T 2
5. Tψ →T 2
⊗
1. φ is an axiom, or
φ → (ψ → φ) ψ → (ψ ∨ χ) (ψ ∧ χ) → ψ
are common axioms that govern →, ∨ and ∧. Some axiom systems aim at a
minimal number of axioms. Depending on the connectives that are taken as
primitives, it is even possible to find axiom systems that consist of a single
axiom.
A rule of inference is a conditional statement that gives a sufficient condi-
tion for a sentence in a derivation to be justified. Modus ponens is one very
common such rule: it says that if φ and φ → ψ are already justified, then ψ is
justified. This means that a line in a derivation containing the sentence ψ is
justified, provided that both φ and φ → ψ (for some sentence φ) appear in the
derivation before ψ.
The ⊢ relation based on axiomatic derivations is defined as follows: Γ ⊢ φ
iff there is a derivation with the sentence φ as its last formula (and Γ is taken
as the set of sentences in that derivation which are justified by (2) above). φ
is a theorem if φ has a derivation where Γ is empty, i.e., every sentence in the
derivation is justfied either by (1) or (3). For instance, here is a derivation that
shows that ⊢ φ → (ψ → (ψ ∨ φ)):
1. ψ → (ψ ∨ φ)
2. (ψ → (ψ ∨ φ)) → ( φ → (ψ → (ψ ∨ φ)))
3. φ → (ψ → (ψ ∨ φ))
The sentence on line 1 is of the form of the axiom φ → ( φ ∨ ψ) (with the roles
of φ and ψ reversed). The sentence on line 2 is of the form of the axiom φ →
(ψ → φ). Thus, both lines are justified. Line 3 is justified by modus ponens: if
we abbreviate it as θ, then line 2 has the form χ → θ, where χ is ψ → (ψ ∨ φ),
i.e., line 1.
A set Γ is inconsistent if Γ ⊢ ⊥. A complete axiom system will also prove
that ⊥ → φ for any φ, and so if Γ is inconsistent, then Γ ⊢ φ for any φ.
Systems of axiomatic derivations for logic were first given by Gottlob Frege
in his 1879 Begriffsschrift, which for this reason is often considered the first
work of modern logic. They were perfected in Alfred North Whitehead and
Bertrand Russell’s Principia Mathematica and by David Hilbert and his stu-
dents in the 1920s. They are thus often called “Frege systems” or “Hilbert
systems.” They are very versatile in that it is often easy to find an axiomatic
system for a logic. Because derivations have a very simple structure and only
one or two inference rules, it is also relatively easy to prove things about them.
However, they are very hard to use in practice, i.e., it is difficult to find and
write proofs.
Γ⇒∆
where Γ and ∆ are finite (possibly empty) sequences of sentences of the lan-
guage L. Γ is called the antecedent, while ∆ is the succedent.
The intuitive idea behind a sequent is: if all of the sentences in the an-
tecedent hold, then at least one of the sentences in the succedent holds. That
is, if Γ = ⟨ φ1 , . . . , φm ⟩ and ∆ = ⟨ψ1 , . . . , ψn ⟩, then Γ ⇒ ∆ holds iff
( φ1 ∧ · · · ∧ φm ) → (ψ1 ∨ · · · ∨ ψn )
holds. There are two special cases: where Γ is empty and when ∆ is empty.
When Γ is empty, i.e., m = 0, ⇒ ∆ holds iff ψ1 ∨ · · · ∨ ψn holds. When ∆ is
empty, i.e., n = 0, Γ ⇒ holds iff ¬( φ1 ∧ · · · ∧ φm ) does. We say a sequent is
valid iff the corresponding sentence is valid.
If Γ is a sequence of sentences, we write Γ, φ for the result of appending
φ to the right end of Γ (and φ, Γ for the result of appending φ to the left end
of Γ). If ∆ is a sequence of sentences also, then Γ, ∆ is the concatenation of the
two sequences.
256
19.2. PROPOSITIONAL RULES
1. φ ⇒ φ
2. ⊥ ⇒
Rules for ¬
Γ ⇒ ∆, φ φ, Γ ⇒ ∆
¬L ¬R
¬ φ, Γ ⇒ ∆ Γ ⇒ ∆, ¬ φ
Rules for ∧
φ, Γ ⇒ ∆
∧L
φ ∧ ψ, Γ ⇒ ∆ Γ ⇒ ∆, φ Γ ⇒ ∆, ψ
∧R
ψ, Γ ⇒ ∆ Γ ⇒ ∆, φ ∧ ψ
∧L
φ ∧ ψ, Γ ⇒ ∆
Rules for ∨
Γ ⇒ ∆, φ
∨R
φ, Γ ⇒ ∆ ψ, Γ ⇒ ∆ Γ ⇒ ∆, φ ∨ ψ
∨L
φ ∨ ψ, Γ ⇒ ∆ Γ ⇒ ∆, ψ
∨R
Γ ⇒ ∆, φ ∨ ψ
Rules for →
Γ ⇒ ∆, φ ψ, Π ⇒ Λ φ, Γ ⇒ ∆, ψ
→L →R
φ → ψ, Γ, Π ⇒ ∆, Λ Γ ⇒ ∆, φ → ψ
Rules for ∀
φ ( t ), Γ ⇒ ∆ Γ ⇒ ∆, φ( a)
∀L ∀R
∀ x φ ( x ), Γ ⇒ ∆ Γ ⇒ ∆, ∀ x φ( x )
Rules for ∃
φ ( a ), Γ ⇒ ∆ Γ ⇒ ∆, φ(t)
∃L ∃R
∃ x φ ( x ), Γ ⇒ ∆ Γ ⇒ ∆, ∃ x φ( x )
Again, t is a closed term, and a is a constant symbol which does not occur in
the lower sequent of the ∃L rule. We call a the eigenvariable of the ∃L inference.
The condition that an eigenvariable not occur in the lower sequent of the
∀R or ∃L inference is called the eigenvariable condition.
Recall the convention that when φ is a formula with the variable x free, we
indicate this by writing φ( x ). In the same context, φ(t) then is short for φ[t/x ].
So we could also write the ∃R rule as:
Γ ⇒ ∆, φ[t/x ]
∃R
Γ ⇒ ∆, ∃ x φ
Note that t may already occur in φ, e.g., φ might be P (t, x ). Thus, inferring
Γ ⇒ ∆, ∃ x P (t, x ) from Γ ⇒ ∆, P (t, t) is a correct application of ∃R—you
may “replace” one or more, and not necessarily all, occurrences of t in the
premise by the bound variable x. However, the eigenvariable conditions in
1 We use the term “eigenvariable” even though a in the above rule is a constant symbol. This
∀R and ∃L require that the constant symbol a does not occur in φ. So, you
cannot correctly infer Γ ⇒ ∆, ∀ x P ( a, x ) from Γ ⇒ ∆, P ( a, a) using ∀R.
In ∃R and ∀L there are no restrictions on the term t. On the other hand,
in the ∃L and ∀R rules, the eigenvariable condition requires that the constant
symbol a does not occur anywhere outside of φ( a) in the upper sequent. It is
necessary to ensure that the system is sound, i.e., only derives sequents that
are valid. Without this condition, the following would be allowed:
φ( a) ⇒ φ( a) φ( a) ⇒ φ( a)
*∃L *∀R
∃ x φ( x ) ⇒ φ( a) φ( a) ⇒ ∀ x φ( x )
∀R ∃L
∃ x φ( x ) ⇒ ∀ x φ( x ) ∃ x φ( x ) ⇒ ∀ x φ( x )
Weakening
Γ ⇒ ∆ Γ ⇒ ∆
WL WR
φ, Γ ⇒ ∆ Γ ⇒ ∆, φ
Contraction
φ, φ, Γ ⇒ ∆ Γ ⇒ ∆, φ, φ
CL CR
φ, Γ ⇒ ∆ Γ ⇒ ∆, φ
Exchange
Γ, φ, ψ, Π ⇒ ∆ Γ ⇒ ∆, φ, ψ, Λ
XL XR
Γ, ψ, φ, Π ⇒ ∆ Γ ⇒ ∆, ψ, φ, Λ
Γ ⇒ ∆, φ φ, Π ⇒ Λ
Cut
Γ, Π ⇒ ∆, Λ
19.5 Derivations
We’ve said what an initial sequent looks like, and we’ve given the rules of
inference. Derivations in the sequent calculus are inductively generated from
these: each derivation either is an initial sequent on its own, or consists of one
or two derivations followed by an inference.
We then say that S is the end-sequent of the derivation and that S is derivable in
LK (or LK-derivable).
The rule, however, is meant to be general: we can replace the φ in the rule
with any sentence, e.g., also with θ. If the premise matches our initial sequent
χ ⇒ χ, that means that both Γ and ∆ are just χ, and the conclusion would
then be θ, χ ⇒ χ. So, the following is a derivation:
χ ⇒ χ
WL
θ, χ ⇒ χ
We can now apply another rule, say XL, which allows us to switch two sen-
tences on the left. So, the following is also a correct derivation:
χ ⇒ χ
WL
θ, χ ⇒ χ
XL
χ, θ ⇒ χ
Γ, φ, ψ, Π ⇒ ∆
XL
Γ, ψ, φ, Π ⇒ ∆,
both Γ and Π were empty, ∆ is χ, and the roles of φ and ψ are played by θ
and χ, respectively. In much the same way, we also see that
θ ⇒ θ
WL
χ, θ ⇒ θ
is a derivation. Now we can take these two derivations, and combine them
using ∧R. That rule was
Γ ⇒ ∆, φ Γ ⇒ ∆, ψ
∧R
Γ ⇒ ∆, φ ∧ ψ
In our case, the premises must match the last sequents of the derivations end-
ing in the premises. That means that Γ is χ, θ, ∆ is empty, φ is χ and ψ is θ. So
the conclusion, if the inference should be correct, is χ, θ ⇒ χ ∧ θ.
χ ⇒ χ
WL
θ, χ ⇒ χ θ ⇒ θ
XL WL
χ, θ ⇒ χ χ, θ ⇒ θ
∧R
χ, θ ⇒ χ ∧ θ
Of course, we can also reverse the premises, then φ would be θ and ψ would
be χ.
χ ⇒ χ
WL
θ ⇒ θ θ, χ ⇒ χ
WL XL
χ, θ ⇒ θ χ, θ ⇒ χ
∧R
χ, θ ⇒ θ ∧ χ
φ∧ψ ⇒ φ
Next, we need to figure out what kind of inference could have a lower sequent
of this form. This could be a structural rule, but it is a good idea to start by
looking for a logical rule. The only logical connective occurring in the lower
sequent is ∧, so we’re looking for an ∧ rule, and since the ∧ symbol occurs in
the antecedent, we’re looking at the ∧L rule.
φ∧ψ ⇒ φ
∧L
There are two options for what could have been the upper sequent of the ∧L
inference: we could have an upper sequent of φ ⇒ φ, or of ψ ⇒ φ. Clearly,
φ ⇒ φ is an initial sequent (which is a good thing), while ψ ⇒ φ is not
derivable in general. We fill in the upper sequent:
φ ⇒ φ
φ∧ψ ⇒ φ
∧L
¬φ ∨ ψ ⇒ φ → ψ
To find a logical rule that could give us this end-sequent, we look at the log-
ical connectives in the end-sequent: ¬, ∨, and →. We only care at the mo-
ment about ∨ and → because they are main operators of sentences in the end-
sequent, while ¬ is inside the scope of another connective, so we will take care
of it later. Our options for logical rules for the final inference are therefore the
∨L rule and the →R rule. We could pick either rule, really, but let’s pick the
→R rule (if for no reason other than it allows us to put off splitting into two
branches). According to the form of →R inferences which can yield the lower
sequent, this must look like:
φ, ¬ φ ∨ ψ ⇒ ψ
¬ φ ∨ ψ ⇒ φ → ψ →R
If we move ¬ φ ∨ ψ to the outside of the antecedent, we can apply the ∨L
rule. According to the schema, this must split into two upper sequents as
follows:
¬ φ, φ ⇒ ψ ψ, φ ⇒ ψ
¬ φ ∨ ψ, φ ⇒ ψ ∨L
φ, ¬ φ ∨ ψ ⇒ ψ XR
¬φ ∨ ψ ⇒ φ → ψ →R
Remember that we are trying to wind our way up to initial sequents; we seem
to be pretty close! The right branch is just one weakening and one exchange
away from an initial sequent and then it is done:
ψ ⇒ ψ
WL
φ, ψ ⇒ ψ
XL
¬ φ, φ ⇒ ψ ψ, φ ⇒ ψ
¬ φ ∨ ψ, φ ⇒ ψ ∨L
XR
φ, ¬ φ ∨ ψ ⇒ ψ
¬φ ∨ ψ ⇒ φ → ψ →R
Now looking at the left branch, the only logical connective in any sentence
is the ¬ symbol in the antecedent sentences, so we’re looking at an instance of
the ¬L rule.
ψ ⇒ ψ
WL
φ ⇒ ψ, φ φ, ψ ⇒ ψ
¬ φ, φ ⇒ ψ ¬ L
ψ, φ ⇒ ψ
XL
¬ φ ∨ ψ, φ ⇒ ψ ∨L
XR
φ, ¬ φ ∨ ψ ⇒ ψ
¬ φ ∨ ψ ⇒ φ → ψ →R
Similarly to how we finished off the right branch, we are just one weakening
and one exchange away from finishing off this left branch as well.
φ ⇒ φ
φ ⇒ φ, ψ WR ψ ⇒ ψ
φ ⇒ ψ, φ XR φ, ψ ⇒ ψ
WL
¬ φ, φ ⇒ ψ ¬L ψ, φ ⇒ ψ
XL
¬ φ ∨ ψ, φ ⇒ ψ
∨L
XR
φ, ¬ φ ∨ ψ ⇒ ψ
¬φ ∨ ψ ⇒ φ→ψ
→R
¬ φ ∨ ¬ψ ⇒ ¬( φ ∧ ψ)
The available main connectives of sentences in the end-sequent are the ∨ sym-
bol and the ¬ symbol. It would work to apply either the ∨L or the ¬R rule
here, but we start with the ¬R rule because it avoids splitting up into two
branches for a moment:
φ ∧ ψ, ¬ φ ∨ ¬ψ ⇒
¬R
¬ φ ∨ ¬ψ ⇒ ¬( φ ∧ ψ)
Now we have a choice of whether to look at the ∧L or the ∨L rule. Let’s see
what happens when we apply the ∧L rule: we have a choice to start with
either the sequent φ, ¬ φ ∨ ψ ⇒ or the sequent ψ, ¬ φ ∨ ψ ⇒ . Since the
derivation is symmetric with regards to φ and ψ, let’s go with the former:
φ, ¬ φ ∨ ¬ψ ⇒
φ ∧ ψ, ¬ φ ∨ ¬ψ ⇒
∧L
¬R
¬ φ ∨ ¬ψ ⇒ ¬( φ ∧ ψ)
The top of the right branch cannot be reduced any further, and it cannot be
brought by way of structural inferences to an initial sequent, so this is not the
right path to take. So clearly, it was a mistake to apply the ∧L rule above.
Going back to what we had before and carrying out the ∨L rule instead, we
get
¬ φ, φ ∧ ψ ⇒ ¬ψ, φ ∧ ψ ⇒
¬ φ ∨ ¬ψ, φ ∧ ψ ⇒ ∨L
XL
φ ∧ ψ, ¬ φ ∨ ¬ψ ⇒
¬R
¬ φ ∨ ¬ψ ⇒ ¬( φ ∧ ψ)
φ ⇒ φ ψ ⇒ ψ
φ∧ψ ⇒ φ
∧L φ∧ψ ⇒ ψ
∧L
¬ φ, φ ∧ ψ ⇒ ¬ L
¬ψ, φ ∧ ψ ⇒ ¬L
¬ φ ∨ ¬ψ, φ ∧ ψ ⇒ ∨ L
XL
φ ∧ ψ, ¬ φ ∨ ¬ψ ⇒
¬R
¬ φ ∨ ¬ψ ⇒ ¬( φ ∧ ψ)
(We could have carried out the ∧ rules lower than the ¬ rules in these steps
and still obtained a correct derivation).
Example 19.8. So far we haven’t used the contraction rule, but it is sometimes
required. Here’s an example where that happens. Suppose we want to prove
⇒ φ ∨ ¬ φ. Applying ∨R backwards would give us one of these two deriva-
tions:
φ ⇒
⇒ φ ⇒ ¬ φ ¬R
⇒ φ ∨ ¬ φ ∨R ⇒ φ ∨ ¬ φ ∨R
Neither of these of course ends in an initial sequent. The trick is to realize that
the contraction rule allows us to combine two copies of a sentence into one—
and when we’re searching for a proof, i.e., going from bottom to top, we can
keep a copy of φ ∨ ¬ φ in the premise, e.g.,
⇒ φ ∨ ¬ φ, φ
⇒ φ ∨ ¬ φ, φ ∨ ¬ φ ∨R
⇒ φ ∨ ¬φ CR
Now we can apply ∨R a second time, and also get ¬ φ, which leads to a com-
plete derivation.
φ ⇒ φ
⇒ φ, ¬ φ ¬R
⇒ φ, φ ∨ ¬ φ ∨R
⇒ φ ∨ ¬ φ, φ XR
⇒ φ ∨ ¬ φ, φ ∨ ¬ φ ∨R
⇒ φ ∨ ¬φ CR
∃ x ¬ φ( x ) ⇒ ¬∀ x φ( x )
We could either carry out the ∃L rule or the ¬R rule. Since the ∃L rule is
subject to the eigenvariable condition, it’s a good idea to take care of it sooner
rather than later, so we’ll do that one first.
¬ φ( a) ⇒ ¬∀ x φ( x )
∃L
∃ x ¬ φ( x ) ⇒ ¬∀ x φ( x )
∀ x φ( x ) ⇒ φ( a)
¬L
¬ φ ( a ), ∀ x φ ( x ) ⇒
XL
∀ x φ ( x ), ¬ φ ( a ) ⇒
¬R
¬ φ( a) ⇒ ¬∀ xφ( x )
∃L
∃ x ¬ φ( x ) ⇒ ¬∀ xφ( x )
At this point, our only option is to carry out the ∀L rule. Since this rule is not
subject to the eigenvariable restriction, we’re in the clear. Remember, we want
to try and obtain an initial sequent (of the form φ( a) ⇒ φ( a)), so we should
choose a as our argument for φ when we apply the rule.
φ( a) ⇒ φ( a)
∀L
∀ x φ( x ) ⇒ φ( a)
¬L
¬ φ ( a ), ∀ x φ ( x ) ⇒
XL
∀ x φ ( x ), ¬ φ ( a ) ⇒
¬R
¬ φ( a) ⇒ ¬∀ x φ( x )
∃L
∃ x ¬ φ( x ) ⇒ ¬∀ x φ( x )
This section collects the definitions of the provability relation and con-
sistency for natural deduction.
Because of the contraction, weakening, and exchange rules, the order and
number of sentences in Γ0′ does not matter: if a sequent Γ0′ ⇒ φ is deriv-
able, then so is Γ0′′ ⇒ φ for any Γ0′′ that contains the same sentences as Γ0′ .
For instance, if Γ0 = {ψ, χ} then both Γ0′ = ⟨ψ, ψ, χ⟩ and Γ0′′ = ⟨χ, χ, ψ⟩ are
sequences containing just the sentences in Γ0 . If a sequent containing one is
derivable, so is the other, e.g.:
ψ, ψ, χ ⇒ φ
CL
ψ, χ ⇒ φ
XL
χ, ψ ⇒ φ
WL
χ, χ, ψ ⇒ φ
π0 π1
Γ0 ⇒ φ φ, ∆ 0 ⇒ ψ
Cut
Γ0 , ∆ 0 ⇒ ψ
Proof. Exercise.
π0 π1
Γ0 ⇒ φ φ, Γ1 ⇒
Cut
Γ0 , Γ1 ⇒
π1
φ ⇒ φ
⇒ φ, ¬ φ ¬R ¬ φ, Γ ⇒
Cut
Γ ⇒ φ
π φ ⇒ φ
¬ φ, φ ⇒ ¬L
Γ0 ⇒ φ φ, ¬ φ ⇒ XL
Cut
Γ, ¬ φ ⇒
π0
π1
φ, Γ0 ⇒
¬R
Γ0 ⇒ ¬ φ ¬ φ, Γ1 ⇒
Cut
Γ0 , Γ1 ⇒
2. φ, ψ ⊢ φ ∧ ψ.
φ ⇒ φ ψ ⇒ ψ
φ∧ψ ⇒ φ
∧L φ∧ψ ⇒ ψ
∧L
φ ⇒ φ ψ ⇒ ψ
φ, ψ ⇒ φ ∧ ψ
∧R
2. Both φ ⊢ φ ∨ ψ and ψ ⊢ φ ∨ ψ.
φ ⇒ φ ψ ⇒ ψ
¬ φ, φ ⇒ ¬L ¬ψ, ψ ⇒ ¬L
φ, ¬ φ, ¬ψ ⇒ ψ, ¬ φ, ¬ψ ⇒
φ ∨ ψ, ¬ φ, ¬ψ ⇒
∨L
φ ⇒ φ ψ ⇒ ψ
φ ⇒ φ∨ψ
∨R ψ ⇒ φ∨ψ
∨R
Proposition 19.24. 1. φ, φ → ψ ⊢ ψ.
2. Both ¬ φ ⊢ φ → ψ and ψ ⊢ φ → ψ.
φ ⇒ φ ψ ⇒ ψ
φ → ψ, φ ⇒ ψ
→L
φ ⇒ φ
¬ φ, φ ⇒ ¬L
φ, ¬ φ ⇒ XL ψ ⇒ ψ
φ, ¬ φ ⇒ ψ WR φ, ψ ⇒ ψ
WL
¬φ ⇒ φ → ψ →R ψ ⇒ φ→ψ
→R
2. ∀ x φ( x ) ⊢ φ(t).
φ(t) ⇒ φ(t)
∃R
φ(t) ⇒ ∃ x φ( x )
φ(t) ⇒ φ(t)
∀L
∀ x φ( x ) ⇒ φ(t)
19.12 Soundness
A derivation system, such as the sequent calculus, is sound if it cannot de-
rive things that do not actually hold. Soundness is thus a kind of guaranteed
safety property for derivation systems. Depending on which proof theoretic
property is in question, we would like to know for instance, that
Γ ⇒ ∆ Γ ⇒ ∆
WL WR
φ, Γ ⇒ ∆ Γ ⇒ ∆, φ
Γ ⇒ ∆, φ
¬L
¬ φ, Γ ⇒ ∆
and Θ = ¬ φ, Γ while Ξ = ∆.
The induction hypothesis tells us that Γ ⇒ ∆, φ is valid, i.e., for every
M, either (a) for some χ ∈ Γ, M ⊭ χ, or (b) for some χ ∈ ∆, M ⊨ χ, or (c)
M ⊨ φ. We want to show that Θ ⇒ Ξ is also valid. Let M be a structure.
If (a) holds, then there is χ ∈ Γ so that M ⊭ χ, but χ ∈ Θ as well. If
(b) holds, there is χ ∈ ∆ such that M ⊨ χ, but χ ∈ Ξ as well. Finally, if
M ⊨ φ, then M ⊭ ¬ φ. Since ¬ φ ∈ Θ, there is χ ∈ Θ such that M ⊭ χ.
Consequently, Θ ⇒ Ξ is valid.
3. The last inference is ¬R: Exercise.
4. The last inference is ∧L: There are two variants: φ ∧ ψ may be inferred
on the left from φ or from ψ on the left side of the premise. In the first
case, the π ends in
φ, Γ ⇒ ∆
∧L
φ ∧ ψ, Γ ⇒ ∆
Γ ⇒ ∆, φ
∨R
Γ ⇒ ∆, φ ∨ ψ
φ, Γ ⇒ ∆, ψ
→R
Γ ⇒ ∆, φ → ψ
Again, the induction hypothesis says that the premise is valid; we want
to show that the conclusion is valid as well. Let M be arbitrary. Since
φ, Γ ⇒ ∆, ψ is valid, at least one of the following cases obtains: (a) M ⊭
φ, (b) M ⊨ ψ, (c) M ⊭ χ for some χ ∈ Γ, or (d) M ⊨ χ for some χ ∈ ∆.
In cases (a) and (b), M ⊨ φ → ψ and so there is a χ ∈ ∆, φ → ψ such that
M ⊨ χ. In case (c), for some χ ∈ Γ, M ⊭ χ. In case (d), for some χ ∈ ∆,
M ⊨ χ. In each case, M satisfies Γ ⇒ ∆, φ → ψ. Since M was arbitrary,
Γ ⇒ ∆, φ → ψ is valid.
7. The last inference is ∀L: Then there is a formula φ( x ) and a closed term t
such that π ends in
φ ( t ), Γ ⇒ ∆
∀L
∀ x φ ( x ), Γ ⇒ ∆
Γ ⇒ ∆, φ( a)
∀R
Γ ⇒ ∆, ∀ x φ( x )
is valid. We have to show that the conclusion is valid as well, i.e., that
for any structure M, (a) M ⊨ ∀ x φ( x ), (b) M ⊭ χ for some χ ∈ Γ, or
(c) M ⊨ χ for some χ ∈ ∆.
Suppose M is an arbitrary structure. If (b) or (c) holds, we are done, so
suppose neither holds: for all χ ∈ Γ, M ⊨ χ, and for all χ ∈ ∆, M ⊭ χ.
We have to show that (a) holds, i.e., M ⊨ ∀ x φ( x ). By Proposition 16.18,
if suffices to show that M, s ⊨ φ( x ) for all variable assignments s. So let s
be an arbitrary variable assignment. Consider the structure M′ which is
′
just like M except aM = s( x ). By Corollary 16.20, for any χ ∈ Γ, M′ ⊨ χ
since a does not occur in Γ, and for any χ ∈ ∆, M′ ⊭ χ. But the premise
is valid, so M′ ⊨ φ( a). By Proposition 16.17, M′ , s ⊨ φ( a), since φ( a) is
′
a sentence. Now s ∼ x s with s( x ) = ValM s ( a ), since we’ve defined M
′
′
in just this way. So Proposition 16.22 applies, and we get M , s ⊨ φ( x ).
Since a does not occur in φ( x ), by Proposition 16.19, M, s ⊨ φ( x ). Since s
was arbitrary, we’ve completed the proof that M, s ⊨ φ( x ) for all variable
assignments.
10. The last inference is ∃L: Exercise.
Now let’s consider the possible inferences with two premises.
1. The last inference is a cut: then π ends in
Γ ⇒ ∆, φ φ, Π ⇒ Λ
Cut
Γ, Π ⇒ ∆, Λ
Γ ⇒ ∆, φ Γ ⇒ ∆, ψ
∧R
Γ ⇒ ∆, φ ∧ ψ
Γ ⇒ ∆, φ ψ, Π ⇒ Λ
→L
φ → ψ, Γ, Π ⇒ ∆, Λ
t1 = t2 , Γ ⇒ ∆, φ(t1 ) t1 = t2 , Γ ⇒ ∆, φ(t2 )
= =
t1 = t2 , Γ ⇒ ∆, φ(t2 ) t1 = t2 , Γ ⇒ ∆, φ(t1 )
Proof. Initial sequents of the form ⇒ t = t are valid, since for every struc-
ture M, M ⊨ t = t. (Note that we assume the term t to be closed, i.e., it
contains no variables, so variable assignments are irrelevant).
Suppose the last inference in a derivation is =. Then the premise is t1 =
t2 , Γ ⇒ ∆, φ(t1 ) and the conclusion is t1 = t2 , Γ ⇒ ∆, φ(t2 ). Consider a struc-
ture M. We need to show that the conclusion is valid, i.e., if M ⊨ t1 = t2 and
M ⊨ Γ, then either M ⊨ χ for some χ ∈ ∆ or M ⊨ φ(t2 ).
By induction hypothesis, the premise is valid. This means that if M ⊨
t1 = t2 and M ⊨ Γ either (a) for some χ ∈ ∆, M ⊨ χ or (b) M ⊨ φ(t1 ).
In case (a) we are done. Consider case (b). Let s be a variable assignment
with s( x ) = ValM (t1 ). By Proposition 16.17, M, s ⊨ φ(t1 ). Since s ∼ x s,
by Proposition 16.22, M, s ⊨ φ( x ). since M ⊨ t1 = t2 , we have ValM (t1 ) =
ValM (t2 ), and hence s( x ) = ValM (t2 ). By applying Proposition 16.22 again,
we also have M, s ⊨ φ(t2 ). By Proposition 16.17, M ⊨ φ(t2 ).
Problems
Problem 19.1. Give derivations of the following sequents:
1. φ ∧ (ψ ∧ χ) ⇒ ( φ ∧ ψ) ∧ χ.
2. φ ∨ (ψ ∨ χ) ⇒ ( φ ∨ ψ) ∨ χ.
3. φ → (ψ → χ) ⇒ ψ → ( φ → χ).
4. φ ⇒ ¬¬ φ.
1. ( φ ∨ ψ) → χ ⇒ φ → χ.
2. ( φ → χ) ∧ (ψ → χ) ⇒ ( φ ∨ ψ) → χ.
3. ⇒ ¬( φ ∧ ¬ φ).
4. ψ → φ ⇒ ¬ φ → ¬ψ.
5. ⇒ ( φ → ¬ φ) → ¬ φ.
6. ⇒ ¬( φ → ψ) → ¬ψ.
7. φ → χ ⇒ ¬( φ ∧ ¬χ).
8. φ ∧ ¬χ ⇒ ¬( φ → χ).
9. φ ∨ ψ, ¬ψ ⇒ φ.
10. ¬ φ ∨ ¬ψ ⇒ ¬( φ ∧ ψ).
12. ⇒ ¬( φ ∨ ψ) → (¬ φ ∧ ¬ψ).
1. ¬( φ → ψ) ⇒ φ.
2. ¬( φ ∧ ψ) ⇒ ¬ φ ∨ ¬ψ.
3. φ → ψ ⇒ ¬ φ ∨ ψ.
4. ⇒ ¬¬ φ → φ.
5. φ → ψ, ¬ φ → ψ ⇒ ψ.
6. ( φ ∧ ψ) → χ ⇒ ( φ → χ) ∨ (ψ → χ).
7. ( φ → ψ) → φ ⇒ φ.
8. ⇒ ( φ → ψ) ∨ (ψ → χ).
3. ∀ x ( φ( x ) → ψ) ⇒ ∃y φ(y) → ψ.
4. ∀ x ¬ φ( x ) ⇒ ¬∃ x φ( x ).
5. ⇒ ¬∃ x φ( x ) → ∀ x ¬ φ( x ).
1. ⇒ ¬∀ x φ( x ) → ∃ x ¬ φ( x ).
2. (∀ x φ( x ) → ψ) ⇒ ∃y ( φ(y) → ψ).
3. ⇒ ∃ x ( φ( x ) → ∀y φ(y)).
1. ⇒ ∀ x ∀y (( x = y ∧ φ( x )) → φ(y))
Natural Deduction
280
20.2. PROPOSITIONAL RULES
Rules for ∧
φ∧ψ
φ ∧Elim
φ ψ
∧Intro
φ∧ψ φ∧ψ
ψ
∧Elim
Rules for ∨
φ [ φ]n [ψ]n
∨Intro
φ∨ψ
ψ
∨Intro φ∨ψ χ χ
φ∨ψ n ∨Elim
χ
Rules for →
[ φ]n
φ→ψ φ
ψ
→Elim
ψ
n →Intro
φ→ψ
Rules for ¬
[ φ]n
¬φ φ
¬Elim
⊥
⊥
¬ φ ¬Intro
n
Rules for ⊥
[¬ φ]n
⊥ ⊥
φ I
n
⊥ ⊥
φ C
Note that ¬Intro and ⊥C are very similar: The difference is that ¬Intro derives
a negated sentence ¬ φ but ⊥C a positive sentence φ.
Whenever a rule indicates that some assumption may be discharged, we
take this to be a permission, but not a requirement. E.g., in the →Intro rule,
we may discharge any number of assumptions of the form φ in the derivation
of the premise ψ, including zero.
Rules for ∀
φ( a) ∀ x φ( x )
∀Intro ∀Elim
∀ x φ( x ) φ(t)
In the rules for ∀, t is a closed term (a term that does not contain any variables),
and a is a constant symbol which does not occur in the conclusion ∀ x φ( x ), or
in any assumption which is undischarged in the derivation ending with the
premise φ( a). We call a the eigenvariable of the ∀Intro inference.1
Rules for ∃
[φ( a)]n
φ(t)
∃Intro
∃ x φ( x )
∃ x φ( x ) χ
n
χ ∃Elim
1 We use the term “eigenvariable” even though a in the above rule is a constant. This has
historical reasons.
Again, t is a closed term, and a is a constant which does not occur in the
premise ∃ x φ( x ), in the conclusion χ, or any assumption which is undischarged
in the derivations ending with the two premises (other than the assumptions
φ( a)). We call a the eigenvariable of the ∃Elim inference.
The condition that an eigenvariable neither occur in the premises nor in
any assumption that is undischarged in the derivations leading to the premises
for the ∀Intro or ∃Elim inference is called the eigenvariable condition.
Recall the convention that when φ is a formula with the variable x free, we
indicate this by writing φ( x ). In the same context, φ(t) then is short for φ[t/x ].
So we could also write the ∃Intro rule as:
φ[t/x ]
∃Intro
∃x φ
Note that t may already occur in φ, e.g., φ might be P (t, x ). Thus, inferring
∃ x P (t, x ) from P (t, t) is a correct application of ∃Intro—you may “replace”
one or more, and not necessarily all, occurrences of t in the premise by the
bound variable x. However, the eigenvariable conditions in ∀Intro and ∃Elim
require that the constant symbol a does not occur in φ. So, you cannot cor-
rectly infer ∀ x P ( a, x ) from P ( a, a) using ∀Intro.
In ∃Intro and ∀Elim there are no restrictions, and the term t can be any-
thing, so we do not have to worry about any conditions. On the other hand,
in the ∃Elim and ∀Intro rules, the eigenvariable condition requires that the
constant symbol a does not occur anywhere in the conclusion or in an undis-
charged assumption. The condition is necessary to ensure that the system
is sound, i.e., only derives sentences from undischarged assumptions from
which they follow. Without this condition, the following would be allowed:
[ φ( a)]1
*∀Intro
∃ x φ( x ) ∀ x φ( x )
∃Elim
∀ x φ( x )
However, ∃ x φ( x ) ⊭ ∀ x φ( x ).
As the elimination rules for quantifiers only allow substituting closed terms
for variables, it follows that any formula that can be derived from a set of sen-
tences is itself a sentence.
20.4 Derivations
We’ve said what an assumption is, and we’ve given the rules of inference.
Derivations in natural deduction are inductively generated from these: each
derivation either is an assumption on its own, or consists of one, two, or three
derivations followed by a correct inference.
Definition 20.2 (Derivation). A derivation of a sentence φ from assumptions Γ
is a finite tree of sentences satisfying the following conditions:
ψ
1 →Intro
φ→ψ
( φ ∧ ψ) → φ
Next, we need to figure out what kind of inference could result in a sen-
tence of this form. The main operator of the conclusion is →, so we’ll try to
arrive at the conclusion using the →Intro rule. It is best to write down the as-
sumptions involved and label the inference rules as you progress, so it is easy
to see whether all assumptions have been discharged at the end of the proof.
[ φ ∧ ψ ]1
φ
1 →Intro
( φ ∧ ψ) → φ
[ φ ∧ ψ ]1
φ ∧Elim
1 →Intro
( φ ∧ ψ) → φ
(¬ φ ∨ ψ) → ( φ → ψ)
To find a logical rule that could give us this conclusion, we look at the logical
connectives in the conclusion: ¬, ∨, and →. We only care at the moment about
the first occurrence of → because it is the main operator of the sentence in the
end-sequent, while ¬, ∨ and the second occurrence of → are inside the scope
of another connective, so we will take care of those later. We therefore start
with the →Intro rule. A correct application must look like this:
[¬ φ ∨ ψ]1
φ→ψ
1 →Intro
(¬ φ ∨ ψ) → ( φ → ψ)
This leaves us with two possibilities to continue. Either we can keep working
from the bottom up and look for another application of the →Intro rule, or we
can work from the top down and apply a ∨Elim rule. Let us apply the latter.
We will use the assumption ¬ φ ∨ ψ as the leftmost premise of ∨Elim. For a
valid application of ∨Elim, the other two premises must be identical to the
conclusion φ → ψ, but each may be derived in turn from another assumption,
namely one of the two disjuncts of ¬ φ ∨ ψ. So our derivation will look like
this:
[¬ φ]2 [ ψ ]2
[¬ φ]2 , [ φ]3 [ ψ ]2 , [ φ ]4
ψ ψ
3 →Intro 4 →Intro
[¬ φ ∨ ψ]1 φ→ψ φ→ψ
2
φ→ψ
∨Elim
1 →Intro
(¬ φ ∨ ψ) → ( φ → ψ)
For the two missing parts of the derivation, we need derivations of ψ from
¬ φ and φ in the middle, and from φ and ψ on the left. Let’s take the former
first. ¬ φ and φ are the two premises of ¬Elim:
[¬ φ]2 [ φ ]3
¬Elim
⊥
[ ψ ]2 , [ φ ]4
[¬ φ]2 [ φ ]3
⊥Intro
⊥ ⊥
I
ψ ψ
3 →Intro 4 →Intro
[¬ φ ∨ ψ]1 φ→ψ φ→ψ
2
φ→ψ
∨Elim
1 →Intro
(¬ φ ∨ ψ) → ( φ → ψ)
Let’s now look at the rightmost branch. Here it’s important to realize that
the definition of derivation allows assumptions to be discharged but does not re-
quire them to be. In other words, if we can derive ψ from one of the assump-
tions φ and ψ without using the other, that’s ok. And to derive ψ from ψ is
trivial: ψ by itself is such a derivation, and no inferences are needed. So we
can simply delete the assumption φ.
[¬ φ]2 [ φ ]3
¬Elim
⊥ ⊥
I
ψ [ ψ ]2
3 →Intro →Intro
[¬ φ ∨ ψ]1 φ→ψ φ→ψ
2
φ→ψ
∨Elim
1 →Intro
(¬ φ ∨ ψ) → ( φ → ψ)
Note that in the finished derivation, the rightmost →Intro inference does not
actually discharge any assumptions.
Example 20.6. So far we have not needed the ⊥C rule. It is special in that it al-
lows us to discharge an assumption that isn’t a sub-formula of the conclusion
of the rule. It is closely related to the ⊥ I rule. In fact, the ⊥ I rule is a special
case of the ⊥C rule—there is a logic called “intuitionistic logic” in which only
⊥ I is allowed. The ⊥C rule is a last resort when nothing else works. For in-
stance, suppose we want to derive φ ∨ ¬ φ. Our usual strategy would be to
attempt to derive φ ∨ ¬ φ using ∨Intro. But this would require us to derive
either φ or ¬ φ from no assumptions, and this can’t be done. ⊥C to the rescue!
[¬( φ ∨ ¬ φ)]1
1
⊥ ⊥C
φ ∨ ¬φ
¬φ φ
¬Elim
1
⊥ ⊥C
φ ∨ ¬φ
⊥
2
¬ φ ¬Intro φ
¬Elim
1
⊥ ⊥C
φ ∨ ¬φ
[ φ ]2 [¬( φ ∨ ¬ φ)]1
[¬( φ ∨ ¬ φ)]1 φ ∨ ¬ φ ∨Intro
¬Elim
⊥
2
¬φ ¬ Intro φ
¬Elim
1
⊥ ⊥C
φ ∨ ¬φ
[ φ ]2 [¬ φ]3
[¬( φ ∨ ¬ φ)]1 φ ∨ ¬φ ∨ Intro [¬( φ ∨ ¬ φ)] 1 φ ∨ ¬ φ ∨Intro
¬Elim ¬Elim
⊥ ⊥ ⊥
2
¬φ ¬ Intro 3
φ C
¬Elim
1
⊥ ⊥C
φ ∨ ¬φ
∃ x ¬ φ( x ) → ¬∀ x φ( x )
We start by writing down what it would take to justify that last step using the
→Intro rule.
[∃ x ¬ φ( x )]1
¬∀ x φ( x )
1 →Intro
∃ x ¬ φ( x ) → ¬∀ x φ( x )
[¬ φ( a)]2
[∃ x ¬ φ( x )]1 ¬∀ x φ( x )
2 ∃Elim
¬∀ x φ( x )
1 →Intro
∃ x ¬ φ( x ) → ¬∀ x φ( x )
In order to derive ¬∀ x φ( x ), we will attempt to use the ¬Intro rule: this re-
quires that we derive a contradiction, possibly using ∀ x φ( x ) as an additional
assumption. Of course, this contradiction may involve the assumption ¬ φ( a)
which will be discharged by the ∃Elim inference. We can set it up as follows:
[¬ φ( a)]2 , [∀ x φ( x )]3
⊥
3 ¬Intro
[∃ x ¬ φ( x )]1 ¬∀ x φ( x )
2 ∃Elim
¬∀ x φ( x )
1 →Intro
∃ x ¬ φ( x ) → ¬∀ x φ( x )
It looks like we are close to getting a contradiction. The easiest rule to apply is
the ∀Elim, which has no eigenvariable conditions. Since we can use any term
we want to replace the universally quantified x, it makes the most sense to
continue using a so we can reach a contradiction.
[∀ x φ( x )]3
∀Elim
[¬ φ( a)]2 φ( a)
¬Elim
⊥
1
3 ¬Intro
[∃ x ¬ φ( x )] ¬∀ x φ( x )
2 ∃Elim
¬∀ x φ( x )
1 →Intro
∃ x ¬ φ( x ) → ¬∀ x φ( x )
∃ x χ( x, b)
We have two premises to work with. To use the first, i.e., try to find
a derivation of ∃ x χ( x, b) from ∃ x ( φ( x ) ∧ ψ( x )) we would use the ∃Elim rule.
Since it has an eigenvariable condition, we will apply that rule first. We get
the following:
[ φ( a) ∧ ψ( a)]1
∃ x ( φ( x ) ∧ ψ( x )) ∃ x χ( x, b)
1 ∃Elim
∃ x χ( x, b)
The two assumptions we are working with share ψ. It may be useful at this
point to apply ∧Elim to separate out ψ( a).
[ φ( a) ∧ ψ( a)]1
∧Elim
ψ( a)
∃ x ( φ( x ) ∧ ψ( x )) ∃ x χ( x, b)
1 ∃Elim
∃ x χ( x, b)
∃ x ( φ( x ) ∧ ψ( x )) ∃ x χ( x, b)
1 ∃Elim
∃ x χ( x, b)
We are so close! One application of ∃Intro and we have reached our goal.
Since we ensured at each step that the eigenvariable conditions were not vio-
lated, we can be confident that this is a correct derivation.
¬∀ x φ( x )
The last line of the derivation is a negation, so let’s try using ¬Intro. This will
require that we figure out how to derive a contradiction.
[∀ x φ( x )]1
⊥
1 ¬Intro
¬∀ x φ( x )
So far so good. We can use ∀Elim but it’s not obvious if that will help us
get to our goal. Instead, let’s use one of our assumptions. ∀ x φ( x ) → ∃y ψ(y)
together with ∀ x φ( x ) will allow us to use the →Elim rule.
∀ x φ( x ) → ∃y ψ(y) [∀ x φ( x )]1
→Elim
∃y ψ(y)
⊥
1 ¬Intro
¬∀ x φ( x )
We now have one final assumption to work with, and it looks like this will
help us reach a contradiction by using ¬Elim.
∀ x φ( x ) → ∃y ψ(y) [∀ x φ( x )]1
→Elim
¬∃y ψ(y) ∃y ψ(y)
¬Elim
⊥
1 ¬Intro
¬∀ x φ( x )
This section collects the definitions the provability relation and consis-
tency for natural deduction.
∆, [ φ]1
δ1 Γ
δ0
ψ
1 →Intro
φ→ψ φ
ψ
→Elim
1. Γ is inconsistent.
Proof. Exercise.
Γ, [¬ φ]1
δ1
1
⊥ ⊥
φ C
δ
¬φ φ
¬Elim
⊥
Since ¬ φ ∈ Γ, all undischarged assumptions are in Γ, this shows that Γ ⊢ ⊥.
Γ, [¬ φ]2 Γ, [ φ]1
δ2 δ1
⊥ ⊥
2
¬¬ φ ¬Intro 1
¬ φ ¬Intro
¬Elim
⊥
Since the assumptions φ and ¬ φ are discharged, this is a derivation of ⊥
from Γ alone. Hence Γ is inconsistent.
2. φ, ψ ⊢ φ ∧ ψ.
φ∧ψ φ∧ψ
φ ∧Elim ψ
∧Elim
2. We can derive:
φ ψ
∧Intro
φ∧ψ
2. Both φ ⊢ φ ∨ ψ and ψ ⊢ φ ∨ ψ.
¬φ [ φ ]1 ¬ψ [ ψ ]1
¬Elim ¬Elim
φ∨ψ ⊥ ⊥
1 ∨Elim
⊥
φ ψ
∨Intro ∨Intro
φ∨ψ φ∨ψ
Proposition 20.24. 1. φ, φ → ψ ⊢ ψ.
2. Both ¬ φ ⊢ φ → ψ and ψ ⊢ φ → ψ.
φ→ψ φ
ψ
→Elim
¬φ [ φ ]1
¬Elim
⊥ ⊥
I
ψ ψ
1 →Intro →Intro
φ→ψ φ→ψ
Note that →Intro may, but does not have to, discharge the assumption φ.
2. ∀ x φ( x ) ⊢ φ(t).
φ(t)
∃Intro
∃ x φ( x )
∀ x φ( x )
∀Elim
φ(t)
20.11 Soundness
A derivation system, such as natural deduction, is sound if it cannot derive
things that do not actually follow. Soundness is thus a kind of guaranteed
safety property for derivation systems. Depending on which proof theoretic
property is in question, we would like to know for instance, that
inferences, and there are no inferences. So, any structure M that satisfies all of
the undischarged assumptions of the proof also satisfies φ.
Now for the inductive step. Suppose that δ contains n inferences. The
premise(s) of the lowermost inference are derived using sub-derivations, each
of which contains fewer than n inferences. We assume the induction hypothe-
sis: The premises of the lowermost inference follow from the undischarged as-
sumptions of the sub-derivations ending in those premises. We have to show
that the conclusion φ follows from the undischarged assumptions of the entire
proof.
We distinguish cases according to the type of the lowermost inference.
First, we consider the possible inferences with only one premise.
1. Suppose that the last inference is ¬Intro: The derivation has the form
Γ, [ φ]n
δ1
⊥
¬ φ ¬Intro
n
2. The last inference is ∧Elim: There are two variants: φ or ψ may be in-
ferred from the premise φ ∧ ψ. Consider the first case. The derivation δ
looks like this:
Γ
δ1
φ∧ψ
φ ∧Elim
3. The last inference is ∨Intro: There are two variants: φ ∨ ψ may be in-
ferred from the premise φ or the premise ψ. Consider the first case. The
derivation has the form
Γ
δ1
φ
∨Intro
φ∨ψ
Γ, [ φ]n
δ1
ψ
n →Intro
φ→ψ
Γ
δ1
⊥ ⊥
φ I
Γ
δ1
φ( a)
∀Intro
∀ x φ( x )
Now let’s consider the possible inferences with several premises: ∨Elim,
∧Intro, →Elim, and ∃Elim.
1. The last inference is ∧Intro. φ ∧ ψ is inferred from the premises φ and ψ
and δ has the form
Γ1 Γ2
δ1 δ2
φ ψ
∧Intro
φ∧ψ
Γ1 Γ2
δ1 δ2
φ→ψ φ
ψ
→Elim
t1 = t2 φ ( t1 )
=Elim
φ ( t2 )
=Intro
t=t
t1 = t2 φ ( t2 )
=Elim
φ ( t1 )
In the above rules, t, t1 , and t2 are closed terms. The =Intro rule allows us
to derive any identity statement of the form t = t outright, from no assump-
tions.
∀ x ∀y (( φ( x ) ∧ φ(y)) → x = y)
∃ x ∀y ( φ(y) → y = x )
∃ x ∀y ( φ(y) → y = x ) [ φ( a) ∧ φ(b)]1
a=b
1 →Intro
(( φ( a) ∧ φ(b)) → a = b)
∀Intro
∀y (( φ( a) ∧ φ(y)) → a = y)
∀Intro
∀ x ∀y (( φ( x ) ∧ φ(y)) → x = y)
We’ll now have to use the main assumption: since it is an existential formula,
we use ∃Elim to derive the intermediary conclusion a = b.
∃ x ∀y ( φ(y) → y = x ) a=b
2 ∃Elim
a=b
1 →Intro
(( φ( a) ∧ φ(b)) → a = b)
∀Intro
∀y (( φ( a) ∧ φ(y)) → a = y)
∀Intro
∀ x ∀y (( φ( x ) ∧ φ(y)) → x = y)
Proof. Any formula of the form t = t is valid, since for every structure M,
M ⊨ t = t. (Note that we assume the term t to be closed, i.e., it contains no
variables, so variable assignments are irrelevant).
Suppose the last inference in a derivation is =Elim, i.e., the derivation has
the following form:
Γ1 Γ2
δ1 δ2
t1 = t2 φ ( t1 )
=Elim
φ ( t2 )
Problems
Problem 20.1. Give derivations that show the following:
1. φ ∧ (ψ ∧ χ) ⊢ ( φ ∧ ψ) ∧ χ.
2. φ ∨ (ψ ∨ χ) ⊢ ( φ ∨ ψ) ∨ χ.
3. φ → (ψ → χ) ⊢ ψ → ( φ → χ).
4. φ ⊢ ¬¬ φ.
1. ( φ ∨ ψ) → χ ⊢ φ → χ.
2. ( φ → χ) ∧ (ψ → χ) ⊢ ( φ ∨ ψ) → χ.
3. ⊢ ¬( φ ∧ ¬ φ).
4. ψ → φ ⊢ ¬ φ → ¬ψ.
5. ⊢ ( φ → ¬ φ) → ¬ φ.
6. ⊢ ¬( φ → ψ) → ¬ψ.
7. φ → χ ⊢ ¬( φ ∧ ¬χ).
8. φ ∧ ¬χ ⊢ ¬( φ → χ).
9. φ ∨ ψ, ¬ψ ⊢ φ.
10. ¬ φ ∨ ¬ψ ⊢ ¬( φ ∧ ψ).
12. ⊢ ¬( φ ∨ ψ) → (¬ φ ∧ ¬ψ).
1. ¬( φ → ψ) ⊢ φ.
2. ¬( φ ∧ ψ) ⊢ ¬ φ ∨ ¬ψ.
3. φ → ψ ⊢ ¬ φ ∨ ψ.
4. ⊢ ¬¬ φ → φ.
5. φ → ψ, ¬ φ → ψ ⊢ ψ.
6. ( φ ∧ ψ) → χ ⊢ ( φ → χ) ∨ (ψ → χ).
7. ( φ → ψ) → φ ⊢ φ.
8. ⊢ ( φ → ψ) ∨ (ψ → χ).
3. ∀ x ( φ( x ) → ψ) ⊢ ∃y φ(y) → ψ.
4. ∀ x ¬ φ( x ) ⊢ ¬∃ x φ( x ).
5. ⊢ ¬∃ x φ( x ) → ∀ x ¬ φ( x ).
1. ⊢ ¬∀ x φ( x ) → ∃ x ¬ φ( x ).
2. (∀ x φ( x ) → ψ) ⊢ ∃y ( φ(y) → ψ).
3. ⊢ ∃ x ( φ( x ) → ∀y φ(y)).
Problem 20.9. Prove that = is both symmetric and transitive, i.e., give deriva-
tions of ∀ x ∀y ( x = y → y = x ) and ∀ x ∀y ∀z(( x = y ∧ y = z) → x = z)
1. ∀ x ∀y (( x = y ∧ φ( x )) → φ(y))
Tableaux
Definition 21.1. A signed formula is a pair consisting of a truth value and a sen-
tence, i.e., either:
T φ or F φ.
306
21.2. PROPOSITIONAL RULES
other words, if a branch is closed, the possibility it describes has been ruled
out. In particular, that means that a closed tableau rules out all possibilities
of simultaneously making every assumption of the form T φ true and every
assumption of the form F φ false.
A closed tableau for φ is a closed tableau with root F φ. If such a closed
tableau exists, all possibilities for φ being false have been ruled out; i.e., φ
must be true in every structure.
Rules for ¬
T¬ φ F ¬φ
¬T ¬F
Fφ Tφ
Rules for ∧
Tφ ∧ ψ
∧T Fφ ∧ ψ
Tφ ∧F
F φ | Fψ
Tψ
Rules for ∨
Fφ ∨ ψ
Tφ ∨ ψ ∨F
∨T Fφ
T φ | Tψ
Fψ
Rules for →
Fφ → ψ
Tφ → ψ →F
→T Tφ
F φ | Tψ
Fψ
Cut
Tφ | Fφ
The Cut rule is not applied “to” a previous signed formula; rather, it allows
every branch in a tableau to be split in two, one branch containing T φ, the
other F φ. It is not necessary—any set of signed formulas with a closed tableau
has one not using Cut—but it allows us to combine tableaux in a convenient
way.
Rules for ∀
T ∀ x φ( x ) F ∀ x φ( x )
∀T ∀F
T φ(t) F φ( a)
Rules for ∃
T ∃ x φ( x ) F ∃ x φ( x )
∃T ∃F
T φ( a) F φ(t)
Again, t is a closed term, and a is a constant symbol which does not occur in
the branch above the ∃T rule. We call a the eigenvariable of the ∃T inference.
The condition that an eigenvariable not occur in the branch above the ∀F
or ∃T inference is called the eigenvariable condition.
Recall the convention that when φ is a formula with the variable x free, we
indicate this by writing φ( x ). In the same context, φ(t) then is short for φ[t/x ].
So we could also write the ∃F rule as:
F ∃x φ
∃F
F φ[t/x ]
1 We use the term “eigenvariable” even though a in the above rule is a constant symbol. This
Note that t may already occur in φ, e.g., φ might be P (t, x ). Thus, inferring
F P (t, t) from F ∃ x P (t, x ) is a correct application of ∃F. However, the eigen-
variable conditions in ∀F and ∃T require that the constant symbol a does not
occur in φ. So, you cannot correctly infer F P ( a, a) from F ∀ x P ( a, x ) using ∀F.
In ∀T and ∃F there are no restrictions on the term t. On the other hand,
in the ∃T and ∀F rules, the eigenvariable condition requires that the constant
symbol a does not occur anywhere in the branches above the respective infer-
ence. It is necessary to ensure that the system is sound. Without this condition,
the following would be a closed tableau for ∃ x φ( x ) → ∀ x φ( x ):
1. F ∃ x φ( x ) → ∀ x φ( x ) Assumption
2. T ∃ x φ( x ) →F 1
3. F ∀ x φ( x ) →F 1
4. T φ( a) ∃T 2
5. F φ( a) ∀F 3
⊗
21.4 Tableaux
We’ve said what an assumption is, and we’ve given the rules of inference.
Tableaux are inductively generated from these: each tableau either is a single
branch consisting of one or more assumptions, or it results from a tableau by
applying one of the rules of inference on a branch.
1. The n topmost signed formulas of the tree are Si φi , one below the other.
2. Every signed formula in the tree that is not one of the assumptions re-
sults from a correct application of an inference rule to a signed formula
in the branch above it.
A branch of a tableau is closed iff it contains both T φ and F φ, and open other-
wise. A tableau in which every branch is closed is a closed tableau (for its set
of assumptions). If a tableau is not closed, i.e., if it contains at least one open
branch, it is open.
Example 21.3. Every set of assumptions on its own is a tableau, but it will
generally not be closed. (Obviously, it is closed only if the assumptions al-
ready contain a pair of signed formulas T φ and F φ.)
From a tableau (open or closed) we can obtain a new, larger one by ap-
plying one of the rules of inference to a signed formula φ in it. The rule will
append one or more signed formulas to the end of any branch containing the
occurrence of φ to which we apply the rule.
For instance, consider the assumption T φ ∧ ¬ φ. Here is the (open) tableau
consisting of just that assumption:
1. T φ ∧ ¬φ Assumption
1. T φ ∧ ¬φ Assumption
2. Tφ ∧T 1
3. T¬ φ ∧T 1
When we write down tableaux, we record the rules we’ve applied on the right
(e.g., ∧T1 means that the signed formula on that line is the result of applying
the ∧T rule to the signed formula on line 1). This new tableau now contains
additional signed formulas, but to only one (T ¬ φ) can we apply a rule (in this
case, the ¬T rule). This results in the closed tableau
1. T φ ∧ ¬φ Assumption
2. Tφ ∧T 1
3. T¬ φ ∧T 1
4. Fφ ¬T 3
⊗
1. F ( φ ∧ ψ) → φ Assumption
There is only one assumption, so only one signed formula to which we can
apply a rule. (For every signed formula, there is always at most one rule that
can be applied: it’s the rule for the corresponding sign and main operator of
the sentence.) In this case, this means, we must apply →F.
1. F ( φ ∧ ψ) → φ ✓ Assumption
2. Tφ ∧ ψ →F 1
3. Fφ →F 1
1. F ( φ ∧ ψ) → φ ✓ Assumption
2. Tφ ∧ ψ ✓ →F 1
3. Fφ →F 1
4. Tφ ∧T 2
5. Tψ ∧T 2
⊗
Since the branch now contains both T φ (on line 4) and F φ (on line 3), the
branch is closed. Since it is the only branch, the tableau is closed. We have
found a closed tableau for ( φ ∧ ψ) → φ.
1. F (¬ φ ∨ ψ) → ( φ → ψ) Assumption
The one signed formula in this tableau has main operator → and sign F, so
we apply the →F rule to it to obtain:
1. F (¬ φ ∨ ψ) → ( φ → ψ) ✓ Assumption
2. T¬ φ ∨ ψ →F 1
3. F ( φ → ψ) →F 1
We now have a choice as to whether to apply ∨T to line 2 or →F to line 3. It
actually doesn’t matter which order we pick, as long as each signed formula
has its corresponding rule applied in every branch. So let’s pick the first one.
The ∨T rule allows the tableau to branch, and the two conclusions of the rule
will be the new signed formulas added to the two new branches. This results
in:
1. F (¬ φ ∨ ψ) → ( φ → ψ) ✓ Assumption
2. T¬ φ ∨ ψ ✓ →F 1
3. F ( φ → ψ) →F 1
4. T¬ φ Tψ ∨T 2
We have not applied the →F rule to line 3 yet: let’s do that now. To save
time, we apply it to both branches. Recall that we write a checkmark next
to a signed formula only if we have applied the corresponding rule in every
open branch. So it’s a good idea to apply a rule at the end of every branch that
contains the signed formula the rule applies to. That way we won’t have to
return to that signed formula lower down in the various branches.
1. F (¬ φ ∨ ψ) → ( φ → ψ) ✓ Assumption
2. T¬ φ ∨ ψ ✓ →F 1
3. F ( φ → ψ) ✓ →F 1
4. T¬ φ Tψ ∨T 2
5. Tφ Tφ →F 3
6. Fψ Fψ →F 3
⊗
The right branch is now closed. On the left branch, we can still apply the ¬T
rule to line 4. This results in F φ and closes the left branch:
1. F (¬ φ ∨ ψ) → ( φ → ψ) ✓ Assumption
2. T¬ φ ∨ ψ ✓ →F 1
3. F ( φ → ψ) ✓ →F 1
4. T¬ φ Tψ ∨T 2
5. Tφ Tφ →F 3
6. Fψ Fψ →F 3
7. Fφ ⊗ ¬T 4
⊗
Example 21.6. We can give tableaux for any number of signed formulas as
assumptions. Often it is also necessary to apply more than one rule that allows
branching; and in general a tableau can have any number of branches. For
instance, consider a tableau for {T φ ∨ (ψ ∧ χ), F ( φ ∨ ψ) ∧ ( φ ∨ χ)}. We start
by applying the ∨T to the first assumption:
1. T φ ∨ (ψ ∧ χ) ✓ Assumption
2. F ( φ ∨ ψ) ∧ ( φ ∨ χ) Assumption
3. Tφ Tψ ∧ χ ∨T 1
Now we can apply the ∧F rule to line 2. We do this on both branches simul-
taneously, and can therefore check off line 2:
1. T φ ∨ (ψ ∧ χ) ✓ Assumption
2. F ( φ ∨ ψ) ∧ ( φ ∨ χ) ✓ Assumption
3. Tφ Tψ ∧ χ ∨T 1
4. Fφ ∨ ψ Fφ ∨ χ Fφ ∨ ψ Fφ ∨ χ ∧F 2
1. T φ ∨ (ψ ∧ χ) ✓ Assumption
2. F ( φ ∨ ψ) ∧ ( φ ∨ χ) ✓ Assumption
3. Tφ Tψ ∧ χ ∨T 1
4. Fφ ∨ ψ ✓ Fφ ∨ χ Fφ ∨ ψ ✓ Fφ ∨ χ ∧F 2
5. Fφ Fφ ∨F 4
6. Fψ Fψ ∨F 4
⊗
1. T φ ∨ (ψ ∧ χ) ✓ Assumption
2. F ( φ ∨ ψ) ∧ ( φ ∨ χ) ✓ Assumption
3. Tφ Tψ ∧ χ ∨T 1
4. Fφ ∨ ψ ✓ Fφ ∨ χ ✓ Fφ ∨ ψ ✓ Fφ ∨ χ ✓ ∧F 2
5. Fφ Fφ ∨F 4
6. Fψ Fψ ∨F 4
7. ⊗ Fφ Fφ ∨F 4
8. Fχ Fχ ∨F 4
⊗
Note that we moved the result of applying ∨F a second time below for clarity.
In this instance it would not have been needed, since the justifications would
have been the same.
Two branches remain open, and Tψ ∧ χ on line 3 remains unchecked. We
apply ∧T to it to obtain a closed tableau:
1. T φ ∨ (ψ ∧ χ) ✓ Assumption
2. F ( φ ∨ ψ) ∧ ( φ ∨ χ) ✓ Assumption
3. Tφ Tψ ∧ χ ✓ ∨T 1
4. Fφ ∨ ψ ✓ Fφ ∨ χ ✓ Fφ ∨ ψ ✓ Fφ ∨ χ ✓ ∧F 2
5. Fφ Fφ Fφ Fφ ∨F 4
6. Fψ Fχ Fψ Fχ ∨F 4
7. ⊗ ⊗ Tψ Tψ ∧T 3
8. Tχ Tχ ∧T 3
⊗ ⊗
For comparison, here’s a closed tableau for the same set of assumptions in
which the rules are applied in a different order:
1. T φ ∨ (ψ ∧ χ) ✓ Assumption
2. F ( φ ∨ ψ) ∧ ( φ ∨ χ) ✓ Assumption
3. Fφ ∨ ψ ✓ Fφ ∨ χ ✓ ∧F 2
4. Fφ Fφ ∨F 3
5. Fψ Fχ ∨F 3
6. Tφ Tψ ∧ χ ✓ Tφ Tψ ∧ χ ✓ ∨T 1
7. ⊗ Tψ ⊗ Tψ ∧T 6
8. Tχ Tχ ∧T 6
⊗ ⊗
1. F ∃ x ¬ φ( x ) → ¬∀ x φ( x ) Assumption
1. F ∃ x ¬ φ( x ) → ¬∀ x φ( x ) ✓ Assumption
2. T ∃ x ¬ φ( x ) →F 1
3. F ¬∀ x φ( x ) →F 1
The next line to deal with is 2. We use ∃T. This requires a new constant
symbol; since no constant symbols yet occur, we can pick any one, say, a.
1. F ∃ x ¬ φ( x ) → ¬∀ x φ( x ) ✓ Assumption
2. T ∃ x ¬ φ( x ) ✓ →F 1
3. F ¬∀ x φ( x ) →F 1
4. T ¬ φ( a) ∃T 2
1. F ∃ x ¬ φ( x ) → ¬∀ x φ( x ) ✓ Assumption
2. T ∃ x ¬ φ( x ) ✓ →F 1
3. F ¬∀ x φ( x ) ✓ →F 1
4. T ¬ φ( a) ∃T 2
5. T ∀ x φ( x ) ¬F 3
1. F ∃ x ¬ φ( x ) → ¬∀ x φ( x ) ✓ Assumption
2. T ∃ x ¬ φ( x ) ✓ →F 1
3. F ¬∀ x φ( x ) ✓ →F 1
4. T ¬ φ( a) ∃T 2
5. T ∀ x φ( x ) ¬F 3
6. F φ( a) ¬T 4
7. T φ( a) ∀T 5
⊗
Example 21.8. Let’s see how we’d give a tableau for the set
1. F ∃ x χ( x, b) Assumption
2. T ∃ x ( φ( x ) ∧ ψ( x )) Assumption
3. T ∀ x (ψ( x ) → χ( x, b)) Assumption
We should always apply a rule with the eigenvariable condition first; in this
case that would be ∃T to line 2. Since the assumptions contain the constant
symbol b, we have to use a different one; let’s pick a again.
1. F ∃ x χ( x, b) Assumption
2. T ∃ x ( φ( x ) ∧ ψ( x )) ✓ Assumption
3. T ∀ x (ψ( x ) → χ( x, b)) Assumption
4. T φ( a) ∧ ψ( a) ∃T 2
1. F ∃ x χ( x, b) Assumption
2. T ∃ x ( φ( x ) ∧ ψ( x )) ✓ Assumption
3. T ∀ x (ψ( x ) → χ( x, b)) Assumption
4. T φ( a) ∧ ψ( a) ∃T 2
5. F χ( a, b) ∃F 1
6. Tψ( a) → χ( a, b) ∀T 3
We don’t check the signed formulas in lines 1 and 3, since we may have to use
them again. Now apply ∧T to line 4:
1. F ∃ x χ( x, b) Assumption
2. T ∃ x ( φ( x ) ∧ ψ( x )) ✓ Assumption
3. T ∀ x (ψ( x ) → χ( x, b)) Assumption
4. T φ( a) ∧ ψ( a) ✓ ∃T 2
5. F χ( a, b) ∃F 1
6. Tψ( a) → χ( a, b) ∀T 3
7. T φ( a) ∧T 4
8. Tψ( a) ∧T 4
1. F ∃ x χ( x, b) Assumption
2. T ∃ x ( φ( x ) ∧ ψ( x )) ✓ Assumption
3. T ∀ x (ψ( x ) → χ( x, b)) Assumption
4. T φ( a) ∧ ψ( a) ✓ ∃T 2
5. F χ( a, b) ∃F 1
6. Tψ( a) → χ( a, b) ✓ ∀T 3
7. T φ( a) ∧T 4
8. Tψ( a) ∧T 4
9. F ψ( a) Tχ( a, b) →T 6
⊗ ⊗
1. T ∀ x φ( x ) Assumption
2. T ∀ x φ( x ) → ∃y ψ(y) Assumption
3. T ¬∃y ψ(y) Assumption
1. T ∀ x φ( x ) Assumption
2. T ∀ x φ( x ) → ∃y ψ(y) Assumption
3. T ¬∃y ψ(y) ✓ Assumption
4. F ∃y ψ(y) ¬T 3
The new line 4 requires ∃F, a quantifier rule without the eigenvariable condi-
tion. So we defer this in favor of using →T on line 2.
1. T ∀ x φ( x ) Assumption
2. T ∀ x φ( x ) → ∃y ψ(y) ✓ Assumption
3. T ¬∃y ψ(y) ✓ Assumption
4. F ∃y ψ(y) ¬T 3
5. F ∀ x φ( x ) T ∃y ψ(y) →T 2
Both new signed formulas require rules with eigenvariable conditions, so these
should be next:
1. T ∀ x φ( x ) Assumption
2. T ∀ x φ( x ) → ∃y ψ(y) ✓ Assumption
3. T ¬∃y ψ(y) ✓ Assumption
4. F ∃y ψ(y) ¬T 3
5. F ∀ x φ( x ) ✓ T ∃y ψ(y) ✓ →T 2
6. F φ(b) Tψ(c) ∀F 5; ∃T 5
To close the branches, we have to use the signed formulas on lines 1 and 3.
The corresponding rules (∀T and ∃F) don’t have eigenvariable conditions, so
we are free to pick whichever terms are suitable. In this case, that’s b and c,
respectively.
1. T ∀ x φ( x ) Assumption
2. T ∀ x φ( x ) → ∃y ψ(y) ✓ Assumption
3. T ¬∃y ψ(y) ✓ Assumption
4. F ∃y ψ(y) ¬T 3
5. F ∀ x φ( x ) ✓ T ∃y ψ(y) ✓ →T 2
6. F φ(b) Tψ(c) ∀F 5; ∃T 5
7. T φ(b) F ψ(c) ∀T 1; ∃F 4
⊗ ⊗
This section collects the definitions of the provability relation and con-
sistency for tableaux.
{Tψ1 , . . . , Tψn }.
1. Fφ Assumption
2. Tφ Assumption
⊗
is closed.
{F φ,Tθ1 , . . . , Tθm }
Apply the Cut rule on φ. This generates two branches, one has T φ in it, the
other F φ. Thus, on the one branch, all of
{F ψ, T φ, Tχ1 , . . . , Tχn }
are available. Since there is a closed tableau for these assumptions, we can
attach it to that branch; every branch through T φ closes. On the other branch,
all of
{F φ, Tθ1 , . . . , Tθm }
are available, so we can also complete the other side to obtain a closed tableau.
This shows Γ ∪ ∆ ⊢ ψ.
Proof. Exercise.
{F φ, Tψ1 , . . . , Tψn }
has a closed tableau. Replace the assumption F φ by T ¬ φ, and insert the
conclusion of ¬T applied to F φ after the assumptions. Any sentence in the
tableau justified by appeal to line 1 in the old tableau is now justified by appeal
to line n + 1. So if the old tableau was closed, the new one is. It shows that Γ
is inconsistent, since all assumptions are in Γ.
2. φ, ψ ⊢ φ ∧ ψ.
1. Fφ Assumption
2. Tφ ∧ ψ Assumption
3. Tφ ∧T 2
4. Tψ ∧T 2
⊗
1. Fψ Assumption
2. Tφ ∧ ψ Assumption
3. Tφ ∧T 2
4. Tψ ∧T 2
⊗
1. Fφ ∧ ψ Assumption
2. Tφ Assumption
3. Tψ Assumption
4. Fφ Fψ ∧F 1
⊗ ⊗
2. Both φ ⊢ φ ∨ ψ and ψ ⊢ φ ∨ ψ.
1. Tφ ∨ ψ Assumption
2. T¬ φ Assumption
3. T ¬ψ Assumption
4. Fφ ¬T 2
5. Fψ ¬T 3
6. Tφ Tψ ∨T 1
⊗ ⊗
1. Fφ ∨ ψ Assumption
2. Tφ Assumption
3. Fφ ∨F 1
4. Fψ ∨F 1
⊗
1. Fφ ∨ ψ Assumption
2. Tψ Assumption
3. Fφ ∨F 1
4. Fψ ∨F 1
⊗
Proposition 21.24. 1. φ, φ → ψ ⊢ ψ.
2. Both ¬ φ ⊢ φ → ψ and ψ ⊢ φ → ψ.
1. Fψ Assumption
2. Tφ → ψ Assumption
3. Tφ Assumption
4. Fφ Tψ →T 2
⊗ ⊗
1. Fφ → ψ Assumption
2. T¬ φ Assumption
3. Tφ →F 1
4. Fψ →F 1
5. Fφ ¬T 2
⊗
1. Fφ → ψ Assumption
2. Tψ Assumption
3. Tφ →F 1
4. Fψ →F 1
⊗
Proof. Suppose Γ ⊢ φ(c), i.e., there are ψ1 , . . . , ψn ∈ Γ and a closed tableau for
{F φ(c),Tψ1 , . . . , Tψn }.
{F ∀ x φ( x ),Tψ1 , . . . , Tψn }.
Take the closed tableau and replace the first assumption with F ∀ x φ( x ), and
insert F φ(c) after the assumptions.
F φ(c) F ∀ x φ( x )
Tψ.. 1 Tψ.. 1
. .
Tψn Tψn
F φ(c)
The tableau is still closed, since all sentences available as assumptions before
are still available at the top of the tableau. The inserted line is the result of
a correct application of ∀F, since the constant symbol c does not occur in ψ1 ,
. . . , ψn or ∀ x φ( x ), i.e., it does not occur above the inserted line in the new
tableau.
2. ∀ x φ( x ) ⊢ φ(t).
1. F ∃ x φ( x ) Assumption
2. T φ(t) Assumption
3. F φ(t) ∃F 1
⊗
1. F φ(t) Assumption
2. T ∀ x φ( x ) Assumption
3. T φ(t) ∀T 2
⊗
21.11 Soundness
A derivation system, such as tableaux, is sound if it cannot derive things that
do not actually hold. Soundness is thus a kind of guaranteed safety property
for derivation systems. Depending on which proof theoretic property is in
question, we would like to know for instance, that
Proof. Let’s call a branch of a tableau satisfiable iff the set of signed formulas
on it is satisfiable, and let’s call a tableau satisfiable if it contains at least one
satisfiable branch.
We show the following: Extending a satisfiable tableau by one of the rules
of inference always results in a satisfiable tableau. This will prove the theo-
rem: any closed tableau results by applying rules of inference to the tableau
consisting only of assumptions from Γ. So if Γ were satisfiable, any tableau
for it would be satisfiable. A closed tableau, however, is clearly not satisfiable:
every branch contains both T φ and F φ, and no structure can both satisfy and
not satisfy φ.
Suppose we have a satisfiable tableau, i.e., a tableau with at least one sat-
isfiable branch. Applying a rule of inference either adds signed formulas to a
branch, or splits a branch in two. If the tableau has a satisfiable branch which
is not extended by the rule application in question, it remains a satisfiable
branch in the extended tableau, so the extended tableau is satisfiable. So we
only have to consider the case where a rule is applied to a satisfiable branch.
Let Γ be the set of signed formulas on that branch, and let S φ ∈ Γ be the
signed formula to which the rule is applied. If the rule does not result in a split
branch, we have to show that the extended branch, i.e., Γ together with the
conclusions of the rule, is still satisfiable. If the rule results in a split branch,
we have to show that at least one of the two resulting branches is satisfiable.
First, we consider the possible inferences that do not result in a split branch.
1. The branch is expanded by applying ¬T to T ¬ψ ∈ Γ. Then the extended
branch contains the signed formulas Γ ∪ {F ψ}. Suppose M ⊨ Γ. In
particular, M ⊨ ¬ψ. Thus, M ⊭ ψ, i.e., M satisfies F ψ.
2. The branch is expanded by applying ¬F to F ¬ψ ∈ Γ: Exercise.
3. The branch is expanded by applying ∧T to Tψ ∧ χ ∈ Γ, which results in
two new signed formulas on the branch: Tψ and Tχ. Suppose M ⊨ Γ,
in particular M ⊨ ψ ∧ χ. Then M ⊨ ψ and M ⊨ χ. This means that M
satisfies both Tψ and Tχ.
4. The branch is expanded by applying ∨F to F ψ ∨ χ ∈ Γ: Exercise.
5. The branch is expanded by applying →F to F ψ → χ ∈ Γ: This results in
two new signed formulas on the branch: Tψ and F χ. Suppose M ⊨ Γ,
in particular M ⊭ ψ → χ. Then M ⊨ ψ and M ⊭ χ. This means that M
satisfies both Tψ and F χ.
Now let’s consider the possible inferences that result in a split branch.
4. The branch is expanded by Cut: This results in two branches, one con-
taining Tψ, the other containing F ψ. Since M ⊨ Γ and either M ⊨ ψ or
M ⊭ ψ, M satisfies either the left or the right branch.
Tt1 = t2 Tt1 = t2
= T φ ( t1 ) F φ ( t1 )
Tt = t
=T =F
T φ ( t2 ) F φ ( t2 )
Note that in contrast to all the other rules, =T and =F require that two
signed formulas already appear on the branch, namely both Tt1 = t2 and
S φ ( t1 ).
1. F φ(t) Assumption
2. Ts = t Assumption
3. T φ(s) Assumption
4. T φ(t) =T 2, 3
⊗
1. F s2 = s1 Assumption
2. Ts1 = s2 Assumption
3. Ts1 = s1 =
4. Ts2 = s1 =T 2, 3
⊗
Here, line 2 is the first prerequisite formula Ts1 = s2 of =T. Line 3 is the
second one, of the form T φ(s2 )—think of φ( x ) as x = s1 , then φ(s1 ) is s1 = s1
and φ(s2 ) is s2 = s1 .
They also prove that = is transitive, i.e., that s1 = s2 , s2 = s3 ⊢ s1 = s3 :
1. F s1 = s3 Assumption
2. Ts1 = s2 Assumption
3. Ts2 = s3 Assumption
4. Ts1 = s3 =T 3, 2
⊗
In this tableau, the first prerequisite formula of =T is line 3, Ts2 = s3 (s2 plays
the role of t1 , and s3 the role of t2 ). The second prerequisite, of the form T φ(s2 )
is line 2. Here, think of φ( x ) as s1 = x; that makes φ(s2 ) into t1 = t2 (i.e., line 2)
and φ(s3 ) into the formula s1 = s3 in the conclusion.
Proof. We just have to show as before that if a tableau has a satisfiable branch,
the branch resulting from applying one of the rules for = to it is also satisfi-
able. Let Γ be the set of signed formulas on the branch, and let M be a struc-
ture satisfying Γ.
Suppose the branch is expanded using =, i.e., by adding the signed for-
mula Tt = t. Trivially, M ⊨ t = t, so M also satisfies Γ ∪ {Tt = t}.
If the branch is expanded using =T, we add a signed formula S φ(t2 ),
but Γ contains both Tt1 = t2 and T φ(t1 ). Thus we have M ⊨ t1 = t2 and
M ⊨ φ(t1 ). Let s be a variable assignment with s( x ) = ValM (t1 ). By Propo-
sition 16.17, M, s ⊨ φ(t1 ). Since s ∼ x s, by Proposition 16.22, M, s ⊨ φ( x ).
since M ⊨ t1 = t2 , we have ValM (t1 ) = ValM (t2 ), and hence s( x ) = ValM (t2 ).
By applying Proposition 16.22 again, we also have M, s ⊨ φ(t2 ). By Proposi-
tion 16.17, M ⊨ φ(t2 ). The case of =F is treated similarly.
Problems
Problem 21.1. Give closed tableaux of the following:
1. T φ ∧ (ψ ∧ χ), F ( φ ∧ ψ) ∧ χ.
2. T φ ∨ (ψ ∨ χ), F ( φ ∨ ψ) ∨ χ.
3. T φ → (ψ → χ), F ψ → ( φ → χ).
4. T φ, F ¬¬ φ.
1. T ( φ ∨ ψ) → χ, F φ → χ.
2. T ( φ → χ) ∧ (ψ → χ), F ( φ ∨ ψ) → χ.
3. F ¬( φ ∧ ¬ φ).
4. Tψ → φ, F ¬ φ → ¬ψ.
5. F ( φ → ¬ φ) → ¬ φ.
6. F ¬( φ → ψ) → ¬ψ.
7. T φ → χ, F ¬( φ ∧ ¬χ).
8. T φ ∧ ¬χ, F ¬( φ → χ).
9. T φ ∨ ψ, ¬ψ, F φ.
12. F ¬( φ ∨ ψ) → (¬ φ ∧ ¬ψ).
1. T ¬( φ → ψ), F φ.
2. T ¬( φ ∧ ψ), F ¬ φ ∨ ¬ψ.
3. T φ → ψ, F ¬ φ ∨ ψ.
4. F ¬¬ φ → φ.
5. T φ → ψ, T ¬ φ → ψ, F ψ.
6. T ( φ ∧ ψ) → χ, F ( φ → χ) ∨ (ψ → χ).
7. T ( φ → ψ) → φ, F φ.
8. F ( φ → ψ) ∨ (ψ → χ).
3. T ∀ x ( φ( x ) → ψ), F ∃y φ(y) → ψ.
4. T ∀ x ¬ φ( x ), F ¬∃ x φ( x ).
5. F ¬∃ x φ( x ) → ∀ x ¬ φ( x ).
1. F ¬∀ x φ( x ) → ∃ x ¬ φ( x ).
3. F ∃ x ( φ( x ) → ∀y φ(y)).
1. F ∀ x ∀y (( x = y ∧ φ( x )) → φ(y))
2. F ∃ x ( φ( x ) ∧ ∀y ( φ(y) → y = x )),
T ∃ x φ( x ) ∧ ∀y ∀z (( φ(y) ∧ φ(z)) → y = z)
Axiomatic Derivations
No effort has been made yet to ensure that the material in this chap-
ter respects various tags indicating which connectives and quantifiers are
primitive or defined: all are assumed to be primitive, except ↔ which is
assumed to be defined. If the FOL tag is true, we produce a version with
quantifiers, otherwise without.
1. φi ∈ Γ; or
2. φi is an axiom; or
332
22.1. RULES AND DERIVATIONS
It gets more interesting if the rule of inference appeals to formulas that appear
before the step considered. The following rule is called modus ponens:
If this is the only rule of inference, then our definition of derivation above
amounts to this: φ1 , . . . , φn is a derivation iff for each i ≤ n one of the follow-
ing holds:
1. φi ∈ Γ; or
2. φi is an axiom; or
The last clause says that φi follows from φ j (ψ) and φk (ψ → φi ) by modus
ponens. If we can go from 1 to n, and each time we find a formula φi that is
either in Γ, an axiom, or which a rule of inference tells us that it is a correct
inference step, then the entire sequence counts as a correct derivation.
( φ ∧ ψ) → φ (22.1)
( φ ∧ ψ) → ψ (22.2)
φ → (ψ → ( φ ∧ ψ)) (22.3)
φ → ( φ ∨ ψ) (22.4)
φ → (ψ ∨ φ) (22.5)
( φ → χ) → ((ψ → χ) → (( φ ∨ ψ) → χ)) (22.6)
φ → (ψ → φ) (22.7)
( φ → (ψ → χ)) → (( φ → ψ) → ( φ → χ)) (22.8)
( φ → ψ) → (( φ → ¬ψ) → ¬ φ) (22.9)
¬ φ → ( φ → ψ) (22.10)
⊤ (22.11)
⊥→φ (22.12)
( φ → ⊥) → ¬ φ (22.13)
¬¬ φ → φ (22.14)
∀ x ψ → ψ ( t ), (22.15)
ψ(t) → ∃ x ψ. (22.16)
Why? Two applications of MP yield the last part, which is what we want. And
we easily see that ¬θ → (θ → α) is an instance of eq. (22.10), and α → (θ → α)
is an instance of eq. (22.7). So our derivation is:
1. ¬θ → (θ → α) eq. (22.10)
2. (¬θ → (θ → α)) →
((α → (θ → α)) → ((¬θ ∨ α) → (θ → α))) eq. (22.6)
3. ((α → (θ → α)) → ((¬θ ∨ α) → (θ → α)) 1, 2, MP
4. α → (θ → α) eq. (22.7)
5. (¬θ ∨ α) → (θ → α) 3, 4, MP
θ → (θ → θ )
In order to apply MP, we would also need to justify the corresponding second
premise, namely φ. But in our case, that would be θ, and we won’t be able to
derive θ by itself. So we need a different strategy.
The other axiom involving just → is eq. (22.8), i.e.,
( φ → (ψ → χ)) → (( φ → ψ) → ( φ → χ))
We could get to the last nested conditional by applying MP twice. Again, that
would mean that we want an instance of eq. (22.8) where φ → χ is θ → θ, the
formula we are aiming for. Then of course, φ and χ are both θ. How should
we pick ψ so that both φ → (ψ → χ) and φ → ψ, i.e., in our case θ → (ψ → θ )
and θ → ψ, are also derivable? Well, the first of these is already an instance of
eq. (22.7), whatever we decide ψ to be. And θ → ψ would be another instance
of eq. (22.7) if ψ were (θ → θ ). So, our derivation is:
1. φ→ψ H YP
2. ψ→χ H YP
3. (ψ → χ) → ( φ → (ψ → χ)) eq. (22.7)
4. φ → (ψ → χ) 2, 3, MP
5. ( φ → (ψ → χ)) →
(( φ → ψ) → ( φ → χ)) eq. (22.8)
6. (( φ → ψ) → ( φ → χ)) 4, 5, MP
7. φ→χ 1, 6, MP
The lines labelled “H YP” (for “hypothesis”) indicate that the formula on that
line is an element of Γ.
(∀ x φ( x ) ∧ ∀y ψ(y)) → ∀ x φ( x )
∀ x φ( x ) → φ( a)
(∀ x φ( x ) ∧ ∀y ψ(y)) → φ( a)
(∀ x φ( x ) ∧ ∀y ψ(y)) → ψ( a)
(∀ x φ( x ) ∧ ∀y ψ(y)) → ( φ( a) ∧ ψ( a))
(∀ x φ( x ) ∧ ∀y ψ(y)) → ∀ x ( φ( x ) ∧ ψ( x )).
φ1 , . . . , φk = φ, ψ1 , . . . , ψl = ψ.
Proof. Exercise.
1. φ Hyp.
2. φ→ψ Hyp.
3. ψ 1, 2, MP
By Proposition 22.19, Γ ⊢ ψ.
The most important result we’ll use in this context is the deduction theo-
rem:
Γ ⊢ φ → ( χ → ψ );
Γ ⊢ φ → χ.
But also
Γ ⊢ ( φ → (χ → ψ)) → (( φ → χ) → ( φ → ψ)),
by eq. (22.8), and two applications of Proposition 22.22 give Γ ⊢ φ → ψ, as
required.
Notice how eq. (22.7) and eq. (22.8) were chosen precisely so that the De-
duction Theorem would hold.
The following are some useful facts about derivability, which we leave as
exercises.
Γ ∪ { φ } ⊢ χ → θ ( a ),
Γ ⊢ φ → (χ → θ ( a)).
By
⊢ ( φ → (χ → θ ( a))) → (( φ ∧ χ) → θ ( a))
Γ ⊢ ( φ ∧ χ ) → θ ( a ).
Since the eigenvariable condition still applies, we can add a step to this deriva-
tion justified by QR, and get
Γ ⊢ ( φ ∧ χ ) → ∀ x θ ( x ).
We also have
⊢ (( φ ∧ χ) → ∀ x θ ( x )) → ( φ → (χ → ∀ x θ ( x )),
so by modus ponens,
Γ ⊢ φ → (χ → ∀ x θ ( x )),
i.e., Γ ⊢ ψ.
We leave the case where ψ is justified by the rule QR, but is of the form
∃ x θ ( x ) → χ, as an exercise.
Proof. Exercise.
2. φ, ψ ⊢ φ ∧ ψ.
2. Both φ ⊢ φ ∨ ψ and ψ ⊢ φ ∨ ψ.
Proposition 22.32. 1. φ, φ → ψ ⊢ ψ.
2. Both ¬ φ ⊢ φ → ψ and ψ ⊢ φ → ψ.
1. φ H YP
2. φ→ψ H YP
3. ψ 1, 2, MP
2. By eq. (22.10) and eq. (22.7) and the deduction theorem, respectively.
2. ∀ x φ( x ) ⊢ φ(t).
22.12 Soundness
A derivation system, such as axiomatic deduction, is sound if it cannot de-
rive things that do not actually hold. Soundness is thus a kind of guaranteed
safety property for derivation systems. Depending on which proof theoretic
property is in question, we would like to know for instance, that
Proof. We have to verify that all the axioms are valid. For instance, here is the
case for eq. (22.15): suppose t is free for x in φ, and assume M, s ⊨ ∀ x φ. Then
by definition of satisfaction, for each s′ ∼ x s, also M, s′ ⊨ φ, and in particular
this holds when s′ ( x ) = ValMs ( t ). By Proposition 16.22, M, s ⊨ φ [ t/x ]. This
shows that M, s ⊨ (∀ x φ → φ[t/x ]).
t = t, (22.17)
t1 = t2 → (ψ(t1 ) → ψ(t2 )), (22.18)
Proposition 22.40. The axioms eq. (22.17) and eq. (22.18) are valid.
Proof. Exercise.
Problems
Problem 22.1. Show that the following hold by exhibiting derivations from
the axioms:
1. ( φ ∧ ψ) → (ψ ∧ φ)
2. (( φ ∧ ψ) → χ) → ( φ → (ψ → χ))
3. ¬( φ ∨ ψ) → ¬ φ
23.1 Introduction
The completeness theorem is one of the most fundamental results about logic.
It comes in two formulations, the equivalence of which we’ll prove. In its first
formulation it says something fundamental about the relationship between
semantic consequence and our derivation system: if a sentence φ follows from
some sentences Γ, then there is also a derivation that establishes Γ ⊢ φ. Thus,
the derivation system is as strong as it can possibly be without proving things
that don’t actually follow.
In its second formulation, it can be stated as a model existence result: ev-
ery consistent set of sentences is satisfiable. Consistency is a proof-theoretic
notion: it says that our derivation system is unable to produce certain deriva-
tions. But who’s to say that just because there are no derivations of a certain
sort from Γ, it’s guaranteed that there is a structure M? Before the complete-
ness theorem was first proved—in fact before we had the derivation systems
we now do—the great German mathematician David Hilbert held the view
that consistency of mathematical theories guarantees the existence of the ob-
jects they are about. He put it as follows in a letter to Gottlob Frege:
346
23.2. OUTLINE OF THE PROOF
φ ∨ ψ ∈ Γ, then we will have to make at least one of them true, i.e., proceed
as if one of them was in Γ.
This suggests the following idea: we add additional formulas to Γ so as to
(a) keep the resulting set consistent and (b) make sure that for every possible
atomic sentence φ, either φ is in the resulting set, or ¬ φ is, and (c) such that,
whenever φ ∧ ψ is in the set, so are both φ and ψ, if φ ∨ ψ is in the set, at least
one of φ or ψ is also, etc. We keep doing this (potentially forever). Call the
set of all formulas so added Γ ∗ . Then our construction above would provide
us with a structure M for which we could prove, by induction, that it satisfies
all sentences in Γ ∗ , and hence also all sentence in Γ since Γ ⊆ Γ ∗ . It turns
out that guaranteeing (a) and (b) is enough. A set of sentences for which (b)
holds is called complete. So our task will be to extend the consistent set Γ to a
consistent and complete set Γ ∗ .
There is one wrinkle in this plan: if ∃ x φ( x ) ∈ Γ we would hope to be able
to pick some constant symbol c and add φ(c) in this process. But how do we
know we can always do that? Perhaps we only have a few constant symbols
in our language, and for each one of them we have ¬ φ(c) ∈ Γ. We can’t also
add φ(c), since this would make the set inconsistent, and we wouldn’t know
whether M has to make φ(c) or ¬ φ(c) true. Moreover, it might happen that Γ
contains only sentences in a language that has no constant symbols at all (e.g.,
the language of set theory).
The solution to this problem is to simply add infinitely many constants at
the beginning, plus sentences that connect them with the quantifiers in the
right way. (Of course, we have to verify that this cannot introduce an incon-
sistency.)
Our original construction works well if we only have constant symbols in
the atomic sentences. But the language might also contain function symbols.
In that case, it might be tricky to find the right functions on N to assign to
these function symbols to make everything work. So here’s another trick: in-
stead of using i to interpret ci , just take the set of constant symbols itself as
the domain. Then M can assign every constant symbol to itself: ciM = ci . But
why not go all the way: let |M| be all terms of the language! If we do this,
there is an obvious assignment of functions (that take terms as arguments and
have terms as values) to function symbols: we assign to the function sym-
bol fin the function which, given n terms t1 , . . . , tn as input, produces the term
fin (t1 , . . . , tn ) as value.
The last piece of the puzzle is what to do with =. The predicate symbol =
has a fixed interpretation: M ⊨ t = t′ iff ValM (t) = ValM (t′ ). Now if we set
things up so that the value of a term t is t itself, then this structure will make
no sentence of the form t = t′ true unless t and t′ are one and the same term.
And of course this is a problem, since basically every interesting theory in a
language with function symbols will have as theorems sentences t = t′ where
t and t′ are not the same term (e.g., in theories of arithmetic: (0 + 0) = 0). To
solve this problem, we change the domain of M: instead of using terms as the
objects in |M|, we use sets of terms, and each set is so that it contains all those
terms which the sentences in Γ require to be equal. So, e.g., if Γ is a theory of
arithmetic, one of these sets will contain: 0, (0 + 0), (0 × 0), etc. This will be
the set we assign to 0, and it will turn out that this set is also the value of all
the terms in it, e.g., also of (0 + 0). Therefore, the sentence (0 + 0) = 0 will be
true in this revised structure.
So here’s what we’ll do. First we investigate the properties of complete
consistent sets, in particular we prove that a complete consistent set contains
φ ∧ ψ iff it contains both φ and ψ, φ ∨ ψ iff it contains at least one of them,
etc. (Proposition 23.2). Then we define and investigate “saturated” sets of
sentences. A saturated set is one which contains conditionals that link each
quantified sentence to instances of it (Definition 23.5). We show that any con-
sistent set Γ can always be extended to a saturated set Γ ′ (Lemma 23.6). If a set
is consistent, saturated, and complete it also has the property that it contains
∃ x φ( x ) iff it contains φ(t) for some closed term t and ∀ x φ( x ) iff it contains
φ(t) for all closed terms t (Proposition 23.7). We’ll then take the saturated con-
sistent set Γ ′ and show that it can be extended to a saturated, consistent, and
complete set Γ ∗ (Lemma 23.8). This set Γ ∗ is what we’ll use to define our term
model M( Γ ∗ ). The term model has the set of closed terms as its domain, and
the interpretation of its predicate symbols is given by the atomic sentences
in Γ ∗ (Definition 23.9). We’ll use the properties of saturated, complete con-
sistent sets to show that indeed M( Γ ∗ ) ⊨ φ iff φ ∈ Γ ∗ (Lemma 23.12), and
thus in particular, M( Γ ∗ ) ⊨ Γ. Finally, we’ll consider how to define a term
model if Γ contains = as well (Definition 23.16) and show that it satisfies Γ ∗
(Lemma 23.19).
all those in Γ) true. The proof of this latter fact requires that ¬ φ ∈ Γ ∗ iff
φ∈ / Γ ∗ , ( φ ∨ ψ) ∈ Γ ∗ iff φ ∈ Γ ∗ or ψ ∈ Γ ∗ , etc.
In what follows, we will often tacitly use the properties of reflexivity, mono-
tonicity, and transitivity of ⊢ (see sections 19.8, 20.7, 21.7 and 22.6).
1. If Γ ⊢ φ, then φ ∈ Γ.
3. φ ∨ ψ ∈ Γ iff either φ ∈ Γ or ψ ∈ Γ.
4. φ → ψ ∈ Γ iff either φ ∈
/ Γ or ψ ∈ Γ.
Proof. Let us suppose for all of the following that Γ is complete and consistent.
1. If Γ ⊢ φ, then φ ∈ Γ.
Suppose that Γ ⊢ φ. Suppose to the contrary that φ ∈ / Γ. Since Γ is
complete, ¬ φ ∈ Γ. By Propositions 19.20, 20.20, 21.20 and 22.28, Γ is in-
consistent. This contradicts the assumption that Γ is consistent. Hence,
it cannot be the case that φ ∈
/ Γ, so φ ∈ Γ.
2. Exercise.
4. Exercise.
The following definition will be used in the proof of the next theorem.
Definition 23.5. Let L′ be as in Proposition 23.3. Fix an enumeration φ0 ( x0 ),
φ1 ( x1 ), . . . of all formulas φi ( xi ) of L′ in which one variable (xi ) occurs free.
We define the sentences θn by induction on n.
Let c0 be the first constant symbol among the di we added to L which does
not occur in φ0 ( x0 ). Assuming that θ0 , . . . , θn−1 have already been defined,
let cn be the first among the new constant symbols di that occurs neither in θ0 ,
. . . , θn−1 nor in φn ( xn ).
Now let θn be the formula ∃ xn φn ( xn ) → φn (cn ).
Lemma 23.6. Every consistent set Γ can be extended to a saturated consistent set Γ ′ .
We’ll now show that complete, consistent sets which are saturated have the
property that it contains a universally quantified sentence iff it contains all its
instances and it contains an existentially quantified sentence iff it contains at
least one instance. We’ll use this to show that the structure we’ll generate from
a complete, consistent, saturated set makes all its quantified sentences true.
2. Exercise.
Let Γ ∗ = n≥0 Γn .
S
Propositions 19.21, 20.21, 21.21 and 22.29, contrary to the induction hypothe-
sis.
For every n and every i < n, Γi ⊆ Γn . This follows by a simple induction
on n. For n = 0, there are no i < 0, so the claim holds automatically. For
the inductive step, suppose it is true for n. We have Γn+1 = Γn ∪ { φn } or
= Γn ∪ {¬ φn } by construction. So Γn ⊆ Γn+1 . If i < n, then Γi ⊆ Γn by
inductive hypothesis, and so ⊆ Γn+1 by transitivity of ⊆.
From this it follows that every finite subset of Γ ∗ is a subset of Γn for
some n, since each ψ ∈ Γ ∗ not already in Γ0 is added at some stage i. If n
is the last one of these, then all ψ in the finite subset are in Γn . So, every finite
subset of Γ ∗ is consistent. By Propositions 19.17, 20.17, 21.17 and 22.21, Γ ∗ is
consistent.
Every sentence of Frm(L) appears on the list used to define Γ ∗ . If φn ∈ / Γ∗ ,
then that is because Γn ∪ { φn } was inconsistent. But then ¬ φn ∈ Γ , so Γ ∗ is
∗
complete.
∗
We will now check that we indeed have ValM( Γ ) (t) = t.
∗
Lemma 23.10. Let M( Γ ∗ ) be the term model of Definition 23.9, then ValM( Γ ) (t) =
t.
Proof. The proof is by induction on t, where the base case, when t is a con-
stant symbol, follows directly from the definition of the term model. For the
∗
induction step assume t1 , . . . , tn are closed terms such that ValM( Γ ) (ti ) = ti
and that f is an n-ary function symbol. Then
∗ ∗ ∗ ∗
ValM( Γ ) ( f (t1 , . . . , tn )) = f M( Γ ) (ValM( Γ ) (t1 ), . . . , ValM( Γ ) (tn ))
∗
= f M( Γ ) ( t 1 , . . . , t n )
= f ( t1 , . . . , t n ),
2. Exercise.
4. φ ≡ ψ ∧ χ: exercise.
6. φ ≡ ψ → χ: exercise.
7. φ ≡ ∀ x ψ( x ): exercise.
23.7 Identity
The construction of the term model given in the preceding section is enough
to establish completeness for first-order logic for sets Γ that do not contain =.
The term model satisfies every φ ∈ Γ ∗ which does not contain = (and hence
all φ ∈ Γ). It does not work, however, if = is present. The reason is that Γ ∗
then may contain a sentence t = t′ , but in the term model the value of any
term is that term itself. Hence, if t and t′ are different terms, their values in
the term model—i.e., t and t′ , respectively—are different, and so t = t′ is false.
We can fix this, however, using a construction known as “factoring.”
t ≈ t′ iff t = t′ ∈ Γ ∗
1. ≈ is reflexive.
2. ≈ is symmetric.
3. ≈ is transitive.
2. If Γ ∗ ⊢ t = t′ then Γ ∗ ⊢ t′ = t.
4. If Γ ∗ ⊢ t = t′ , then
1. |M/≈ | = Trm(L)/≈ .
2. cM/≈ = [c]≈
Note that we have defined f M/≈ and RM/≈ for elements of Trm(L)/≈ by
referring to them as [t]≈ , i.e., via representatives t ∈ [t]≈ . We have to make sure
that these definitions do not depend on the choice of these representatives, i.e.,
that for some other choices t′ which determine the same equivalence classes
([t]≈ = [t′ ]≈ ), the definitions yield the same result. For instance, if R is a one-
place predicate symbol, the last clause of the definition says that [t]≈ ∈ RM/≈
iff M ⊨ R(t). If for some other term t′ with t ≈ t′ , M ⊭ R(t), then the definition
would require [t′ ]≈ ∈ / RM/≈ . If t ≈ t′ , then [t]≈ = [t′ ]≈ , but we can’t have both
[t]≈ ∈ R M/ ≈ and [t]≈ ∈/ RM/≈ . However, Proposition 23.14 guarantees that
this cannot happen.
Proposition 23.17. M/≈ is well defined, i.e., if t1 , . . . , tn , t1′ , . . . , t′n are terms, and
ti ≈ ti′ then
1. [ f (t1 , . . . , tn )]≈ = [ f (t1′ , . . . , t′n )]≈ , i.e.,
and
2. M ⊨ R(t1 , . . . , tn ) iff M ⊨ R(t1′ , . . . , t′n ), i.e.,
As in the case of the term model, before proving the truth lemma we need
the following lemma.
Proof. By induction on φ, just as in the proof of Lemma 23.12. The only case
that needs additional attention is when φ ≡ t = t′ .
Corollary 23.21 (Completeness Theorem, Second Version). For all Γ and sen-
tences φ: if Γ ⊨ φ then Γ ⊢ φ.
Proof. Note that the Γ’s in Corollary 23.21 and Theorem 23.20 are universally
quantified. To make sure we do not confuse ourselves, let us restate Theo-
rem 23.20 using a different variable: for any set of sentences ∆, if ∆ is consis-
tent, it is satisfiable. By contraposition, if ∆ is not satisfiable, then ∆ is incon-
sistent. We will use this to prove the corollary.
Suppose that Γ ⊨ φ. Then Γ ∪ {¬ φ} is unsatisfiable by Proposition 16.27.
Taking Γ ∪ {¬ φ} as our ∆, the previous version of Theorem 23.20 gives us
that Γ ∪ {¬ φ} is inconsistent. By Propositions 19.19, 20.19, 21.19 and 22.27,
Γ ⊢ φ.
Theorem 23.23 (Compactness Theorem). The following hold for any sentences Γ
and φ:
∆ = {c ̸= t : t ∈ Trm(L)}.
k 1’s). For any finite subset ∆ 0 of ∆ there is a K such that all the sentences
′
c < (1 ÷ k) in ∆ 0 have k < K. If we expand Q to Q′ with cQ = 1/K we have
that Q′ ⊨ Γ ∪ ∆ 0 , and so Γ ∪ ∆ is finitely satisfiable (Exercise: prove this in
detail). By compactness, Γ ∪ ∆ is satisfiable. Any model S of Γ ∪ ∆ contains
an infinitesimal, namely cS .
Example 23.26. We know that first-order logic with identity predicate can ex-
press that the size of the domain must have some minimal size: The sen-
tence φ≥n (which says “there are at least n distinct objects”) is true only in
structures where |M| has at least n objects. So if we take
∆ = { φ ≥ n : n ≥ 1}
then any model of ∆ must be infinite. Thus, we can guarantee that a theory
only has infinite models by adding ∆ to it: the models of Γ ∪ ∆ are all and only
the infinite models of Γ.
So first-order logic can express infinitude. The compactness theorem shows
that it cannot express finitude, however. For suppose some set of sentences Λ
were satisfied in all and only finite structures. Then ∆ ∪ Λ is finitely satisfiable.
Why? Suppose ∆′ ∪ Λ′ ⊆ ∆ ∪ Λ is finite with ∆′ ⊆ ∆ and Λ′ ⊆ Λ. Let n be the
largest number such that φ≥n ∈ ∆′ . Λ, being satisfied in all finite structures,
has a model M with finitely many but ≥ n elements. But then M ⊨ ∆′ ∪ Λ′ . By
compactness, ∆ ∪ Λ has an infinite model, contradicting the assumption that
Λ is satisfied only in finite structures.
2. ( φ ∨ ψ) ∈ Γ iff either φ ∈ Γ or ψ ∈ Γ.
3. ( φ → ψ) ∈ Γ iff either φ ∈
/ Γ or ψ ∈ Γ.
Lemma 23.28. Every finitely satisfiable set Γ can be extended to a saturated finitely
satisfiable set Γ ′ .
Lemma 23.30. Every finitely satisfiable set Γ can be extended to a complete and
finitely satisfiable set Γ ∗ .
Problems
Problem 23.1. Complete the proof of Proposition 23.2.
Problem 23.6. Use Corollary 23.21 to prove Theorem 23.20, thus showing that
the two formulations of the completeness theorem are equivalent.
Problem 23.7. In order for a derivation system to be complete, its rules must
be strong enough to prove every unsatisfiable set inconsistent. Which of the
rules of derivation were necessary to prove completeness? Are any of these
rules not used anywhere in the proof? In order to answer these questions,
make a list or diagram that shows which of the rules of derivation were used
in which results that lead up to the proof of Theorem 23.20. Be sure to note
any tacit uses of rules in these proofs.
arithmetic which are true in the standard model of arithmetic N are also true
in a structure N′ that contains an element which does satisfy every formula
n < x.
Problem 23.11. Prove Lemma 23.28. (Hint: The crucial step is to show that if
Γn is finitely satisfiable, so is Γn ∪ {θn }, without any appeal to derivations or
consistency.)
Problem 23.13. Prove Lemma 23.30. (Hint: the crucial step is to show that if
Γn is finitely satisfiable, then either Γn ∪ { φn } or Γn ∪ {¬ φn } is finitely satisfi-
able.)
Problem 23.14. Write out the complete proof of the Truth Lemma (Lemma 23.12)
in the version required for the proof of Theorem 23.31.
This chapter, adapted from Jeremy Avigad’s logic notes, gives the
briefest of glimpses into which other logical systems there are. It is in-
tended as a chapter suggesting further topics for study in a course that
does not cover them. Each one of the topics mentioned here will—
hopefully—eventually receive its own part-level treatment in the Open
Logic Project.
24.1 Overview
First-order logic is not the only system of logic of interest: there are many ex-
tensions and variations of first-order logic. A logic typically consists of the
formal specification of a language, usually, but not always, a deductive sys-
tem, and usually, but not always, an intended semantics. But the technical use
of the term raises an obvious question: what do logics that are not first-order
logic have to do with the word “logic,” used in the intuitive or philosophical
sense? All of the systems described below are designed to model reasoning of
some form or another; can we say what makes them logical?
No easy answers are forthcoming. The word “logic” is used in different
ways and in different contexts, and the notion, like that of “truth,” has been
analyzed from numerous philosophical stances. For example, one might take
the goal of logical reasoning to be the determination of which statements are
necessarily true, true a priori, true independent of the interpretation of the
nonlogical terms, true by virtue of their form, or true by linguistic convention;
and each of these conceptions requires a good deal of clarification. Even if one
restricts one’s attention to the kind of logic used in mathematics, there is little
agreement as to its scope. For example, in the Principia Mathematica, Russell
and Whitehead tried to develop mathematics on the basis of logic, in the logi-
cist tradition begun by Frege. Their system of logic was a form of higher-type
364
24.2. MANY-SORTED LOGIC
logic similar to the one described below. In the end they were forced to intro-
duce axioms which, by most standards, do not seem purely logical (notably,
the axiom of infinity, and the axiom of reducibility), but one might nonetheless
hold that some forms of higher-order reasoning should be accepted as logical.
In contrast, Quine, whose ontology does not admit “propositions” as legiti-
mate objects of discourse, argues that second-order and higher-order logic are
really manifestations of set theory in sheep’s clothing; in other words, systems
involving quantification over predicates are not purely logical.
For now, it is best to leave such philosophical issues for a rainy day, and
simply think of the systems below as formal idealizations of various kinds of
reasoning, logical or otherwise.
asserts that if any French person is married to a German, either the French
person drinks wine or the German doesn’t eat wurst.
Many-sorted logic can be embedded in first-order logic in a natural way,
by lumping all the objects of the many-sorted domains together into one first-
order domain, using unary predicate symbols to keep track of the sorts, and
relativizing quantifiers. For example, the first-order language corresponding
to the example above would have unary predicate symbols “Ger man” and
“F r ench,” in addition to the other relations described, with the sort require-
ments erased. A sorted quantifier ∀ x φ, where x is a variable of the German
sort, translates to
∀ x (Ger man ( x ) → φ).
We need to add axioms that insure that the sorts are separate—e.g., ∀ x ¬(Ger man ( x ) ∧
F r ench ( x ))—as well as axioms that guarantee that “drinks wine” only holds
of objects satisfying the predicate F r ench ( x ), etc. With these conventions and
axioms, it is not difficult to show that many-sorted sentences translate to first-
order sentences, and many-sorted derivations translate to first-order deriva-
tions. Also, many-sorted structures “translate” to corresponding first-order
structures and vice-versa, so we also have a completeness theorem for many-
sorted logic.
∀ x1 . . . ∀ xk ( R( x1 , . . . , xk ) ↔ S( x1 , . . . , xk )).
The rules for second-order logic simply extend the quantifier rules to the
new second order variables. Here, however, one has to be a little bit careful
to explain how these variables interact with the predicate symbols of L, and
with formulas of L more generally. At the bare minimum, relation variables
count as terms, so one has inferences of the form
φ( R) ⊢ ∃ R φ( R)
But if L is the language of arithmetic with a constant relation symbol <, one
would also expect the following inference to be valid:
x < y ⊢ ∃ R R( x, y)
φ ( x1 , . . . , x k ) ⊢ ∃ R R ( x1 , . . . , x k )
where φ[λ⃗x. ψ(⃗x )/R] denotes the result of replacing every atomic formula of
the form Rt1 , . . . , tk in φ by ψ(t1 , . . . , tk ). This last rule is equivalent to having
a comprehension schema, i.e., an axiom of the form
∃ R ∀ x1 , . . . , xk ( φ( x1 , . . . , xk ) ↔ R( x1 , . . . , xk )),
one for each formula φ in the second-order language, in which R is not a free
variable. (Exercise: show that if R is allowed to occur in φ, this schema is
inconsistent!)
When logicians refer to the “axioms of second-order logic” they usually
mean the minimal extension of first-order logic by second-order quantifier
rules together with the comprehension schema. But it is often interesting to
study weaker subsystems of these axioms and rules. For example, note that
in its full generality the axiom schema of comprehension is impredicative: it
allows one to assert the existence of a relation R( x1 , . . . , xk ) that is “defined”
by a formula with second-order quantifiers; and these quantifiers range over
the set of all such relations—a set which includes R itself! Around the turn of
the twentieth century, a common reaction to Russell’s paradox was to lay the
blame on such definitions, and to avoid them in developing the foundations
of mathematics. If one prohibits the use of second-order quantifiers in the
formula φ, one has a predicative form of comprehension, which is somewhat
weaker.
From the semantic point of view, one can think of a second-order structure
as consisting of a first-order structure for the language, coupled with a set of
relations on the domain over which the second-order quantifiers range (more
precisely, for each k there is a set of relations of arity k). Of course, if com-
prehension is included in the derivation system, then we have the added re-
quirement that there are enough relations in the “second-order part” to satisfy
the comprehension axioms—otherwise the derivation system is not sound!
One easy way to insure that there are enough relations around is to take the
second-order part to consist of all the relations on the first-order part. Such
a structure is called full, and, in a sense, is really the “intended structure” for
the language. If we restrict our attention to full structures we have what is
known as the full second-order semantics. In that case, specifying a structure
boils down to specifying the first-order part, since the contents of the second-
order part follow from that implicitly.
To summarize, there is some ambiguity when talking about second-order
logic. In terms of the derivation system, one might have in mind either
When logicians do not specify the derivation system or the semantics they
have in mind, they are usually refering to the second item on each list. The
advantage to using this semantics is that, as we will see, it gives us categorical
descriptions of many natural mathematical structures; at the same time, the
derivation system is quite strong, and sound for this semantics. The drawback
is that the derivation system is not complete for the semantics; in fact, no effec-
tively given derivation system is complete for the full second-order semantics.
On the other hand, we will see that the derivation system is complete for the
weakened semantics; this implies that if a sentence is not provable, then there
is some structure, not necessarily the full one, in which it is false.
The language of second-order logic is quite rich. One can identify unary
relations with subsets of the domain, and so in particular you can quantify
over these sets; for example, one can express induction for the natural num-
bers with a single axiom
If one takes the language of arithmetic to have symbols 0, ′, +, × and <, one
can add the following axioms to describe their behavior:
1. ∀ x ¬ x ′ = 0
2. ∀ x ∀y (s( x ) = s(y) → x = y)
3. ∀ x ( x + 0) = x
4. ∀ x ∀y ( x + y′ ) = ( x + y)′
5. ∀ x ( x × 0) = 0
6. ∀ x ∀y ( x × y′ ) = (( x × y) + x )
7. ∀ x ∀y ( x < y ↔ ∃z y = ( x + z′ ))
It is not difficult to show that these axioms, together with the axiom of induc-
tion above, provide a categorical description of the structure N, the standard
model of arithmetic, provided we are using the full second-order semantics.
Given any structure M in which these axioms are true, define a function f
from N to the domain of M using ordinary recursion on N, so that f (0) = 0M
and f ( x + 1) = ′M ( f ( x )). Using ordinary induction on N and the fact that ax-
ioms (1) and (2) hold in M, we see that f is injective. To see that f is surjective,
let P be the set of elements of |M| that are in the range of f . Since M is full, P is
in the second-order domain. By the construction of f , we know that 0M is in P,
and that P is closed under ′M . The fact that the induction axiom holds in M
(in particular, for P) guarantees that P is equal to the entire first-order domain
of M. This shows that f is a bijection. Showing that f is a homomorphism is
no more difficult, using ordinary induction on N repeatedly.
In set-theoretic terms, a function is just a special kind of relation; for ex-
ample, a unary function f can be identified with a binary relation R satisfying
∀ x ∃!y R( x, y). As a result, one can quantify over functions too. Using the full
semantics, one can then define the class of infinite structures to be the class of
structures M for which there is an injective function from the domain of M to
a proper subset of itself:
∃ f (∀ x ∀y ( f ( x ) = f (y) → x = y) ∧ ∃y ∀ x f ( x ) ̸= y).
The negation of this sentence then defines the class of finite structures.
In addition, one can define the class of well-orderings, by adding the fol-
lowing to the definition of a linear ordering:
∀ P (∃ x P( x ) → ∃ x ( P( x ) ∧ ∀y (y < x → ¬ P(y)))).
This asserts that every non-empty set has a least element, modulo the iden-
tification of “set” with “one-place relation”. For another example, one can
express the notion of connectedness for graphs, by saying that there is no non-
trivial separation of the vertices into disconnected parts:
Think of types as syntactic “labels,” which classify the objects we want in our
domain; σ → τ describes those objects that are functions which take objects of
type σ to objects of type τ. For example, we might want to have a type Ω of
truth values, “true” and “false,” and a type N of natural numbers. In that case,
you can think of objects of type N → Ω as unary relations, or subsets of N;
objects of type N → N are functions from natural numers to natural numbers;
Rst (0) = s
Rst ( x + 1) = t( x, Rst ( x )),
⟨s, t⟩ denotes the pair whose first component is s and whose second compo-
nent is t, and p1 (s) and p2 (s) denote the first and second elements (“projec-
tions”) of s. Finally, λx. s denotes the function f defined by
f (x) = s
Theorem 24.1. There are irrational numbers a and b such that ab is rational.
√ √2 √
Proof. Consider 2 . If this is rational, we are done: we can let a = b = 2.
Otherwise, it is irrational. Then we have
√ √ √
√ 2 √2 √ 2· 2 √ 2
( 2 ) = 2 = 2 = 2,
√
√ 2 √
which is certainly rational. So, in this case, let a be 2 , and let b be 2.
Does this constitute a valid proof? Most mathematicians feel that it does.
But again, there is something a little bit unsatisfying here: we have proved the
existence of a pair of real numbers with a certain property, without being able
to say which pair of numbers it is. It is possible to prove the √
same result, but in
such a way that the pair a, b is given in the proof: take a = 3 and b = log3 4.
Then √ log 4
ab = 3 3 = 31/2·log3 4 = (3log3 4 )1/2 = 41/2 = 2,
since 3log3 x = x.
Intuitionistic logic is designed to model a kind of reasoning where moves
like the one in the first proof are disallowed. Proving the existence of an x
satisfying φ( x ) means that you have to give a specific x, and a proof that it
satisfies φ, like in the second proof. Proving that φ or ψ holds requires that
you can prove one or the other.
Formally speaking, intuitionistic first-order logic is what you get if you
restrict a derivation system for first-order logic in a certain way. Similarly,
there are intuitionistic versions of second-order or higher-order logic. From
the mathematical point of view, these are just formal deductive systems, but,
as already noted, they are intended to model a kind of mathematical reason-
ing. One can take this to be the kind of reasoning that is justified on a cer-
tain philosophical view of mathematics (such as Brouwer’s intuitionism); one
can take it to be a kind of mathematical reasoning which is more “concrete”
and satisfying (along the lines of Bishop’s constructivism); and one can argue
about whether or not the formal description captures the informal motiva-
tion. But whatever philosophical positions we may hold, we can study intu-
itionistic logic as a formally presented logic; and for whatever reasons, many
mathematical logicians find it interesting to do so.
There is an informal constructive interpretation of the intuitionist connec-
tives, usually known as the BHK interpretation (named after Brouwer, Heyt-
ing, and Kolmogorov). It runs as follows: a proof of φ ∧ ψ consists of a proof
of φ paired with a proof of ψ; a proof of φ ∨ ψ consists of either a proof of φ,
or a proof of ψ, where we have explicit information as to which is the case;
a proof of φ → ψ consists of a procedure, which transforms a proof of φ to a
proof of ψ; a proof of ∀ x φ( x ) consists of a procedure which returns a proof
of φ( x ) for any value of x; and a proof of ∃ x φ( x ) consists of a value of x,
together with a proof that this value satisfies φ. One can describe the interpre-
tation in computational terms known as the “Curry-Howard isomorphism”
or the “formulas-as-types paradigm”: think of a formula as specifying a cer-
tain kind of data type, and proofs as computational objects of these data types
that enable us to see that the corresponding formula is true.
Intuitionistic logic is often thought of as being classical logic “minus” the
law of the excluded middle. This following theorem makes this more precise.
1. ( φ → ⊥) → ¬ φ.
2. φ ∨ ¬ φ
3. ¬¬ φ → φ
Obtaining instances of one schema from either of the others is a good exercise
in intuitionistic logic.
The first deductive systems for intuitionistic propositional logic, put forth
as formalizations of Brouwer’s intuitionism, are due, independently, to Kol-
mogorov, Glivenko, and Heyting. The first formalization of intuitionistic first-
order logic (and parts of intuitionist mathematics) is due to Heyting. Though
a number of classically valid schemata are not intuitionistically valid, many
are.
The double-negation translation describes an important relationship between
classical and intuitionist logic. It is defined inductively follows (think of φ N
( φ ∨ ψ) N ≡ ¬¬( φ N ∨ ψ N )
( φ → ψ) N ≡ ( φ N → ψ N )
(∀ x φ) N ≡ ∀ x φ N
(∃ x φ) N ≡ ¬¬∃ x φ N
2. M, w ⊮ ⊥.
3. M, w ⊩ ( φ ∧ ψ) iff M, w ⊩ φ and M, w ⊩ ψ.
4. M, w ⊩ ( φ ∨ ψ) iff M, w ⊩ φ or M, w ⊩ ψ.
□( φ → ψ) → (□φ → □ψ)
□φ → φ
□φ → □□φ
♢φ → □♢φ
Variations of these axioms may be suitable for different applications; for ex-
ample, S5 is usually taken to characterize the notion of logical necessity. And
the nice thing is that one can usually find a semantics for which the derivation
system is sound and complete by restricting the accessibility relation in the
Kripke structures in natural ways. For example, S4 corresponds to the class
of Kripke structures in which the accessibility relation is reflexive and transi-
tive. S5 corresponds to the class of Kripke structures in which the accessibility
relation is universal, which is to say that every world is accessible from every
other; so □φ holds if and only if φ holds in every world.
Model Theory
379
CHAPTER 24. BEYOND FIRST-ORDER LOGIC
1. |M| = |M′ |
′
2. For every constant symbol c ∈ L, cM = cM .
′
3. For every function symbol f ∈ L, f M = f M .
′
4. For every predicate symbol P ∈ L, PM = PM .
Proof. Exercise.
381
CHAPTER 25. BASICS OF MODEL THEORY
25.2 Substructures
The domain of a structure M may be a subset of another M′ . But we should
obviously only consider M a “part” of M′ if not only |M| ⊆ |M′ |, but M and
M′ “agree” in how they interpret the symbols of the language at least on the
shared part |M|.
Definition 25.4. Given structures M and M′ for the same language L, we say
that M is a substructure of M′ , and M′ an extension of M, written M ⊆ M′ , iff
1. |M| ⊆ |M′ |,
′
2. For each constant c ∈ L, cM = cM ;
′
3. For each n-place function symbol f ∈ L f M ( a1 , . . . , an ) = f M ( a1 , . . . , an )
for all a1 , . . . , an ∈ |M|.
25.3 Overspill
Theorem 25.5. If a set Γ of sentences has arbitrarily large finite models, then it has
an infinite model.
Proof. If there were such a φ, its negation ¬ φ would be true in all and only the
finite structures, and it would therefore have arbitrarily large finite models
but it would lack an infinite model, contradicting Theorem 25.5.
Definition 25.7. Given two structures M and M′ for the same language L, we
say that M is elementarily equivalent to M′ , written M ≡ M′ , if and only if for
every sentence φ of L, M ⊨ φ iff M′ ⊨ φ.
Definition 25.8. Given two structures M and M′ for the same language L,
we say that M is isomorphic to M′ , written M ≃ M′ , if and only if there is a
function h : |M| → |M′ | such that:
b. M, s ⊨ φ iff M′ , h ◦ s ⊨ φ.
′ ′
1. If t ≡ c, then ValM
s (c) = c
M and ValM ( c ) = cM . Thus, h (ValM ( t )) =
h◦s s
′ ′
h(cM ) = cM (by (3) of Definition 25.8) = ValMh ◦ s ( t ).
′
2. If t ≡ x, then ValM M M
s ( x ) = s ( x ) and Valh◦s ( x ) = h ( s ( x )). Thus, h (Vals ( x )) =
′
h(s( x )) = ValM
h ◦ s ( x ).
3. If t ≡ f (t1 , . . . , tn ), then
ValM M M M
s ( t ) = f (Vals ( t1 ), . . . , Vals ( tn )) and
′ ′ M′
ValM
h◦s (t ) = f M
(ValM
h◦s ( t1 ), . . . , Valh◦s ( tn )).
′
The induction hypothesis is that for each i, h(ValM M
s ( ti )) = Valh◦s ( ti ). So,
h(ValM M M M
s ( t )) = h ( f (Vals ( t1 ), . . . , Vals ( tn ))
′ ′
= h( f M (ValM M
h◦s ( t1 ), . . . , Valh◦s ( tn )) (25.1)
M′ ′ M′
= f (ValM h◦s ( t1 ), . . . , Valh◦s ( tn )) (25.2)
′
= ValM
h◦s (t )
Here, eq. (25.1) follows by induction hypothesis and eq. (25.2) by (5) of
Definition 25.8.
We also use the term “theory” informally to refer to sets of sentences hav-
ing an intended interpretation, whether deductively closed or not.
Remark 2. Consider R = ⟨R, <⟩, the structure whose domain is the set R of
the real numbers, in the language comprising only a 2-place predicate sym-
bol interpreted as the < relation over the reals. Clearly R is non-enumerable;
however, since Th(R) is obviously consistent, by the Löwenheim-Skolem the-
orem it has an enumerable model, say S, and by Proposition 25.13, R ≡ S.
Moreover, since R and S are not isomorphic, this shows that the converse of
Theorem 25.9 fails in general.
1. p is injective;
Proof. Since M and N are enumerable, let |M| = { a0 , a1 , . . .} and |N| = {b0 , b1 , . . .}.
Starting with an arbitrary p0 ∈ I, we define an increasing sequence of partial
isomorphisms p0 ⊆ p1 ⊆ p2 ⊆ · · · as follows:
1. if n + 1 is odd, say n = 2r, then using the Forth property find a pn+1 ∈ I
such that pn ⊆ pn+1 and ar is in the domain of pn+1 ;
If we now put:
[
p= pn ,
n ≥0
Theorem 25.17. Suppose M and N are structures for a purely relational language
(a language containing only predicate symbols, and no function symbols or con-
stants). Then if M ≃ p N, also M ≡ N.
Remark 3. If function symbols are present, the previous result is still true, but
one needs to consider the isomorphism induced by p between the substruc-
ture of M generated by a1 , . . . , an and the substructure of N generated by b1 ,
. . . , bn .
Definition 25.18. For any formula φ, the quantifier rank of φ, denoted by qr( φ) ∈
N, is recursively defined as the highest number of nested quantifiers in φ.
Two structures M and N are n-equivalent, written M ≡n N, if they agree on all
sentences of quantifier rank less than or equal to n.
Proposition 25.19. Let L be a finite purely relational language, i.e., a language con-
taining finitely many predicate symbols and constant symbols, and no function sym-
bols. Then for each n ∈ N there are only finitely many first-order sentences in the
language L that have quantifier rank no greater than n, up to logical equivalence.
Proof. By induction on n.
Definition 25.20. Given a structure M, let |M|<ω be the set of all finite se-
quences over |M|. We use a, b, c, . . . to range over finite sequences of elements.
If a ∈ |M|<ω and a ∈ |M|, then aa represents the concatenation of a with a.
Definition 25.21. Given structures M and N, we define relations In ⊆ |M|<ω ×
|N|<ω between sequences of equal length, by recursion on n as follows:
1. I0 (a, b) if and only if a and b satisfy the same atomic formulas in M and
N; i.e., if s1 ( xi ) = ai and s2 ( xi ) = bi and φ is atomic with all variables
among x1 , . . . , xn , then M, s1 ⊨ φ if and only if N, s2 ⊨ φ.
2. In+1 (a, b) if and only if for every a ∈ A there is a b ∈ B such that
In (aa, bb), and vice-versa.
Definition 25.22. Write M ≈n N if In (Λ, Λ) holds of M and N (where Λ is the
empty sequence).
Theorem 25.23. Let L be a purely relational language. Then In (a, b) implies that
for every φ such that qr( φ) ≤ n, we have M, a ⊨ φ if and only if N, b ⊨ φ (where
again a satisfies φ if any s such that s( xi ) = ai satisfies φ). Moreover, if L is finite,
the converse also holds.
Proof. The proof that In (a, b) implies that a and b satisfy the same formulas
of quantifier rank no greater than n is by an easy induction on φ. For the con-
verse we proceed by induction on n, using Proposition 25.19, which ensures
that for each n there are at most finitely many non-equivalent formulas of that
quantifier rank.
For n = 0 the hypothesis that a and b satisfy the same quantifier-free for-
mulas gives that they satisfy the same atomic ones, so that I0 (a, b).
For the n + 1 case, suppose that a and b satisfy the same formulas of quan-
tifier rank no greater than n + 1; in order to show that In+1 (a, b) suffices to
show that for each a ∈ |M| there is a b ∈ |N| such that In (aa, bb), and by the
inductive hypothesis again suffices to show that for each a ∈ |M| there is a
b ∈ |N| such that aa and bb satisfy the same formulas of quantifier rank no
greater than n.
Given a ∈ |M|, let τna be set of formulas ψ( x, y) of rank no greater than
n satisfied by aa in M; τna is finite, so we can assume it is a single first-order
formula. It follows that a satisfies ∃ x τna ( x, y), which has quantifier rank no
greater than n + 1. By hypothesis b satisfies the same formula in N, so that
there is a b ∈ |N| such that bb satisfies τna ; in particular, bb satisfies the same
formulas of quantifier rank no greater than n as aa. Similarly one shows that
for every b ∈ |N| there is a ∈ |M| such that aa and bb satisfy the same formu-
las of quantifier rank no greater than n, which completes the proof.
Corollary 25.24. If M and N are purely relational structures in a finite language,
then M ≈n N if and only if M ≡n N. In particular M ≡ N if and only if for each n,
M ≈n N .
1. ∀ x ¬ x < x;
3. ∀ x ∀y ( x < y ∨ x = y ∨ y < x );
4. ∀ x ∃y x < y;
5. ∀ x ∃y y < x;
Theorem 25.26. Any two enumerable dense linear orderings without endpoints are
isomorphic.
3. if ai <1 a <1 ai+1 for some i, then let b ∈ |M2 | be such that bi <2 b <2
bi + 1 .
Problems
Problem 25.1. Prove Proposition 25.2.
Problem 25.2. Carry out the proof of (b) of Theorem 25.9 in detail. Make sure
to note where each of the five properties characterizing isomorphisms of Def-
inition 25.8 is used.
Problem 25.5. Complete the proof of Theorem 25.26 by verifying that I satis-
fies the Back property.
Models of Arithmetic
26.1 Introduction
The standard model of arithmetic is the structure N with |N| = N in which 0,
′, +, ×, and < are interpreted as you would expect. That is, 0 is 0, ′ is the
successor function, + is interpeted as addition and × as multiplication of the
numbers in N. Specifically,
0N = 0
′N ( n ) = n + 1
+N (n, m) = n + m
×N (n, m) = nm
Of course, there are structures for L A that have domains other than N. For
instance, we can take M with domain |M| = { a}∗ (the finite sequences of the
single symbol a, i.e., ∅, a, aa, aaa, . . . ), and interpretations
0M = ∅
′M ( s ) = s ⌢ a
+M (n, m) = an+m
×M (n, m) = anm
These two structures are “essentially the same” in the sense that the only dif-
ference is the elements of the domains but not how the elements of the do-
mains are related among each other by the interpretation functions. We say
that the two structures are isomorphic.
It is an easy consequence of the compactness theorem that any theory true
in N also has models that are not isomorphic to N. Such structures are called
non-standard. The interesting thing about them is that while the elements of a
standard model (i.e., N, but also all structures isomorphic to it) are exhausted
390
26.2. STANDARD MODELS OF ARITHMETIC
Proposition 26.2. If a structure M is standard, then its domain is the set of values
of the standard numerals, i.e.,
Proof. Clearly, every ValM (n) ∈ |M|. We just have to show that every x ∈
|M| is equal to ValM (n) for some n. Since M is standard, it is isomorphic
to N. Suppose g : N → |M| is an isomorphism. Then g(n) = g(ValN (n)) =
ValM (n). But for every x ∈ |M|, there is an n ∈ N such that g(n) = x, since g
is surjective.
5. ⟨n, m⟩ ∈ <N iff n < m. If n < m, then Q ⊢ n < m, and also M ⊨ n < m.
Thus ⟨ValM (n), ValM (m)⟩ ∈ <M , i.e., ⟨ g(n), g(m)⟩ ∈ <M . If n ̸< m,
then Q ⊢ ¬n < m, and consequently M ⊭ n < m. Thus, as before,
/ <M . Together, we get: ⟨n, m⟩ ∈ <N iff ⟨ g(n), g(m)⟩ ∈
⟨ g(n), g(m)⟩ ∈
< .M
Proposition 26.4. If M is standard, then g from the proof of Proposition 26.3 is the
only isomorphism from N to M.
Proof. Expand L A by a new constant symbol c and consider the set of sen-
tences
Γ = TA ∪ {c ̸= 0, c ̸= 1, c ̸= 2, . . . }
26.4 Models of Q
We know that there are non-standard structures that make the same sentences
true as N does, i.e., is a model of TA. Since N ⊨ Q, any model of TA is also
a model of Q. Q is much weaker than TA, e.g., Q ⊬ ∀ x ∀y ( x + y) = (y + x ).
Weaker theories are easier to satisfy: they have more models. E.g., Q has
models which make ∀ x ∀y ( x + y) = (y + x ) false, but those cannot also be
models of TA, or PA for that matter. Models of Q are also relatively simple:
we can specify them explicitly.
Example 26.8. Consider the structure K with domain |K| = N ∪ { a} and in-
terpretations
0K = 0
(
x+1 if x ∈ N
′K ( x ) =
a if x = a
(
x+y if x, y ∈ N
+K ( x, y) =
a otherwise
if x, y ∈ N
xy
K
× ( x, y) = 0 if x = 0 or y = 0
a otherwise
since ⊕ and ∗ agree with + and ′ on standard numbers. Now suppose x ∈ |K|.
Then
( x ⊕ a∗ ) = ( x ⊕ a) = a = a∗ = ( x ⊕ a)∗
( a ⊕ n∗ ) = ( a ⊕ (n + 1)) = a = a∗ = ( a ⊕ n)∗
( a ⊕ a∗ ) = ( a ⊕ a) = a = a∗ = ( a ⊕ a)∗
This is of course a bit more detailed than needed. For instance, since a ⊕ z = a
whatever z is, we can immediately conclude a ⊕ a∗ = a. The remaining axioms
can be verified the same way.
K is thus a model of Q. Its “addition” ⊕ is also commutative. But there are
other sentences true in N but false in K, and vice versa. For instance, a 4 a, so
K ⊨ ∃ x x < x and K ⊭ ∀ x ¬ x < x. This shows that Q ⊬ ∀ x ¬ x < x.
x x∗ x⊕y m a b
n n+1 n n+m b a
a a a a b a
b b b b b a
Since ∗ is injective, 0 is not in its range, and every x ∈ |L| other than 0 is,
axioms Q1 –Q3 are true in L. For any x, x ⊕ 0 = x, so Q4 is true as well. For
Q5 , consider x ⊕ y∗ and ( x ⊕ y)∗ . They are equal if x and y are both standard,
since then ∗ and ⊕ agree with ′ and +. If x is non-standard, and y is standard,
we have x ⊕ y∗ = x = x ∗ = ( x ⊕ y)∗ . If x and y are both non-standard, we
have four cases:
a ⊕ a∗ = b = b∗ = ( a ⊕ a)∗
b ⊕ b∗ = a = a∗ = (b ⊕ b)∗
b ⊕ a∗ = b = b∗ = (b ⊕ y)∗
a ⊕ b∗ = a = a∗ = ( a ⊕ b)∗
n ⊕ a∗ = n ⊕ a = b = b∗ = (n ⊕ a)∗
n ⊕ b∗ = n ⊕ b = a = a∗ = (n ⊕ b)∗
So, L ⊨ Q5 . However, a ⊕ 0 ̸= 0 ⊕ a, so L ⊭ ∀ x ∀y ( x + y) = (y + x ).
26.5 Models of PA
Any non-standard model of TA is also one of PA. We know that non-standard
models of TA and hence of PA exist. We also know that such non-standard
2. If x 4 y and y 4 z then x 4 z.
3. For any x ̸= y, x 4 y or y 4 x
Proof. PA proves:
1. ∀ x ¬ x < x
Proposition 26.11. z is the least element of |M| in the 4-ordering. For any x, x 4
x ∗ , and x ∗ is the 4-least element with that property. For any x, there is a unique y
such that y∗ = x. (We call y the “predecessor” of x in M, and denote it by ∗ x.)
Proof. Exercise.
Proposition 26.12. All standard elements of M are less than (according to 4) all
non-standard elements.
Proof. We’ll use n as short for ValM (n), a standard element of M. Already Q
proves that, for any n ∈ N, ∀ x ( x < n′ → ( x = 0 ∨ x = 1 ∨ · · · ∨ x = n)).
There are no elements that are 4z. So if n is standard and x is non-standard,
we cannot have x 4 n. By definition, a non-standard element is one that isn’t
ValM (n) for any n ∈ N, so x ̸= n as well. Since 4 is a linear order, we must
have n 4 x.
We call this subset the block of x and write it as [ x ]. It has no least and no greatest
element. It can be characterized as the set of those y ∈ |M| such that, for some
standard n, x ⊕ n = y or y ⊕ n = x.
Proof. Clearly, such a set [ x ] always exists since every element y of |M| has
a unique successor y∗ and unique predecessor ∗ y. For successive elements y,
y∗ we have y 4 y∗ and y∗ is the 4-least element of |M| such that y is 4-less
than it. Since always ∗ y 4 y and y 4 y∗ , [ x ] has no least or greatest element. If
y ∈ [ x ] then x ∈ [y], for then either y∗...∗ = x or x ∗...∗ = y. If y∗...∗ = x (with n
∗’s), then y ⊕ n = x and conversely, since PA ⊢ ∀ x x ′...′ = ( x + n) (if n is the
number of ′’s).
Proposition 26.14. If [ x ] ̸= [y] and x 4 y, then for any u ∈ [ x ] and any v ∈ [y],
u 4 v.
This means that the blocks themselves can be ordered in a way that re-
spects 4: [ x ] 4 [y] iff x 4 y, or, equivalently, if u 4 v for any u ∈ [ x ] and v ∈ [y].
Clearly, the standard block [0] is the least block. It intersects with no non-
standard block, and no two non-standard blocks intersect either. Specifically,
you cannot “reach” a different block by taking repeated successors or prede-
cessors.
Proof. Exercise.
Proposition 26.19. The ordering of the blocks is dense. That is, if x 4 y and [ x ] ̸=
[y], then there is a block [z] distinct from both that is between them.
The non-standard blocks are therefore ordered like the rationals: they form
a denumerable dense linear ordering without endpoints. One can show that
any two such denumerable orderings are isomorphic. It follows that for any
two enumerable non-standard models M1 and M2 of true arithmetic, their
reducts to the language containing < and = only are isomorphic. Indeed, an
isomorphism h can be defined as follows: the standard parts of M1 and M2
are isomorphic to the standard model N and hence to each other. The blocks
making up the non-standard part are themselves ordered like the rationals
and therefore isomorphic; an isomorphism of the blocks can be extended to
an isomorphism within the blocks by matching up arbitrary elements in each,
and then taking the image of the successor of x in M1 to be the successor of the
image of x in M2 . Note that it does not follow that M1 and M2 are isomorphic
in the full language of arithmetic (indeed, isomorphism is always relative to
a language), as there are non-isomorphic ways to define addition and multi-
plication over |M1 | and |M2 |. (This also follows from a famous theorem due
to Vaught that the number of countable models of a complete theory cannot
be 2.)
Example 26.21. Recall the structure K from Example 26.8. Its domain was
|K| = N ∪ { a} and interpretations
0K = 0
(
x+1 if x ∈ N
′K ( x ) =
a if x = a
(
x+y if x, y ∈ N
+K ( x, y) =
a otherwise
if x, y ∈ N
xy
K
× ( x, y) = 0 if x = 0 or y = 0
a otherwise
returns n + 1. But 0 now plays the role of a, which is its own successor. So
′
′K (0) = 0. For addition and multiplication we likewise have
(
K′ x+y−1 if x, y > 0
+ ( x, y) =
0 otherwise
1
if x = 1 or y = 1
K′
× ( x, y) = xy − x − y + 2 if x, y > 1
0 otherwise
′
And we have ⟨ x, y⟩ ∈ <K iff x < y and x > 0 and y > 0, or if y = 0.
All of these functions are computable functions of natural numbers and
′
<K is a decidable relation on N—but they are not the same functions as suc-
′
cessor, addition, and multiplication on N, and <K is not the same relation
as < on N.
Problems
Problem 26.1. Show that the converse of Proposition 26.2 is false, i.e., give
an example of a structure M with |M| = {ValM (n) : n ∈ N} that is not
isomorphic to N.
∀ x ∀y ( x ′ = y′ → x = y) (Q1 )
∀ x 0 ̸= x′ (Q2 )
∀ x ( x = 0 ∨ ∃y x = y′ ) (Q3 )
1. M1 ⊨ Q1 , M1 ⊨ Q2 , M1 ⊭ Q3 ;
2. M2 ⊨ Q1 , M2 ⊭ Q2 , M2 ⊨ Q3 ; and
3. M3 ⊭ Q1 , M3 ⊨ Q2 , M3 ⊨ Q3 ;
Obviously, you just have to specify 0Mi and ′Mi for each.
Problem 26.3. Prove that K from Example 26.8 satisifies the remaining axioms
of Q,
∀ x ( x × 0) = 0 (Q6 )
∀ x ∀y ( x × y′ ) = (( x × y) + x ) (Q7 )
∀ x ∀y ( x < y ↔ ∃z (z′ + x ) = y) (Q8 )
∀ x ( x × 0) = 0 (Q6 )
∀ x ∀y ( x × y′ ) = (( x × y) + x ) (Q7 )
∀ x ∀y ( x < y ↔ ∃z (z′ + x ) = y) (Q8 )
Problem 26.8. Write out a detailed proof of Proposition 26.19. Which sentence
must PA derive in order to guarantee the existence of z? Why is x 4 z and z 4 y,
and why is [ x ] ̸= [z] and [z] ̸= [y]?
27.1 Introduction
The interpolation theorem is the following result: Suppose ⊨ φ → ψ. Then
there is a sentence χ such that ⊨ φ → χ and ⊨ χ → ψ. Moreover, every constant
symbol, function symbol, and predicate symbol (other than =) in χ occurs
both in φ and ψ. The sentence χ is called an interpolant of φ and ψ.
The interpolation theorem is interesting in its own right, but its main im-
portance lies in the fact that it can be used to prove results about definability in
a theory, and the conditions under which combining two consistent theories
results in a consistent theory. The first result is known as the Beth definability
theorem; the second, Robinson’s joint consistency theorem.
Lemma 27.2. Suppose L0 is the language containing every constant symbol, func-
.
tion symbol and predicate symbol (other than =) that occurs in both Γ and ∆, and let
403
CHAPTER 27. THE INTERPOLATION THEOREM
Γ ∆
¬χ
L0′ be obtained by the addition of infinitely many new constant symbols cn for n ≥ 0.
Then if Γ and ∆ are inseparable in L0 , they are also inseparable in L0′ .
γ ⊨ χ[c/x ], δ ⊨ ¬χ[c/x ].
Γ ⊨ ∀ x χ, ∆ ⊨ ¬∀ x χ,
Lemma 27.3. Suppose that Γ ∪ {∃ x σ } and ∆ are inseparable, and c is a new con-
stant symbol not in Γ, ∆, or σ. Then Γ ∪ {∃ x σ, σ [c/x ]} and ∆ are also inseparable.
2. c does occur in χ so that χ has the form χ[c/x ]. Then we have that
Γ ∪ {∃ x σ, σ [c/x ]} ⊨ χ[c/x ],
Finally, define:
Γ∗ = ∆∗ =
[ [
Γn , ∆n.
n ≥0 n ≥0
The basis for (1) is given by Lemma 27.2. For part (2), we need to distinguish
three cases:
This completes the basis of the induction for (1) and (2) above. Now for the in-
ductive step. For (1), if ∆ n+1 = ∆ n ∪ {ψn } then Γn+1 and ∆ n+1 are inseparable
by construction (even when ψn is existential, by Lemma 27.3); if ∆ n+1 = ∆ n
(because Γn+1 and ∆ n ∪ {ψn } are separable), then we use the induction hy-
pothesis on (2). For the inductive step for (2), if Γn+2 = Γn+1 ∪ { φn+1 } then
Γn+2 and ∆ n+1 are inseparable by construction (even when φn+1 is existential,
by Lemma 27.3); and if Γn+2 = Γn+1 then we use the inductive case for (1) just
proved. This concludes the induction on (1) and (2).
It follows that Γ ∗ and ∆∗ are inseparable; if not, by compactness, there
is n ≥ 0 that separates Γn and ∆ n , against (1). In particular, Γ ∗ and ∆∗ are
consistent: for if the former or the latter is inconsistent, then they are separated
by ∃ x x ̸= x or ∀ x x = x, respectively.
We now show that Γ ∗ is maximally consistent in L1′ and likewise ∆∗ in
′
L2 . For the former, suppose that φn ∈ / Γ ∗ and ¬ φn ∈/ Γ ∗ , for some n ≥ 0. If
φn ∈/ Γ then Γn ∪ { φn } is separable from ∆ n , and so there is χ ∈ L0′ such that
∗
both:
Γ ∗ ⊨ φn → χ, ∆∗ ⊨ ¬χ.
Γ ∗ ⊨ ¬ φn → χ′ , ∆∗ ⊨ ¬χ′ .
′
∆∗ has a model M2′ whose domain |M2′ | is given by the interpretations cM2 of
the constant symbols.
Let M1 be obtained from M1′ by dropping interpretations for constant sym-
bols, function symbols, and predicate symbols in L1′ \ L0′ , and similarly for
′ ′
M2 . Then the map h : M1 → M2 defined by h(cM1 ) = cM2 is an isomor-
phism in L0′ , because Γ ∗ ∩ ∆∗ is maximally consistent in L0′ , as shown. This
follows because any L0′ -sentence either belongs to both Γ ∗ and ∆∗ , or to nei-
′ ′
ther: so cM1 ∈ PM1 if and only if P(c) ∈ Γ ∗ if and only if P(c) ∈ ∆∗ if and
′ ′
only if cM2 ∈ PM2 . The other conditions satisfied by isomorphisms can be
established similarly.
Let us now define a model M for the language L1 ∪ L2 as follows:
′
1. The domain |M| is just |M2 |, i.e., the set of all elements cM2 ;
′
2. If a predicate symbol P is in L2 \ L1 then PM = PM2 ;
′ M′ M′
3. If a predicate P is in L1 \ L2 then PM = h( PM2 ), i.e., ⟨c1 2 , . . . , cn 2 ⟩ ∈
M′ M′ ′
PM if and only if ⟨c1 1 , . . . , cn 1 ⟩ ∈ PM1 .
′ ′
4. If a predicate symbol P is in L0 then PM = PM2 = h( PM1 ).
5. Function symbols of L1 ∪ L2 , including constant symbols, are handled
similarly.
Finally, one shows by induction on formulas that M agrees with M1′ on all
formulas of L1′ and with M2′ on all formulas of L2′ . In particular, M ⊨ Γ ∗ ∪ ∆∗ ,
whence M ⊨ φ and M ⊨ ¬ψ, and ̸⊨ φ → ψ. This concludes the proof of Craig’s
Interpolation Theorem.
Σ( P) ⊨ ∀ x1 . . . ∀ xn ( P( x1 , . . . , xn ) ↔ χ( x1 , . . . , xn )).
Σ( P) ∪ Σ( P′ ) ⊨ ∀ x1 . . . ∀ xn ( P( x1 , . . . , xn ) ↔ P′ ( x1 , . . . , xn )),
Σ( P) ⊨ ∀ x1 . . . ∀ xn ( P( x1 , . . . , xn ) ↔ χ( x1 , . . . , xn ))
′
Σ( P ) ⊨ ∀ x1 . . . ∀ xn ( P′ ( x1 , . . . , xn ) ↔ χ( x1 , . . . , xn ))
and the conclusion follows. For the converse: assume that Σ( P) implicitly
defines P. First, we add constant symbols c1 , . . . , cn to L. Then
Σ ( P ) ∪ Σ ( P ′ ) ⊨ P ( c1 , . . . , c n ) → P ′ ( c1 , . . . , c n ).
∆ 0 ∪ ∆ 1 ⊨ P ( c1 , . . . , c n ) → P ′ ( c1 , . . . , c n ).
θ ( P ) ∧ P ( c1 , . . . , c n ) ⊨ χ ( c1 , . . . , c n ); χ ( c1 , . . . , c n ) ⊨ θ ( P ′ ) → P ′ ( c1 , . . . , c n ).
θ ( P ) ⊨ χ ( c1 , . . . , c n ) → P ( c1 , . . . , c n ).
Σ( P) ⊨ ∀ x1 . . . ∀ xn ( P( x1 , . . . , xn ) ↔ χ( x1 , . . . , xn )).
Lindström’s Theorem
28.1 Introduction
In this chapter we aim to prove Lindström’s characterization of first-order
logic as the maximal logic for which (given certain further constraints) the
Compactness and the Downward Löwenheim-Skolem theorems hold (Theo-
rem 23.23 and Theorem 23.32). First, we need a more general characterization
of the general class of logics to which the theorem applies. We will restrict
ourselves to relational languages, i.e., languages which only contain predicate
symbols and individual constants, but no function symbols.
Notice that we are still employing the same notion of structure for a given
language as for first-order logic, but we do not presuppose that sentences are
build up from the basic symbols in L in the usual way, nor that the relation
|= L is recursively defined in the same way as for first-order logic. So for in-
stance the definition, being completely general, is intended to capture the case
where sentences in ⟨ L, |= L ⟩ contain infinitely long conjunctions or disjunction,
or quantifiers other than ∃ and ∀ (e.g., “there are infinitely many x such that
. . . ”), or perhaps infinitely long quantifier prefixes. To emphasize that “sen-
tences” in L(L) need not be ordinary sentences of first-order logic, in this
chapter we use variables α, β, . . . to range over them, and reserve φ, ψ, . . . for
ordinary first-order formulas.
410
28.2. ABSTRACT LOGICS
Definition 28.2. Let Mod L (α) denote the class {M : M |= L α}. If the language
needs to be made explicit, we write ModL L ( α ). Two structures M and N for L
are elementarily equivalent in ⟨ L, |= L ⟩, written M ≡ L N, if the same sentences
from L(L) are true in each.
Remark 5. First-order logic, i.e., the abstract logic ⟨ F, |=⟩, is normal. In fact,
the above properties are mostly straightforward for first-order logic. We just
remark that the expansion property comes down to extensionality, and that
the relativization of a sentence α to R( x, c1 , . . . , cn ) is obtained by replacing
each subformula ∀ x β by ∀ x ( R( x, c1 , . . . , cn ) → β). Moreover, if ⟨ L, |= L ⟩ is
normal, then ⟨ F, |=⟩ ≤ ⟨ L, |= L ⟩, as can be can shown by induction on first-
order formulas. Accordingly, with no loss in generality, we can assume that
every first-order sentence belongs to every normal logic.
|M|<ω is the set of finite sequences of elements of |M|. Let S be the ternary
relation over |M|<ω representing concatenation, i.e., if a, b, c ∈ |M|<ω then
S(a, b, c) holds if and only if c is the concatenation of a and b; and let T be the
ternary relation such that T (a, b, c) holds for b ∈ M and a, c ∈ |M|<ω if and
only if a = a1 , . . . an and c = a1 , . . . an , b. Pick new 3-place predicate symbols
P and Q and form the structure M∗ having the universe |M| ∪ |M|<ω , having
M as a substructure, and interpreting P and Q by the concatenation relations
S and T (so M∗ is in the language L ∪ { P, Q}).
Define |N|<ω , S′ , T ′ , P′ , Q′ and N∗ analogously. Since by hypothesis M ≃ p
N, there is a relation I between |M|<ω and |N|<ω such that I (a, b) holds if
and only if a and b are isomorphic and satisfy the back-and-forth condition of
Definition 25.15. Now, let M be the structure whose domain is the union of the
domains of M∗ and N∗ , having M∗ and N∗ as substructures, in the language
with one extra binary predicate symbol R interpreted by the relation I and
predicate symbols denoting the domains |M|∗ and |N| ∗.
I
M N
M∗ N∗
Proof. Let n be such that any two n-equivalent structures M and N agree on
the value assigned to α. Recall Proposition 25.19: there are only finitely many
first-order sentences in a finite language that have quantifier rank no greater
than n, up to logical equivalence. Now, for each fixed structure M let θM be
the conjunction of all first-order sentences α true in M with qr(α) ≤ n (this
conjunction is finite), so that N |= θM if and only if N ≡n M. Then put θ =
{θM : M |= L α}; this disjunction is also finite (up to logical equivalence).
W
Proof. By Lemma 28.8, it suffices to show that for any α ∈ L(L), with L finite,
there is n ∈ N such that for any two structures M and N: if M ≡n N then M
and N agree on α. For then α is equivalent to a first-order sentence, from which
⟨ L, |= L ⟩ ≤ ⟨ F, |=⟩ follows. Since we are working in a finite, purely relational
language, by Theorem 25.23 we can replace the statement that M ≡n N by the
corresponding algebraic statement that In (∅, ∅).
Given α, suppose towards a contradiction that for each n there are struc-
tures Mn and Nn such that In (∅, ∅), but (say) Mn |= L α whereas Nn ̸|= L α. By
the Isomorphism Property we can assume that all the Mn ’s interpret the con-
stants of the language by the same objects; furthermore, since there are only
finitely many atomic sentences in the language, we may also assume that they
satisfy the same atomic sentences (we can take a subsequence of the M’s oth-
erwise). Let M be the union of all the Mn ’s, i.e., the unique minimal structure
having each Mn as a substructure. As in the proof of Theorem 28.7, let M∗
be the extension of M with domain |M| ∪ |M|<ω , in the expanded language
comprising the concatenation predicates P and Q.
Similarly, define Nn , N and N∗ . Now let M be the structure whose domain
comprises the domains of M∗ and N∗ as well as the natural numbers N along
with their natural ordering ≤, in the language with extra predicates represent-
ing the domains |M|, |N|, |M|<ω and |N|<ω as well as predicates coding the
domains of Mn and Nn in the sense that:
such that Mn |= α, Nn ̸|= α, and for each n in the ordering, J (n, a, b) holds if
and only if In (a, b).
Using the Compactness Property, we can find a model M∗ of θ in which
the ordering contains a non-standard element n∗ . In particular then M∗ will
contain substructures Mn∗ and Nn∗ such that Mn∗ |= L α and Nn∗ ̸|= L α. But
now we can define a set I of pairs of k-tuples from |Mn∗ | and |Nn∗ | by putting
⟨a, b⟩ ∈ I if and only if J (n∗ − k, a, b), where k is the length of a and b. Since
n∗ is non-standard, for each standard k we have that n∗ − k > 0, and the set I
witnesses the fact that Mn∗ ≃ p Nn∗ . But by Theorem 28.7, Mn∗ is L-equivalent
to Nn∗ , a contradiction.
Computability
416
28.4. LINDSTRÖM’S THEOREM
Recursive Functions
29.1 Introduction
In order to develop a mathematical theory of computability, one has to, first
of all, develop a model of computability. We now think of computability as the
kind of thing that computers do, and computers work with symbols. But at
the beginning of the development of theories of computability, the paradig-
matic example of computation was numerical computation. Mathematicians
were always interested in number-theoretic functions, i.e., functions f : Nn →
N that can be computed. So it is not surprising that at the beginning of the
theory of computability, it was such functions that were studied. The most
familiar examples of computable numerical functions, such as addition, mul-
tiplication, exponentiation (of natural numbers) share an interesting feature:
they can be defined recursively. It is thus quite natural to attempt a general
definition of computable function on the basis of recursive definitions. Among
the many possible ways to define number-theoretic functions recursively, one
particularly simple pattern of definition here becomes central: so-called prim-
itive recursion.
In addition to computable functions, we might be interested in computable
sets and relations. A set is computable if we can compute the answer to
whether or not a given number is an element of the set, and a relation is com-
putable iff we can compute whether or not a tuple ⟨n1 , . . . , nk ⟩ is an element
of the relation. By considering the characteristic function of a set or relation,
discussion of computable sets and relations can be subsumed under that of
418
29.2. PRIMITIVE RECURSION
h (0) = 1
h ( x + 1) = 2 · h ( x )
If we already know how to multiply, then these equations give us the infor-
mation required for (a) and (b) above. By successively applying the second
equation, we get that
h(1) = 2 · h(0) = 2,
h(2) = 2 · h(1) = 2 · 2,
h(3) = 2 · h(2) = 2 · 2 · 2,
..
.
add( x, 0) = x
add( x, y + 1) = add( x, y) + 1
These equations specify the value of add for all x and y. To find add(2, 3), for
instance, we apply the defining equations for x = 2, using the first to find
add(2, 0) = 2, then using the second to successively find add(2, 1) = 2 + 1 =
3, add(2, 2) = 3 + 1 = 4, add(2, 3) = 4 + 1 = 5.
In the definition of add we used + on the right-hand-side of the second
equation, but only to add 1. In other words, we used the successor func-
tion succ(z) = z + 1 and applied it to the previous value add( x, y) to define
add( x, y + 1). So we can think of the recursive definition as given in terms of
a single function which we apply to the previous value. However, it doesn’t
hurt—and sometimes is necessary—to allow the function to depend not just
on the previous value but also on x and y. Consider:
mult( x, 0) = 0
mult( x, y + 1) = add(mult( x, y), x )
mult(2, 0) = 0
mult(2, 1) = mult(2, 0 + 1) = add(mult(2, 0), 2) = add(0, 2) = 2
mult(2, 2) = mult(2, 1 + 1) = add(mult(2, 1), 2) = add(2, 2) = 4
mult(2, 3) = mult(2, 2 + 1) = add(mult(2, 2), 2) = add(4, 2) = 6
h ( x 0 , . . . , x k −1 , 0 ) = f ( x 0 , . . . , x k −1 )
h( x0 , . . . , xk−1 , y + 1) = g( x0 , . . . , xk−1 , y, h( x0 , . . . , xk−1 , y))
add( x0 , 0) = f ( x0 ) = x0
add( x0 , y + 1) = g( x0 , y, add( x0 , y)) = succ(add( x0 , y))
mult( x0 , 0) = f ( x0 ) = 0
mult( x0 , y + 1) = g( x0 , y, mult( x0 , y)) = add(mult( x0 , y), x0 )
29.3 Composition
If f and g are two one-place functions of natural numbers, we can compose
them: h( x ) = g( f ( x )). The new function h( x ) is then defined by composition
from the functions f and g. We’d like to generalize this to functions of more
than one argument.
Here’s one way of doing this: suppose f is a k-place function, and g0 , . . . ,
gk−1 are k functions which are all n-place. Then we can define a new n-place
function h as follows:
Pin ( x0 , . . . , xn−1 ) = xi
The functions Pik are called projection functions: Pin is an n-place function. Then
g can be defined by
g( x, y, z) = succ( P23 ( x, y, z)).
Here the role of f is played by the 1-place function succ, so k = 1. And we
have one 3-place function P23 which plays the role of g0 . The result is a 3-place
function that returns the successor of the third argument.
The projection functions also allow us to define new functions by reorder-
ing or identifying arguments. For instance, the function h( x ) = add( x, x ) can
be defined by
h( x0 ) = add( P01 ( x0 ), P01 ( x0 )).
Here k = 2, n = 1, the role of f (y0 , y1 ) is played by add, and the roles of g0 ( x0 )
and g1 ( x0 ) are both played by P01 ( x0 ), the one-place projection function (aka
the identity function).
If f (y0 , y1 ) is a function we already have, we can define the function h( x0 , x1 ) =
f ( x1 , x0 ) by
h( x0 , x1 ) = f ( P12 ( x0 , x1 ), P02 ( x0 , x1 )).
Here k = 2, n = 2, and the roles of g0 and g1 are played by P12 and P02 , respec-
tively.
You may also worry that g0 , . . . , gk−1 are all required to have the same
arity n. (Remember that the arity of a function is the number of arguments;
an n-place function has arity n.) But adding the projection functions provides
the desired flexibility. For example, suppose f and g are 3-place functions and
h is the 2-place function defined by
h( x, y) = f ( x, g( x, x, y), y).
The definition of h can be rewritten with the projection functions, as
h( x, y) = f ( P02 ( x, y), g( P02 ( x, y), P02 ( x, y), P12 ( x, y)), P12 ( x, y)).
Then h is the composition of f with P02 , l, and P12 , where
l ( x, y) = g( P02 ( x, y), P02 ( x, y), P12 ( x, y)),
i.e., l is the composition of g with P02 , P02 , and P12 .
Pin ( x0 , . . . , xn−1 ) = xi ,
for each natural number n and i < n, we will include among the primitive
recursive functions the function zero( x ) = 0.
Definition 29.3. The set of primitive recursive functions is the set of functions
from Nn to N, defined inductively by the following clauses:
Put more concisely, the set of primitive recursive functions is the smallest
set containing zero, succ, and the projection functions Pjn , and which is closed
under composition and primitive recursion.
Another way of describing the set of primitive recursive functions is by
defining it in terms of “stages.” Let S0 denote the set of starting functions:
zero, succ, and the projections. These are the primitive recursive functions of
stage 0. Once a stage Si has been defined, let Si+1 be the set of all functions
you get by applying a single instance of composition or primitive recursion to
functions already in Si . Then
[
S= Si
i ∈N
add( x0 , 0) = f ( x0 ) = x0
add( x0 , y + 1) = g( x0 , y, add( x0 , y)) = succ(add( x0 , y))
Since succ and P23 count as primitive recursive functions, g does as well, since
it can be defined by composition from primitive recursive functions.
Proof. Exercise.
Example 29.6. Here’s our very first example of a primitive recursive defini-
tion:
h (0) = 1
h ( y + 1) = 2 · h ( y ).
This function cannot fit into the form required by Definition 29.1, since k = 0.
The definition also involves the constants 1 and 2. To get around the first
problem, let’s introduce a dummy argument and define the function h′ :
h ′ ( x0 , 0) = f ( x0 ) = 1
h′ ( x0 , y + 1) = g( x0 , y, h′ ( x0 , y)) = 2 · h′ ( x0 , y).
g( x0 , y, z) = g′ ( P23 ( x0 , y, z))
and
add( x0 , 0) = P01 ( x0 ) = x0
add( x0 , y + 1) = succ( P23 ( x0 , y, add( x0 , y))) = add( x0 , y) + 1
Here the role of f is played by P01 , and the role of g is played by succ( P23 ( x0 , y, z)),
which is assigned the notation Comp1,3 [succ, P23 ] as it is the result of defining
a function by composition from the 1-ary function succ and the 3-ary func-
tion P23 . With this setup, we can denote the addition function by
Having these notations sometimes proves useful, e.g., when enumerating prim-
itive recursive functions.
h(⃗x, 0) = f (⃗x )
h(⃗x, y + 1) = g(⃗x, y, h(⃗x, y))
and suppose the functions f and g are computable. (We use ⃗x to abbreviate x0 ,
. . . , xk−1 .) Then h(⃗x, 0) can obviously be computed, since it is just f (⃗x ) which
we assume is computable. h(⃗x, 1) can then also be computed, since 1 = 0 + 1
and so h(⃗x, 1) is just
Thus, to compute h(⃗x, y) in general, successively compute h(⃗x, 0), h(⃗x, 1), . . . ,
until we reach h(⃗x, y).
Thus, a primitive recursive definition yields a new computable function if
the functions f and g are computable. Composition of functions also results
in a computable function if the functions f and gi are computable.
Since the basic functions zero, succ, and Pin are computable, and compo-
sition and primitive recursion yield computable functions from computable
functions, this means that every primitive recursive function is computable.
exp( x, 0) = 1
exp( x, y + 1) = mult( x, exp( x, y)).
exp( x, 0) = f ( x )
exp( x, y + 1) = g( x, y, exp( x, y)).
where
f ( x ) = succ(zero( x )) = 1
g( x, y, z) = mult( P03 ( x, y, z), P23 ( x, y, z)) = x · z
is primitive recursive.
pred(0) = 0 and
pred(y + 1) = y.
This is almost a primitive recursive definition. It does not, strictly speaking, fit
into the pattern of definition by primitive recursion, since that pattern requires
at least one extra argument x. It is also odd in that it does not actually use
pred(y) in the definition of pred(y + 1). But we can first define pred′ ( x, y) by
pred′ ( x, 0) = zero( x ) = 0,
pred′ ( x, y + 1) = P13 ( x, y, pred′ ( x, y)) = y.
and then define pred from it by composition, e.g., as pred( x ) = pred′ (zero( x ), P01 ( x )).
fac(0) = 1
fac(y + 1) = fac(y) · (y + 1).
h( x, 0) = const1 ( x )
h( x, y + 1) = g( x, y, h( x, y))
where g( x, y, z) = mult( P23 ( x, y, z), succ( P13 ( x, y, z))) and then let
From now on we’ll be a bit more laissez-faire and not give the official defini-
tions by composition and primitive recursion.
is primitive recursive.
Proof. We have:
x −̇ 0 = x
x −̇ (y + 1) = pred( x −̇ y)
max( x, y) = x + (y −̇ x ).
Proof. Exercise.
Proposition 29.14. The set of primitive recursive functions is closed under the fol-
lowing two operations:
Proof. For example, finite sums are defined recursively by the equations
g(⃗x, 0) = f (⃗x, 0)
g(⃗x, y + 1) = g(⃗x, y) + f (⃗x, y + 1).
χIsZero (0) = 1,
χIsZero ( x + 1) = 0.
It should be clear that one can compose relations with other primitive re-
cursive functions. So the following are also primitive recursive:
Proposition 29.16. The set of primitive recursive relations is closed under Boolean
operations, that is, if P(⃗x ) and Q(⃗x ) are primitive recursive, so are
1. ¬ P(⃗x )
2. P(⃗x ) ∧ Q(⃗x )
3. P(⃗x ) ∨ Q(⃗x )
4. P(⃗x ) → Q(⃗x )
Proof. Suppose P(⃗x ) and Q(⃗x ) are primitive recursive, i.e., their characteristic
functions χ P and χQ are. We have to show that the characteristic functions of
¬ P(⃗x ), etc., are also primitive recursive.
(
0 if χ P (⃗x ) = 1
χ¬ P (⃗x ) =
1 otherwise
We can define χ P∧Q (⃗x ) as χ P (⃗x ) · χQ (⃗x ) or as min(χ P (⃗x ), χQ (⃗x )). Similarly,
Proposition 29.17. The set of primitive recursive relations is closed under bounded
quantification, i.e., if R(⃗x, z) is a primitive recursive relation, then so are the relations
(∀z < y) R(⃗x, z) holds of ⃗x and y if and only if R(⃗x, z) holds for every z less than y,
and similarly for (∃z < y) R(⃗x, z).
Proof. By convention, we take (∀z < 0) R(⃗x, z) to be true (for the trivial reason
that there are no z less than 0) and (∃z < 0) R(⃗x, z) to be false. A bounded
universal quantifier functions just like a finite product or iterated minimum,
i.e., if P(⃗x, y) ⇔ (∀z < y) R(⃗x, z) then χ P (⃗x, y) can be defined by
χ P (⃗x, 0) = 1
χ P (⃗x, y + 1) = min(χ P (⃗x, y), χ R (⃗x, y))).
cond(0, y, z) = y,
cond( x + 1, y, z) = z.
One can use this to justify definitions of primitive recursive functions by cases
from primitive recursive relations:
Proposition 29.18. If g0 (⃗x ), . . . , gm (⃗x ) are primitive recursive functions, and R0 (⃗x ),
. . . , Rm−1 (⃗x ) are primitive recursive relations, then the function f defined by
g0 (⃗x ) if R0 (⃗x )
g1 (⃗x ) if R1 (⃗x ) and not R0 (⃗x )
.
f (⃗x ) = ..
gm−1 (⃗x ) if Rm−1 (⃗x ) and none of the previous hold
gm (⃗x ) otherwise
For m greater than 1, one can just compose definitions of this form.
Proof. Note than there can be no z < 0 such that R(⃗x, z) since there is no z < 0
at all. So m R (⃗x, 0) = 0.
In case the bound is of the form y + 1 we have three cases:
1. There is a z < y such that R(⃗x, z), in which case m R (⃗x, y + 1) = m R (⃗x, y).
2. There is no such z < y but R(⃗x, y) holds, then m R (⃗x, y + 1) = y.
3. There is no z < y + 1 such that R(⃗x, z), then m R (⃗z, y + 1) = y + 1.
So we can define m R (⃗x, 0) by primitive recursion as follows:
m R (⃗x, 0) = 0
m R (⃗x, y)
if m R (⃗x, y) ̸= y
m R (⃗x, y + 1) = y if m R (⃗x, y) = y and R(⃗x, y)
y+1 otherwise.
29.10 Primes
Bounded quantification and bounded minimization provide us with a good
deal of machinery to show that natural functions and relations are primitive
recursive. For example, consider the relation “x divides y”, written x | y. The
relation x | y holds if division of y by x is possible without remainder, i.e.,
if y is an integer multiple of x. (If it doesn’t hold, i.e., the remainder when
dividing x by y is > 0, we write x ∤ y.) In other words, x | y iff for some z,
x · z = y. Obviously, any such z, if it exists, must be ≤ y. So, we have that
x | y iff for some z ≤ y, x · z = y. We can define the relation x | y by bounded
existential quantification from = and multiplication by
x | y ⇔ (∃z ≤ y) ( x · z) = y.
Prime( x ) ⇔ x ≥ 2 ∧ (∀y ≤ x ) (y | x → y = 1 ∨ y = x )
p (0) = 2
p( x + 1) = nextPrime( p( x ))
Since nextPrime( x ) is the least y such that y > x and y is prime, it can be
easily computed by unbounded search. But it can also be defined by bounded
minimization, thanks to a result due to Euclid: there is always a prime number
between x and x ! + 1.
This shows, that nextPrime( x ) and hence p( x ) are (not just computable but)
primitive recursive.
(If you’re curious, here’s a quick proof of Euclid’s theorem. Suppose pn
is the largest prime ≤ x and consider the product p = p0 · p1 · · · · · pn of all
primes ≤ x. Either p + 1 is prime or there is a prime between x and p + 1.
Why? Suppose p + 1 is not prime. Then some prime number q | p + 1 where
q < p + 1. None of the primes ≤ x divide p + 1. (By definition of p, each
of the primes pi ≤ x divides p, i.e., with remainder 0. So, each of the primes
pi ≤ x divides p + 1 with remainder 1, and so pi ∤ p + 1.) Hence, q is a prime
> x and < p + 1. And p ≤ x !, so there is a prime > x and ≤ x ! + 1.)
29.11 Sequences
The set of primitive recursive functions is remarkably robust. But we will be
able to do even more once we have developed a adequate means of handling
sequences. We will identify finite sequences of natural numbers with natural
numbers in the following way: the sequence ⟨ a0 , a1 , a2 , . . . , ak ⟩ corresponds to
the number
a +1 a +1 a +1
p00 · p11 · p2a2 +1 · · · · · pkk .
We add one to the exponents to guarantee that, for example, the sequences
⟨2, 7, 3⟩ and ⟨2, 7, 3, 0, 0⟩ have distinct numeric codes. We can take both 0 and 1
to code the empty sequence; for concreteness, let Λ denote 0.
The reason that this coding of sequences works is the so-called Fundamen-
tal Theorem of Arithmetic: every natural number n ≥ 2 can be written in one
and only one way in the form
a a a
n = p00 · p11 · · · · · pkk
Proposition 29.20. The function len(s), which returns the length of the sequence s,
is primitive recursive.
We can use bounded minimization, since there is only one i that satisfies R(s, i )
when s is a code of a sequence, and if i exists it is less than s itself.
Proposition 29.21. The function append(s, a), which returns the result of append-
ing a to the sequence s, is primitive recursive.
Proposition 29.22. The function element(s, i ), which returns the ith element of s
(where the initial element is called the 0th), or 0 if i is greater than or equal to the
length of s, is primitive recursive.
Proof. Note that a is the ith element of s iff pia+1 is the largest power of pi that
divides s, i.e., pia+1 | s but pia+2 ∤ s. So:
(
0 if i ≥ len(s)
element(s, i ) =
(min a < s) ( pia+2 ∤ s) otherwise.
Instead of using the official names for the functions defined above, we
introduce a more compact notation. We will use (s)i instead of element(s, i ),
and ⟨s0 , . . . , sk ⟩ to abbreviate
append(append(. . . append(Λ, s0 ) . . . ), sk ).
Proposition 29.23. The function concat(s, t), which concatenates two sequences, is
primitive recursive.
concat(⟨ a0 , . . . , ak ⟩, ⟨b0 , . . . , bl ⟩) = ⟨ a0 , . . . , ak , b0 , . . . , bl ⟩.
hconcat(s, t, 0) = s
hconcat(s, t, n + 1) = append(hconcat(s, t, n), (t)n )
then the numeric code of the sequence s described above is at most sequenceBound( x, k).
Having such a bound on sequences gives us a way of defining new func-
tions using bounded search. For example, we can define concat using bounded
search. All we need to do is write down a primitive recursive specification of
the object (number of the concatenated sequence) we are looking for, and a
bound on how far to look. The following works:
Proof. Exercise.
29.12 Trees
Sometimes it is useful to represent trees as natural numbers, just like we can
represent sequences by numbers and properties of and operations on them by
primitive recursive relations and functions on their codes. We’ll use sequences
and their codes to do this. A tree can be either a single node (possibly with a
label) or else a node (possibly with a label) connected to a number of subtrees.
The node is called the root of the tree, and the subtrees it is connected to its
immediate subtrees.
We code trees recursively as a sequence ⟨k, d1 , . . . , dk ⟩, where k is the num-
ber of immediate subtrees and d1 , . . . , dk the codes of the immediate subtrees.
If the nodes have labels, they can be included after the immediate subtrees. So
a tree consisting just of a single node with label l would be coded by ⟨0, l ⟩, and
a tree consisting of a root (labelled l1 ) connected to two single nodes (labelled
l2 , l3 ) would be coded by ⟨2, ⟨0, l2 ⟩, ⟨0, l3 ⟩, l1 ⟩.
Proposition 29.25. The function SubtreeSeq(t), which returns the code of a se-
quence the elements of which are the codes of all subtrees of the tree with code t, is
primitive recursive.
g(s, 0) = f ((s)0 )
g(s, k + 1) = g(s, k) ⌢ f ((s)k+1 )
For instance, if s is a sequence of trees, then h(s) = gISubtrees (s, len(s)) gives
the sequence of the immediate subtrees of the elements of s. We can use it to
define hSubtreeSeq by
hSubtreeSeq(t, 0) = ⟨t⟩
hSubtreeSeq(t, n + 1) = hSubtreeSeq(t, n) ⌢ h(hSubtreeSeq(t, n)).
The maximum level of subtrees in a tree coded by t, i.e., the maximum dis-
tance between the root and a leaf node, is bounded by the code t. So a se-
quence of codes of all subtrees of the tree coded by t is given by hSubtreeSeq(t, t).
h0 (⃗x, 0) = f 0 (⃗x )
h1 (⃗x, 0) = f 1 (⃗x )
h0 (⃗x, y + 1) = g0 (⃗x, y, h0 (⃗x, y), h1 (⃗x, y))
h1 (⃗x, y + 1) = g1 (⃗x, y, h0 (⃗x, y), h1 (⃗x, y))
h(⃗x, 0) = f (⃗x )
h(⃗x, y + 1) = g(⃗x, y, ⟨h(⃗x, 0), . . . , h(⃗x, y)⟩).
with the understanding that the last argument to g is just the empty sequence
when y is 0. In either formulation, the idea is that in computing the “successor
step,” the function h can make use of the entire sequence of values computed
so far. This is known as a course-of-values recursion. For a particular example,
it can be used to justify the following type of definition:
(
g(⃗x, y, h(⃗x, k(⃗x, y))) if k (⃗x, y) < y
h(⃗x, y) =
f (⃗x ) otherwise
h(⃗x, 0) = f (⃗x )
h(⃗x, y + 1) = g(⃗x, y, h(k(⃗x ), y))
h( x ) = g( x, x ) + 1
= f x ( x ) + 1.
g0 ( x ) = x+1
gn +1 ( x ) = gnx ( x )
You can confirm that each function gn is primitive recursive. Each successive
function grows much faster than the one before; g1 ( x ) is equal to 2x, g2 ( x ) is
equal to 2x · x, and g3 ( x ) grows roughly like an exponential stack of x 2’s. The
Ackermann–Péter function is essentially the function G ( x ) = gx ( x ), and one
can show that this grows faster than any primitive recursive function.
Let us return to the issue of enumerating the primitive recursive functions.
Remember that we have assigned symbolic notations to each primitive recur-
sive function; so it suffices to enumerate notations. We can assign a natural
number #( F ) to each notation F, recursively, as follows:
#(0) = ⟨0⟩
#( S ) = ⟨1⟩
#( Pin ) = ⟨2, n, i ⟩
#(Compk,l [ H, G0 , . . . , Gk−1 ]) = ⟨3, k, l, #( H ), #( G0 ), . . . , #( Gk−1 )⟩
#(Recl [ G, H ]) = ⟨4, l, #( G ), #( H )⟩
Here we are using the fact that every sequence of numbers can be viewed as
a natural number, using the codes from the last section. The upshot is that
every code is assigned a natural number. Of course, some sequences (and
hence some numbers) do not correspond to notations; but we can let f i be the
unary primitive recursive function with notation coded as i, if i codes such a
notation; and the constant 0 function otherwise. The net result is that we have
an explicit way of enumerating the unary primitive recursive functions.
(In fact, some functions, like the constant zero function, will appear more
than once on the list. This is not just an artifact of our coding, but also a result
of the fact that the constant zero function has more than one notation. We will
later see that one can not computably avoid these repetitions; for example,
there is no computable function that decides whether or not a given notation
represents the constant zero function.)
We can now take the function g( x, y) to be given by f x (y), where f x refers
to the enumeration we have just described. How do we know that g( x, y) is
computable? Intuitively, this is clear: to compute g( x, y), first “unpack” x, and
see if it is a notation for a unary function. If it is, compute the value of that
function on input y.
You may already be convinced that (with some work!) one can write
a program (say, in Java or C++) that does this; and now we can appeal to
the Church-Turing thesis, which says that anything that, intuitively, is com-
putable can be computed by a Turing machine.
Of course, a more direct way to show that g( x, y) is computable is to de-
scribe a Turing machine that computes it, explicitly. This would, in particular,
avoid the Church-Turing thesis and appeals to intuition. Soon we will have
built up enough machinery to show that g( x, y) is computable, appealing to a
model of computation that can be simulated on a Turing machine: namely, the
recursive functions.
2. Add something to the definition, so that some new partial functions are
included.
The first is easy. As before, we will start with zero, successor, and projec-
tions, and close under composition and primitive recursion. The only differ-
ence is that we have to modify the definitions of composition and primitive
recursion to allow for the possibility that some of the terms in the definition
are not defined. If f and g are partial functions, we will write f ( x ) ↓ to mean
that f is defined at x, i.e., x is in the domain of f ; and f ( x ) ↑ to mean the
opposite, i.e., that f is not defined at x. We will use f ( x ) ≃ g( x ) to mean that
either f ( x ) and g( x ) are both undefined, or they are both defined and equal.
We will use these notations for more complicated terms as well. We will adopt
the convention that if h and g0 , . . . , gk all are partial functions, then
h( g0 (⃗x ), . . . , gk (⃗x ))
the least x such that f (0, ⃗z), f (1, ⃗z), . . . , f ( x, ⃗z) are all defined, and
f ( x, ⃗z) = 0, if such an x exists
Definition 29.26. The set of partial recursive functions is the smallest set of par-
tial functions from the natural numbers to the natural numbers (of various
arities) containing zero, successor, and projections, and closed under compo-
sition, primitive recursion, and unbounded search.
Definition 29.27. The set of recursive functions is the set of partial recursive
functions that are total.
for every x.
The proof of the normal form theorem is involved, but the basic idea is
simple. Every partial recursive function has an index e, intuitively, a number
coding its program or definition. If f ( x ) ↓, the computation can be recorded
systematically and coded by some number s, and the fact that s codes the
computation of f on input x can be checked primitive recursively using only
x and the definition e. Consequently, the relation T, “the function with index e
has a computation for input x, and s codes this computation,” is primitive
recursive. Given the full record of the computation s, the “upshot” of s is the
value of f ( x ), and it can be obtained from s primitive recursively as well.
The normal form theorem shows that only a single unbounded search is
required for the definition of any partial recursive function. Basically, we can
search through all numbers until we find one that codes a computation of
the function with index e for input x. We can use the numbers e as “names”
of partial recursive functions, and write φe for the function f defined by the
equation in the theorem. Note that any partial recursive function can have
more than one index—in fact, every partial recursive function has infinitely
many indices.
is not computable.
In the context of partial recursive functions, the role of the specification
of a program may be played by the index e given in Kleene’s normal form
theorem. If f is a partial recursive function, any e for which the equation in
the normal form theorem holds, is an index of f . Given a number e, the normal
form theorem states that
Note that h(e, x ) = 0 if φe ( x ) ↑, but also when e is not the index of a partial
recursive function at all.
1. If h(ed , ed ) = 1 then φed (ed ) ↓. But φed ≃ d, and d(ed ) is defined iff
h(ed , ed ) = 0. So h(ed , ed ) ̸= 1.
The upshot is that ed cannot, after all, be the index of a partial recursive func-
tion. But if h were partial recursive, d would be too, and so our definition of
ed as an index of it would be admissible. We must conclude that h cannot be
partial recursive.
Definition 29.30. The set of general recursive functions is the smallest set of
functions from the natural numbers to the natural numbers (of various ari-
ties) containing zero, successor, and projections, and closed under composi-
tion, primitive recursion, and unbounded search applied to regular functions.
Problems
Problem 29.1. Prove Proposition 29.5 by showing that the primitive recursive
definition of mult is can be put into the form required by Definition 29.1 and
showing that the corresponding functions f and g are primitive recursive.
Problem 29.2. Give the complete primitive recursive notation for mult.
is primitive recursive.
Problem 29.5. Show that integer division d( x, y) = ⌊ x/y⌋ (i.e., division, where
you disregard everything after the decimal point) is primitive recursive. When
y = 0, we stipulate d( x, y) = 0. Give an explicit definition of d using primitive
recursion and composition.
Problem 29.6. Show that the three place relation x ≡ y mod n (congruence
modulo n) is primitive recursive.
Problem 29.7. Suppose R(⃗x, z) is primitive recursive. Define the function m′R (⃗x, y)
which returns the least z less than y such that R(⃗x, z) holds, if there is one, and
0 otherwise, by primitive recursion from χ R .
sconcat(⟨s0 , . . . , sk ⟩) = s0 ⌢ . . . ⌢ sk .
Problem 29.10. Show that there is a primitive recursive function tail(s) with
the property that
tail(Λ) = 0 and
tail(⟨s0 , . . . , sk ⟩) = ⟨s1 , . . . , sk ⟩.
Computability Theory
30.1 Introduction
The branch of logic known as Computability Theory deals with issues having to
do with the computability, or relative computability, of functions and sets. It is
a evidence of Kleene’s influence that the subject used to be known as Recursion
Theory, and today, both names are commonly used.
Let us call a function f : N → 7 N partial computable if it can be computed
in some model of computation. If f is total we will simply say that f is com-
putable. A relation R with computable characteristic function χ R is also called
computable. If f and g are partial functions, we will write f ( x ) ↓ to mean that
f is defined at x, i.e., x is in the domain of f ; and f ( x ) ↑ to mean the opposite,
i.e., that f is not defined at x. We will use f ( x ) ≃ g( x ) to mean that either f ( x )
and g( x ) are both undefined, or they are both defined and equal.
One can explore the subject without having to refer to a specific model
of computation. To do this, one shows that there is a universal partial com-
putable function, Un(k, x ). This allows us to enumerate the partial computable
functions. We will adopt the notation φk to denote the k-th unary partial com-
putable function, defined by φk ( x ) ≃ Un(k, x ). (Kleene used {k} for this pur-
pose, but this notation has not been used as much recently.) Slightly more
generally, we can uniformly enumerate the partial computable functions of
arbitrary arities, and we will use φnk to denote the k-th n-ary partial recursive
function.
Recall that if f (⃗x, y) is a total or partial function, then µy f (⃗x, y) is the
function of ⃗x that returns the least y such that f (⃗x, y) = 0, assuming that all of
f (⃗x, 0), . . . , f (⃗x, y − 1) are defined; if there is no such y, µy f (⃗x, y) is undefined.
445
CHAPTER 30. COMPUTABILITY THEORY
relation “s codes the record of computation of the function with index e for
input x” and the function “output of computation sequence with code s” are
then computable; in fact, they are primitive recursive.
This fundamental fact is very powerful, and allows us to prove a number
of striking and important results about computability, independently of the
model of computation chosen.
for every x.
Proof Sketch. For any model of computation one can rigorously define a de-
scription of the computable function f and code such description using a nat-
ural number k. One can also rigorously define a notion of “computation se-
quence” which records the process of computing the function with index k for
input x. These computation sequences can likewise be coded as numbers s.
This can be done in such a way that (a) it is decidable whether a number s
codes the computation sequence of the function with index k on input x and
(b) what the end result of the computation sequence coded by s is. In fact, the
relation in (a) and the function in (b) are primitive recursive.
Theorem 30.2. Every partial computable function has infinitely many indices.
It is helpful to think of sm m
n as acting on programs. That is, sn takes a pro-
gram, x, for an (m + n)-ary function, as well as fixed inputs a0 , . . . , am−1 ; and
it returns a program, sm n ( x, a0 , . . . , am−1 ), for the n-ary function of the remain-
ing arguments. It you think of x as the description of a Turing machine, then
sm
n ( x, a0 , . . . , am−1 ) is the Turing machine that, on input y0 , . . . , yn−1 , prepends
a0 , . . . , am−1 to the input string, and runs x. Each sm n is then just a primitive
recursive function that finds a code for the appropriate Turing machine.
Proof. Let Un(k, x ) ≃ U (µs T (k, x, s)) in Kleene’s normal form theorem.
Proof. This theorem says that there is no total computable function that is uni-
versal for the total computable functions. The proof is a simple diagonaliza-
tion: if Un′ (k, x ) were total and computable, then
d( x ) = Un′ ( x, x ) + 1
would also be total and computable. However, for every k, d(k ) is not equal
to Un′ (k, k).
Theorem Theorem 30.4 above shows that we can get around this diagonal-
ization argument, but only at the expense of allowing partial functions. It is
worth trying to understand what goes wrong with the diagonalization argu-
ment, when we try to apply it in the partial case. In particular, the function
h( x ) = Un( x, x ) + 1 is partial recursive. Suppose h is the k-th function in the
enumeration; what can we say about h(k)?
But now Un′ (k, x ) is a total function, and is computable if h is. For instance,
we could define g using primitive recursion, by
g(0, k, x ) ≃ 0
g(y + 1, k, x ) ≃ Un(k, x );
then
Un′ (k, x ) ≃ g(h(k, x ), k, x ).
And since Un′ (k, x ) agrees with Un(k, x ) wherever the latter is defined, Un′ is
universal for those partial computable functions that happen to be total. But
this contradicts Theorem 30.5.
1. computable functions
To sort this out, it might help to draw a big square representing all the partial
functions from N to N, and then mark off two overlapping regions, corre-
sponding to the total functions and the computable partial functions, respec-
tively. It is a good exercise to see if you can describe an object in each of the
resulting regions in the diagram.
Theorem 30.9. Let S be a set of natural numbers. Then the following are equivalent:
1. S is computably enumerable.
The first three clauses say that we can equivalently take any non-empty
computably enumerable set to be enumerated by either a computable func-
tion, a partial computable function, or a primitive recursive function. The
fourth clause tells us that if S is computably enumerable, then for some index
e,
S = { x : φe ( x ) ↓}.
In other words, S is the set of inputs on for which the computation of φe
halts. For that reason, computably enumerable sets are sometimes called semi-
decidable: if a number is in the set, you eventually get a “yes,” but if it isn’t,
you never get a “no”!
Proof. Since every primitive recursive function is computable and every com-
putable function is partial computable, (3) implies (1) and (1) implies (2).
(Note that if S is empty, S is the range of the partial computable function that
is nowhere defined.) If we show that (2) implies (3), we will have shown the
first three clauses equivalent.
So, suppose S is the range of the partial computable function φe . If S is
empty, we are done. Otherwise, let a be any element of S. By Kleene’s normal
form theorem, we can write
otherwise, it returns a.We need to show that S is the range of f , i.e., for any
natural number y, y ∈ S if and only if it is in the range of f . In the forwards
direction, suppose y ∈ S. Then y is in the range of φe , so for some x and s,
T (e, x, s) and U (s) = y; but then y = f (⟨ x, s⟩). Conversely, suppose y is in the
range of f . Then either y = a, or for some z, T (e, (z)0 , (z)1 ) and U ((z)1 ) = y.
Since, in the latter case, φe ( x ) ↓= y, either way, y is in S.
(The notation φe ( x ) ↓= y means “φe ( x ) is defined and equal to y.” We
could just as well use φe ( x ) = y, but the extra arrow is sometimes helpful in
reminding us that we are dealing with a partial function.)
To finish up the proof of Theorem 30.9, it suffices to show that (1) and (4)
are equivalent. First, let us show that (1) implies (4). Suppose S is the range of
a computable function f , i.e.,
Let
g(y) = µx f ( x ) = y.
Then g is a partial computable function, and g(y) is defined if and only if for
some x, f ( x ) = y. In other words, the domain of g is the range of f . Expressed
in terms of Turing machines: given a Turing machine F that enumerates the
elements of S, let G be the Turing machine that semi-decides S by searching
through the outputs of F to see if a given element is in the set.
Finally, to show (4) implies (1), suppose that S is the domain of the partial
computable function φe , i.e.,
S = { x : φe ( x ) ↓}.
S = { x : ∃y R( x, y)}.
S = { x : ∃y T (e, x, y)}.
f ( x ) ≃ µy AtomRx, y.
Then k enumerates A ∪ B; the idea is that k just alternates between the enumer-
ations offered by f and g. Enumerating A ∩ B is tricker. If A ∩ B is empty, it
Theorem 30.12. Let A be any set of natural numbers. Then A is computable if and
only if both A and A are computably enumerable.
function. But now we have that for every x, x ∈ A if and only if T (e, x, h( x )),
i.e., if φe is the one that is defined. Since T (e, x, h( x )) is a computable relation,
A is computable.
30.14 Reducibility
We now know that there is at least one set, K0 , that is computably enumerable
but not computable. It should be clear that there are others. The method of
reducibility provides a powerful method of showing that other sets have these
properties, without constantly having to return to first principles.
Generally speaking, a “reduction” of a set A to a set B is a method of
transforming answers to whether or not elements are in B into answers as
to whether or not elements are in A. We will focus on a notion called “many-
one reducibility,” but there are many other notions of reducibility available,
with varying properties. Notions of reducibility are also central to the study
of computational complexity, where efficiency issues have to be considered as
well. For example, a set is said to be “NP-complete” if it is in NP and every
NP problem can be reduced to it, using a notion of reduction that is similar to
the one described below, only with the added requirement that the reduction
can be computed in polynomial time.
We have already used this notion implicitly. Define the set K by
K = { x : φ x ( x ) ↓},
Proposition 30.16. Let A and B be any sets, and suppose A is many-one reducible
to B.
1. If B is computably enumerable, so is A.
2. If B is computable, so is A.
Proof. Let f be a many-one reduction from A to B. For the first claim, just
check that if B is the domain of a partial function g, then A is the domain
of g ◦ f :
x ∈ Aiff f ( x ) ∈ B
iff g( f ( x )) ↓ .
For the second claim, remember that if B is computable then B and B are
computably enumerable. It is not hard to check that f is also a many-one
reduction of A to B, so, by the first part of this proof, A and A are computably
enumerable. So A is computable as well. (Alternatively, you can check that
χ A = χ B ◦ f ; so if χ B is computable, then so is χ A .)
B = We = { x : φe ( x ) ↓}.
So, it turns out that all the examples of computably enumerable sets that
we have considered so far are either computable, or complete. This should
seem strange! Are there any examples of computably enumerable sets that
are neither computable nor complete? The answer is yes, but it wasn’t until
the middle of the 1950s that this was established by Friedberg and Muchnik,
independently.
Note that f ignores its third input entirely. Pick an index e such that f = φ3e ;
so we have
φ3e ( x, y, z) ≃ φ x (y).
By the s-m-n theorem, there is a function s(e, x, y) such that, for every z,
φs(e,x,y) (z) ≃ φ3e ( x, y, z)
≃ φ x ( y ).
In terms of the informal argument above, s(e, x, y) is an index for the ma-
chine that, for any input z, ignores that input and computes φ x (y).
In particular, we have
φs(e,x,y) (0) ↓ if and only if φ x (y) ↓ .
In other words, ⟨ x, y⟩ ∈ K0 if and only if s(e, x, y) ∈ K1 . So the function g
defined by
g(w) = s(e, (w)0 , (w)1 )
is a reduction of K0 to K1 .
Proof. To see that Tot is not computable, it suffices to show that K is reducible
to it. Let h( x, y) be defined by
(
0 if x ∈ K
h( x, y) ≃
undefined otherwise
Note that h( x, y) does not depend on y at all. It should not be hard to see that
h is partial computable: on input x, y, the we compute h by first simulating the
function φ x on input x; if this computation halts, h( x, y) outputs 0 and halts.
So h( x, y) is just Z (µs T ( x, x, s)), where Z is the constant zero function.
Using the s-m-n theorem, there is a primitive recursive function k( x ) such
that for every x and y,
(
0 if x ∈ K
φk( x) (y) =
undefined otherwise
So φk( x) is total if x ∈ K, and undefined otherwise. Thus, k is a reduction of K
to Tot.
If you think about it, you will see that the specifics of Tot do not play into
the proof of Proposition 30.20. We designed h( x, y) to act like the constant
function j(y) = 0 exactly when x is in K; but we could just as well have made
it act like any other partial computable function under those circumstances.
This observation lets us state a more general theorem, which says, roughly,
that no nontrivial property of computable functions is decidable.
Keep in mind that φ0 , φ1 , φ2 , . . . is our standard enumeration of the partial
computable functions.
Theorem 30.21 (Rice’s Theorem). Let C be any set of partial computable func-
tions, and let A = {n : φn ∈ C }. If A is computable, then either C is ∅ or C is
the set of all the partial computable functions.
An index set is a set A with the property that if n and m are indices which
“compute” the same function, then either both n and m are in A, or neither is.
It is not hard to see that the set A in the theorem has this property. Conversely,
if A is an index set and C is the set of functions computed by these indices,
then A = {n : φn ∈ C }.
With this terminology, Rice’s theorem is equivalent to saying that no non-
trivial index set is decidable. To understand what the theorem says, it is
helpful to emphasize the distinction between programs (say, in your favorite
programming language) and the functions they compute. There are certainly
questions about programs (indices), which are syntactic objects, that are com-
putable: does this program have more than 150 symbols? Does it have more
than 22 lines? Does it have a “while” statement? Does the string “hello world”
every appear in the argument to a “print” statement? Rice’s theorem says that
no nontrivial question about the program’s behavior is computable. This in-
cludes questions like these: does the program halt on input 0? Does it ever
halt? Does it ever output an even number?
Proof of Rice’s theorem. Suppose C is neither ∅ nor the set of all the partial com-
putable functions, and let A be the set of indices of functions in C. We will
show that if A were computable, we could solve the halting problem; so A is
not computable.
Without loss of generality, we can assume that the function f which is
nowhere defined is not in C (otherwise, switch C and its complement in the
argument below). Let g be any function in C. The idea is that if we could
decide A, we could tell the difference between indices computing f , and in-
dices computing g; and then we could use that capability to solve the halting
problem.
where P02 (z0 , z1 ) = z0 is the 2-place projection function returning the 0-th ar-
gument, which is computable.
Then h is a composition of partial computable functions, and the right side
is defined and equal to g(y) just when Un( x, x ) and g(y) are both defined.
Notice that for a fixed x, if φ x ( x ) is undefined, then h( x, y) is undefined for
every y; and if φ x ( x ) is defined, then h( x, y) ≃ g(y). So, for any fixed value
of x, either h( x, y) acts just like f or it acts just like g, and deciding whether or
not φ x ( x ) is defined amounts to deciding which of these two cases holds. But
this amounts to deciding whether or not h x (y) ≃ h( x, y) is in C or not, and if
A were computable, we could do just that.
More formally, since h is partial computable, it is equal to the function φk
for some index k. By the s-m-n theorem there is a primitive recursive function
s such that for each x, φs(k,x) (y) = h x (y). Now we have that for each x, if
φ x ( x ) ↓, then φs(k,x) is the same function as g, and so s(k, x ) is in A. On the
other hand, if φ x ( x ) ↑, then φs(k,x) is the same function as f , and so s(k, x )
is not in A. In other words we have that for every x, x ∈ K if and only if
s(k, x ) ∈ A. If A were computable, K would be also, which is a contradiction.
So A is not computable.
1. { x : 17 is in the range of φ x }
2. { x : φ x is constant}
3. { x : φ x is total}
2. For every computable function f ( x ), there is an index e such that for every y,
φ e ( y ) ≃ φ f ( e ) ( y ).
Proof. (1) ⇒ (2): Given f , define g by g( x, y) ≃ Un( f ( x ), y). Use (1) to get an
index e such that for every y,
φe (y) = Un( f (e), y)
= φ f ( e ) ( y ).
(2) ⇒ (1): Given g, use the s-m-n theorem to get f such that for every x
and y, φ f ( x) (y) ≃ g( x, y). Use (2) to get an index e such that
φe ( y ) = φ f (e) ( y )
= g(e, y).
This concludes the proof.
Before showing that statement (1) is true (and hence (2) as well), consider
how bizarre it is. Think of e as being a computer program; statement (1) says
that given any partial computable g( x, y), you can find a computer program
e that computes ge (y) ≃ g(e, y). In other words, you can find a computer
program that computes a function that references the program itself.
Theorem 30.24. The two statements in Lemma 30.23 are true. Specifically, for every
partial computable function g( x, y), there is an index e such that for every y,
φe (y) ≃ g(e, y).
Proof. The ingredients are already implicit in the discussion of the halting
problem above. Let diag( x ) be a computable function which for each x re-
turns an index for the function f x (y) ≃ φ x ( x, y), i.e.
φdiag( x) (y) ≃ φ x ( x, y).
Think of diag as a function that transforms a program for a 2-ary function into
a program for a 1-ary function, obtained by fixing the original program as its
first argument. The function diag can be defined formally as follows: first
define s by
s( x, y) ≃ Un2 ( x, x, y),
where Un2 is a 3-ary function that is universal for partial computable 2-ary
functions. Then, by the s-m-n theorem, we can find a primitive recursive func-
tion diag satisfying
φdiag( x) (y) ≃ s( x, y).
Now, define the function l by
l ( x, y) ≃ g(diag( x ), y).
and let ⌜l⌝ be an index for l. Finally, let e = diag(⌜l⌝). Then for every y, we
have
φe (y) ≃ φdiag(⌜l⌝) (y)
≃ φ⌜l⌝ (⌜l⌝, y)
≃ l (⌜l⌝, y)
≃ g(diag(⌜l⌝), y)
≃ g(e, y),
as required.
What’s going on? Suppose you are given the task of writing a computer
program that prints itself out. Suppose further, however, that you are working
with a programming language with a rich and bizarre library of string func-
tions. In particular, suppose your programming language has a function diag
which works as follows: given an input string s, diag locates each instance of
the symbol ‘x’ occuring in s, and replaces it by a quoted version of the original
string. For example, given the string
hello x world
as output. In that case, it is easy to write the desired program; you can check
that
print(diag(’print(diag(x))’))
does the trick. For more common programming languages like C++ and Java,
the same idea (with a more involved implementation) still works.
We are only a couple of steps away from the proof of the fixed-point theo-
rem. Suppose a variant of the print function print( x, y) accepts a string x and
another numeric argument y, and prints the string x repeatedly, y times. Then
the “program”
g(diag(’g(diag(x), y)’), y)
which is a program that, on input y, runs g on the program itself and y. Think-
ing of “quoting” with “using an index for,” we have the proof above.
For now, it is o.k. if you want to think of the proof as formal trickery, or
black magic. But you should be able to reconstruct the details of the argument
given above. When we prove the incompleteness theorems (and the related
“fixed-point theorem”) we will discuss other ways of understanding why it
works.
The same idea can be used to get a “fixed point” combinator. Suppose you
have a lambda term g, and you want another term k with the property that k
is β-equivalent to gk. Define terms
diag( x ) = xx
and
l ( x ) = g(diag( x ))
using our notational conventions; in other words, l is the term λx. g( xx ). Let
k be the term ll. Then we have
k = (λx. g( xx ))(λx. g( xx ))
−
→
→ g((λx. g( xx ))(λx. g( xx )))
= gk.
If one takes
Y = λg. ((λx. g( xx ))(λx. g( xx )))
then Yg and g(Yg) reduce to a common term; so Yg ≡ β g(Yg). This is known
as “Curry’s combinator.” If instead one takes
Y = (λxg. g( xxg))(λxg. g( xxg))
then in fact Yg reduces to g(Yg), which is a stronger statement. This latter
version of Y is known as “Turing’s combinator.”
and then using the fixed-point lemma to find an index e such that φe (y) =
g(e, y).
For a concrete example, the “greatest common divisor” function gcd(u, v)
can be defined by
(
v if 0 = 0
gcd(u, v) ≃
gcd(mod(v, u), u) otherwise
g( x ) ≃ µy f ( x, y) = 0.
Proof. The idea is roughly as follows. Given x, we will use the fixed-point
lambda term Y to define a function h x (n) which searches for a y starting at n;
then g( x ) is just h x (0). The function h x can be expressed as the solution of a
fixed-point equation:
(
n if f ( x, n) = 0
h x (n) ≃
h x (n + 1) otherwise.
We can do this using the fixed-point term Y. First, let U be the term
and then let H be the term YU. Notice that the only free variable in H is x. Let
us show that H satisfies the equation above.
By the definition of Y, we have
H = YU ≡ U (YU ) = U ( H ).
H (n) ≡ U ( H, n)
−
→
→ D (n, H (S(n)), F ( x, n)),
as required. Notice that if you substitute a numeral m for x in the last line, the
expression reduces to n if F (m, n) reduces to 0, and it reduces to H (S(n)) if
F (m, n) reduces to any other numeral.
To finish off the proof, let G be λx. H (0). Then G represents g; in other
words, for every m, G (m) reduces to reduces to g(m), if g(m) is defined, and
has no normal form otherwise.
Problems
Problem 30.1. Give a reduction of K to K0 .
Turing Machines
471
CHAPTER 30. COMPUTABILITY THEORY
31.1 Introduction
What does it mean for a function, say, from N to N to be computable? Among
the first answers, and the most well known one, is that a function is com-
putable if it can be computed by a Turing machine. This notion was set out
by Alan Turing in 1936. Turing machines are an example of a model of compu-
tation—they are a mathematically precise way of defining the idea of a “com-
putational procedure.” What exactly that means is debated, but it is widely
agreed that Turing machines are one way of specifying computational proce-
dures. Even though the term “Turing machine” evokes the image of a physi-
cal machine with moving parts, strictly speaking a Turing machine is a purely
mathematical construct, and as such it idealizes the idea of a computational
procedure. For instance, we place no restriction on either the time or memory
requirements of a Turing machine: Turing machines can compute something
even if the computation would require more storage space or more steps than
there are atoms in the universe.
It is perhaps best to think of a Turing machine as a program for a spe-
cial kind of imaginary mechanism. This mechanism consists of a tape and a
read-write head. In our version of Turing machines, the tape is infinite in one
direction (to the right), and it is divided into squares, each of which may con-
tain a symbol from a finite alphabet. Such alphabets can contain any number of
different symbols, but we will mainly make do with three: ▷, 0, and 1. When
the mechanism is started, the tape is empty (i.e., each square contains the sym-
bol 0) except for the leftmost square, which contains ▷, and a finite number of
squares which contain the input. At any time, the mechanism is in one of a
finite number of states. At the outset, the head scans the leftmost square and
in a specified initial state. At each step of the mechanism’s run, the content
of the square currently scanned together with the state the mechanism is in
and the Turing machine program determine what happens next. The Turing
machine program is given by a partial function which takes as input a state q
473
CHAPTER 31. TURING MACHINE COMPUTATIONS
2. A proof of the equivalence of two definitions (in case the new definition
has a greater intuitive appeal).
Our goal is to try to define the notion of computability “in principle,” i.e.,
without taking into account practical limitations of time and space. Of course,
with the broadest definition of computability in place, one can then go on
to consider computation with bounded resources; this forms the heart of the
subject known as “computational complexity.”
0, 1, R
start q0 q1
Recall that the Turing machine has a read/write head and a tape with the
input written on it. The instruction can be read as if reading a 0 in state q0 , write
a 1, move right, and move to state q1 . This is equivalent to the transition function
mapping ⟨q0 , 0⟩ to ⟨q1 , 1, R⟩.
Example 31.1. Even Machine: The following Turing machine halts if, and only
if, there are an even number of 1’s on the tape (under the assumption that all
0, 0, R
1, 1, R
start q0 q1
1, 1, R
The above machine halts only when the input is an even number of strokes.
Otherwise, the machine (theoretically) continues to operate indefinitely. For
any machine and input, it is possible to trace through the configurations of the
machine in order to determine the output. We will give a formal definition
of configurations later. For now, we can intuitively think of configurations
as a series of diagrams showing the state of the machine at any point in time
during operation. Configurations show the content of the tape, the state of the
machine and the location of the read/write head.
Let us trace through the configurations of the even machine if it is started
with an input of four 1’s. In this case, we expect that the machine will halt.
We will then run the machine on an input of three 1’s, where the machine will
run forever.
The machine starts in state q0 , scanning the leftmost 1. We can represent
the initial state of the machine as follows:
▷10 1110 . . .
The above configuration is straightforward. As can be seen, the machine starts
in state one, scanning the leftmost 1. This is represented by a subscript of the
state name on the first 1. The applicable instruction at this point is δ(q0 , 1) =
⟨q1 , 1, R⟩, and so the machine moves right on the tape and changes to state q1 .
▷111 110 . . .
Since the machine is now in state q1 scanning a 1, we have to “follow” the
instruction δ(q1 , 1) = ⟨q0 , 1, R⟩. This results in the configuration
▷1110 10 . . .
As the machine continues, the rules are applied again in the same order, re-
sulting in the following two configurations:
▷11111 0 . . .
▷111100 . . .
The machine is now in state q0 scanning a 0. Based on the transition diagram,
we can easily see that there is no instruction to be carried out, and thus the
machine has halted. This means that the input has been accepted.
Suppose next we start the machine with an input of three 1’s. The first few
configurations are similar, as the same instructions are carried out, with only
a small difference of the tape input:
▷10 110 . . .
▷111 10 . . .
▷1110 0 . . .
▷11101 . . .
The machine has now traversed past all the 1’s, and is reading a 0 in state q1 .
As shown in the diagram, there is an instruction of the form δ(q1 , 0) = ⟨q1 , 0, R⟩.
Since the tape is filled with 0 indefinitely to the right, the machine will con-
tinue to execute this instruction forever, staying in state q1 and moving ever
further to the right. The machine will never halt, and does not accept the
input.
It is important to note that not all machines will halt. If halting means that
the machine runs out of instructions to execute, then we can create a machine
that never halts simply by ensuring that there is an outgoing arrow for each
symbol at each state. The even machine can be modified to run indefinitely
by adding an instruction for scanning a 0 at q0 .
Example 31.2.
0, 0, R 0, 0, R
1, 1, R
start q0 q1
1, 1, R
1, 1, R 1, 1, R
1, 0, R 0, 0, R
start q0 q1 q2
0, 0, R 0, 1, R
q5 q4 q3
0, 0, L 1, 1, L
1, 1, L 1, 1, L 0, 1, L
Example 31.3. The machine table for the even machine is:
0 1 ▷
q0 1, q1 , R
q1 0, q1 , R 1, q0 , R
So far we have only considered machines that read and accept input. How-
ever, Turing machines have the capacity to both read and write. An example
of such a machine (although there are many, many examples) is a doubler. A
doubler, when started with a block of n 1’s on the tape, outputs a block of 2n
1’s.
3. an initial state q0 ∈ Q,
We assume that the tape is infinite in one direction only. For this reason
it is useful to designate a special symbol ▷ as a marker for the left end of the
tape. This makes it easier for Turing machine programs to tell when they’re
“in danger” of running off the tape. We could assume that this symbol is never
overwritten, i.e., that δ(q, ▷) = ⟨q′ , ▷, x ⟩ if δ(q, ▷) is defined. Some textbooks
do this, we do not. You can simply be careful when constructing your Turing
machine that it never overwrites ▷. Moreover, there are cases where allowing
such overwriting provides some convenient flexibility.
Example 31.6. Even Machine: The even machine is formally the quadruple
⟨ Q, Σ, q0 , δ⟩ where
Q = { q0 , q1 }
Σ = {▷, 0, 1},
δ(q0 , 1) = ⟨q1 , 1, R⟩,
δ(q1 , 1) = ⟨q0 , 1, R⟩,
δ(q1 , 0) = ⟨q1 , 0, R⟩.
3. q ∈ Q
Intuitively, the sequence C is the content of the tape (symbols of all squares
from the leftmost square to the last non-blank or previously visited square),
m is the number of the square the read/write head is scanning (beginning
with 0 being the number of the leftmost square), and q is the current state of
the machine.
Example 31.12. Addition: Let’s build a machine that computes the function
f (n, m) = n + m. This requires a machine that starts with two blocks of 1’s of
length n and m on the tape, and halts with one block consisting of n + m 1’s.
The two input blocks of 1’s are separated by a 0, so one method would be to
write a stroke on the square containing the 0, and erase the last 1.
1, 1, R 1, 1, R 1, 0, N
0, 1, N 0, 0, L
start q0 q1 q2
1, 1, R 1, 1, L
0, 1, L
q2 q3
q6
0, 0, R 0, 0, L
R
1,
0, 0, 1, R
1, 1, R q1 q4
q7 1, 1, R
1, 0, R 1, 1, L
0, 0, L
start q0 q5
0, 1, R
q8 1, 0, N
1, 1, L
Example 31.13. The machine in Figure 31.4 computes the function f ( x ) = 2x.
Instead of erasing the input and writing two 1’s at the far right for every 1 in
the input as the machine from Example 31.4 does, this machine adds a single 1
to the right for every 1 in the input. It has to keep track of where the input
ends, so it leaves a 0 between the input and the added strokes, which it fills
with a 1 at the very end. And we have to “remember” where we are in the
input, so we temporarily replace a 1 in the input block by a 0.
0, 0, R 1, 1, R
0, 0, R 1, 1, R
start q6 q7 q8
0, 0, L 0, ▷, L
1, 0, L
q11 q10 q9 1, 1, L
0, 0, R
1,
▷, ▷, R
0,
L
1, 1, R
0, 1, R ▷, 0, N
q12 q13 q14
0, 0, R
move the doubled block of strokes to the far left of the tape. The machine
in Figure 31.5 does just this last part: started on a tape consisting of a block
of 0’s followed by a block of 1’s (and the head positioned anywhere in the
block of 0’s), it erases the 1’s one at a time and writes them at the beginning
of the tape. In order to be able to tell when it is done, it first marks the end
of the block of 1’s with a ▷ symbol, which gets deleted at the end. We’ve
started numbering the states at q6 , so they can be added to the doubler ma-
chine. All you’ll need is an additional instruction δ(q5 , 0) = ⟨q6 , 0, N ⟩, i.e., an
arrow from q5 to q6 labelled 0, 0, N. (There is one subtle problem: the resulting
machine does not work for input x = 0. We’ll leave this as an exercise.)
2. M does not halt at all, or with an output that is not a single block of 1’s
if f (n1 , . . . , nk ) is undefined.
Example 31.16. Halting States. To elucidate this concept, let us begin with an
alteration of the even machine. Instead of having the machine halt in state q0 if
the input is even, we can add an instruction to send the machine into a halting
state.
0, 0, R
1, 1, R
start q0 q1
1, 1, R
0, 0, N
Let us further expand the example. When the machine determines that the
input is odd, it never halts. We can alter the machine to include a reject state
by replacing the looping instruction with an instruction to go to a reject state r.
1, 1, R
start q0 q1
1, 1, R
0, 0, N 0, 0, N
h r
advantages. The definition of halting used so far in this chapter makes the
proof of the Halting Problem intuitive and easy to demonstrate. For this rea-
son, we continue with our original definition.
We have already discussed that any Turing machine can be changed into
one with the same behavior but with a designated halting state. This is done
simply by adding a new state h, and adding an instruction δ(q, σ ) = ⟨h, σ, N ⟩
for any pair ⟨q, σ⟩ where the original δ is undefined. It is true, although te-
dious to prove, that any Turing machine M can be turned into a disciplined
Turing machine M′ which halts on the same inputs and produces the same
output. For instance, if the Turing machine halts and is not on square 1, we
can add some instructions to make the head move left until it finds the tape-
end marker, then move one square to the right, then halt. We’ll leave you to
think about how the other conditions can be dealt with.
Example 31.18. In Figure 31.6, we turn the addition machine from Example 31.12
into a disciplined machine.
Proposition 31.19. For every Turing machine M, there is a disciplined Turing ma-
chine M′ which halts with output O if M halts with output O, and does not halt if
M does not halt. In particular, any function f : Nn → N computable by a Turing
machine is also computable by a disciplined Turing machine.
0, 1, N
start q0 q1
0, 0, L
1, 1, R 1, 1, R
q2
1, 1, L
1, 0, L
h q3
▷, ▷, R
The examples of Turing machines we have seen so far have been fairly simple
in nature. But in fact, any problem that can be solved with any modern pro-
gramming language can also be solved with Turing machines. To build more
complex Turing machines, it is important to convince ourselves that we can
combine them, so we can build machines to solve more complex problems by
breaking the procedure into simpler parts. If we can find a natural way to
break a complex problem down into constituent parts, we can tackle the prob-
lem in several stages, creating several simple Turing machines and combining
them into one machine that can solve the problem. This point is especially
important when tackling the Halting Problem in the next section.
How do we combine Turing machines M = ⟨ Q, Σ, q0 , δ⟩ and M′ = ⟨ Q′ , Σ′ , q0′ , δ′ ⟩?
We now use the configuration of the tape after M has halted as the input con-
figuration of a run of machine M′ . To get a single Turing machine M ⌢ M′
that does this, do the following:
δ(q, σ)
if q ∈ Q
δ′′ (q, σ ) = δ′ (q, σ ) if q ∈ Q′
′
⟨q0 , σ, N ⟩ if q ∈ Q and δ(q, σ) is undefined
Note that unless the machine M is disciplined, we don’t know where the
tape head is when M halts, so the halting configuration of M need not have
the head scanning square 1. When combining machines, it’s important to keep
this in mind.
1, 1, R 1, 1, R 1, 0, N
0, 1, N 0, 0, L
start q0 q1 q2
machine.
1, 1, R 1, 1, R
0, 1, N 0, 0, L
start q0 q1 q2
1, 0, L
1, 1, L q3
▷, ▷, R
q4
1, 1, R 1, 1, R
0, 1, N 0, 0, L
start q0 q1 q2
1, 0, L
1, 1, L q3
1, 1, R 1, 1, R
▷, ▷, R
1, 0, R 0, 0, R
q4 q5 q6
0, 0, R 0, 1, R
q9 q8 q7
0, 0, L 1, 1, L
1, 1, L 1, 1, L 0, 1, L
and is supposed to move left. According to our definition, it just stays put
instead of “falling off”, but we could have defined it so that it halts when that
happens. This definition is also equivalent: we could simulate the behavior
of a Turing machine that halts when it attempts to move left from square 0
by deleting every transition δ(q, ▷) = ⟨q′ , σ, L⟩—then instead of attempting to
move left on ▷ the machine halts.1
There are also different ways of representing numbers (and hence the input-
output function computed by a Turing machine): we use unary representa-
tion, but you can also use binary representation. This requires two symbols in
addition to 0 and ▷.
Now here is an interesting fact: none of these variations matters as to
which functions are Turing computable. If a function is Turing computable ac-
cording to one definition, it is Turing computable according to all of them.
We won’t go into the details of verifying this. Here’s just one example:
we gain no additional computing power by allowing a tape that is infinite
in both directions, or multiple tapes. The reason is, roughly, that a Turing
machine with a single one-way infinite tape can simulate multiple or two-way
infinite tapes. E.g., using additional states and instructions, we can “translate”
a program for a machine with multiple tapes or two-way infinite tape into
one with a single one-way infinite tape. The translated machine can use the
even squares for the squares of tape 1 (or the “positive” squares of a two-way
infinite tape) and the odd squares for the squares of tape 2 (or the “negative”
squares).
other than square 0 (see Example 31.14). We can get around that by adding a second ▷′ symbol to
use instead for such a purpose.
Problems
Problem 31.1. Choose an arbitary input and trace through the configurations
of the doubler machine in Example 31.4.
Problem 31.6. Give a definition for when a Turing machine M computes the
function f : Nk → Nm .
Problem 31.7. Trace through the configurations of the machine from Exam-
ple 31.12 for input ⟨3, 2⟩. What happens if the machine computes 0 + 0?
Problem 31.9. Subtraction: Design a Turing machine that when given an input
of two non-empty strings of strokes of length n and m, where n > m, computes
the function f (n, m) = n − m.
Undecidability
32.1 Introduction
It might seem obvious that not every function, even every arithmetical func-
tion, can be computable. There are just too many, whose behavior is too
complicated. Functions defined from the decay of radioactive particles, for
instance, or other chaotic or random behavior. Suppose we start counting 1-
second intervals from a given time, and define the function f (n) as the num-
ber of particles in the universe that decay in the n-th 1-second interval after
that initial moment. This seems like a candidate for a function we cannot ever
hope to compute.
But it is one thing to not be able to imagine how one would compute such
functions, and quite another to actually prove that they are uncomputable.
In fact, even functions that seem hopelessly complicated may, in an abstract
sense, be computable. For instance, suppose the universe is finite in time—
some day, in the very distant future the universe will contract into a single
point, as some cosmological theories predict. Then there is only a finite (but
incredibly large) number of seconds from that initial moment for which f (n)
is defined. And any function which is defined for only finitely many inputs is
computable: we could list the outputs in one big table, or code it in one very
big Turing machine state transition diagram.
We are often interested in special cases of functions whose values give the
answers to yes/no questions. For instance, the question “is n a prime num-
ber?” is associated with the function
(
1 if n is prime
isprime(n) =
0 otherwise.
We say that a yes/no question can be effectively decided, if the associated 1/0-
valued function is effectively computable.
To prove mathematically that there are functions which cannot be effec-
tively computed, or problems that cannot effectively decided, it is essential to
493
CHAPTER 32. UNDECIDABILITY
fix a specific model of computation, and show that there are functions it can-
not compute or problems it cannot decide. We can show, for instance, that not
every function can be computed by Turing machines, and not every problem
can be decided by Turing machines. We can then appeal to the Church-Turing
thesis to conclude that not only are Turing machines not powerful enough to
compute every function, but no effective procedure can.
The key to proving such negative results is the fact that we can assign
numbers to Turing machines themselves. The easiest way to do this is to enu-
merate them, perhaps by fixing a specific way to write down Turing machines
and their programs, and then listing them in a systematic fashion. Once we
see that this can be done, then the existence of Turing-uncomputable functions
follows by simple cardinality considerations: the set of functions from N to N
(in fact, even just from N to {0, 1}) are non-enumerable, but since we can enu-
merate all the Turing machines, the set of Turing-computable functions is only
denumerable.
We can also define specific functions and problems which we can prove
to be uncomputable and undecidable, respectively. One such problem is the
so-called Halting Problem. Turing machines can be finitely described by list-
ing their instructions. Such a description of a Turing machine, i.e., a Turing
machine program, can of course be used as input to another Turing machine.
So we can consider Turing machines that decide questions about other Tur-
ing machines. One particularly interesting question is this: “Does the given
Turing machine eventually halt when started on input n?” It would be nice if
there were a Turing machine that could decide this question: think of it as a
quality-control Turing machine which ensures that Turing machines don’t get
caught in infinite loops and such. The interesting fact, which Turing proved,
is that there cannot be such a Turing machine. There cannot be a single Turing
machine which, when started on input consisting of a description of a Turing
machine M and some number n, will always halt with either output 1 or 0
according to whether M machine would have halted when started on input n
or not.
0, 0, R
1, 1, R
start q0 q1
1, 1, R
0, 0, R
A, A, R
start s h
A, A, R
2, 2, R
3, 3, R
start 1 2
3, 3, R
Σ δ(2,2)=⟨2,2,R⟩
z }| { z }| {
2, 1, 2 , 3, 1, 2, 3, 1, 1, 3, 2, 3, 2 , 2, 2, 2, 2, 2 , 2, 3, 1, 3, 2 .
|{z} | {z } | {z }
Q δ(1,3)=⟨2,3,R⟩ δ(2,3)=⟨1,3,R⟩
Theorem 32.1. There are functions from N to N which are not Turing computable.
Proof. We know that the set of finite sequences of positive integers (Z+ )∗ is
enumerable (problem 4.7). This gives us that the set of descriptions of stan-
dard Turing machines, as a subset of (Z+ )∗ , is itself enumerable. Every Turing
computable function N to N is computed by some (in fact, many) Turing ma-
chines. By renaming its states and symbols to positive integers (in particular,
▷ as 1, 0 as 2, and 1 as 3) we can see that every Turing computable function is
computed by a standard Turing machine. This means that the set of all Turing
computable functions from N to N is also enumerable.
On the other hand, the set of all functions from N to N is not enumerable
(problem 4.35). If all functions were computable by some Turing machine,
we could enumerate the set of all functions by listing all the descriptions of
Turing machines that compute them. So there are some functions that are not
Turing computable.
Definition 32.2. If M is the eth Turing machine (in our fixed enumeration), we
say that e is an index of M. We write Me for the eth Turing machine.
A machine may have more than one index, e.g., two descriptions of M
may differ in the order in which we list its instructions, and these different
descriptions will have different indices.
Importantly, it is possible to give the enumeration of Turing machine de-
scriptions in such a way that we can effectively compute the description of M
from its index, and to effectively compute an index of a machine M from its
description. By the Church-Turing thesis, it is then possible to find a Turing
machine which recovers the description of the Turing machine with index e
and writes the corresponding description on its tape as output. The descrip-
tion would be a sequence of blocks of 1’s (representing the positive integers in
the sequence describing Me ).
Given this, it now becomes natural to ask: what functions of Turing ma-
chine indices are themselves computable by Turing machines? What proper-
ties of Turing machine indices can be decided by Turing machines? An ex-
ample: the function that maps an index e to the number of states the Turing
machine with index e has, is computable by a Turing machine. Here’s what
such a Turing machine would do: started on a tape containing a single block
of e 1’s, it would first decode e into its description. The description is now
represented by a sequence of blocks of 1’s on the tape. Since the first element
in this sequence is the number of states. So all that has to be done now is to
erase everything but the first block of 1’s and then halt.
A remarkable result is the following:
1. Find the number k of the “current head position” (at the beginning,
that’s 1),
2. Move to the kth block in the “tape” to see what the “symbol” there is,
4. Move back to the kth block on the “tape” and replace the “symbol” there
with the code number of the symbol Me would write,
5. Move the head to where it records the current “state” and replace the
number there with the number of the new state,
6. Move to the place where it records the “tape position” and erase a 1 or
add a 1 (if the instruction says to move left or right, respectively).
7. Repeat.2
2 We’re glossing over some subtle difficulties here. E.g., U may need some extra space when
it increases the counter where it keeps track of the “current head position”—in that case it will
have to move the entire “tape” to the right.
If Me started on input n never halts, then U also never halts, so its output is
undefined.
If in step (3) it turns out that the description of Me contains no instruction
for the current “state”/“symbol” pair, then Me would halt. If this happens, U
erases the part of its tape to the left of the “tape.” For each block of three 1’s
(representing a 1 on Me ’s tape), it writes a 1 on the left end of its own tape, and
successively erases the “tape.” When this is done, U’s tape contains a single
block of 1’s of length m.
If U encounters something other than a block of three 1’s on the “tape,” it
immediately halts. Since U’s tape in this case does not contain a single block
of 1’s, its output is not a natural number, i.e., f (e, n) is undefined in this case.
Definition 32.5 (Halting problem). The Halting Problem is the problem of de-
termining (for any e, n) whether the Turing machine Me halts for an input of n
strokes.
2. Now suppose Me does not halt for an input of e 1s. Then s(e) = 0, and
S, when started on input e, halts with a blank tape. J, when started on
a blank tape, immediately halts. Again, Me does what S followed by J
would do, so Me must halt for an input of e 1’s.
In order to establish this important negative result, we prove that the de-
cision problem cannot be solved by a Turing machine. That is, we show that
there is no Turing machine which, whenever it is started on a tape that con-
tains a first-order sentence, eventually halts and outputs either 1 or 0 depend-
ing on whether the sentence is valid or not. By the Church-Turing thesis, every
function which is computable is Turing computable. So if this “validity func-
tion” were effectively computable at all, it would be Turing computable. If it
isn’t Turing computable, then, it also cannot be effectively computable.
Our strategy for proving that the decision problem is unsolvable is to re-
duce the halting problem to it. This means the following: We have proved that
the function h(e, w) that halts with output 1 if the Turing machine described
by e halts on input w and outputs 0 otherwise, is not Turing computable. We
will show that if there were a Turing machine that decides validity of first-
order sentences, then there is also Turing machine that computes h. Since h
cannot be computed by a Turing machine, there cannot be a Turing machine
that decides validity either.
The first step in this strategy is to show that for every input w and a Turing
machine M, we can effectively describe a sentence τ ( M, w) representing the
instruction set of M and the input w and a sentence α( M, w) expressing “M
eventually halts” such that:
The bulk of our proof will consist in describing these sentences τ ( M, w) and α( M, w)
and in verifying that τ ( M, w) → α( M, w) is valid iff M halts on input w.
a predicate symbol < to express both the ordering of tape positions (when it
means “to the left of”) and execution steps (then it means “before”).
Once we have the language in place, we list the “axioms” of τ ( M, w), i.e.,
the sentences which, taken together, describe the behavior of M when run on
input w. There will be sentences which lay down conditions on 0, ′, and <,
sentences that describes the input configuration, and sentences that describe
what the configuration of M is after it executes a particular instruction.
Definition 32.9. Given a Turing machine M = ⟨ Q, Σ, q0 , δ⟩, the language L M
consists of:
1. A two-place predicate symbol Qq ( x, y) for every state q ∈ Q. Intu-
itively, Qq (m, n) expresses “after n steps, M is in state q scanning the
mth square.”
3. A constant symbol 0
For each number n there is a canonical term n, the numeral for n, which
represents it in L M . 0 is 0, 1 is 0′ , 2 is 0′′ , and so on. More formally:
0=0
n + 1 = n′
a) A sentence that says that every number is less than its successor:
∀x x < x′
∀ x ∀y ((Qqi ( x, y) ∧ Sσ ( x, y)) →
(Qq j ( x ′ , y′ ) ∧ Sσ′ ( x, y′ ) ∧ φ( x, y)))
This says that if, after y steps, the machine is in state qi scanning
square x which contains symbol σ, then after y + 1 steps it is scan-
ning square x + 1, is in state q j , square x now contains σ′ , and every
square other than x contains the same symbol as it did after y steps.
b) For every instruction δ(qi , σ ) = ⟨q j , σ′ , L⟩, the sentence:
∀ x ∀y ((Qqi ( x ′ , y) ∧ Sσ ( x ′ , y)) →
(Qq j ( x, y′ ) ∧ Sσ′ ( x ′ , y′ ) ∧ φ( x, y))) ∧
∀y ((Qqi (0, y) ∧ Sσ (0, y)) →
(Qq j (0, y′ ) ∧ Sσ′ (0, y′ ) ∧ φ(0, y)))
Take a moment to think about how this works: now we don’t start
with “if scanning square x . . . ” but: “if scanning square x + 1 . . . ” A
move to the left means that in the next step the machine is scanning
square x. But the square that is written on is x + 1. We do it this
way since we don’t have subtraction or a predecessor function.
Note that numbers of the form x + 1 are 1, 2, . . . , i.e., this doesn’t
cover the case where the machine is scanning square 0 and is sup-
posed to move left (which of course it can’t—it just stays put). That
special case is covered by the second conjunction: it says that if, af-
ter y steps, the machine is scanning square 0 in state qi and square 0
contains symbol σ, then after y + 1 steps it’s still scanning square 0,
is now in state q j , the symbol on square 0 is σ′ , and the squares
other than square 0 contain the same symbols they contained ofter
y steps.
∀ x ∀y ((Qqi ( x, y) ∧ Sσ ( x, y)) →
(Qq j ( x, y′ ) ∧ Sσ′ ( x, y′ ) ∧ φ( x, y)))
Let τ ( M, w) be the conjunction of all the above sentences for Turing machine M
and input w.
In order to express that M eventually halts, we have to find a sentence that
says “after some number of steps, the transition function will be undefined.”
Let X be the set of all pairs ⟨q, σ ⟩ such that δ(q, σ ) is undefined. Let α( M, w)
then be the sentence
_
∃ x ∃y ( (Qq ( x, y) ∧ Sσ ( x, y)))
⟨q,σ⟩∈ X
∃ x ∃y Qh ( x, y)
Proof. Exercise.
Proof. Suppose that M halts for input w after n steps. There is some state q,
square m, and symbol σ such that:
Lemma 32.13. For each n, if M has not halted after n steps, τ ( M, w) ⊨ χ( M, w, n).
1. δ(q, σ ) = ⟨q′ , σ′ , R⟩
2. δ(q, σ ) = ⟨q′ , σ′ , L⟩
3. δ(q, σ ) = ⟨q′ , σ′ , N ⟩
Qq (m, n) ∧ Sσ (m, n)
We now get
as follows: The first line comes directly from the consequent of the pre-
ceding conditional, by modus ponens. Each conjunct in the middle
line—which excludes Sσm (m, n′ )—follows from the corresponding con-
junct in χ( M, w, n) together with φ(m, n).
If m < k, τ ( M, w) ⊢ m < k (Proposition 32.10) and by transitivity of <,
we have ∀ x (k < x → m < x ). If m = k, then ∀ x (k < x → m < x ) by
logic alone. The last line then follows from the corresponding conjunct
in χ( M, w, n), ∀ x (k < x → m < x ), and φ(m, n). If m < k, this already is
χ( M, w, n + 1).
Now suppose m = k. In that case, after n + 1 steps, the tape head has
also visited square k + 1, which now is the right-most square visited.
′
So χ( M, w, n + 1) has a new conjunct, S0 (k , n′ ), and the last conjuct is
′
∀ x (k < x → S0 ( x, n′ )). We have to verify that these two sentences are
also implied.
We already have ∀ x (k < x → S0 ( x, n′ )). In particular, this gives us k <
′ ′ ′
k → S0 (k , n′ ). From the axiom ∀ x x < x ′ we get k < k . By modus
′ ′
ponens, S0 (k , n ) follows.
′
Also, since τ ( M, w) ⊢ k < k , the axiom for transitivity of < gives us
′
∀ x (k < x → S0 ( x, n′ )). (We leave the verification of this as an exercise.)
∀ x ∀y ((Qq ( x ′ , y) ∧ Sσ ( x ′ , y)) →
(Qq′ ( x, y′ ) ∧ Sσ′ ( x ′ , y′ ) ∧ φ( x, y))) ∧
∀y ((Qqi (0, y) ∧ Sσ (0, y)) →
(Qq j (0, y′ ) ∧ Sσ′ (0, y′ ) ∧ φ(0, y)))
′ ′
(Qq (l , n) ∧ Sσ (l , n)) →
′
(Qq′ (l, n′ ) ∧ Sσ′ (l , n′ ) ∧ φ(l, n))
Proof. Suppose the decision problem were solvable, i.e., suppose there were
a Turing machine D. Then we could solve the halting problem as follows.
We construct a Turing machine E that, given as input the number e of Turing
machine Me and input w, computes the corresponding sentence τ ( Me , w) →
α( Me , w) and halts, scanning the leftmost square on the tape. The machine
E ⌢ D would then, given input e and w, first compute τ ( Me , w) → α( Me , w)
and then run the decision problem machine D on that input. D halts with out-
put 1 iff τ ( Me , w) → α( Me , w) is valid and outputs 0 otherwise. By Lemma 32.15
and Lemma 32.14, τ ( Me , w) → α( Me , w) is valid iff Me halts on input w. Thus,
E ⌢ D, given input e and w halts with output 1 iff Me halts on input w and
halts with output 0 otherwise. In other words, E ⌢ D would solve the halting
problem. But we know, by Theorem 32.8, that no such Turing machine can
exist.
Proof. All possible derivations of first-order logic can be generated, one after
another, by an effective algorithm. The machine E does this, and when it finds
a derivation that shows that ⊢ ψ, it halts with output 1. By the soundness
theorem, if E halts with output 1, it’s because ⊨ ψ. By the completeness the-
orem, if ⊨ ψ there is a derivation that shows that ⊢ ψ. Since E systematically
generates all possible derivations, it will eventually find one that shows ⊢ ψ,
so will eventually halt with output 1.
ψ ( y ) ≡ ∀ x ( x < y → x ̸ = y ).
∀ x ∀y ((Qqi ( x, y) ∧ Sσ ( x, y)) →
(Qq j ( x ′ , y′ ) ∧ Sσ′ ( x, y′ ) ∧ φ( x, y) ∧ ψ(y′ )))
∀ x ∀y ((Qqi ( x ′ , y) ∧ Sσ ( x ′ , y)) →
(Qq j ( x, y′ ) ∧ Sσ′ ( x ′ , y′ ) ∧ φ( x, y))) ∧
∀y ((Qqi (0, y) ∧ Sσ (0, y)) →
(Qq j (0, y′ ) ∧ Sσ′ (0, y′ ) ∧ φ(0, y) ∧ ψ(y′ )))
∀ x ∀y ((Qqi ( x, y) ∧ Sσ ( x, y)) →
(Qq j ( x, y′ ) ∧ Sσ′ ( x, y′ ) ∧ φ( x, y) ∧ ψ(y′ )))
As you can see, the sentences describing the transitions of M are the
same as the corresponding sentence in τ ( M, w), except we add ψ(y′ ) at
the end. ψ(y′ ) ensures that the number y′ of the “next” configuration is
different from all previous numbers 0, 0′ , . . . .
Let τ ′ ( M, w) be the conjunction of all the above sentences for Turing ma-
chine M and input w.
where n = max(k, len(w)) and k is the least number such that M started on
input w has halted after k steps. We leave the verification that M′ ⊨ τ ′ ( M, w) ∧
E( M, w) as an exercise.
Proof. Suppose there were a Turing machine F that decides the finite satisfi-
ability problem. Then given any Turing machine M and input w, we could
compute the sentence τ ′ ( M, w) ∧ α( M, w), and use F to decide if it has a finite
model. By Lemmata 32.19 and 32.20, it does iff M started on input w halts. So
we could use F to solve the halting problem, which we know is unsolvable.
Corollary 32.22. There can be no derivation system that is sound and complete for
finite validity, i.e., a derivation system which has ⊢ ψ iff M ⊨ ψ for every finite
structure M.
Proof. Exercise.
Problems
Problem 32.1. Can you think of a way to describe Turing machines that does
not require that the states and alphabet symbols are explicitly listed? You may
define your own notion of “standard” machine, but say something about why
every Turing machine can be computed by a “standard” machine in your new
sense.
Problem 32.2. The Three Halting (3-Halt) problem is the problem of giving a
decision procedure to determine whether or not an arbitrarily chosen Turing
Machine halts for an input of three 1’s on an otherwise blank tape. Prove that
the 3-Halt problem is unsolvable.
Problem 32.3. Show that if the halting problem is solvable for Turing machine
and input pairs Me and n where e ̸= n, then it is also solvable for the cases
where e = n.
Problem 32.4. We proved that the halting problem is unsolvable if the input
is a number e, which identifies a Turing machine Me via an enumaration of all
Turing machines. What if we allow the description of Turing machines from
section 32.2 directly as input? Can there be a Turing machine which decides
the halting problem but takes as input descriptions of Turing machines rather
than indices? Explain why or why not.
Problem 32.8. Give a derivation of Sσi (i, n′ ) from Sσi (i, n) and φ(m, n) (as-
suming i ̸= m, i.e., either i < m or m < i).
′
Problem 32.9. Give a derivation of ∀ x (k < x → S0 ( x, n′ )) from ∀ x (k < x →
S0 ( x, n′ )), ∀ x x < x ′ , and ∀ x ∀y ∀z (( x < y ∧ y < z) → x < z).)
Incompleteness
514
32.9. TRAKTHENBROT’S THEOREM
Introduction to Incompleteness
516
33.1. HISTORICAL BACKGROUND
Basic Laws of Arithmetic, Frege set out to show that all of arithmetic could be
derived in his Begriffsschrift from purely logical assumption. Unfortunately,
these assumptions turned out to be inconsistent, as Russell showed in 1902.
But setting aside the inconsistent axiom, Frege more or less invented mod-
ern logic singlehandedly, a startling achievement. Quantificational logic was
also developed independently by algebraically-minded thinkers after Boole,
including Peirce and Schröder.
Let us now turn to developments in the foundations of mathematics. Of
course, since logic plays an important role in mathematics, there is a good deal
of interaction with the developments just described. For example, Frege de-
veloped his logic with the explicit purpose of showing that all of mathematics
could be based solely on his logical framework; in particular, he wished to
show that mathematics consists of a priori analytic truths instead of, as Kant
had maintained, a priori synthetic ones.
Many take the birth of mathematics proper to have occurred with the
Greeks. Euclid’s Elements, written around 300 B.C., is already a mature rep-
resentative of Greek mathematics, with its emphasis on rigor and precision.
The definitions and proofs in Euclid’s Elements survive more or less in tact
in high school geometry textbooks today (to the extent that geometry is still
taught in high schools). This model of mathematical reasoning has been held
to be a paradigm for rigorous argumentation not only in mathematics but in
branches of philosophy as well. (Spinoza even presented moral and religious
arguments in the Euclidean style, which is strange to see!)
Calculus was invented by Newton and Leibniz in the seventeenth century.
(A fierce priority dispute raged for centuries, but most scholars today hold
that the two developments were for the most part independent.) Calculus in-
volves reasoning about, for example, infinite sums of infinitely small quanti-
ties; these features fueled criticism by Bishop Berkeley, who argued that belief
in God was no less rational than the mathematics of his time. The methods of
calculus were widely used in the eighteenth century, for example by Leonhard
Euler, who used calculations involving infinite sums with dramatic results.
In the nineteenth century, mathematicians tried to address Berkeley’s crit-
icisms by putting calculus on a firmer foundation. Efforts by Cauchy, Weier-
strass, Bolzano, and others led to our contemporary definitions of limits, con-
tinuity, differentiation, and integration in terms of “epsilons and deltas,” in
other words, devoid of any reference to infinitesimals. Later in the century,
mathematicians tried to push further, and explain all aspects of calculus, in-
cluding the real numbers themselves, in terms of the natural numbers. (Kro-
necker: “God created the whole numbers, all else is the work of man.”) In
1872, Dedekind wrote “Continuity and the irrational numbers,” where he
showed how to “construct” the real numbers as sets of rational numbers (which,
as you know, can be viewed as pairs of natural numbers); in 1888 he wrote
“Was sind und was sollen die Zahlen” (roughly, “What are the natural num-
bers, and what should they be?”) which aimed to explain the natural numbers
in purely “logical” terms. In 1887 Kronecker wrote “Über den Zahlbegriff”
(“On the concept of number”) where he spoke of representing all mathemati-
cal object in terms of the integers; in 1889 Giuseppe Peano gave formal, sym-
bolic axioms for the natural numbers.
The end of the nineteenth century also brought a new boldness in dealing
with the infinite. Before then, infinitary objects and structures (like the set of
natural numbers) were treated gingerly; “infinitely many” was understood
as “as many as you want,” and “approaches in the limit” was understood as
“gets as close as you want.” But Georg Cantor showed that it was possible to
take the infinite at face value. Work by Cantor, Dedekind, and others help to
introduce the general set-theoretic understanding of mathematics that is now
widely accepted.
This brings us to twentieth century developments in logic and founda-
tions. In 1902 Russell discovered the paradox in Frege’s logical system. In 1904
Zermelo proved Cantor’s well-ordering principle, using the so-called “axiom
of choice”; the legitimacy of this axiom prompted a good deal of debate. Be-
tween 1910 and 1913 the three volumes of Russell and Whitehead’s Principia
Mathematica appeared, extending the Fregean program of establishing mathe-
matics on logical grounds. Unfortunately, Russell and Whitehead were forced
to adopt two principles that seemed hard to justify as purely logical: an axiom
of infinity and an axiom of “reducibility.” In the 1900’s Poincaré criticized the
use of “impredicative definitions” in mathematics, and in the 1910’s Brouwer
began proposing to refound all of mathematics in an “intuitionistic” basis,
which avoided the use of the law of the excluded middle (φ ∨ ¬ φ).
Strange days indeed! The program of reducing all of mathematics to logic
is now referred to as “logicism,” and is commonly viewed as having failed,
due to the difficulties mentioned above. The program of developing mathe-
matics in terms of intuitionistic mental constructions is called “intuitionism,”
and is viewed as posing overly severe restrictions on everyday mathemat-
ics. Around the turn of the century, David Hilbert, one of the most influen-
tial mathematicians of all time, was a strong supporter of the new, abstract
methods introduced by Cantor and Dedekind: “no one will drive us from the
paradise that Cantor has created for us.” At the same time, he was sensitive
to foundational criticisms of these new methods (oddly enough, now called
“classical”). He proposed a way of having one’s cake and eating it too:
2. Use safe, “finitary” methods to prove that these formal deductive sys-
tems are consistent.
Hilbert’s work went a long way toward accomplishing the first goal. In
1899, he had done this for geometry in his celebrated book Foundations of ge-
can be proved in them. It also makes sense to develop less restricted methods
of proof for establishing the consistency of these systems, and to find ways to
measure how hard it is to prove their consistency. Since Gödel showed that
(almost) every formal system has questions it cannot settle, it makes sense to
look for “interesting” questions a given formal system cannot settle, and to
figure out how strong a formal system has to be to settle them. To the present
day, logicians have been pursuing these questions in a new mathematical dis-
cipline, the theory of proofs.
33.2 Definitions
In order to carry out Hilbert’s project of formalizing mathematics and show-
ing that such a formalization is consistent and complete, the first order of busi-
ness would be that of picking a language, logical framework, and a system of
axioms. For our purposes, let us suppose that mathematics can be formalized
in a first-order language, i.e., that there is some set of constant symbols, func-
tion symbols, and predicate symbols which, together with the connectives and
quatifiers of first-order logic, allow us to express the claims of mathematics.
Most people agree that such a language exists: the language of set theory, in
which ∈ is the only non-logical symbol. That such a simple language is so
expressive is of course a very implausible claim at first sight, and it took a
lot of work to establish that practically of all mathematics can be expressed
in this very austere vocabulary. To keep things simple, for now, let’s restrict
our discussion to arithmetic, so the part of mathematics that just deals with
the natural numbers N. The natural language in which to express facts of
arithmetic is L A . L A contains a single two-place predicate symbol <, a sin-
gle constant symbol 0, one one-place function symbol ′, and two two-place
function symbols + and ×.
There are two easy ways to specify theories. One is as the set of sentences
true in some structure. For instance, consider the structure for L A in which
the domain is N and all non-logical symbols are interpreted as you would
expect.
1. |N| = N
2. 0N = 0
Q = { φ : { Q1 , . . . , Q8 } ⊨ φ }.
Definition 33.7. A theory Γ is complete iff for every sentence φ in its language,
either Γ ⊨ φ or Γ ⊨ ¬ φ.
φ(0, y1 , . . . , yn ) ∧ ∀ x ( φ( x, y1 , . . . , yn ) → φ( x ′ , y1 , . . . , yn ))
Definition 33.11. A set X is called computably enumerable (c.e. for short) iff it
is empty or it has a computable enumeration.
tary,” means, which would defend classical mathematics against the chal-
lenges of intuitionism. Gödel’s incompleteness theorems showed that these
goals cannot be achieved.
Gödel’s first incompleteness theorem showed that a version of Russell and
Whitehead’s Principia Mathematica is not complete. But the proof was actu-
ally very general and applies to a wide variety of theories. This means that it
wasn’t just that Principia Mathematica did not manage to completely capture
mathematics, but that no acceptable theory does. It took a while to isolate
the features of theories that suffice for the incompleteness theorems to apply,
and to generalize Gödel’s proof to apply make it depend only on these fea-
tures. But we are now in a position to state a very general version of the first
incompleteness theorem for theories in the language L A of arithmetic.
To say that Γ is not complete is to say that for at least one sentence φ,
Γ ⊬ φ and Γ ⊬ ¬ φ. Such a sentence is called independent (of Γ ). We can in
fact relatively quickly prove that there must be independent sentences. But
the power of Gödel’s proof of the theorem lies in the fact that it exhibits a
specific example of such an independent sentence. The intriguing construction
produces a sentence γΓ , called a Gödel sentence for Γ, which is unprovable
because in Γ, γΓ is equivalent to the claim that γΓ is unprovable in Γ. It does
so constructively, i.e., given an axiomatization of Γ and a description of the
derivation system, the proof gives a method for actually writing down γΓ .
The construction in Gödel’s proof requires that we find a way to express
in L A the properties of and operations on terms and formulas of L A itself.
These include properties such as “φ is a sentence,” “δ is a derivation of φ,”
and operations such as φ[t/x ]. This way must (a) express these properties
and relations via a “coding” of symbols and sequences thereof (which is what
terms, formulas, derivations, etc. are) as natural numbers (which is what L A
can talk about). It must (b) do this in such a way that Γ will prove the relevant
facts, so we must show that these properties are coded by decidable properties
of natural numbers and the operations correspond to computable functions on
natural numbers. This is called “arithmetization of syntax.”
Before we investigate how syntax can be arithmetized, however, we will
consider the condition that Γ is “strong enough,” i.e., represents all com-
putable functions and decidable relations. This requires that we give a precise
definition of “computable.” This can be done in a number of ways, e.g., via
the model of Turing machines, or as those functions computable by programs
in some general-purpose programming language. Since our aim is to repre-
sent these functions and relations in a theory in the language L A , however, it
is best to pick a simple definition of computability of just numerical functions.
This is the notion of recursive function. So we will first discuss the recursive
functions. We will then show that Q already represents all recursive functions
and relations. This will allow us to apply the incompleteness theorem to spe-
cific theories such as Q and PA, since we will have established that these are
examples of theories that are “strong enough.”
The end result of the arithmetization of syntax is a formula Prov Γ ( x ) which,
via the coding of formulas as numbers, expresses provability from the axioms
of Γ. Specifically, if φ is coded by the number n, and Γ ⊢ φ, then Γ ⊢ ProvΓ (n).
This “provability predicate” for Γ allows us also to express, in a certain sense,
the consistency of Γ as a sentence of L A : let the “consistency statement” for Γ
be the sentence ¬ProvΓ (n), where we take n to be the code of a contradiction,
e.g., of ⊥. The second incompleteness theorem states that consistent axioma-
tizable theories also do not prove their own consistency statements. The con-
ditions required for this theorem to apply are a bit more stringent than just
that the theory represents all computable functions and decidable relations,
but we will show that PA satisifes them.
D = {n : Γ ⊢ ¬ φn (n)}
The preceding theorem shows that no consistent theory that represents all
decidable relations can be decidable. We will show that Q does represent all
decidable relations; this means that all theories that include Q, such as PA and
TA, also do, and hence also are not decidable. (Since all these theories are true
in the standard model, they are all consistent.))
We can also use this result to obtain a weak version of the first incomplete-
ness theorem. Any theory that is axiomatizable and complete is decidable.
Consistent theories that are axiomatizable and represent all decidable proper-
ties then cannot be complete.
Problems
Problem 33.1. Show that TA = { φ : N ⊨ φ} is not axiomatizable. You may
assume that TA represents all decidable properties.
Arithmetization of Syntax
34.1 Introduction
In order to connect computability and logic, we need a way to talk about the
objects of logic (symbols, terms, formulas, derivations), operations on them,
and their properties and relations, in a way amenable to computational treat-
ment. We can do this directly, by considering computable functions and re-
lations on symbols, sequences of symbols, and other objects built from them.
Since the objects of logical syntax are all finite and built from an enumerable
sets of symbols, this is possible for some models of computation. But other
models of computation—such as the recursive functions—-are restricted to
numbers, their relations and functions. Moreover, ultimately we also want
to be able to deal with syntax within certain theories, specifically, in theo-
ries formulated in the language of arithmetic. In these cases it is necessary to
arithmetize syntax, i.e., to represent syntactic objects, operations on them, and
their relations, as numbers, arithmetical functions, and arithmetical relations,
respectively. The idea, which goes back to Leibniz, is to assign numbers to
syntactic objects.
It is relatively straightforward to assign numbers to symbols as their “codes.”
Some symbols pose a bit of a challenge, since, e.g., there are infinitely many
variables, and even infinitely many function symbols of each arity n. But of
course it’s possible to assign numbers to symbols systematically in such a way
that, say, v2 and v3 are assigned different codes. Sequences of symbols (such
as terms and formulas) are a bigger challenge. But if we can deal with se-
quences of numbers purely arithmetically (e.g., by the powers-of-primes cod-
ing of sequences), we can extend the coding of individual symbols to coding
of sequences of symbols, and then further to sequences or other arrangements
529
CHAPTER 34. ARITHMETIZATION OF SYNTAX
⊥ ¬ ∨ ∧ → ∀ ∃ = ( ) ,
together with enumerable sets of variables and constant symbols, and enu-
merable sets of function symbols and predicate symbols of arbitrary arity. We
can assign codes to each of these symbols in such a way that every symbol is
assigned a unique number as its code, and no two different symbols are as-
signed the same number. We know that this is possible since the set of all
symbols is enumerable and so there is a bijection between it and the set of nat-
ural numbers. But we want to make sure that we can recover the symbol (as
well as some information about it, e.g., the arity of a function symbol) from
its code in a computable way. There are many possible ways of doing this,
of course. Here is one such way, which uses primitive recursive functions.
(Recall that ⟨n0 , . . . , nk ⟩ is the number coding the sequence of numbers n0 , . . . ,
nk .)
⊥ ¬ ∨ ∧ → ∀
⟨0, 0⟩ ⟨0, 1⟩ ⟨0, 2⟩ ⟨0, 3⟩ ⟨0, 4⟩ ⟨0, 5⟩
∃ = ( ) ,
⟨0, 6⟩ ⟨0, 7⟩ ⟨0, 8⟩ ⟨0, 9⟩ ⟨0, 10⟩
1. Fn( x, n) iff x is the code of fin for some i, i.e., x is the code of an n-ary function
symbol.
2. Pred( x, n) iff x is the code of Pin for some i or x is the code of = and n = 2,
i.e., x is the code of an n-ary predicate symbol.
Note that codes and Gödel numbers are different things. For instance, the
variable v5 has a code cv5 = ⟨1, 5⟩ = 22 · 36 . But the variable v5 considered as
a term is also a sequence of symbols (of length 1). The Gödel number # v5 # of the
2 6
term v5 is ⟨cv5 ⟩ = 2cv5 +1 = 22 ·3 +1 .
where pi is the i-th prime (starting with p0 = 2). So for instance, the formula
v0 = 0, or, more explicitly, =(v0 , c0 ), has the Gödel number
Here, c= is ⟨0, 7⟩ = 20+1 · 37=1 , cv0 is ⟨1, 0⟩ = 21+1 · 30+1 , etc. So # = (v0 , c0 )# is
Proposition 34.5. The relations Term( x ) and ClTerm( x ) which hold iff x is the
Gödel number of a term or a closed term, respectively, are primitive recursive.
1. si is a variable v j , or
2. si is a constant symbol c j , or
1. Var((y)i ), or
2. Const((y)i ), or
(y)i = # f jn (# ⌢ flatten(z) ⌢ # )# ,
num(0) = # 0#
num(n + 1) = # ′(# ⌢ num(n) ⌢ # )# .
Proof. The number x is the Gödel number of an atomic formula iff one of the
following holds:
1. There are n, j < x, and z < x such that for each i < n, Term((z)i ) and
x=
# n #
P j ( ⌢ flatten(z) ⌢ # )# .
3. x = # ⊥# .
Proposition 34.8. The relation Frm( x ) which holds iff x is the Gödel number of
a formula is primitive recursive.
Proposition 34.9. The relation FreeOcc( x, z, i ), which holds iff the i-th symbol of
the formula with Gödel number x is a free occurrence of the variable with Gödel num-
ber z, is primitive recursive.
Proof. Exercise.
Proposition 34.10. The property Sent( x ) which holds iff x is the Gödel number of
a sentence is primitive recursive.
34.5 Substitution
Recall that substitution is the operation of replacing all free occurrences of
a variable u in a formula φ by a term t, written φ[t/u]. This operation, when
carried out on Gödel numbers of variables, formulas, and terms, is primitive
recursive.
hSubst( x, y, z, 0) = Λ
hSubst( x, y, z, i + 1) =
(
hSubst( x, y, z, i ) ⌢ y if FreeOcc( x, z, i )
append(hSubst( x, y, z, i ), ( x )i ) otherwise.
Proposition 34.12. The relation FreeFor( x, y, z), which holds iff the term with Gödel
number y is free for the variable with Gödel number z in the formula with Gödel num-
ber x, is primitive recursive.
Proof. Exercise.
34.6 Derivations in LK
In order to arithmetize derivations, we must represent derivations as num-
bers. Since derivations are trees of sequents where each inference carries also
a label, a recursive representation is the most obvious approach: we represent
a derivation as a tuple, the components of which are the end-sequent, the la-
bel, and the representations of the sub-derivations leading to the premises of
the last inference.
⟨0, # Γ ⇒ ∆# ⟩.
⟨1, # π1 # , # Γ ⇒ ∆# , k⟩ or
⟨2, # π1 # , # π2 # , # Γ ⇒ ∆# , k⟩,
respectively, where k is given by the following table according to which
rule was used in the last inference:
Rule: WL WR CL CR XL XR
k: 1 2 3 4 5 6
Rule: ¬L ¬R ∧L ∧R ∨L ∨R
k: 7 8 9 10 11 12
Rule: →L →R ∀L ∀R ∃L ∃R
k: 13 14 15 16 17 18
Rule: Cut =
k: 19 20
Proposition 34.15. The property Correct( p) which holds iff the last inference in the
derivation π with Gödel number p is correct, is primitive recursive.
We also have to show that for each rule of inference R the relation FollowsByR ( p)
is primitive recursive, where FollowsByR ( p) holds iff p is the Gödel number
of derivation π, and the end-sequent of π follows by a correct application of R
from the immediate sub-derivations of π.
A simple case is that of the ∧R rule. If π ends in a correct ∧R inference, it
looks like this:
π1 π2
Γ ⇒ ∆, φ Γ ⇒ ∆, ψ
∧R
Γ ⇒ ∆, φ ∧ ψ
So, the last inference in the derivation π is a correct application of ∧R iff there
are sequences of sentences Γ and ∆ as well as two sentences φ and ψ such that
the end-sequent of π1 is Γ ⇒ ∆, φ, the end-sequent of π2 is Γ ⇒ ∆, ψ, and the
end-sequent of π is Γ ⇒ ∆, φ ∧ ψ. We just have to translate this into Gödel
numbers. If s = # Γ ⇒ ∆# then (s)0 = # Γ# and (s)1 = # ∆# . So, FollowsBy∧R ( p)
holds iff
The individual lines express, respectively, “there is a sequence (Γ) with Gödel
number g, there is a sequence (∆) with Gödel number d, a formula (φ) with
Gödel number a, and a formula (ψ) with Gödel number b,” such that “the
end-sequent of π is Γ ⇒ ∆, φ ∧ ψ,” “the end-sequent of π1 is Γ ⇒ ∆, φ,” “the
end-sequent of π2 is Γ ⇒ ∆, ψ,” and “π has two immediate subderivations
and the last inference rule is ∧R (with number 10).”
The last inference in π is a correct application of ∃R iff there are sequences
Γ and ∆, a formula φ, a variable x, and a term t, such that the end-sequent of
Sequent(EndSequent( p)) ∧
[(LastRule( p) = 1 ∧ FollowsByWL ( p)) ∨ · · · ∨
(LastRule( p) = 20 ∧ FollowsBy= ( p)) ∨
( p)0 = 0 ∧ InitialSeq(EndSequent( p))]
The first line ensures that the end-sequent of d is actually a sequent consisting
of sentences. The last line covers the case where p is just an initial sequent.
Proposition 34.16. The relation Deriv( p) which holds if p is the Gödel number of a
correct derivation π, is primitive recursive.
[ φ ∧ ψ ]1
φ ∧Elim
1 →Intro
( φ ∧ ψ) → φ
2. All assumptions in δ with label n are of the form φ (i.e., we can discharge the
assumption φ using label n in δ).
Proof. We have to show that the corresponding relations between Gödel num-
bers of formulas and Gödel numbers of derivations are primitive recursive.
1. We want to show that Assum( x, d, n), which holds if x is the Gödel num-
ber of an assumption of the derivation with Gödel number d labelled n,
is primitive recursive. This is the case if the derivation with Gödel num-
ber ⟨0, x, n⟩ is a sub-derivation of d. Note that the way we code deriva-
tions is a special case of the coding of trees introduced in section 29.12,
so the primitive recursive function SubtreeSeq(d) gives a sequence of
Gödel numbers of all sub-derivations of d (of length a most d). So we
can define
Proposition 34.21. The property Correct(d) which holds iff the last inference in the
derivation δ with Gödel number d is correct, is primitive recursive.
Proof. Here we have to show that for each rule of inference R the relation
FollowsByR (d) is primitive recursive, where FollowsByR (d) holds iff d is the
Gödel number of derivation δ, and the end-formula of δ follows by a correct
application of R from the immediate sub-derivations of δ.
A simple case is that of the ∧Intro rule. If δ ends in a correct ∧Intro infer-
ence, it looks like this:
δ1 δ2
φ ψ
∧Intro
φ∧ψ
Another simple example if the =Intro rule. Here the premise is an empty
derivation, i.e., (d)1 = 0, and no discharge label, i.e., n = 0. However, φ must
be of the form t = t, for a closed term t. Here, a primitive recursive definition
is
For a more complicated example, FollowsBy→Intro (d) holds iff the end-
formula of δ is of the form ( φ → ψ), where the end-formula of δ1 is ψ, and
any assumption in δ labelled n is of the form φ. We can express this primitive
recursively by
( d )0 = 1 ∧
(∃ a < d) (Discharge( a, (d)1 , DischargeLabel(d)) ∧
EndFmla(d) = (# (# ⌢ a ⌢ # →# ⌢ EndFmla((d)1 ) ⌢ # )# ))
(d)0 = 1 ∧ DischargeLabel(d) = 0 ∧
(∃ a < d) (∃ x < d) (∃t < d) (ClTerm(t) ∧ Var( x ) ∧
Subst( a, t, x ) = EndFmla((d)1 ) ∧ EndFmla(d) = (# ∃# ⌢ x ⌢ a)).
Sent(EndFmla(d)) ∧
(LastRule(d) = 1 ∧ FollowsBy∧Intro (d)) ∨ · · · ∨
(LastRule(d) = 16 ∧ FollowsBy=Elim (d)) ∨
(∃n < d) (∃ x < d) (d = ⟨0, x, n⟩).
The first line ensures that the end-formula of d is a sentence. The last line
covers the case where d is just an assumption.
Proposition 34.22. The relation Deriv(d) which holds if d is the Gödel number of a
correct derivation δ, is primitive recursive.
Proposition 34.23. The relation OpenAssum(z, d) that holds if z is the Gödel num-
ber of an undischarged assumption φ of the derivation δ with Gödel number d, is
primitive recursive.
1. ψ → (ψ ∨ φ)
2. (ψ → (ψ ∨ φ)) → ( φ → (ψ → (ψ ∨ φ)))
3. φ → (ψ → (ψ ∨ φ))
⟨ # ψ → ( ψ ∨ φ )# ,
#
(ψ → (ψ ∨ φ)) → ( φ → (ψ → (ψ ∨ φ)))# ,
#
φ → (ψ → (ψ ∨ φ))# ⟩.
1. φ is an axiom.
4. δ is a correct derivation.
Proof. We have to show that the corresponding relations between Gödel num-
bers of formulas and Gödel numbers of derivations are primitive recursive.
ψ → ( χ → ψ ).
2. The i-th line in δ is justified by modus ponens iff there are lines j and
k < i where the sentence on line j is some formula φ, the sentence on
line k is φ → ψ, and the sentence on line i is ψ.
All of these can be tested primitive recursively, since the Gödel numbers
of ψ, φ( x ), and x are less than the Gödel number of the formula on line i,
and that of a less than the Gödel number of the formula on line j:
QR1 (d, i ) ⇔ (∃b < (d)i ) (∃ x < (d)i ) (∃ a < (d)i ) (∃c < (d) j ) (
Var( x ) ∧ Const(c) ∧
( d ) i = ( ⌢ b ⌢ # →# ⌢ # ∀# ⌢ x ⌢ a ⌢ # )# ∧
# #
(d) j = # (# ⌢ b ⌢ # →# ⌢ Subst( a, c, x ) ⌢ # )# ∧
Sent(b) ∧ Sent(Subst( a, c, x )) ∧ (∀k < len(b)) (b)k ̸= (c)0 )
Here we assume that c and x are the Gödel numbers of the variable and
constant considered as terms (i.e., not their symbol codes). We test that x
is the only free variable of φ( x ) by testing if φ( x )[c/x ] is a sentence, and
ensure that c does not occur in ψ by requiring that every symbol of ψ is
different from c.
We leave the other version of QR as an exercise.
hCond(s, y, 0) = y
hCond(s, y, n + 1) = # (# ⌢ (s)n ⌢ # →# ⌢ Cond(s, y, n) ⌢ # )#
Cond(s, y) = hCond(s, y, len(s))
The bound on s is given by considering that each (s)i is the Gödel number of
a sub-formula of the last line of the derivation, i.e., is less than ( x )len( x)−1 . The
number of antecedents ψ ∈ Γ, i.e., the length of s, is less than the length of the
last line of x.
Problems
Problem 34.1. Show that the function flatten(z), which turns the sequence
⟨# t1 # , . . . , # tn # ⟩ into # t1 , . . . , tn # , is primitive recursive.
Problem 34.2. Give a detailed proof of Proposition 34.8 along the lines of the
first proof of Proposition 34.5.
Problem 34.3. Prove Proposition 34.9. You may make use of the fact that any
substring of a formula which is a formula is a sub-formula of it.
Representability in Q
35.1 Introduction
The incompleteness theorems apply to theories in which basic facts about
computable functions can be expressed and proved. We will describe a very
minimal such theory called “Q” (or, sometimes, “Robinson’s Q,” after Raphael
Robinson). We will say what it means for a function to be representable in Q,
and then we will prove the following:
A function is representable in Q if and only if it is computable.
For one thing, this provides us with another model of computability. But we
will also use it to show that the set { φ : Q ⊢ φ} is not decidable, by reducing
the halting problem to it. By the time we are done, we will have proved much
stronger things than this.
The language of Q is the language of arithmetic; Q consists of the fol-
lowing axioms (to be used in conjunction with the other axioms and rules of
first-order logic with identity predicate):
∀ x ∀y ( x ′ = y′ → x = y) (Q1 )
∀ x 0 ̸= x′ (Q2 )
∀ x ( x = 0 ∨ ∃y x = y′ ) (Q3 )
∀ x ( x + 0) = x (Q4 )
∀ x ∀y ( x + y′ ) = ( x + y)′ (Q5 )
∀ x ( x × 0) = 0 (Q6 )
∀ x ∀y ( x × y′ ) = (( x × y) + x ) (Q7 )
∀ x ∀y ( x < y ↔ ∃z (z′ + x ) = y) (Q8 )
For each natural number n, define the numeral n to be the term 0′′...′ where
there are n tick marks in all. So, 0 is the constant symbol 0 by itself, 1 is 0′ , 2 is
0′′ , etc.
548
35.1. INTRODUCTION
( φ(0) ∧ ∀ x ( φ( x ) → φ( x ′ ))) → ∀ x φ( x )
∀y (( φ(0) ∧ ∀ x ( φ( x ) → φ( x ′ ))) → ∀ x φ( x ))
Using instances of the induction schema, one can prove much more from the
axioms of PA than from those of Q. In fact, it takes a good deal of work to
find “natural” statements about the natural numbers that can’t be proved in
Peano arithmetic!
1. φ f (n0 , . . . , nk , m)
2. ∀y ( φ f (n0 , . . . , nk , y) → m = y).
There are other ways of stating the definition; for example, we could equiv-
alently require that Q proves ∀y ( φ f (n0 , . . . , nk , y) ↔ y = m).
There are two directions to proving the theorem. The left-to-right direction
is fairly straightforward once arithmetization of syntax is in place. The other
direction requires more work. Here is the basic idea: we pick “general recur-
sive” as a way of making “computable” precise, and show that every general
recursive function is representable in Q. Recall that a function is general re-
cursive if it can be defined from zero, the successor function succ, and the
projection functions Pin , using composition, primitive recursion, and regular
minimization. So one way of showing that every general recursive function is
representable in Q is to show that the basic functions are representable, and
whenever some functions are representable, then so are the functions defined
from them using composition, primitive recursion, and regular minimization.
In other words, we might show that the basic functions are representable, and
that the representable functions are “closed under” composition, primitive
recursion, and regular minimization. This guarantees that every general re-
cursive function is representable.
It turns out that the step where we would show that representable func-
tions are closed under primitive recursion is hard. In order to avoid this step,
we show first that in fact we can do without primitive recursion. That is, we
show that every general recursive function can be defined from basic func-
tions using composition and regular minimization alone. To do this, we show
that primitive recursion can actually be done by a specific regular minimiza-
tion. However, for this to work, we have to add some additional basic func-
tions: addition, multiplication, and the characteristic function of the identity
relation χ= . Then, we can prove the theorem by showing that all of these basic
functions are representable in Q, and the representable functions are closed
under composition and regular minimization.
Proof. The “if” part is Definition 35.1(1). The “only if” part is seen as follows:
Suppose Q ⊢ φ f (n0 , . . . , nk , m) but m ̸= f (n0 , . . . , nk ). Let l = f (n0 , . . . , nk ).
By Definition 35.1(1), Q ⊢ φ f (n0 , . . . , nk , l ). By Definition 35.1(2), ∀y ( φ f (n0 , . . . , nk , y) →
l = y). Using logic and the assumption that Q ⊢ φ f (n0 , . . . , nk , m), we get that
Q ⊢ l = m. On the other hand, by Lemma 35.14, Q ⊢ l ̸= m. So Q is incon-
sistent. But that is impossible, since Q is satisfied by the standard model (see
Definition 33.2), N ⊨ Q, and satisfiable theories are always consistent by the
Soundness Theorem (Corollaries 20.29, 19.31, 21.31 and 22.38).
Proof. Let’s first give the intuitive idea for why this is true. To compute f , we
do the following. List all the possible derivations δ in the language of arith-
metic. This is possible to do mechanically. For each one, check if it is a deriva-
tion of a formula of the form φ f (n0 , . . . , nk , m) (the formula representing f in Q
from Lemma 35.3). If it is, m = f (n0 , . . . , nk ) by Lemma 35.3, and we’ve found
the value of f . The search terminates because Q ⊢ φ f (n0 , . . . , nk , f (n0 , . . . , nk )),
so eventually we find a δ of the right sort.
A ( n0 , . . . , n k , m ) =
Subst(Subst(. . . Subst(# φ f # , num(n0 ), # x0 # ),
. . . ), num(nk ), # xk # ), num(m), # y# )
This looks complicated, but it’s just the function A(n0 , . . . , nk , m) = # φ f (n0 , . . . , nk , m)# .
Now, consider the relation R(n0 , . . . , nk , s) which holds if (s)0 is the Gödel
number of a derivation from Q of φ f (n0 , . . . , nk , (s)1 ):
R ( n0 , . . . , n k , s ) iff PrfQ ((s)0 , A(n0 , . . . , nk , (s)1 ))
If we can find an s such that R(n0 , . . . , nk , s) hold, we have found a pair of
numbers—(s)0 and (s1 )—such that (s)0 is the Gödel number of a derivation
of A f (n0 , . . . , nk , (s)1 ). So looking for s is like looking for the pair d and m
in the informal proof. And a computable function that “looks for” such an
s can be defined by regular minimization. Note that R is regular: for ev-
ery n0 , . . . , nk , there is a derivation δ of Q ⊢ φ f (n0 , . . . , nk , f (n0 , . . . , nk )), so
R(n0 , . . . , nk , s) holds for s = ⟨# δ# , f (n0 , . . . , nk )⟩. So, we can write f as
f (n0 , . . . , nk ) = (µs R(n0 , . . . , nk , s))1 .
Definition 35.6. Two natural numbers a and b are relatively prime iff their great-
est common divisor is 1; in other words, they have no other divisors in com-
mon.
z ≡ y0 mod x0
z ≡ y1 mod x1
..
.
z ≡ yn mod xn .
j = max(n, y0 , . . . , yn ) + 1,
and let
x0 = 1 + j !
x1 = 1 + 2 · j !
x2 = 1 + 3 · j !
..
.
x n = 1 + ( n + 1) · j !
To see that (1) is true, note that if p is a prime number and p | xi and p | xk ,
then p | 1 + (i + 1) j ! and p | 1 + (k + 1) j !. But then p divides their difference,
(1 + (i + 1) j !) − (1 + (k + 1) j !) = (i − k) j !.
not( x ) = χ= ( x, 0)
(min x ≤ z) R( x, y) = µx ( R( x, y) ∨ x = z)
(∃ x ≤ z) R( x, y) ⇔ R((min x ≤ z) R( x, y), y)
We can then show that all of the following are also definable without primitive
recursion:
Now define
j = max(n, a0 , . . . , an ) + 1,
d0 ≡ a i mod (1 + (i + 1)d1 )
ai = rem(1 + (i + 1)d1 , d0 ).
β(d, i ) = β∗ (d0 , d1 , i )
= rem(1 + (i + 1)d1 , d0 )
= ai
which is what we need. This completes the proof of the β-function lemma.
h(⃗x, 0) = f (⃗x )
h(⃗x, y + 1) = g(⃗x, y, h(⃗x, y)).
We need to show that h can be defined from f and g using just composition
and regular minimization, using the basic functions and functions defined
from them using composition and regular minimization (such as β).
Lemma 35.9. If h can be defined from f and g using primitive recursion, it can be
defined from f , g, the functions zero, succ, Pin , add, mult, χ= , using composition
and regular minimization.
Proof. First, define an auxiliary function ĥ(⃗x, y) which returns the least num-
ber d such that d codes a sequence which satisfies
where now (d)i is short for β(d, i ). In other words, ĥ returns the sequence
⟨h(⃗x, 0), h(⃗x, 1), . . . , h(⃗x, y)⟩. We can write ĥ as
n + m = n + m and
∀y ((n + m) = y → y = n + m).
is represented in Q by
φ χ = ( x0 , x1 , y ) ≡ ( x0 = x1 ∧ y = 1) ∨ ( x0 ̸ = x1 ∧ y = 0).
Note that the lemma does not say much: in essence it says that Q can
prove that different numerals denote different objects. For example, Q proves
0′′ ̸= 0′′′ . But showing that this holds in general requires some care. Note also
that although we are using induction, it is induction outside of Q.
Proof of Proposition 35.13. If n = m, then n and m are the same term, and
χ= (n, m) = 1. But Q ⊢ (n = m ∧ 1 = 1), so it proves φ= (n, m, 1). If n ̸= m,
then χ= (n, m) = 0. By Lemma 35.14, Q ⊢ n ̸= m and so also (n ̸= m ∧ 0 = 0).
Thus Q ⊢ φ= (n, m, 0).
For the second part, we also have two cases. If n = m, we have to show
that Q ⊢ ∀y ( φ= (n, m, y) → y = 1). Arguing informally, suppose φ= (n, m, y),
i.e.,
( n = n ∧ y = 1) ∨ ( n ̸ = n ∧ y = 0)
The left disjunct implies y = 1 by logic; the right contradicts n = n which is
provable by logic.
( n = m ∧ y = 1) ∨ ( n ̸ = m ∧ y = 0)
Lemma 35.16. Q ⊢ (n + m) = n + m
Q ⊢ (n + m) = n + m,
we can replace the left side with n + m and get n + m = y, for arbitrary y.
Proof. Exercise.
Lemma 35.18. Q ⊢ (n × m) = n · m
Proof. Exercise.
Recall that we use × for the function symbol of the language of arith-
metic, and · for the ordinary multiplication operation on numbers. So · can
appear between expressions for numbers (such as in m · n) while × appears
only between terms of the language of arithmetic (such as in (m × n)). Even
more confusingly, + is used for both the function symbol and the addition
operation. When it appears between terms—e.g., in (n + m)—it is the 2-place
function symbol of the language of arithmetic, and when it appears between
numbers—e.g., in n + m—it is the addition operation. This includes the case
n + m: this is the standard numeral corresponding to the number n + m.
h( x0 , . . . , xl −1 ) = f ( g0 ( x0 , . . . , xl −1 ), . . . , gk−1 ( x0 , . . . , xl −1 )).
Q ⊢ φ g (n, k)
Q ⊢ φ f (k, m)
Q ⊢ φ g (n, k) ∧ φ f (k, m)
Q ⊢ ∀y ( φ g (n, y) → y = k )
Q ⊢ ∀z ( φ f (k, z) → z = m)
since φ f represents f . Using just a little bit of logic, we can show that also
The same idea works in the more complex case where f and gi have arity
greater than 1.
∃ y 0 . . . ∃ y k − 1 ( φ g0 ( x 0 , . . . , x l − 1 , y 0 ) ∧ · · · ∧
φ gk−1 ( x0 , . . . , xl −1 , yk−1 ) ∧ φ f (y0 , . . . , yk−1 , z))
represents
h( x0 , . . . , xl −1 ) = f ( g0 ( x0 , . . . , xl −1 ), . . . , gk−1 ( x0 , . . . , xl −1 )).
Proof. Exercise.
Lemma 35.22. For every constant symbol a and every natural number n,
Q ⊢ ( a′ + n) = ( a + n)′ .
Q ⊢ ( a ′ + 0) = a ′ by axiom Q4 (35.1)
Q ⊢ ( a + 0) = a by axiom Q4 (35.2)
′ ′
Q ⊢ ( a + 0) = a by eq. (35.2) (35.3)
′ ′
Q ⊢ ( a + 0) = ( a + 0) by eq. (35.1) and eq. (35.3)
It is again worth mentioning that this is weaker than saying that Q proves
∀ x ∀y ( x ′ + y) = ( x + y)′ . Although this sentence is true in N, Q does not
prove it.
Proof. We give the proof informally (i.e., only giving hints as to how to con-
struct the formal derivation).
We have to prove ¬ a < 0 for an arbitrary a. By the definition of <, we
need to prove ¬∃y (y′ + a) = 0 in Q. We’ll assume ∃y (y′ + a) = 0 and prove a
contradiction. Suppose (b′ + a) = 0. Using Q3 , we have that a = 0 ∨ ∃y a = y′ .
We distinguish cases.
Case 1: a = 0 holds. From (b′ + a) = 0, we have (b′ + 0) = 0. By axiom Q4
of Q, we have (b′ + 0) = b′ , and hence b′ = 0. But by axiom Q2 we also have
b′ ̸= 0, a contradiction.
Case 2: For some c, a = c′ . But then we have (b′ + c′ ) = 0. By axiom Q5 ,
we have (b′ + c)′ = 0, again contradicting axiom Q2 .
Q ⊢ ∀ x ( x < n + 1 → ( x = 0 ∨ · · · ∨ x = n)).
m′ ≡ m + 1, (c′ + m + 1) = a. By Q8 , m + 1 < a.
Q ⊢ φ g (m, n, 0).
Q ⊢ ¬ φ g (k, n, 0).
We get that
Proof. For definiteness, and using the Church-Turing Thesis, let’s say that a
function is computable iff it is general recursive. The general recursive func-
tions are those which can be defined from the zero function zero, the successor
function succ, and the projection function Pin using composition, primitive re-
cursion, and regular minimization. By Lemma 35.9, any function h that can
be defined from f and g can also be defined using composition and regular
minimization from f , g, and zero, succ, Pin , add, mult, χ= . Consequently, a
function is general recursive iff it can be defined from zero, succ, Pin , add,
mult, χ= using composition and regular minimization.
We’ve furthermore shown that the basic functions in question are rep-
resentable in Q (Propositions 35.10 to 35.13, 35.15 and 35.17), and that any
function defined from representable functions by composition or regular min-
imization (Proposition 35.21, Proposition 35.26) is also representable. Thus
every general recursive function is representable in Q.
∀ y ( φ χ R ( n0 , . . . , n k , y ) → y = 0).
Since Q proves 0 ̸= 1, Q proves ¬ φχR (n0 , . . . , nk , 1), and so it proves ¬ φ R (n0 , . . . , nk ).
35.10 Undecidability
We call a theory T undecidable if there is no computational procedure which, af-
ter finitely many steps and unfailingly, provides a correct answer to the ques-
tion “does T prove φ?” for any sentence φ in the language of T. So Q would
be decidable iff there were a computational procedure which decides, given a
sentence φ in the language of arithmetic, whether Q ⊢ φ or not. We can make
this more precise by asking: Is the relation ProvQ (y), which holds of y iff y is
the Gödel number of a sentence provable in Q, recursive? The answer is: no.
Theorem 35.30. Q is undecidable, i.e., the relation
is not recursive.
Proof. Suppose it were. Then we could solve the halting problem as follows:
Given e and n, we know that φe (n) ↓ iff there is an s such that T (e, n, s), where
T is Kleene’s predicate from Theorem 29.28. Since T is primitive recursive it
is representable in Q by a formula ψT , that is, Q ⊢ ψT (e, n, s) iff T (e, n, s). If
Q ⊢ ψT (e, n, s) then also Q ⊢ ∃y ψT (e, n, y). If no such s exists, then Q ⊢
¬ψT (e, n, s) for every s. But Q is ω-consistent, i.e., if Q ⊢ ¬ φ(n) for every n ∈
N, then Q ⊬ ∃y φ(y). We know this because the axioms of Q are true in the
standard model N. So, Q ⊬ ∃y ψT (e, n, y). In other words, Q ⊢ ∃y ψT (e, n, y)
iff there is an s such that T (e, n, s), i.e., iff φe (n) ↓. From e and n we can
compute # ∃y ψT (e, n, y)# , let g(e, n) be the primitive recursive function which
does that. So (
1 if ProvQ ( g(e, n))
h(e, n) =
0 otherwise.
This would show that h is recursive if ProvQ is. But h is not recursive, by
Theorem 29.29, so ProvQ cannot be either.
Problems
Problem 35.1. Show that the relations x < y, x | y, and the function rem( x, y)
can be defined without primitive recursion. You may use 0, successor, plus,
times, χ= , projections, and bounded minimization and quantification.
Problem 35.5. Using the proofs of Proposition 35.20 and Proposition 35.20 as
a guide, carry out the proof of Proposition 35.21 in detail.
36.1 Introduction
A theory is a set of sentences that is deductively closed, that is, with the
property that whenever T proves φ then φ is in T. It is probably best to think
of a theory as being a collection of sentences, together with all the things that
these sentences imply. From now on, we will use Q to refer to the theory con-
sisting of the set of sentences derivable from the eight axioms in section 35.1.
Remember that we can code formula of Q as numbers; if φ is such a formula,
let # φ# denote the number coding φ. Modulo this coding, we can now ask
whether various sets of formulas are computable or not.
36.2 Q is C.e.-Complete
Theorem 36.1. Q is c.e. but not decidable. In fact, it is a complete c.e. set.
566
36.3. ω-CONSISTENT EXTENSIONS OF Q ARE UNDECIDABLE
Proof. It is not hard to see that Q is c.e., since it is the set of (codes for) sen-
tences y such that there is a proof x of y in Q:
Q = {y : ∃ x PrfQ ( x, y)}.
But we know that PrfQ ( x, y) is computable (in fact, primitive recursive), and
any set that can be written in the above form is c.e.
Saying that it is a complete c.e. set is equivalent to saying that K ≤m Q,
where K = { x : φ x ( x ) ↓}. So let us show that K is reducible to Q. Since
Kleene’s predicate T (e, x, s) is primitive recursive, it is representable in Q, say,
by φ T . Then for every x, we have
x ∈ K → ∃s T ( x, x, s)
→ ∃s (Q ⊢ φ T ( x, x, s))
→ Q ⊢ ∃s φ T ( x, x, s).
Theorem 36.3. Let T be any ω-consistent theory that includes Q. Then T is not
decidable.
Theorem 36.5. Let T be any consistent theory that includes Q. Then T is not decid-
able.
S(n) → T ⊢ θS (n)
→ R (# θ S ( u )# , n )
and
Let “true arithmetic” be the theory { φ : N ⊨ φ}, that is, the set of sentences
in the language of arithmetic that are true in the standard interpretation.
This theorems is not that far from Gödel’s original 1931 formulation of the
First Incompleteness Theorem. Aside from the more modern terminology, the
key differences are this: Gödel has “ω-consistent” instead of “consistent”; and
he could not say “axiomatizable” in full generality, since the formal notion of
computability was not in place yet. (The formal models of computability were
developed over the following decade, including by Gödel, and in large part to
be able to characterize the kinds of theories that are susceptible to the Gödel
phenomenon.)
The theorem says you can’t have it all, namely, completeness, consistency,
and axiomatizability. If you give up any one of these, though, you can have
the other two: Q is consistent and computably axiomatized, but not com-
plete; the inconsistent theory is complete, and computably axiomatized (say,
by {0 ̸= 0}), but not consistent; and the set of true sentence of arithmetic is
complete and consistent, but it is not computably axiomatized.
S(n) → Q ⊢ θS (n)
→ θS (n) ∈ C
and
Theorem 36.11. Let T be any theory in the language of arithmetic that is consistent
with Q (i.e., T ∪ Q is consistent). Then T is undecidable.
C = { φ : T ⊢ α → φ }.
Corollary 36.12. First-order logic for the language of arithmetic (that is, the set { φ :
φ is provable in first-order logic}) is undecidable.
Theorem 36.13. Suppose T is a theory in a language in which one can interpret the
language of arithmetic, in such a way that T is consistent with the interpretation of
Q. Then T is undecidable. If T proves the interpretation of the axioms of Q, then no
consistent extension of T is decidable.
The proof is just a small modification of the proof of the last theorem; one
could use a counterexample to get a separation of Q and Q̄. One can take ZFC,
Zermelo-Fraenkel set theory with the axiom of choice, to be an axiomatic foun-
dation that is powerful enough to carry out a good deal of ordinary mathemat-
ics. In ZFC one can define the natural numbers, and via this interpretation,
the axioms of Q are true. So we have
The language of ZFC has only a single binary relation, ∈. (In fact, you
don’t even need equality.) So we have
Corollary 36.16. First-order logic for any language with a binary relation symbol is
undecidable.
This result extends to any language with two unary function symbols,
since one can use these to simulate a binary relation symbol. The results just
cited are tight: it turns out that first-order logic for a language with only unary
relation symbols and at most one unary function symbol is decidable.
One more bit of trivia. We know that the set of sentences in the language
0, ′ , +, ×, < true in the standard model is undecidable. In fact, one can de-
fine < in terms of the other symbols, and then one can define + in terms of
× and ′ . So the set of true sentences in the language 0, ′ , × is undecidable.
On the other hand, Presburger has shown that the set of sentences in the lan-
guage 0, ′ , + true in the language of arithmetic is decidable. The procedure is
computationally infeasible, however.
37.1 Introduction
573
CHAPTER 37. INCOMPLETENESS AND PROVABILITY
Lemma 37.1. Let T be any theory extending Q, and let ψ( x ) be any formula with
only the variable x free. Then there is a sentence φ such that T ⊢ φ ↔ ψ(⌜φ⌝).
The lemma asserts that given any property ψ( x ), there is a sentence φ that
asserts “ψ( x ) is true of me,” and T “knows” this.
How can we construct such a sentence? Consider the following version of
the Epimenides paradox, due to Quine:
But what happens when one takes the phrase “yields falsehood when pre-
ceded by its quotation,” and precedes it with a quoted version of itself? Then
one has the original sentence! In short, the sentence asserts that it is false.
it can derive ψ(di ag (⌜ψ(di ag ( x ))⌝)) ↔ ψ(⌜φ⌝). But the left hand side is, by
definition, φ.
Of course, di ag will in general not be a function symbol of T, and cer-
tainly is not one of Q. But, since diag is computable, it is representable in Q
by some formula θdiag ( x, y). So instead of writing ψ(di ag ( x )) we can write
Lemma 37.2. Let ψ( x ) be any formula with one free variable x. Then there is a
sentence φ such that Q ⊢ φ ↔ ψ(⌜φ⌝).
Consider such a y. Since θdiag (⌜α( x )⌝, y), by eq. (37.2), y = ⌜φ⌝. So, from ψ(y)
we have ψ(⌜φ⌝).
Now suppose ψ(⌜φ⌝). By eq. (37.1), we have
It follows that
You should compare this to the proof of the fixed-point lemma in com-
putability theory. The difference is that here we want to define a statement in
terms of itself, whereas there we wanted to define a function in terms of itself;
this difference aside, it is really the same idea.
and only if x is the Gödel number of a derivation of the formula with Gödel
number y in T. In fact, for the particular theory that Gödel had in mind, Gödel
was able to show that this relation is primitive recursive, using the list of 45
functions and relations in his paper. The 45th relation, xBy, is just PrfT ( x, y)
for his particular choice of T. Remember that where Gödel uses the word
“recursive” in his paper, we would now use the phrase “primitive recursive.”
Since PrfT ( x, y) is computable, it is representable in T. We will use Prf T ( x, y)
to refer to the formula that represents it. Let Prov T (y) be the formula ∃ x Prf T ( x, y).
This describes the 46th relation, Bew(y), on Gödel’s list. As Gödel notes, this
is the only relation that “cannot be asserted to be recursive.” What he proba-
bly meant is this: from the definition, it is not clear that it is computable; and
later developments, in fact, show that it isn’t.
Let T be an axiomatizable theory containing Q. Then PrfT ( x, y) is decid-
able, hence representable in Q by a formula Prf T ( x, y). Let Prov T (y) be the
formula we described above. By the fixed-point lemma, there is a formula γT
such that Q (and hence T) derives
Proof. Suppose T derives γT . Then there is a derivation, and so, for some
number m, the relation PrfT (m, # γT # ) holds. But then Q derives the sentence
Prf T (m, ⌜γT ⌝). So Q derives ∃ x Prf T ( x, ⌜γT ⌝), which is, by definition, Prov T (⌜γT ⌝).
By eq. (37.3), Q derives ¬γT , and since T extends Q, so does T. We have
shown that if T derives γT , then it also derives ¬γT , and hence it would be
inconsistent.
Note that every ω-consistent theory is also consistent. This follows simply
from the fact that if T is inconsistent, then T ⊢ φ for every φ. In particular, if T
is inconsistent, it derives both ¬ φ(n) for every n and also derives ∃ x φ( x ). So,
if T is inconsistent, it is ω-inconsistent. By contraposition, if T is ω-consistent,
it must be consistent.
no derivation of γT in T, Q derives
¬Prf T (0, ⌜γT ⌝), ¬Prf T (1, ⌜γT ⌝), ¬Prf T (2, ⌜γT ⌝), . . .
and so does T. On the other hand, by eq. (37.3), ¬γT is equivalent to ∃ x Prf T ( x, ⌜γT ⌝).
So T is ω-inconsistent.
Proof. Recall that Prov T (y) is defined as ∃ x Prf T ( x, y), where Prf T ( x, y) repre-
sents the decidable relation which holds iff x is the Gödel number of a deriva-
tion of the sentence with Gödel number y. The relation that holds between x
and y if x is the Gödel number of a refutation of the sentence with Gödel num-
ber y is also decidable. Let not( x ) be the primitive recursive function which
does the following: if x is the code of a formula φ, not( x ) is a code of ¬ φ.
Then RefT ( x, y) holds iff PrfT ( x, not(y)). Let Ref T ( x, y) represent it. Then, if
T ⊢ ¬ φ and δ is a corresponding derivation, Q ⊢ Ref T (⌜δ⌝, ⌜φ⌝). We define
RProv T (y) as
but that’s just RProv T (⌜ρ T ⌝). By eq. (37.4), Q ⊢ ¬ρ T . Since T extends Q, also
T ⊢ ¬ρ T . We’ve assumed that T ⊢ ρ T , so T would be inconsistent, contrary to
the assumption of the theorem.
Now, let’s show that T ⊬ ¬ρ T . Again, suppose it did, and suppose n is
the Gödel number of a derivation of ¬ρ T . Then RefT (n, # ρ T # ) holds, and since
Ref T represents RefT in Q, Q ⊢ Ref T (n, ⌜ρ T ⌝). We’ll again show that T would
then be inconsistent because it would also derive ρ T . Since
is logically equivalent to
We argue informally using logic, making use of facts about what Q derives.
Suppose x is arbitrary and Prf T ( x, ⌜ρ T ⌝). We already know that T ⊬ ρ T , and
so for every k, Q ⊢ ¬Prf T (k, ⌜ρ T ⌝). Thus, for every k it follows that x ̸= k. In
particular, we have (a) that x ̸= n. We also have ¬( x = 0 ∨ x = 1 ∨ · · · ∨ x =
n − 1) and so by Lemma 35.24, (b) ¬( x < n). By Lemma 35.25, n < x. Since
Q ⊢ Ref T (n, ⌜ρ T ⌝), we have n < x ∧ Ref T (n, ⌜ρ T ⌝), and from that ∃z (z <
x ∧ Ref T (z, ⌜ρ T ⌝)). Since x was arbitrary we get, as required, that
( φ(0) ∧ ∀ x ( φ( x ) → φ( x ′ ))) → ∀ x φ( x )
for every formula φ. Notice that this is really a schema, which is to say, in-
finitely many axioms (and it turns out that PA is not finitely axiomatizable).
But since one can effectively determine whether or not a string of symbols is
an instance of an induction axiom, the set of axioms for PA is computable. PA
is a much more robust theory than Q. For example, one can easily prove that
addition and multiplication are commutative, using induction in the usual
way. In fact, most finitary number-theoretic and combinatorial arguments can
be carried out in PA.
Since PA is computably axiomatized, the derivability predicate PrfPA ( x, y)
is computable and hence represented in Q (and so, in PA). As before, we will
take Prf PA ( x, y) to denote the formula representing the relation. Let ProvPA (y)
be the formula ∃ x PrfPA ( x, y), which, intuitively says, “y is derivable from the
axioms of PA.” The reason we need a little bit more than the axioms of Q is
we need to know that the theory we are using is strong enough to derive a
few basic facts about this derivability predicate. In fact, what we need are the
following facts:
The only way to verify that these three properties hold is to describe the for-
mula ProvPA (y) carefully and use the axioms of PA to describe the relevant
formal derivations. Conditions (1) and (2) are easy; it is really condition (3)
that requires work. (Think about what kind of work it entails . . . ) Carrying
out the details would be tedious and uninteresting, so here we will ask you
to take it on faith that PA has the three properties listed above. A reasonable
choice of ProvPA (y) will also satisfy
To make the argument more precise, we will let γPA be the Gödel sentence
for PA and use the derivability conditions (P1)–(P3) to show that PA derives
ConPA → γPA . This will show that PA doesn’t derive ConPA . Here is a sketch
of the proof, in PA. (For simplicity, we drop the PA subscripts.)
γ ↔ ¬Prov(⌜γ⌝) (37.5)
γ is a Gödel sentence
γ → ¬Prov(⌜γ⌝) (37.6)
from eq. (37.5)
γ → (Prov(⌜γ⌝) → ⊥) (37.7)
from eq. (37.6) by logic
Prov(⌜γ → (Prov(⌜γ⌝) → ⊥)⌝) (37.8)
by from eq. (37.7) by condition P1
Prov(⌜γ⌝) → Prov(⌜(Prov(⌜γ⌝) → ⊥)⌝) (37.9)
from eq. (37.8) by condition P2
Prov(⌜γ⌝) → (Prov(⌜Prov(⌜γ⌝)⌝) → Prov(⌜⊥⌝)) (37.10)
from eq. (37.9) by condition P2 and logic
Prov(⌜γ⌝) → Prov(⌜Prov(⌜γ⌝)⌝) (37.11)
by P3
Prov(⌜γ⌝) → Prov(⌜⊥⌝) (37.12)
from eq. (37.10) and eq. (37.11) by logic
Con → ¬Prov(⌜γ⌝) (37.13)
contraposition of eq. (37.12) and Con ≡ ¬Prov(⌜⊥⌝)
Con → γ
from eq. (37.5) and eq. (37.13) by logic
The use of logic in the above just elementary facts from propositional logic,
e.g., eq. (37.7) uses ⊢ ¬ φ ↔ ( φ → ⊥) and eq. (37.12) uses φ → (ψ → χ), φ → ψ ⊢
φ → χ. The use of condition P2 in eq. (37.9) and eq. (37.10) relies on instances
of P2, Prov(⌜φ → ψ⌝) → (Prov(⌜φ⌝) → Prov(⌜ψ⌝)). In the first one, φ ≡ γ and
ψ ≡ Prov(⌜γ⌝) → ⊥; in the second, φ ≡ Prov(⌜G⌝) and ψ ≡ ⊥.
The more abstract version of the second incompleteness theorem is as fol-
lows:
Theorem 37.9. Let T be any consistent, axiomatized theory extending Q and let
Prov T (y) be any formula satisfying derivability conditions P1–P3 for T. Then T
does not derive ConT .
The moral of the story is that no “reasonable” consistent theory for math-
ematics can derive its own consistency statement. Suppose T is a theory of
T ⊢ Prov T (⌜δ⌝) ↔ δ.
If it were derivable, T ⊢ Prov T (⌜δ⌝) by condition (1), but the same conclusion
follows if we apply modus ponens to the equivalence above. Hence, we don’t
get that T is inconsistent, at least not by the same argument as in the case of
the Gödel sentence. This of course does not show that T does derive δ.
We can make headway on this question if we generalize it a bit. The left-to-
right direction of the fixed point equivalence, Prov T (⌜δ⌝) → δ, is an instance
of a general schema called a reflection principle: Prov T (⌜φ⌝) → φ. It is called
that because it expresses, in a sense, that T can “reflect” about what it can
derive; basically it says, “If T can derive φ, then φ is true,” for any φ. This is
true for sound theories only, of course, and this suggests that theories will in
general not derive every instance of it. So which instances can a theory (strong
enough, and satisfying the derivability conditions) derive? Certainly all those
where φ itself is derivable. And that’s it, as the next result shows.
Theorem 37.10. Let T be an axiomatizable theory extending Q, and suppose Prov T (y)
is a formula satisfying conditions P1–P3 from section 37.7. If T derives Prov T (⌜φ⌝) →
φ, then in fact T derives φ.
The heuristic for the proof of Löb’s theorem is a clever proof that Santa
Claus exists. (If you don’t like that conclusion, you are free to substitute any
other conclusion you would like.) Here it is:
2. Suppose X is true.
3. Then what it says holds; i.e., we have: if X is true, then Santa Claus
exists.
4. Since we are assuming X is true, we can conclude that Santa Claus exists,
by modus ponens from (2) and (3).
5. We have succeeded in deriving (4), “Santa Claus exists,” from the as-
sumption (2), “X is true.” By conditional proof, we have shown: “If X is
true, then Santa Claus exists.”
A formalization of this idea, replacing “is true” with “is derivable,” and “Santa
Claus exists” with φ, yields the proof of Löb’s theorem. The trick is to apply
the fixed-point lemma to the formula Prov T (y) → φ. The fixed point of that
corresponds to the sentence X in the preceding sketch.
Proof of Theorem 37.10. Suppose φ is a sentence such that T derives Prov T (⌜φ⌝) →
φ. Let ψ(y) be the formula Prov T (y) → φ, and use the fixed-point lemma to
find a sentence θ such that T derives θ ↔ ψ(⌜θ⌝). Then each of the following
is derivable in T:
θ ↔ (Prov T (⌜θ⌝) → φ) (37.14)
θ is a fixed point of ψ(y)
θ → (Prov T (⌜θ⌝) → φ) (37.15)
from eq. (37.14)
Prov T (⌜θ → (Prov T (⌜θ⌝) → φ)⌝) (37.16)
from eq. (37.15) by condition P1
Prov T (⌜θ⌝) → Prov T (⌜Prov T (⌜θ⌝) → φ⌝) (37.17)
from eq. (37.16) using condition P2
Prov T (⌜θ⌝) → (Prov T (⌜Prov T (⌜θ⌝)⌝) → Prov T (⌜φ⌝)) (37.18)
from eq. (37.17) using P2 again
Prov T (⌜θ⌝) → Prov T (⌜Prov T (⌜θ⌝)⌝) (37.19)
by derivability condition P3
Prov T (⌜θ⌝) → Prov T (⌜φ⌝) (37.20)
from eq. (37.18) and eq. (37.19)
Prov T (⌜φ⌝) → φ (37.21)
by assumption of the theorem
Prov T (⌜θ⌝) → φ (37.22)
from eq. (37.20) and eq. (37.21)
(Prov T (⌜θ⌝) → φ) → θ (37.23)
from eq. (37.14)
θ (37.24)
from eq. (37.22) and eq. (37.23)
Prov T (⌜θ⌝) (37.25)
from eq. (37.24) by condition P1
φ from eq. (37.21) and eq. (37.25)
With Löb’s theorem in hand, there is a short proof of the second incom-
pleteness theorem (for theories having a derivability predicate satisfying con-
ditions P1–P3): if T ⊢ Prov T (⌜⊥⌝) → ⊥, then T ⊢ ⊥. If T is consistent, T ⊬ ⊥.
So, T ⊬ Prov T (⌜⊥⌝) → ⊥, i.e., T ⊬ ConT . We can also apply it to show that δ,
the fixed point of Prov T ( x ), is derivable. For since
T ⊢ Prov T (⌜δ⌝) ↔ δ
in particular
T ⊢ Prov T (⌜δ⌝) → δ
and so by Löb’s theorem, T ⊢ δ.
Now one can ask, is the converse also true? That is, is every relation defin-
able in N computable? The answer is no. For example:
Lemma 37.13. The halting relation is definable in N.
so ∃s θ T (z, x, s) defines H in N.
negative.
Theorem 37.14. The set of true sentences of arithmetic is not definable in arithmetic.
However, for any language strong enough to represent the diagonal function,
and any linguistic predicate T ( x ), we can construct a sentence X satisfying
“X if and only if not T (‘X’).” Given that we do not want a truth predicate
to declare some sentences to be both true and false, Tarski concluded that
one cannot specify a truth predicate for all sentences in a language without,
somehow, stepping outside the bounds of the language. In other words, a the
truth predicate for a language cannot be defined in the language itself.
Problems
Problem 37.1. A formula φ( x ) is a truth definition if Q ⊢ ψ ↔ φ(⌜ψ⌝) for all
sentences ψ. Show that no formula is a truth definition by using the fixed-
point lemma.
Problem 37.2. Every ω-consistent theory is consistent. Show that the con-
verse does not hold, i.e., that there are consistent but ω-inconsistent theories.
Do this by showing that Q ∪ {¬γQ } is consistent but ω-inconsistent.
Problem 37.3. Two sets A and B of natural numbers are said to be computably
inseparable if there is no decidable set X such that A ⊆ X and B ⊆ X (X is the
complement, N \ X, of X). Let T be a consistent axiomatizable extension of
Q. Suppose A is the set of Gödel numbers of sentences provable in T and B
the set of Gödel numbers of sentences refutable in T. Prove that A and B are
computably inseparable.
2. T ⊢ φ → Prov T (⌜φ⌝).
4. T ⊢ Prov T (⌜φ⌝) → φ
Second-order Logic
589
CHAPTER 37. INCOMPLETENESS AND PROVABILITY
Basic syntax and semantics for SOL covered so far. As a chapter it’s
too short. Substitution for second-order variables has to be covered to
be able to talk about derivation systems for SOL, and there’s some subtle
issues there.
38.1 Introduction
In first-order logic, we combine the non-logical symbols of a given language,
i.e., its constant symbols, function symbols, and predicate symbols, with the
logical symbols to express things about first-order structures. This is done
using the notion of satisfaction, which relates a structure M, together with a
variable assignment s, and a formula φ: M, s ⊨ φ holds iff what φ expresses
when its constant symbols, function symbols, and predicate symbols are in-
terpreted as M says, and its free variables are interpreted as s says, is true.
The interpretation of the identity predicate = is built into the definition of
M, s ⊨ φ, as is the interpretation of ∀ and ∃. The former is always interpreted
as the identity relation on the domain |M| of the structure, and the quanti-
fiers are always interpreted as ranging over the entire domain. But, crucially,
quantification is only allowed over elements of the domain, and so only object
variables are allowed to follow a quantifier.
In second-order logic, both the language and the definition of satisfaction
are extended to include free and bound function and predicate variables, and
quantification over them. These variables are related to function symbols and
predicate symbols the same way that object variables are related to constant
symbols. They play the same role in the formation of terms and formulas
of second-order logic, and quantification over them is handled in a similar
way. In the standard semantics, the second-order quantifiers range over all
possible objects of the right type (n-place functions from |M| to |M| for func-
591
CHAPTER 38. SYNTAX AND SEMANTICS
tion variables, n-place relations for predicate variables). For instance, while
∀v0 (P01 (v0 ) ∨ ¬P01 (v0 )) is a formula in both first- and second-order logic, in
the latter we can also consider ∀V01 ∀v0 (V01 (v0 ) ∨ ¬V01 (v0 )) and ∃V01 ∀v0 (V01 (v0 ) ∨
¬V01 (v0 )). Since these contain no free variables, they are sentences of second-
order logic. Here, V01 is a second-order 1-place predicate variable. The allow-
able interpretations of V01 are the same that we can assign to a 1-place predicate
symbol like P01 , i.e., subsets of |M|. Quantification over them then amounts
to saying that ∀v0 (V01 (v0 ) ∨ ¬V01 (v0 )) holds for all ways of assigning a subset
of |M| as the value of V01 , or for at least one. Since every set either contains or
fails to contain a given object, both are true in any structure.
Definition 38.1 (Second-order Terms). The set of second-order terms of L, Trm2 (L),
is defined by adding to Definition 15.4 the clause
So, a second-order term looks just like a first-order term, except that where
a first-order term contains a function symbol fi n , a second-order term may
contain a function variable uin in its place.
Definition 38.2 (Second-order formula). The set of second-order formulas Frm2 (L)
of the language L is defined by adding to Definition 15.4 the clauses
38.3 Satisfaction
To define the satisfaction relation M, s ⊨ φ for second-order formulas, we
have to extend the definitions to cover second-order variables. The notion
of a structure is the same for second-order logic as it is for first-order logic.
There is only a difference for variable assignments s: these now must not just
provide values for the first-order variables, but also for the second-order vari-
ables.
3. n-place function variable uin to an n-place function from |M| to |M|, i.e.,
s(uin ) : |M|n → |M|;
t ≡ u ( t1 , . . . , t n ):
ValM M M
s ( t ) = s ( u )(Vals ( t1 ), . . . , Vals ( tn )).
¬ R(z)) says that whatever falls under the interpretation of P does not fall un-
der the interpretation of R and vice versa. In a structure, the interpretation of
a predicate symbol P is given by the interpretation PM . But for second-order
variables like X and Y, the interpretation is provided, not by the structure
itself, but by a variable assignment. Since the second-order formula is not
a sentence (it includes free variables X and Y), it is only satisfied relative to
a structure M together with a variable assignment s.
M, s ⊨ ∀z ( X (z) ↔ ¬Y (z)) whenever the elements of s( X ) are not elements
of s(Y ), and vice versa, i.e., iff s(Y ) = |M| \ s( X ). For instance, take |M| =
{1, 2, 3}. Since no predicate symbols, function symbols, or constant symbols
are involved, the domain of M is all that is relevant. Now for s1 ( X ) = {1, 2}
and s1 (Y ) = {3}, we have M, s1 ⊨ ∀z ( X (z) ↔ ¬Y (z)).
By contrast, if we have s2 ( X ) = {1, 2} and s2 (Y ) = {2, 3}, M, s2 ⊭ ∀z ( X (z) ↔
¬Y (z)). That’s because M, s2 [2/z] ⊨ X (z) (since 2 ∈ s2 [2/z]( X )) but M, s2 [2/z] ⊭
¬Y (z) (since also 2 ∈ s2 [2/z](Y )).
Example 38.10. The second-order sentence ∀ X ∀y X (y) says that every 1-place
relation, i.e., every property, holds of every object. That is clearly never true,
since in every M, for a variable assignment s with s( X ) = ∅, and s(y) = a ∈
|M| we have M, s ⊭ X (y). This means that φ → ∀ X ∀y X (y) is equivalent in
second-order logic to ¬ φ, that is: M ⊨ φ → ∀ X ∀y X (y) iff M ⊨ ¬ φ. In other
words, in second-order logic we can define ¬ using ∀ and →.
that the underlying satisfaction relation is now that for second-order formu-
las. A second-order sentence, of course, is a formula in which all variables,
including predicate and function variables, are bound.
Definition 38.11 (Validity). A sentence φ is valid, ⊨ φ, iff M ⊨ φ for every
structure M.
Example 38.15. In first-order logic we can define the identity relation Id|M|
(i.e., {⟨ a, a⟩ : a ∈ |M|}) by the formula x = y. In second-order logic, we can
define this relation without =. For if a and b are the same element of |M|, then
they are elements of the same subsets of |M| (since sets are determined by
their elements). Conversely, if a and b are different, then they are not elements
of the same subsets: e.g., a ∈ { a} but b ∈ / { a} if a ̸= b. So “being elements
of the same subsets of |M|” is a relation that holds of a and b iff a = b. It is
a relation that can be expressed in second-order logic, since we can quantify
over all subsets of |M|. Hence, the following formula defines Id|M| :
∀ X ( X ( x ) ↔ X (y))
ψR ( X ) ≡ ∀ x ∀y ( R( x, y) → X ( x, y)) ∧
∀ x ∀y ∀z (( X ( x, y) ∧ X (y, z)) → X ( x, z)).
The first conjunct says that R ⊆ X and the second that X is transitive.
To say that X is the smallest such relation is to say that it is itself included in
every relation that includes R and is transitive. So we can define the transitive
closure of R by the formula
R∗ ( X ) ≡ ψR ( X ) ∧ ∀Y (ψR (Y ) → ∀ x ∀y ( X ( x, y) → Y ( x, y))).
∀ x ∀ y ( f ( x ) = f ( y ) → x = y ) ∧ ∃ y ∀ x y ̸ = f ( x ).
M ⊨ Inf iff |M| is infinite. We can then define Fin ≡ ¬Inf; M ⊨ Fin iff |M| is
finite. No single sentence of pure first-order logic can express that the domain
is infinite although an infinite set of them can. There is no set of sentences of
pure first-order logic that is satisfied in a structure iff its domain is finite.
m0 , m1 , m2 , . . .
for some s. Let m = s(z) and f = s(u) and consider M = {m, f (m), f ( f (m)), . . . }.
M so defined is clearly enumerable. Then
Problems
Problem 38.1. Show that in second-order logic ∀ and → can define the other
connectives:
Problem 38.2. Show that ∀ X ( X ( x ) → X (y)) (note: → not ↔!) defines Id|M| .
Problem 38.3. The sentence Inf ∧ Count is true in all and only denumerable
domains. Adjust the definition of Count so that it becomes a different sentence
that directly expresses that the domain is denumerable, and prove that it does.
39.1 Introduction
First-order logic also has two more properties: it is compact (if every fi-
nite subset of a set Γ of sentences is satisfiable, Γ itself is satisfiable) and the
Löwenheim-Skolem Theorem holds for it (if Γ has an infinite model it has a de-
numerable model). Both of these results fail for second-order logic. Again, the
reason is that second-order logic can express facts about the size of domains
that first-order logic cannot.
600
39.2. SECOND-ORDER ARITHMETIC
∀ x x′ ̸= 0
∀ x ∀y ( x ′ = y′ → x = y)
∀ x ( x = 0 ∨ ∃y x = y′ )
∀ x ( x + 0) = x
∀ x ∀y ( x + y′ ) = ( x + y)′
∀ x ( x × 0) = 0
∀ x ∀y ( x × y′ ) = (( x × y) + x )
∀ x ∀y ( x < y ↔ ∃z (z′ + x ) = y)
( φ(0) ∧ ∀ x ( φ( x ) → φ( x ′ ))) → ∀ x φ( x ).
The latter is a “schema,” i.e., a pattern that generates infinitely many sen-
tences of the language of arithmetic, one for each formula φ( x ). We call this
schema the (first-order) axiom schema of induction. In second-order Peano arith-
metic PA2 , induction can be stated as a single sentence. PA2 consists of the
first eight axioms above plus the (second-order) induction axiom:
∀ X ( X (0) ∧ ∀ x ( X ( x ) → X ( x ′ ))) → ∀ x X ( x ).
It says that if a subset X of the domain contains 0M and with any x ∈ |M| also
contains ′M ( x ) (i.e., it is “closed under successor”) it contains everything in
the domain (i.e., X = |M|).
The induction axiom guarantees that any structure satisfying it contains
only those elements of |M| the axioms require to be there, i.e., the values of n
for n ∈ N. A model of PA2 contains no non-standard numbers.
Proof. Let N = {ValM (n) : n ∈ N}, and suppose M ⊨ PA2 . Of course, for any
n ∈ N, ValM (n) ∈ |M|, so N ⊆ |M|.
Now for inclusion in the other direction. Consider a variable assignment s
with s( X ) = N. By assumption,
Above we defined PA2 as the theory that contains the first eight arith-
metical axioms plus the second-order induction axiom. In fact, thanks to the
expressive power of second-order logic, only the first two of the arithmetical
axioms plus induction are needed for second-order Peano arithmetic.
Proposition 39.3. Let PA2† be the second-order theory containing the first two arith-
metical axioms (the successor axioms) and the second-order induction axiom. Then
≤, +, and × are definable in PA2† .
ψ( x, Y ) ≡ Y ( x ) ∧ ∀y (Y (y) → Y (y′ ))
φ+ ( x, y, z) ≡ ∃u (u(0) = x ∧ ∀w u( x ′ ) = u( x )′ ∧ u(y) = z)
Theorem 39.5. There is no sound and complete derivation system for second-order
logic.
is satisfied in a structure iff its domain is infinite. Let φ≥n be a sentence that
asserts that the domain has at least n elements, e.g.,
φ ≥ n ≡ ∃ x 1 . . . ∃ x n ( x 1 ̸ = x 2 ∧ x 1 ̸ = x 3 ∧ · · · ∧ x n −1 ̸ = x n ).
It is finitely satisfiable, since for any finite subset Γ0 ⊆ Γ there is some k so that
φ≥k ∈ Γ but no φ≥n ∈ Γ for n > k. If |M| has k elements, M ⊨ Γ0 . But, Γ is not
satisfiable: if M ⊨ ¬Inf, |M| must be finite, say, of size k. Then M ⊭ φ≥k+1 .
Theorem 39.7. The Löwenheim-Skolem Theorem fails for second-order logic: There
are sentences with infinite models but no enumerable models.
Theorem 39.8. There are sentences with denumerable but no non-enumerable mod-
els.
Proof. Count ∧ Inf is true in N but not in any structure M with |M| non-
enumerable.
Problems
Problem 39.1. Complete the proof of Proposition 39.3.
40.1 Introduction
Since second-order logic can quantify over subsets of the domain as well as
functions, it is to be expected that some amount, at least, of set theory can be
carried out in second-order logic. By “carry out,” we mean that it is possible
to express set theoretic properties and statements in second-order logic, and is
possible without any special, non-logical vocabulary for sets (e.g., the mem-
bership predicate symbol of set theory). For instance, we can define unions
and intersections of sets and the subset relationship, but also compare the
sizes of sets, and state results such as Cantor’s Theorem.
605
CHAPTER 40. SECOND-ORDER LOGIC AND SET THEORY
Two sets are the same size, or “equinumerous,” X ≈ Y, iff there is a bijec-
tive function f : X → Y.
∃u (∀ x ( X ( x ) → Y (u( x ))) ∧
∀ x ∀y (u( x ) = u(y) → x = y) ∧
∀y (Y (y) → ∃ x ( X ( x ) ∧ y = u( x ))))
Proof. The sentence is satisfied in a structure M if, for any subsets X ⊆ |M|
and Y ⊆ |M|, if X ⪯ Y and Y ⪯ X then X ≈ Y. But this holds for any sets X
and Y—it is the Schröder-Bernstein Theorem.
∃u (∀ x ∀y (u( x ) = u(y) → x = y) ∧
∃y ( X (y) ∧ ∀ x ( X ( x ) → y ̸= u( x )))
We know from Cantor’s Theorem that there are non-enumerable sets, and
in fact, that there are infinitely many different levels of infinite sizes. Set the-
ory develops an entire arithmetic of sizes of sets, and assigns infinite cardinal
numbers to sets. The natural numbers serve as the cardinal numbers measur-
ing the sizes of finite sets. The cardinality of denumerable sets is the first infi-
nite cardinality, called ℵ0 (“aleph-nought” or “aleph-zero”). The next infinite
size is ℵ1 . It is the smallest size a set can be without being countable (i.e., of
size ℵ0 ). We can define “X has size ℵ0 ” as Aleph0 ( X ) ↔ Inf( X ) ∧ Count( X ).
X has size ℵ1 iff all its subsets are finite or have size ℵ0 , but is not itself of
size ℵ0 . Hence we can express this by the formula Aleph1 ( X ) ≡ ∀Y (Y ⊆
X → (¬Inf(Y ) ∨ Aleph0 (Y ))) ∧ ¬Aleph0 ( X ). Being of size ℵ2 is defined simi-
larly, etc.
There is one size of special interest, the so-called cardinality of the contin-
uum. It is the size of ℘(N), or, equivalently, the size of R. That a set is the size
of the continuum can also be expressed in second-order logic, but requires a
bit more work.
Pow(Y, R, X ) ≡
∀ Z ( Z ⊆ X → ∃ x (Y ( x ) ∧ Codes( x, R, Z ))) ∧
∀ x (Y ( x ) → ∀ Z (Codes( x, R, Z ) → Z ⊆ X )
expresses that s(Y ) s( R)-codes the power set of s( X ), i.e., the elements of s(Y ) s( R)-
code exactly the subsets of s( X ).
With this trick, we can express statements about the power set by quantify-
ing over the codes of subsets rather than the subsets themselves. For instance,
Cantor’s Theorem can now be expressed by saying that there is no injective
function from the domain of any relation that codes the power set of X to X
itself.
Proposition 40.11. The sentence
∀ X ∀Y ∀ R (Pow(Y, R, X )→
¬∃u (∀ x ∀y (u( x ) = u(y) → x = y) ∧
∀ x (Y ( x ) → X (u( x )))))
is valid.
Proof. Pow(Y, R, X ) expresses that s(Y ) s( R)-codes the power set of s( X ), which
Aleph0 ( X ) says is countable. So s(Y ) is at least as large as the power of the
continuum, although it may be larger (if multiple elements of s(Y ) code the
same subset of X). This is ruled out be the last conjunct, which requires the
association between elements of s(Y ) and subsets of s( Z ) via s( R) to be injec-
tive.
M ⊨ ∃ X ∃Y ∃ R (Aleph0 ( X ) ∧ Pow(Y, R, X )∧
∃u (∀ x ∀y (u( x ) = u(y) → x = y) ∧
∀y (Y (y) → ∃ x y = u( x )))).
The Continuum Hypothesis is the statement that the size of the continuum
is the first non-enumerable cardinality, i.e, that ℘(N) has size ℵ1 .
CH ≡ ∀ X (Aleph1 ( X ) ↔ Cont( X ))
is valid.
Note that it isn’t true that ¬CH is valid iff the Continuum Hypothesis is
false. In an enumerable domain, there are no subsets of size ℵ1 and also no
subsets of the size of the continuum, so CH is always true in an enumerable
domain. However, we can give a different sentence that is valid iff the Con-
tinuum Hypothesis is false:
is valid.
610
40.4. THE POWER OF THE CONTINUUM
This part deals with the lambda calculus. The introduction chapter
is based on Jeremy Avigad’s notes; part of it is now redundant and cov-
ered in later chapters. The chapters on syntax, Church-Rosser property,
and lambda definability were produced by Zesen Qian during his Mitacs
summer internship. They still have to be reviewed and revised.
Introduction
41.1 Overview
The lambda calculus was originally designed by Alonzo Church in the early
1930s as a basis for constructive logic, and not as a model of the computable
functions. But it was soon shown to be equivalent to other definitions of com-
putability, such as the Turing computable functions and the partial recursive
functions. The fact that this initially came as a small surprise makes the char-
acterization all the more interesting.
Lambda notation is a convenient way of referring to a function directly
by a symbolic expression which defines it, instead of defining a name for it.
Instead of saying “let f be the function defined by f ( x ) = x + 3,” one can
say, “let f be the function λx. ( x + 3).” In other words, λx. ( x + 3) is just a
name for the function that adds three to its argument. In this expression, x
is a dummy variable, or a placeholder: the same function can just as well
be denoted by λy. (y + 3). The notation works even with other parameters
around. For example, suppose g( x, y) is a function of two variables, and k is a
natural number. Then λx. g( x, k) is the function which maps any x to g( x, k).
This way of defining a function from a symbolic expression is known as
lambda abstraction. The flip side of lambda abstraction is application: assuming
one has a function f (say, defined on the natural numbers), one can apply it to
any value, like 2. In conventional notation, of course, we write f (2) for the
result.
612
41.2. THE SYNTAX OF THE LAMBDA CALCULUS
(λx. ( x + 3))(2)
can be simplified to 2 + 3.
Up to this point, we have done nothing but introduce new notations for
conventional notions. The lambda calculus, however, represents a more radi-
cal departure from the set-theoretic viewpoint. In this framework:
1. Everything denotes a function.
Convention 1. 1. When parentheses are left out, application takes place from
left to right. For example, if M, N, P, and Q are terms, then MNPQ ab-
breviates ((( MN ) P) Q).
For example,
λxy. xxyxλz. xz
abbreviates
λx. λy. (((( xx )y) x )(λz. ( xz))).
You should memorize these conventions. They will drive you crazy at first,
but you will get used to them, and after a while they will drive you less crazy
than having to deal with a morass of parentheses.
Two terms that differ only in the names of the bound variables are called α-
equivalent; for example, λx. x and λy. y. It will be convenient to think of these
as being the “same” term; in other words, when we say that M and N are the
same, we also mean “up to renamings of the bound variables.” Variables that
are in the scope of a λ are called “bound”, while others are called “free.” There
are no free variables in the previous example; but in
(λz. yz) x
1. We have
(λx. xxy)λz. z −
→ (λz. z)(λz. z)y
−
→ (λz. z)y
−
→ y.
(λx. xx )(λx. xx ) −
→ (λx. xx )(λx. xx ).
4. Also, some terms can be reduced in more than one way; for example,
by contracting the innermost one. Note, in this case, however, that both
terms further reduce to the same term, zv.
The final outcome in the last example is not a coincidence, but rather il-
lustrates a deep and important property of the lambda calculus, known as the
“Church-Rosser property.”
Corollary 41.2. Suppose M can be reduced to normal form. Then this normal form
is unique.
Proof. If M −
→→ N1 and M − →→ N2 , by the previous theorem there is a term P
such that N1 and N2 both reduce to P. If N1 and N2 are both in normal form,
this can only happen if N1 ≡ P ≡ N2 .
Finally, we will say that two terms M and N are β-equivalent, or just equiv-
alent, if they reduce to a common term; in other words, if there is some P such
β
that M −
→
→ P and N −
→
→ P. This is written M = N. Using Theorem 41.1, you
β
can check that = is an equivalence relation, with the additional property that
β
for every M and N, if M −
→
→ N or N −
→
→ M, then M = N. (In fact, one can
β
show that = is the smallest equivalence relation having this property.)
41.5 Currying
A λ-abstract λx. M represents a function of one argument, which is quite a
limitation when we want to define function accepting multiple arguments.
One way to do this would be by extending the λ-calculus to allow the for-
mation of pairs, triples, etc., in which case, say, a three-place function λx. M
would expect its argument to be a triple. However, it is more convenient to
do this by Currying.
Let’s consider an example. We’ll pretend for a moment that we have a
+ operation in the λ-calculus. The addition function is 2-place, i.e., it takes
two arguments. But a λ-abstract only gives us functions of one argument: the
syntax does not allow expressions like λ( x, y). ( x + y). However, we can con-
sider the one-place function f x (y) given by λy. ( x + y), which adds x to its
single argument y. Actually, this is not a single function, but a family of dif-
ferent functions “add x,” one for each number x. Now we can define another
one-place function g as λx. f x . Applied to argument x, g( x ) returns the func-
tion f x —so its values are other functions. Now if we apply g to x, and then
the result to y we get: ( g( x ))y = f x (y) = x + y. In this way, the one-place
function g can do the same job as the two-place addition function. “Curry-
ing” simply refers to this trick for turning two-place functions into one place
functions (whose values are one-place functions).
Here is an example properly in the syntax of the λ-calculus. How do we
represent the function f ( x, y) = x? If we want to define a function that accepts
two arguments and returns the first, we can write λx. λy. x, which literally is
a function that accepts an argument x and returns the function λy. x. The
function λy. x accepts another argument y, but drops it, and always returns x.
Let’s see what happens when we apply λx. λy. x to two arguments:
β
(λx. λy. x ) MN −
→(λy. M) N
β
−
→M
get:
β
(λx1 . λx2 . . . . λxn . N ) M1 . . . Mn −
→
β
−
→ ((λx2 . . . . λxn . N )[ M1 /x1 ]) M2 . . . Mn
≡ (λx2 . . . . λxn . N [ M1 /x1 ]) M2 . . . Mn
..
.
β
−
→ P[ M1 /x1 ] . . . [ Mn /xn ]
The last line literally means substituting Mi for xi in the body of the function
definition, which is exactly what we want when applying multiple arguments
to a function.
Definition 41.3. For each natural number n, define the Church numeral n to be
the lambda term λx. λy. ( x ( x ( x (. . . x (y))))), where there are n x’s in all.
Proof. Wwe need to show that every partial computable function f is λ-defined
by a lambda term F. By Kleene’s normal form theorem, it suffices to show that
every primitive recursive function is λ-defined by a lambda term, and then
that the functions λ-definable are closed under suitable compositions and un-
bounded search. To show that every primitive recursive function is λ-defined
by a lambda term, it suffices to show that the initial functions are λ-definable,
and that the partial functions that are λ-definable are closed under composi-
tion, primitive recursion, and unbounded search.
We will use a more conventional notation to make the rest of the proof
more readable. For example, we will write M ( x, y, z) instead of Mxyz. While
this is suggestive, you should remember that terms in the untyped lambda
calculus do not have associated arities; so, for the same term M, it makes just
as much sense to write M ( x, y) and M ( x, y, z, w). But using this notation indi-
cates that we are treating M as a function of three variables, and helps make
the intentions behind the definitions clearer. In a similar way, we will say
“define M by M ( x, y, z) = . . . ” instead of “define M by M = λx. λy. λz. . . ..”
F (0) ≡ G
F (n + 1) ≡ H (n, F (n))
In other words, with lambda trickery, we can avoid having to worry about the
extra parameters ⃗z—they just get absorbed in the lambda notation.
Before we define the term F, we need a mechanism for handling ordered
pairs. This is provided by the next lemma.
Lemma 41.10. There is a lambda term D such that for each pair of lambda terms M
and N, D ( M, N )(0) −
→
→ M and D ( M, N )(1) −
→
→ N.
K (y) = λx. y.
In other words, K is the term λy. λx. y. Looking at it differently, for every M,
K ( M ) is a constant function that returns M on any input.
Now define D ( x, y, z) by D ( x, y, z) = z(K (y)) x. Then we have
D ( M, N, 0) −
→
→ 0(K ( N )) M −
→
→ M and
D ( M, N, 1) −
→
→ 1(K ( N )) M −
→
→ K( N ) M −
→
→ N,
as required.
Proof. We need to show that given any terms, G and H, we can find a term F
such that
F (0) ≡ G
F (n + 1) ≡ H (n, F (n))
for every natural number n. The idea is roughly to compute sequences of pairs
diag( x ) = xx
and
l ( x ) = g(diag( x ))
using our notational conventions; in other words, l is the term λx. g( xx ). Let
k be the term ll. Then we have
k = (λx. g( xx ))(λx. g( xx ))
−
→
→ g((λx. g( xx ))(λx. g( xx )))
= gk.
If one takes
Y = λg. ((λx. g( xx ))(λx. g( xx )))
then Yg and g(Yg) reduce to a common term; so Yg ≡ β g(Yg). This is known
as “Curry’s combinator.” If instead one takes
g( x ) ≃ µy f ( x, y).
Then g is λ-definable.
Proof. The idea is roughly as follows. Given x, we will use the fixed-point
lambda term Y to define a function h x (n) which searches for a y starting at n;
then g( x ) is just h x (0). The function h x can be expressed as the solution of a
fixed-point equation:
(
n if f ( x, n) = 0
h x (n) ≃
h x (n + 1) otherwise.
We can do this using the fixed-point term Y. First, let U be the term
and then let H be the term YU. Notice that the only free variable in H is x. Let
us show that H satisfies the equation above.
By the definition of Y, we have
H = YU ≡ U (YU ) = U ( H ).
H (n) ≡ U ( H, n)
−
→
→ D (n, H (S(n)), F ( x, n)),
as required. Notice that if you substitute a numeral m for x in the last line, the
expression reduces to n if F (m, n) reduces to 0, and it reduces to H (S(n)) if
F (m, n) reduces to any other numeral.
To finish off the proof, let G be λx. H (0). Then G λ-defines g; in other
words, for every m, G (m) reduces to reduces to g(m), if g(m) is defined, and
has no normal form otherwise.
Syntax
42.1 Terms
The terms of the lambda calculus are built up inductively from an infinite
supply of variables v0 , v1 , . . . , the symbol “λ”, and parentheses. We will use
x, y, z, . . . to designate variables, and M, N, P, . . . to desginate terms.
Definition 42.1 (Terms). The set of terms of the lambda calculus is defined
inductively by:
624
42.2. UNIQUE READABILITY
Lemma 42.3. The result of an application starts with either two parentheses or a
parenthesis and a variable.
Proposition 42.5 (Unique Readability). There is a unique formation for each term.
In other words, if a term M is formed by a formation, then it is the only formation
that can form this term.
3. M is of the form ( PQ), where P and Q are terms. Since it starts with
a parentheses, it cannot also be constructed by Definition 42.1(1). By
Lemma 42.2, P cannot begin with λ, so ( PQ) cannot be the result of an
abstraction. Now suppose there were another way of constructing M by
application, e.g., it is also of the form ( P′ Q′ ). Then P is a proper initial
segment of P′ (or vice versa), and this is impossible by Lemma 42.4. So P
and Q are uniquely determined, and by inductive hypothesis we know
that formations of P and Q is unique.
1. When parentheses are left out, application takes place from left to right.
For example, if M, N, P, and Q are terms, then MNPQ abbreviates
((( MN ) P) Q).
2. Again, when parentheses are left out, lambda abstraction is given the
widest scope possible. From example, λx. MNP is read as (λx. MNP).
3. A lambda can be used to abstract multiple variables. For example, λxyz. M
is short for λx. λy. λz. M.
For example,
λxy. xxyxλz. xz
abbreviates
(λx. (λy. (((( xx )y) x )(λz. ( xz))))).
Example 42.9. In λx. xy, both x and y are in the scope of λx, so x is bound by
λx. Since y is not in the scope of any λy, it is free. In λx. xx, both occurrences of
x are bound by λx, since both are free in xx. In ((λx. xx ) x ), the last occurrence
of x is free, since it is not in the scope of a λx. In λx. (λx. x ) x, the scope of
the first λx is (λx. x ) x and the scope of the second λx is the second-to-last
occurrence of x. In (λx. x ) x, the last occurrence of x is free, and the second-to-
last is bound. Thus, the second-to-last occurrence of x in λx. (λx. x ) x is bound
by the second λx, and the last occurrence by the first λx.
For a term P, we can check all variable occurrences in it and get a set of free
variables. This set is denoted by FV( P) with a natural definition as follows:
Definition 42.10 (Free variables of a term). The set of free variables of a term
is defined inductively by:
1. FV( x ) = { x }
2. FV(λx. N ) = FV( N ) \ { x }
Proof. Exercise.
42.5 Substitution
Free variables are references to environment variables, thus it makes sense to
actually use a specific value in the place of a free variable. For example, we
may want to replace f in λx. f x with a specific term, like the identity function
λy. y. This results in λx. (λy. y) x. The process of replacing free variables with
lambda terms is called substitution.
1. x [ N/x ] = N.
2. y[ N/x ] = y if x ̸= y.
Theorem 42.14. If x ∈
/ FV( M ), then FV( M [ N/x ]) = FV( M ), if the left-hand side
is defined.
1. M is a variable: exercise.
Then:
FV(λy. P[ N/x ]) =
= FV(λy. P[ N/x ]) by (4)
= FV( P[ N/x ]) \ {y} by Definition 42.10(2)
= FV( P) \ {y} by inductive hypothesis
= FV(λy. P) by Definition 42.10(2)
1. M is a variable: exercise.
2. M is of the form PQ: Since ( PQ)[ N/y] is defined, it has to be ( P[ N/x ])( Q[ N/x ])
with both substitution defined. Also, since x ∈ FV( PQ), either x ∈
FV( P) or x ∈ FV( Q) or both. The rest is left as an exercise.
3. M is of the form λy. P. Since λy. P[ N/x ] is defined, it has to be λy. P[ N/x ],
with P[ N/x ] defined, x ̸= y and y ∈ / FV( N ); also, since y ∈ FV(λx. P),
we have y ∈ FV( P) too. Now:
Theorem 42.16. x ∈
/ FV( M [ N/x ]), if the right-hand side is defined and x ∈
/
FV( N ).
Proof. Exercise.
1. M is a variable z: Exercise.
42.6 α-Conversion
What is the relation between λx. x and λy. y? They both represent the identity
function. They are, of course, syntactically different terms. They differ only in
the name of the bound variable, and one is the result of “renaming” the bound
variable in the other. This is called α-conversion.
α
Definition 42.18 (Change of bound variable, − →). If a term M contains an oc-
currence of λx. N, y ∈
/ FV( N ), and N [y/x ] is defined, then replacing this oc-
currence by
λy. N [y/x ]
“Smallest” here means the relation contains only pairs that are required
by compatibility and the additional condition, and nothing else. Thus this
relation can also be defined as follows:
α α
Definition 42.21 (Change of bound variable, −
→). Change of bound variable (−
→)
is inductively defined as follows:
→ N ′ then λx. N −
→ λx. N ′
α α
1. If N −
→ P′ then ( PQ) −
→ ( P′ Q)
α α
2. If P −
→ Q′ then ( PQ) −
→ ( PQ′ )
α α
3. If Q −
α
4. If x ̸= y, y ∈
/ FV( N ) and N [y/x ] is defined, then λx. N −
→ λy. N [y/x ].
The definitions are equivalent, but we leave the proof as an exercise. From
now on we will use the inductive definition.
α α
Definition 42.22 (α-conversion, − →→). α-conversion (−
→
→) is the smallest reflexi-
α
tive and transitive relation on terms containing −
→.
a) If x ∈ FV ( N ), then:
b) If x ∈
/ FV ( N ), then:
α α
Lemma 42.26. If P −
→ Q then Q −
→ P.
α
Proof. Induction on the derivation of P −
→ Q.
1. If the last rule is (4), then P is of the form λx. N and Q of the form
λy. N [y/x ], where x ̸= y, y ∈ / FV( N ) and N [y/x ] defined. First, we
have y ∈ / FV( N [y/x ]) by Theorem 42.16. By Theorem 42.17 we have
that N [y/x ][ x/y] is not only defined, but also equal to N. Then by (4),
α
we have λy. N [y/x ] − → λx. N [y/x ][ x/y] = λx. N.
α-equivalent to M [ R/y].
Proof. Exercise.
Theorem 42.30. For any M, R, and y, there exists M′ such that M = M′ and
α
M′ [ R/y] is defined. Moreover, if there is another pair M′′ = M and R′′ where
α
M′′ [ R′′ /y] is defined and R′′ = R, then M′ [ R/y] = M′′ [ R′′ /y].
α α
1. M is a variable z: Exercise.
Corollary 42.31. For any M, R, and y, there exists a pair of M′ and R′ such that
M′ = M, R = R′ and M′ [ R′ /y] is defined. Moreover, if there is another pair
α α
M′′ = M and R′′ with M′ [ R′ /y] defined, then M′ [ R′ /y] = M′′ [ R′′ /y].
α α
Definition 42.33.
FΓ ( x ) = Γ ( x )
FΓ ( PQ) = FΓ ( P) FΓ ( Q)
FΓ (λx. N ) = λ. Fx,Γ ( N )
where Γ is a list of variables indexed from zero, and Γ ( x ) denotes the position
of the variable x in Γ. For example, if Γ is x, y, z, then Γ ( x ) is 0 and Γ (z) is 2.
x, Γ denotes the list resulted from pushing x to the head of Γ; for instance,
continuing the Γ in last example, w, Γ is w, x, y, z.
Definition 42.34.
GΓ ( n ) = Γ [ n ]
GΓ ( PQ) = GΓ ( P) GΓ ( Q)
GΓ (λ. N ) = λx. Gx,Γ ( N )
where Γ is again a list of variables indexed from zero, and Γ [n] denotes the
variable in position n. For example, if Γ is x, y, z, then Γ [1] is y.
The variable x in last equation is chosen to be any variable that not in Γ.
It is not hard to see that they are well defined, because α-conversion is
compatible.
Note how this definition significantly simplifies our reasoning. For exam-
ple:
42.9 β-reduction
When we see (λm. (λy. y)m), it is natural to conjecture that it has some connec-
tion with λm. m, namely the second term should be the result of “simplifying”
the first. The notion of β-reduction captures this intuition formally.
β β
Definition 42.39 (β-contraction, −→). The β-contraction (−
→) is the smallest com-
patible relation on terms satisfying the following condition:
β
(λx. N ) Q −
→ N [ Q/x ]
β
We say P is β-contracted to Q if P −
→ Q. A term of the form (λx. N ) Q is called
a redex.
β β
Definition 42.40 (β-reduction, −
→
→). β-reduction (−
→
→) is the smallest reflexive,
β β
transitive relation on terms containing −
→. We say P is β-reduced to Q if P −
→
→
Q.
β β
We will write −
→ instead of −
→, and −
→
→ instead of −
→
→ when context is clear.
β
Informally speaking, M − →
→ N if and only if M can be changed to N by
zero or several steps of β-contraction.
β
If M − →→ N and N is β-normal, then we say N is a normal form of M. One
may ask if the normal form of a term is unique, and the answer is yes, as we
will see later.
Let us consider some examples.
1. We have
(λx. xxy)λz. z −
→ (λz. z)(λz. z)y
−
→ (λz. z)y
−
→y
(λx. xx )(λx. xx ) −
→ (λx. xx )(λx. xx )
4. Also, some terms can be reduced in more than one way; for example,
by contracting the innermost one. Note, in this case, however, that both
terms further reduce to the same term, zv.
The final outcome in the last example is not a coincidence, but rather il-
lustrates a deep and important property of the lambda calculus, known as the
Church-Rosser property.
In general, there is more than one way to β-reduce a term, thus many
reduction strategies have been invented, among which the most common is
the natural strategy. The natural strategy always contracts the left-most redex,
where the position of a redex is defined as its starting point in the term. The
natural strategy has the useful property that a term can be reduced to a normal
form by some strategy iff it can be reduced to normal form using the natural
strategy. In what follows we will use the natural stratuegy unless otherwise
specified.
1. M = M.
2. If M = N, then N = M.
3. If M = N, N = O, then M = O.
4. If M = N, then PM = PN.
5. If M = N, then MQ = NQ.
7. (λx. N ) Q = N [ Q/x ].
The first three rules make the relation an equivalence relation; the next
three make it compatible; the last ensures that it contains β-contraction.
Informally speaking, two terms are β-equivalent if and only if one of them
can be changed to the other in zero or more steps of β-contraction, or “inverse”
of β-contraction. The inverse of β-contraction is defined so that M inverse-β-
contracts to N iff N β-contracts to M.
Besides the above rules, we will extend the relation with more rules, and
X
denote the extended equivalence relation as =, where X is the extending rule.
42.10 η-conversion
There is another relation on λ terms. In section 42.4 we used the example
λx. ( f x ), which accepts an argument and applies f to it. In other words, it
is the same function as f : λx. ( f x ) N and f N both reduce to f N. We use η-
reduction (and η-extension) to capture this idea.
η η
Definition 42.43 (η-contraction, − →). η-contraction (−
→) is the smallest compat-
ible relation on terms satisfying the following condition:
η
λx. Mx −
→ M provided x ∈
/ FV ( M)
βη βη
Definition 42.44 (βη-reduction, −→
→). βη-reduction (−→
→) is the smallest reflex-
β η
ive, transitive relation on terms containing −
→ and −
→, i.e., the rules of reflex-
ivity and transitive plus the following two rules:
β βη
1. If M −
→ N then M −→
→ N.
η βη
2. If M −
→ N then M −→
→ N.
Roughly speaking, the rule states that two terms, viewed as functions,
should be considered equal if they behave the same for the same argument.
We now prove that the η rule provides exactly the extensionality, and noth-
ing else.
ext η
Theorem 42.47. M = N if and only if M = N.
η
Proof. First we prove that = is closed under the extensionality rule. That is, ext
η η ext ext
rule doesn’t add anything to =. We then have = contains = , and if M = N,
η
then M = N.
η
To prove = is closed under ext, note that for any M = N derived by the ext
η η
rule, we have Mx = Nx as premise. Then we have λx. Mx = λx. Nx by a rule
η
of =, applying η on both side gives us M = N.
ext
Similarly we prove that the η rule is contained in = . For any λx. Mx and
ext ext
M with x ∈ / FV ( M ), we have that (λx. Mx ) x = Mx, giving us λx. Mx = M
by the ext rule.
Problems
Problem 42.1. Describe the formation of (λg. (λx. ( g( xx )))(λx. ( g( xx )))).
Problem 42.4. 1. Identify the scopes of λg and the two λx in this term:
λg. (λx. g( xx ))λx. g( xx ).
2. In λg. (λx. g( xx ))λx. g( xx ), are all occurrences of variables bound? By
which abstractions are they bound respectively?
3. Give FV(λx. (λy. (λz. xy)z)y)
641
CHAPTER 43. THE CHURCH-ROSSER PROPERTY
X X
Theorem 43.2. If a relation −
→ satisfies the Church-Rosser property, and −
→→ is the
X X
smallest transitive relation containing −
→, then −
→→ satisfies the Church-Rosser prop-
erty too.
Proof. Suppose
X X X
M−
→ P1 −
→ ... −
→ Pm and
X X X
M−
→ Q1 −
→ ... −
→ Qn .
N0,0 = M
Ni,0 = Pi if 1 ≤ i ≤ m
N0,j = Q j if 1 ≤ j ≤ n
and otherwise:
Ni,j = R
X X
where R is a term such that Ni−1,j −
→ R and Ni,j−1 −
→ R. By the Church-Rosser
X
property of −
→, such a term always exists.
X X X X
Now we have Nm,0 −
→ ... −
→ Nm,n and N0,n −
→ ... −
→ Nm,n . Note Nm,0 is
X
P and N0,n is Q. By definition of −
→→ the theorem follows.
Proof. Exercise.
x∗β = x (43.1)
∗β ∗β
(λx. N ) = λx. N (43.2)
∗β
( PQ) = P∗ β Q∗ β if P is not a λ-abstract (43.3)
∗β ∗β ∗β
((λx. N ) Q) = N [ Q /x ] (43.4)
β β β
Lemma 43.6. If M =⇒ M′ and R =⇒ R′ , then M [ R/y] =⇒ M′ [ R′ /y].
β
Proof. By induction on the derivation of M =⇒ M′ .
β β
Lemma 43.7. If M =⇒ M′ then M′ =⇒ M∗ β .
β
Proof. By induction on the derivation of M =⇒ M′ .
β
Theorem 43.8. =⇒ has the Church-Rosser property.
43.3 β-reduction
β β
→ M′ , then M =⇒ M′ .
Lemma 43.9. If M −
β
→ M′ , then M is (λx. N ) Q, M′ is N [ Q/x ], for some x, N, and Q.
Proof. If M −
β β β
Since N =⇒ N and Q =⇒ Q by Theorem 43.4, we immediately have (λx. N ) Q =⇒
N [ Q/x ] by Definition 43.3(4).
β β
Lemma 43.10. If M =⇒ M′ , then M −
→ M′ .
→
β
Proof. By induction on the derivation of M =⇒ M′ .
β
1. The last rule is (1): Then M and M′ are just x, and x −
→
→ x.
2. The last rule is (2): M is λx. N and M′ is λx. N ′ for some x, N, N ′ , where
β β β
N =⇒ N ′ . By induction hypothesis we have N −
→ N ′ . Then λx. N −
→ →
→
β β
λx. N ′ (by the same series of − → N ′ ).
→ contractions as N −
→
β β
Lemma 43.11. −
→
→ is the smallest transitive relation containing =⇒.
X β
Proof. Let −
→→ be the smallest transitive relation containing =⇒.
β X β β β
−
→
→⊆−
→ → M′ , i.e., M ≡ M1 −
→: Suppose M −
→ → Mk ≡ M′ . By
→ ... −
β β X β
Lemma 43.9, M ≡ M1 =⇒ . . . =⇒ Mk ≡ M′ . Since is −
→→ contains =⇒ and is
X
transitive, M −
→→ M′ .
X β X β β
−
→→⊆−
→
→: Suppose M −
→→ M′ , i.e., M ≡ M1 =⇒ . . . =⇒ Mk ≡ M′ . By
β β β β
Lemma 43.10, M ≡ M1 −
→ → Mk ≡ M′ . Since −
→ ... −
→ →
→ is transitive, M −
→
→
M.′
β
Theorem 43.12. −
→
→ satisfies the Church-Rosser property.
Proof. Immediate from Theorem 43.2, Theorem 43.8, and Lemma 43.11.
βη
Theorem 43.14. M =⇒ M.
Proof. Exercise.
x ∗ βη = x (43.5)
∗ βη ∗ βη
(λx. N ) = λx. N (43.6)
( PQ)∗ βη = P∗ βη Q∗ βη if P is not a λ-abstract (43.7)
∗ βη ∗ βη ∗ βη
((λx. N ) Q) =N [Q /x ] (43.8)
(λx. Nx )∗ βη = N ∗ βη if x ∈
/ FV ( N ) (43.9)
βη βη βη
Lemma 43.16. If M =⇒ M′ and R =⇒ R′ , then M[ R/y] =⇒ M′ [ R′ /y].
βη
Proof. By induction on the derivation of M =⇒ M′ .
The first four cases are exactly like those in Lemma 43.6. If the last rule
is (5), then M is λx. Nx, M′ is N ′ for some x and N ′ where x ∈ / FV ( N ),
βη βη
and N =⇒ N ′ . We want to show that (λx. Nx )[ R/y] =⇒ N ′ [ R′ /y], i.e.,
βη
λx. N [ R/y] x =⇒ N ′ [ R′ /y]. It follows by Definition 43.13(5) and the induc-
tion hypothesis.
βη βη
Lemma 43.17. If M =⇒ M′ then M′ =⇒ M∗ βη .
βη
Proof. By induction on the derivation of M =⇒ M′ .
The first four cases are like those in Lemma 43.7. If the last rule is (5),
then M is λx. Nx and M′ is N ′ for some x, N, N ′ where x ∈ / FV ( N ) and
N =⇒ N ′ . We want to show that N ′ =⇒ (λx. Nx )∗ βη , i.e., N ′ =⇒ N ∗ βη ,
βη βη βη
43.5 βη-reduction
βη
The Church-Rosser property holds for βη-reduction (−→
→).
βη βη
Lemma 43.19. If M −→ M′ , then M =⇒ M′ .
βη β
Proof. By induction on the derivation of M −→ M′ . If M − → M′ by η-conversion
(i.e., Definition 42.43), we use Theorem 43.14. The other cases are as in Lemma 43.9.
βη βη
Lemma 43.20. If M =⇒ M′ , then M −→
→ M′ .
βη
Proof. Induction on the derivation of M =⇒ M′ .
If the last rule is (5), then M is λx. Nx and M′ is N ′ for some x, N, N ′
βη
/ FV ( N ) and N =⇒ N ′ . Thus we can first reduce λx. Nx to N by
where x ∈
βη βη
→ N′,
η-conversion, followed by the series of −→ steps that show that N −→
which holds by induction hypothesis.
βη βη
Lemma 43.21. −→
→ is the smallest transitive relation containing =⇒.
Proof. As in Lemma 43.11
βη
Theorem 43.22. −→
→ satisfies Church-Rosser property.
Proof. By Theorem 43.2, Theorem 43.18 and Lemma 43.21.
Problems
Problem 43.1. Prove Theorem 43.4.
Lambda Definability
44.1 Introduction
At first glance, the lambda calculus is just a very abstract calculus of expres-
sions that represent functions and applications of them to others. Nothing in
the syntax of the lambda calculus suggests that these are functions of partic-
ular kinds of objects, in particular, the syntax includes no mention of natural
numbers. Its basic operations—application and lambda abstractions—are op-
erations that apply to any function, not just functions on natural numbers.
Nevertheless, with some ingenuity, it is possible to define arithmetical
functions, i.e., functions on the natural numbers, in the lambda calculus. To
do this, we define, for each natural number n ∈ N, a special λ-term n, the
Church numeral for n. (Church numerals are named for Alonzo Church.)
n ≡ λ f x. f n ( x )
648
44.2. λ-DEFINABLE ARITHMETICAL FUNCTIONS
F n 0 n 1 . . . n k −1 −
→
→ f ( n 0 , n 1 , . . . , n k −1 )
A very simple example are the constant functions. The term Ck ≡ λx. k λ-
defines the function ck : N → N such that c(n) = k. For Ck n ≡ (λx. k)n − →k
for any n. The identity function is λ-defined by λx. x. More complex functions
are of course harder to define, and often require a lot of ingenuity. So it is per-
haps surprising that every computable function is λ-definable. The converse
is also true: if a function is λ-definable, it is computable.
Succ ≡ λa. λ f x. f ( a f x ).
(λa. λ f x. f ( a f x )) n −
→ λ f x. f (n f x ).
→ λ f x. f ( f n ( x )),
Succ n −
→
i.e., n + 1.
Example 44.4. Let’s look at what happens when we apply Succ to 0, i.e., λ f x. x.
We’ll spell the terms out in full:
Add ≡ λab. λ f x. a f (b f x )
or, alternatively,
The first addition works as follows: Add first accept two numbers a and b.
The result is a function that accepts f and x and returns a f (b f x ). If a and b
are Church numerals n and m, this reduces to f n+m ( x ), which is identical to
f n ( f m ( x )). Or, slowly:
(λab. λ f x. a f (b f x ))n m −
→ λ f x. n f (m f x )
→ λ f x. n f ( f m x )
−
→ λ f x. f n ( f m x ) ≡ n + m.
−
Add′ n m −
→ n Succ m.
→ Succn (m).
n Succ m −
→
And since Succ λ-defines the successor function, and the successor function
applied n times to m gives n + m, this in turn reduces to n + m.
Proof. To see how this works, suppose we apply Mult to Church numerals
n and m: Mult n m reduces to λ f x. n(m f ) x. The term m f defines a function
which applies f to its argument m times. Consequently, n(m f ) x applies the
function “apply f m times” itself n times to x. In other words, we apply f to
x, n · m times. But the resulting normal term is just the Church numeral nm.
Truth values are represented as selectors, i.e., functions that accept two ar-
guments and returning one of them. The truth value true selects its first argu-
ment, and false its second. For example, true MN always reduces to M, while
false MN always reduces to N.
β
R n1 . . . n k −
→
→ false
otherwise.
For instance, the relation IsZero = {0} which holds of 0 and 0 only, is
λ-definable by
IsZero ≡ λn. n(λx. false) true.
How does it work? Since Church numerals are defined as iterators (functions
which apply their first argument n times to the second), we set the initial value
to be true, and for every step of iteration, we return false regardless of the
result of the last iteration. This step will be applied to the initial value n times,
and the result will be true if and only if the step is not applied at all, i.e., when
n = 0.
The function “Not” accepts one argument, and returns true if the argument is
false, and false if the argument is true. The function “And” accepts two truth
values as arguments, and should return true iff both arguments are true. Truth
values are represented as selectors (described above), so when x is a truth
value and is applied to two arguments, the result will be the first argument if x
is true and the second argument otherwise. Now And takes its two arguments
x and y, and in return passes y and false to its first argument x. Assuming x is
a truth value, the result will evaluate to y if x is true, and to false if x is false,
which is just what is desired.
Note that we assume here that only truth values are used as arguments to
And. If it is passed other terms, the result (i.e., the normal form, if it exists)
may well not be a truth value.
Lemma 44.9. The basic primitive recursive functions zero, succ, and projections Pin
are λ-definable.
Zero ≡ λa. λ f x. x
Succ ≡ λa. λ f x. f ( a f x )
Projin ≡ λx0 . . . xn−1 . xi
Lemma 44.10. Suppose the k-ary function f , and n-ary functions g0 , . . . , gk−1 , are
λ-definable by terms F, G0 , . . . , Gk , and h is defined from them by composition. Then
H is λ-definable.
Note that Lemma 44.10 did not require that f and g0 , . . . , gk−1 are primitive
recursive; it is only required that they are total and λ-definable.
Lemma 44.11. Suppose f is an n-ary function and g is an n + 2-ary function, they
are λ-definable by terms F and G, and the function h is defined from f and g by
primitive recursion. Then h is also λ-definable.
Proof. Recall that h is defined by
h ( x1 , . . . , x n , 0) = f ( x1 , . . . , x n )
h( x1 , . . . , xn , y + 1) = h( x1 , . . . , xn , y, h( x1 , . . . , xn , y)).
Informally speaking, the primitive recursive definition iterates the application
of the function h y times and applies it to f ( x1 , . . . , xn ). This is reminiscent of
the definition of Church numerals, which is also defined as a iterator.
For simplicity, we give the definition and proof for a single additional ar-
gument x. The function h is λ-defined by:
H ≡λx. λy. Snd(yD ⟨0, Fx ⟩)
where
Proof. By Lemma 44.9, all basic functions are λ-definable, and by Lemma 44.10
and Lemma 44.11, the λ-definable functions are closed under composition and
primitive recursion.
44.6 Fixpoints
Suppose we wanted to define the factorial function by recursion as a term Fac
with the following property:
only involves previously defined terms in the right-hand side, such as Add.
We can always remove Add by replacing it with its defining term. This would
give the term Mult as a pure lambda term; if Add itself involved defined terms
(as, e.g., Add′ does), we could continue this process and finally arrive at a pure
lambda term.
However this is not true in the case of recursive definitions like the one of
Fac above. If we replace the occurrence of Fac on the right-hand side with the
definition of Fac itself, we get:
and we still haven’t gotten rid of Fac on the right-hand side. Clearly, if we
repeat this process, the definition keeps growing longer and the process never
results in a pure lambda term. Thus this way of defining factorial (or more
generally recursive functions) is not feasible.
The recursive definition does tell us something, though: If f were a term
representing the factorial function, then the term
applied to the term f , i.e., Fac′ f , also represents the factorial function. That is,
if we regard Fac′ as a function accepting a function and returning a function,
the value of Fac′ f is just f , provided f is the factorial. A function f with the
β
property that Fac′ f = f is called a fixpoint of Fac′ . So, the factorial is a fixpoint
of Fac′ .
There are terms in the lambda calculus that compute the fixpoints of a
given term, and these terms can then be used to turn a term like Fac′ into the
definition of the factorial.
Yg −
→
→ g(Yg)
−
→→ g( g(Yg))
−
→→ g( g( g(Yg)))
...
Note that the above sequence of β-reduction steps starting with Yg is infi-
nite. So if we apply Yg to some term, i.e., consider (Yg) N, that term will also
reduce to infinitely many different terms, namely ( g(Yg)) N, ( g( g(Yg))) N,
. . . . It is nevertheless possible that some other sequence of reduction steps
does terminate in a normal form.
Take the factorial for instance. Define Fac as Y Fac′ (i.e., a fixpoint of Fac′ ).
Then:
→ Y Fac′ 3
Fac 3 −
→
→ Fac′ (Y Fac′ ) 3
−
→
≡ (λx. λn. IsZero n 1 (Mult n ( x (Pred n)))) Fac 3
−
→
→ IsZero 3 1 (Mult 3 (Fac(Pred 3)))
−
→
→ Mult 3 (Fac 2).
Similarly,
Fac 2 −
→
→ Mult 2 (Fac 1)
Fac 1 −
→
→ Mult 1 (Fac 0)
but
→ Fac′ (Y Fac′ ) 0
Fac 0 −
→
≡ (λx. λn. IsZero n 1 (Mult n ( x (Pred n)))) Fac 0
−
→
→ IsZero 0 1 (Mult 0 (Fac(Pred 0))).
−
→
→ 1.
So together
Fac 3 −
→
→ Mult 3 (Mult 2 (Mult 1 1)).
What goes for Fac′ goes for any recursive definition. Suppose we have a
recursive equation
β
g x1 . . . x n = N
G ≡ (Yλg. λx1 . . . xn . N ) −
→
→ λg. λx1 . . . xn . N (Yλg. λx1 . . . xn . N )
≡ (λg. λx1 . . . xn . N ) G
and consequently
G x1 . . . x n −
→
→ (λg. λx1 . . . xn . N ) G x1 . . . xn
−
→
→ (λx1 . . . xn . N [ G/g]) x1 . . . xn
−
→
→ N [ G/g].
β
Church’s combinator is a bit weaker than Turing’s in that Yg = g(Yg) but not
β
Yg −
→
→ g(Yg). Let V be the term λx. g( xx ), so that YC ≡ λg. VV. Then
VV ≡ (λx. g( xx ))V −
→
→ g(VV ) and thus
YC g ≡ (λg. VV ) g −
→
→ VV −
→
→ g(VV ), but also
g(YC g) ≡ g((λg. VV ) g) −
→
→ g(VV ).
β
In other words, YC g and g(YC g) reduce to a common term g(VV ); so YC g =
g(YC g). This is often enough for applications.
44.7 Minimization
The general recursive functions are those that can be obtained from the ba-
sic functions zero, succ, Pin by composition, primitive recursion, and regular
minimization. To show that all general recursive functions are λ-definable we
have to show that any function defined by regular minimization from a λ-
definable function is itself λ-definable.
g( x1 , . . . , xk ) = µy f ( x1 , . . . , xk , y) = 0
is also λ-definable.
Proof. Suppose the lambda term F λ-defines the regular function f (⃗x, y). To
λ-define h we use a search function and a fixpoint combinator:
otherwise call itself with Succ y. Thus (Y Search) Fn1 . . . nk 0 returns the least m
for which f (n1 , . . . , nk , m) = 0.
Specifically, observe that
(Y Search) Fn1 . . . nk m −
→
→m
if f (n1 , . . . , nk , m) = 0, or
−
→
→ (Y Search) F n1 . . . nk m + 1
(Y Search) Fn1 . . . nk 0 −
→
→ h ( n1 , . . . , n k ).
Proof. By Lemma 44.9, all basic functions are λ-definable, and by Lemma 44.10,
Lemma 44.11, and Lemma 44.15, the λ-definable functions are closed under
composition, primitive recursion, and regular minimization.
Proof. We only sketch the proof. First, we arithmetize λ-terms, i.e., systema-
tially assign Gödel numbers to λ-terms, using the usual power-of-primes cod-
ing of sequences. Then we define a partial recursive function normalize(t)
operating on the Gödel number t of a lambda term as argument, and which
returns the Gödel number of the normal form if it has one, or is undefined oth-
erwise. Then define two partial recursive functions toChurch and fromChurch
that maps natural numbers to and from the Gödel numbers of the correspond-
ing Church numeral.
Using these recursive functions, we can define the function f as a par-
tial recursive function. There is a λ-term F that λ-defines f . To compute
f (n1 , . . . , nk ), first obtain the Gödel numbers of the corresponding Church nu-
merals using toChurch(ni ), append these to # F# to obtain the Gödel number of
the term Fn1 . . . nk . Now use normalize on this Gödel number. If f (n1 , . . . , nk )
is defined, Fn1 . . . nk has a normal form (which must be a Church numeral),
and otherwise it has no normal form (and so
normalize(# Fn1 . . . nk # )
Problems
Problem 44.1. The term
Succ′ ≡ λn. λ f x. n f ( f x )
Problem 44.3. Explain why the access functions Fst and Snd work.
Problem 44.4. Define the functions Or and Xor representing the truth func-
tions of inclusive and exclusive disjunction using the encoding of truth values
as λ-terms.
Problem 44.5. Complete the proof of Lemma 44.10 by showing that Hn0 . . . nn−1 −
→
→
h ( n 0 , . . . , n n −1 ).
Many-valued Logic
662
44.9. λ-DEFINABLE FUNCTIONS ARE RECURSIVE
45.1 Introduction
In classical logic, we deal with formulas that are built from propositional vari-
ables using the propositional connectives ¬, ∧, ∨, →, and ↔. When we define
a semantics for classical logic, we do so using the two truth values T and F.
We interpret propositional variables in a valuation v, which assigns these truth
values T, F to the propositional variables. Any valuation then determines a
truth value v( φ) for any formula φ, and A formula is satisfied in a valuation v,
v ⊨ φ, iff v( φ) = T.
Many-valued logics are generalizations of classical two-valued logic by
allowing more truth values than just T and F. So in many-valued logic, a val-
uation v is a function assigning to every propositional variable p one of a
range of possible truth values. We’ll generally call the set of allowed truth
values V. Classical logic is a many-valued logic where V = {T, F}, and the
truth value v( φ) is computed using the familiar characteristic truth tables for
the connectives.
Once we add additional truth values, we have more than one natural op-
tion for how to compute v( φ) for the connectives we read as “and,” “or,”
“not,” and “if—then.” So a many-valued logic is determined not just by the
set of truth values, but also by the truth functions we decide to use for each
connective. Once these are selected for a many-valued logic L, however, the
truth value vL ( φ) is uniquely determined by the valuation, just like in classical
logic. Many-valued logics, like classical logic, are truth functional.
With this semantic building blocks in hand, we can go on to define the
analogs of the semantic concepts of tautology, entailment, and satisfiability.
In classical logic, a formula is a tautology if its truth value v( φ) = T for any v.
In many-valued logic, we have to generalize this a bit as well. First of all,
there is no requirement that the set of truth values V contains T. For instance,
some many-valued logics use numbers, such as all rational numbers between
0 and 1 as their set of truth values. In such a case, 1 usually plays the rule of T.
664
45.2. LANGUAGES AND CONNECTIVES
In other logics, not just one but several truth values do. So, we require that
every many-valued logic have a set V + of designated values. We can then say
that a formula is satisfied in a valuation v, v ⊨L φ, iff vL ( φ) ∈ V + . A formula φ
is a tautology of the logic, ⊨L φ, iff v( φ) ∈ V + for any v. And, finally, we say
that φ is entailed by a set of formulas, Γ ⊨L φ, if every valuation that satisfies
all the formulas in Γ also satisfies φ.
45.3 Formulas
Definition 45.3 (Formula). The set Frm(L) of formulas of a propositional lan-
guage L is defined inductively as follows:
45.4 Matrices
A many-valued logic is defined by its language, its set of truth values V, a sub-
set of designated truth values, and truth functions for its connective. Together,
these elements are called a matrix.
⋆ : V n → V. If n = 0,
4. for each n-place connective ⋆ in L, a truth function e
⋆ is just an element of V.
then e
4. For ⊥, we have ⊥ e = F. The other truth functions are given by the usual
truth tables (see Figure 45.1).
¬
e ∧
e T F ∨
e T F →
f T F
T F T T F T T T T T F
F T F F F F T F F T T
Definition 45.8. Given a valuation v into the set of truth values V of a many-
valued logic L, define the evaluation function v : Frm(L) → V inductively
by:
1. v(pn ) = v(pn );
v(⋆( φ1 , . . . , φn )) = e
⋆L (v( φ1 ), . . . , v( φn )).
We have some of the same facts for these notions as we do for the case of
classical logic:
Proof. Exercise.
In classical logic we can connect entailment and the conditional. For in-
stance, we have the validity of modus ponens: If Γ ⊨ φ and Γ ⊨ φ → ψ then
Γ ⊨ ψ. Another important relationship between ⊨ and → in classical logic is
the semantic deduction theorem: Γ ⊨ φ → ψ if and only if Γ ∪ { φ} ⊨ ψ. These
results do not always hold in many-valued logics. Whether they do depends
on the truth function →
f.
1. ¬ e C ( x ) if x = T or x = F;
e L (x) = ¬
2. ∧
e L ( x, y) = ∧
e C ( x, y),
3. ∨
e L ( x, y) = ∨
e C ( x, y),
Then, for any valuation v into V such that v( p) ∈ {T, F}, vL ( φ) = vC ( φ).
Proof. By induction on φ.
2. If φ ≡ ¬ B, we have
vL ( φ ) = ¬
e L (vL (ψ)) by Definition 45.8
=¬
e L (vC (ψ)) by inductive hypothesis
=¬
e C (vC (ψ)) by assumption (1),
since vC (ψ) ∈ {T, F},
= vC ( φ ) by Definition 45.8.
3. If φ ≡ (ψ ∧ χ), we have
vL ( φ ) = ∧
e L (vL (ψ), vL (χ)) by Definition 45.8
=∧e L (vC (ψ), vC (χ)) by inductive hypothesis
=∧
e C (vC (ψ), vC (χ)) by assumption (2),
since vC (ψ), vC (χ) ∈ {T, F},
= vC ( φ ) by Definition 45.8.
Problems
Problem 45.1. Prove Proposition 45.11
Three-valued Logics
46.1 Introduction
If we just add one more value U to T and F, we get a three-valued logic. Even
though there is only one more truth value, the possibilities for defining the
truth-functions for ¬, ∧, ∨, and → are quite numerous. Then a logic might
use any combination of these truth functions, and you also have a choice of
making only T designated, or both T and U.
We present here a selection of the most well-known three-valued logics,
their motivations, and some of their properties.
670
46.2. ŁUKASIEWICZ LOGIC
The other values (if the arguments are settled truth values, T or F, are like in
classical logic.
For the conditional, the situation is a little trickier. Suppose q is a future
contingent statement. If p is false, then p → q will be true, regardless of how
q turns out, so we should set →f (F, U) = T. And if p is true, then q → p will
be true, regardless of what q turns out to be, so → f (U, T) = T. If p is true,
then p → q might turn out to be true or false, so → f (T, U) = U. Similarly, if p
is false, then q → p might turn out to be true or false, so → f (U, F) = U. This
leaves the case where p and q are both future contingents. On the basis of the
motivation, we should really assign U in this case. However, this would make
φ → φ not a tautology. Łukasiewicz had not trouble giving up φ ∨ ¬ φ and
¬( φ ∧ ¬ φ), but balked at giving up φ → φ. So he stipulated → f (U, U) = T.
Definition 46.1. Three-valued Łukasiewicz logic is defined using the matrix:
1. The standard propositional language L0 with ¬, ∧, ∨, →.
2. The set of truth values V = {T, U, F}.
3. T is the only designated value, i.e., V + = {T}.
4. Truth functions are given by the following tables:
¬
e ∧
e Ł3 T U F
T F T T U F
U U U U U F
F T F F F F
1 Łukasiewicz here uses “possible” in a way that is uncommon today, namely to mean possible
∨
e Ł3 T U F →
f Ł3 T U F
T T T T T T U F
U T U U U T T U
F T U F F T T T
p q ¬ p → (p → q)
T T F T T T T T
T U F T T T U U
T F F T T T F F
U T U U T U T T
U U U U T U T U
U F U U T U U F
F T T F T F T T
F U T F T F T U
F F T F T F T F
One might therefore perhaps think that although not all classical tautolo-
gies are tautologies in Ł3 , they should at least take either the value T or the
value U on every valuation. This is not the case. A counterexample is given
by
¬( p → ¬ p) ∨ ¬(¬ p → p)
which is F if p is U.
Łukasiewicz hoped to build a logic of possibility on the basis of his three-
valued system, by introducing a one-place connective ♢φ (for “φ is possible”)
and a corresponding □φ (for “φ is necessary”):
♢
e □
e
T T T T
U T U F
F F F F
In other words, p is possible iff it is not already settled as false; and p is nec-
essary iff it is already settled as true.
However, the shortcomings of this proposed modal logic soon became ev-
ident: However things turn out, p ∧ ¬ p can never turn out to be true. So even
if it is not now settled (and therefore undetermined), it should count as im-
possible, i.e., ¬♢( p ∧ ¬ p) should be a tautology. However, if v( p) = U, then
v(¬♢( p ∧ ¬ p)) = U. Although Łukasiewicz was correct that two truth values
will not be enough to accommodate modal distinctions such as possiblity and
necessity, introducing a third truth value is also not enough.
¬
e ∧
e Ks T U F
T F T T U F
U U U U U F
F T F F F F
∨
e Ks T U F →
f Ks T U F
T T T T T T U F
U T U U U T U U
F T U F F T T T
¬
e ∧
e Kw T U F
T F T T U F
U U U U U U
F T F F U F
∨
e Kw T U F →
f Kw T U F
T T U T T T U F
U U U U U U U U
F T U F F T U T
e (U) = ∨
¬ e (U, U) = ∧
e (U, U) = →
f (U, U) = U
in both logics. As U ∈
/ V + for either Ks or Kw, on this valuation, φ will not
be designated.
Although both weak and strong Kleene logic have no tautologies, they
have non-trivial consequence relations.
Dmitry Bochvar interpreted U as “meaningless” and attempted to use it
to solve paradoxes such as the Liar paradox by stipulating that paradoxical
sentences take the value U. He introduced a logic which is essentially weak
Kleene logic extended by additional connectives, two of which are “external
negation” and the “is undefined” operator:
∼
e +
e
T F T F
U T U T
F T F F
¬
eG ∧
eG T U F
T F T T U F
U F U U U F
F T F F F F
∨
eG T U F →
fG T U F
T T T T T T U F
U T U U U T T F
F T U F F T T T
You’ll notice that the truth tables for ∧ and ∨ are the same as in Łukasiewicz
and strong Kleene logic, but the truth tables for ¬ and → differ for each.
In Gödel logic, ¬ e (U) = F. In contrast to Łukasiewicz logic and Kleene
f (U, F) = F; in contrast to Kleene logic (but as in Łukasiewicz logic),
logic, →
f (U, U) = T.
→
As the connection to intuitionistic logic alluded to above suggests, G3 is
close to intuitionistic logic. All intuitionistic truths are tautologies in G3 , and
many classical tautologies that are not valid intuitionistically also fail to be
tautologies in G3 . For instance, the following are not tautologies:
p ∨ ¬p ( p → q) → (¬ p ∨ q)
¬¬ p → p ¬( p ∧ q) → (¬ p ∨ ¬q)
(( p → q) → p) → p
However, not every tautology of G3 is also intuitionistically valid, e.g., ( p →
q ) ∨ ( q → p ).
Definition 46.8. Halldén’s logic of nonsense Hal is defined using the matrix:
4. Truth functions are the same as weak Kleene logic, plus the “is mean-
ingless” operator:
+
e
T F
U T
F F
By contrast to the Kleene logics with which they share truth tables, these
do have tautologies.
Proposition 46.9. The tautologies of LP are the same as the tautologies of classical
propositional logic.
2. φ ≡ ¬ψ.
3. φ ≡ (ψ ∧ χ).
The other two cases are similar, and left as exercises. Alternatively, the
proof above establishes the result for all formulas only containing ¬ and ∧.
One may now appeal to the facts that in both Ks and C, for any v, v(ψ ∨ χ) =
v(¬(¬ψ ∧ ¬χ)) and v(ψ → χ) = v(¬(ψ ∧ ¬χ)).
Although they have the same tautologies as classical logic, their conse-
quence relations are different. LP, for instance, is paraconsistent in that ¬ p, p ⊭
q, and so the principle of explosion ¬ φ, φ ⊨ ψ does not hold in general. (It
holds for some cases of φ and ψ, e.g., if ψ is a tautology.)
What if you make U designated in Ł3 ?
Definition 46.10. The logic 3-valued R-Mingle RM3 is defined using the ma-
trix:
Different truth tables can sometimes generate the same logic (entailment
relation) just by changing the designated values. E.g., this happens if in Gödel
logic we take V + = {T, U} instead of {T}.
Proposition 46.11. The matrix with V = {F, U, T}, V + = {T, U}, and the truth
functions of 3-valued Gödel logic defines classical logic.
Proof. Exercise.
Problems
Problem 46.1. Suppose we define v( φ ↔ ψ) = v(( φ → ψ) ∧ (ψ → φ)) in Ł3 .
What truth table would ↔ have?
1. p → (q → p)
2. ¬( p ∧ q) ↔ (¬ p ∨ ¬q)
3. ¬( p ∨ q) ↔ (¬ p ∧ ¬q)
Problem 46.3. Show that the following classical tautologies are not tautolo-
gies in Ł3 :
1. (¬ p ∧ p) → q)
2. (( p → q) → p) → p
3. ( p → ( p → q)) → ( p → q)
1. p, p → q ⊨ q
2. ¬¬ p ⊨ p
3. p ∧ q ⊨ p
4. p ⊨ p ∧ p
5. p ⊨ p ∨ q
Problem 46.6. Which of the following relations hold in (a) strong and (b) weak
Kleene logic? Give a truth table for each.
1. p, p → q ⊨ q
2. p ∨ q, ¬ p ⊨ q
3. p ∧ q ⊨ p
4. p ⊨ p ∧ p
5. p ⊨ p ∨ q
Problem 46.7. Can you define ∼ in Bochvar’s logic in terms of ¬ and +, i.e.,
find a formula with only the propositional variable p and not involving ∼
which always takes the same truth value as ∼ p? Give a truth table to show
you’re right.
Problem 46.9. Give truth tables that show that the following are not tautolo-
gies of G3
( p → q) → (¬ p ∨ q)
¬( p ∧ q) → (¬ p ∨ ¬q)
(( p → q) → p) → p
Problem 46.10. Which of the following relations hold in Gödel logic? Give a
truth table for each.
1. p, p → q ⊨ q
2. p ∨ q, ¬ p ⊨ q
3. p ∧ q ⊨ p
4. p ⊨ p ∧ p
5. p ⊨ p ∨ q
Problem 46.11. Complete the proof Proposition 46.9, i.e., establish (a) and (b)
for the cases where φ ≡ (ψ ∨ χ) and φ ≡ (ψ → χ).
Problem 46.13. Which of the following relations hold in (a) LP and in (b) Hal?
Give a truth table for each.
1. p, p → q ⊨ q
2. ¬q, p → q ⊨ ¬ p
3. p ∨ q, ¬ p ⊨ q
4. ¬ p, p ⊨ q
5. p ⊨ p ∨ q
6. p → q, q → r ⊨ p → r
1. p, p → q ⊨ q
2. p ∨ q, ¬ p ⊨ q
3. ¬ p, p ⊨ q
4. p ⊨ p ∨ q
Problem 46.15. Prove Proposition 46.11 by showing that for the logic L de-
fined just like Gödel logic but with V + = {T, U}, if Γ ⊭L ψ then Γ ⊭C ψ. Use
the ideas of Proposition 46.9, except instead of proving properties (a) and (b),
show that vG ( φ) = F iff v′ C ( φ) = F (and hence that vG ( φ) ∈ {T, U} iff
v′ C ( φ) = T). Explain why this establishes the proposition.
Infinite-valued Logics
47.1 Introduction
The number of truth values of a matrix need not be finite. An obvious choice
for a set of infinitely many truth values is the set of rational numbers between
0 and 1, V∞ = [0, 1] ∩ Q, i.e.,
n
V∞ = { : n, m ∈ N and n ≤ m}.
m
When considering this infinite truth value set, it is often useful to also consider
the subsets
n
Vm = { : n ∈ N and n ≤ m}
m−1
1 1 3
V5 = {0, , , , 1}.
4 2 4
In logics based on these truth value sets, usually only 1 is designated, i.e.,
V + = {1}. In other words, we let 1 play the role of (absolute) truth, 0 as
absolute falsity, but formulas may take any intermediate value in V.
One can also consider the set V[0,1] = [0, 1] of all real numbers between 0
and 1, or other infinite subsets of [0, 1], however. Logics with this truth value
set are often called fuzzy.
681
CHAPTER 47. INFINITE-VALUED LOGICS
Proposition 47.2. The logic Ł3 defined by Definition 46.1 is the same as Ł3 defined
by Definition 47.1.
Proof. This can be seen by comparing the truth tables for the connectives given
in Definition 46.1 with the truth tables determined by the equations in Defini-
tion 47.1:
¬
e ∧
e Ł3 1 1/2 0
1 0 1 1 1/2 0
1/2 1/2 1/2 1/2 1/2 0
0 1 0 0 0 0
∨
e Ł3 1 1/2 0 →
f Ł3 1 1/2 0
1 1 1 1 1 1 1/2 0
1/2 1 1/2 1/2 1/2 1 1 1/2
0 1 1/2 0 0 1 1 1
Proof. Exercise.
⊥
e =0
(
1 if x = 0
¬
e G (x) =
0 otherwise
∧
e G ( x, y) = min( x, y)
∨
e G ( x, y) = max( x, y)
(
1 if x ≤ y
→
f G ( x, y) =
y otherwise.
Proposition 47.5. The logic G3 defined by Definition 46.6 is the same as G3 defined
by Definition 47.4.
Proof. This can be seen by comparing the truth tables for the connectives given
in Definition 46.6 with the truth tables determined by the equations in Defini-
tion 47.4:
¬
e G3 ∧
eG 1 1/2 0
1 0 1 1 1/2 0
1/2 0 1/2 1/2 1/2 0
0 1 0 0 0 0
∨
eG 1 1/2 0 →
fG 1 1/2 0
1 1 1 1 1 1 1/2 0
1/2 1 1/2 1/2 1/2 1 1 0
0 1 1/2 0 0 1 1 1
Proof. Exercise.
p ∨ ¬p ( p → q) → (¬ p ∨ q)
¬¬ p → p ¬( p ∧ q) → (¬ p ∨ ¬q)
(( p → q) → p) → p
Problems
Problem 47.1. Prove Proposition 47.3.
Sequent Calculus
48.1 Introduction
The sequent calculus for classical logic is an efficient and simple derivation
system. If a many-valued logic is defined by a matrix with finitely many truth
values, i.e., V is finite, it is possible to provide a sequent calculus for it. The
idea for how to do this comes from considering the meanings of sequents and
the form of inference rules in the classical case.
Now recall that a sequent
φ1 , . . . , φn ⇒ ψ1 , . . . , ψn
( φ1 ∧ · · · ∧ φm ) → (ψ1 ∨ · · · ∨ ψn )
⇒ φ ψ ⇒ φ ⇒ ψ
φ→ψ ⇒
→L ⇒ φ → ψ →R
685
CHAPTER 48. SEQUENT CALCULUS
φ, ψ, Γ ⇒ ∆ Γ ⇒ ∆, φ, ψ
∧L ∨R
φ ∧ ψ, Γ ⇒ ∆ Γ ⇒ ∆, φ ∨ ψ
This basic idea, applied to an n-valued logic, then results in a sequent cal-
culus with n instead of two places, one for each truth value. For a three-valued
logic with V = {F, U, T}, a sequent is an expression Γ | Π | ∆. It is satisfied
in a valuation v iff either v( φ) = F for some φ ∈ Γ or v( φ) = T for some
φ ∈ ∆ or v( φ) = U for some φ ∈ Π. Consequently, initial sequents φ | φ | φ
are always satisfied.
For each connective of an n-valued logic L, there is a logical rule for each
truth value that this connective can take in L. Derivations in an n-sided se-
quent calculus for L are trees of sequents, where the topmost sequents are
initial sequents, and if a sequent stands below one or more other sequents, it
must follow correctly by a rule of inference for the connectives of L.
Definition 48.3 (Theorems). A sentence φ is a theorem of an n-valued logic L
if there is a derivation of the n-sequent containing φ in each position corre-
sponding to a designated truth value of L. We write ⊢L φ if φ is a theorem
and ⊬L φ if it is not.
Γ1 | . . . | Γi | . . . | Γn
Wi
Γ1 | . . . | φ, Γi | . . . | Γn
Γ1 | . . . | φ, φ, Γi | . . . | Γn
Ci
Γ1 | . . . | φ, Γi | . . . | Γn
Γ1 | . . . | Γi , φ, ψ, Γi′ | . . . | Γn
Xi
Γ1 | . . . | Γi , ψ, φ, Γi′ | . . . | Γn
Γ1 | . . . | φ, Γi | . . . | Γn ∆ 1 | . . . | φ, ∆ j | . . . | ∆ n
Cuti, j
Γ1 , ∆ 1 | . . . | Γn , ∆ n
Rules for ¬
The following rules for ¬ apply to Łukasiewicz and Kleene logics, and their
variants.
Γ | Π | ∆, φ
¬F
¬ φ, Γ | Π | ∆
Γ | φ, Π | ∆
¬U
Γ | ¬ φ, Π | ∆
φ, Γ | Π | ∆
¬T
Γ | Π | ∆, ¬ φ
Γ | φ, Π | ∆, φ φ, Γ | Π | ∆
¬G F ¬G T
¬ φ, Γ | Π | ∆ Γ | Π | ∆, ¬ φ
(In Gödel logic, ¬ φ can never take the value U, so there is no rule for the
middle position.)
Rules for ∧
These are the rules for ∧ in Łukasiewicz, strong Kleene, and Gödel logic.
φ, ψ, Γ | Π | ∆
∧F
φ ∧ ψ, Γ | Π | ∆
Γ | φ, Π | φ, ∆ Γ | ψ, Π | ψ, ∆ Γ | φ, ψ, Π | ∆
∧U
Γ | φ ∧ ψ, Π | ∆
Γ | Π | ∆, φ Γ | Π | ∆, ψ
∧T
Γ | Π | ∆, φ ∧ ψ
Rules for ∨
These are the rules for ∨ in Łukasiewicz, strong Kleene, and Gödel logic.
φ, Γ | Π | ∆ ψ, Γ | Π | ∆
∨F
φ ∨ ψ, Γ | Π | ∆
φ, Γ | φ, Π | ∆ ψ, Γ | ψ, Π | ∆ Γ | φ, ψ, Π | ∆
∨U
Γ | φ ∨ ψ, Π | ∆
Γ | Π | ∆, φ, ψ
∨T
Γ | Π | ∆, φ ∨ ψ
Rules for →
These are the rules for → in Łukasiewicz logic.
Γ | Π | ∆, φ ψ, Γ | Π | ∆
→ Ł3 F
φ → ψ, Γ | Π | ∆
Γ | φ, ψ, Π | ∆ ψ, Γ | Π | ∆, φ
→Ł3 U
Γ | φ → ψ, Π | ∆
φ, Γ | ψ, Π | ∆, ψ φ, Γ | φ, Π | ∆, ψ
→Ł3 T
Γ | Π | ∆, φ → ψ
Γ | Π | ∆, φ ψ, Γ | Π | ∆
→Ks F
φ → ψ, Γ | Π | ∆
ψ, Γ | ψ, Π | ∆ Γ | φ, ψ, Π | ∆ Γ | φ, Π | ∆, φ
→Ks U
Γ | φ → ψ, Π | ∆
φ, Γ | Π | ∆, ψ
→Ks T
Γ | Π | ∆, φ → ψ
Γ | φ, Π | ∆, φ ψ, Γ | Π | ∆
→ G3 F
φ → ψ, Γ | Π | ∆
Γ | ψ, Π | ∆ Γ | Π | ∆, φ
→ G3 U
Γ | φ → ψ, Π | ∆
φ, Γ | ψ, Π | ∆, ψ φ, Γ | φ, Π | ∆, ψ
→ G3 T
Γ | Π | ∆, φ → ψ
691
Part XI
692
48.4. PROPOSITIONAL RULES FOR SELECTED LOGICS
49.1 Introduction
Modal logic deals with modal propositions and the entailment relations among
them. Examples of modal propositions are the following:
1. It is necessary that 2 + 2 = 4.
Possibility and necessity are not the only modalities: other unary connectives
are also classified as modalities, for instance, “it ought to be the case that φ,”
“It will be the case that φ,” “Dana knows that φ,” or “Dana believes that φ.”
Modal logic makes its first appearance in Aristotle’s De Interpretatione: he
was the first to notice that necessity implies possibility, but not vice versa; that
possibility and necessity are inter-definable; that If φ ∧ ψ is possibly true then
φ is possibly true and ψ is possibly true, but not conversely; and that if φ → ψ
is necessary, then if φ is necessary, so is ψ.
The first modern approach to modal logic was the work of C. I. Lewis, cul-
minating with Lewis and Langford, Symbolic Logic (1932). Lewis & Langford
were unhappy with the representation of implication by means of the material
conditional: φ → ψ is a poor substitute for “φ implies ψ.” Instead, they pro-
posed to characterize implication as “Necessarily, if φ then ψ,” symbolized
as φ J ψ. In trying to sort out the different properties, Lewis identified five
different modal systems, S1, . . . , S4, S5, the last two of which are still in use.
The approach of Lewis and Langford was purely syntactical: they identi-
fied reasonable axioms and rules and investigated what was provable with
those means. A semantic approach remained elusive for a long time, until a
first attempt was made by Rudolf Carnap in Meaning and Necessity (1947) us-
ing the notion of a state description, i.e., a collection of atomic sentences (those
694
49.2. THE LANGUAGE OF BASIC MODAL LOGIC
that are “true” in that state description). After lifting the truth definition to
arbitrary sentences φ, Carnap defines φ to be necessarily true if it is true in all
state descriptions. Carnap’s approach could not handle iterated modalities, in
that sentences of the form “Possibly necessarily . . . possibly φ” always reduce
to the innermost modality.
The major breakthrough in modal semantics came with Saul Kripke’s arti-
cle “A Completeness Theorem in Modal Logic” (JSL 1959). Kripke based his
work on Leibniz’s idea that a statement is necessarily true if it is true “at all
possible worlds.” This idea, though, suffers from the same drawbacks as Car-
nap’s, in that the truth of statement at a world w (or a state description s) does
not depend on w at all. So Kripke assumed that worlds are related by an ac-
cessibility relation R, and that a statement of the form “Necessarily φ” is true at
a world w if and only if φ is true at all worlds w′ accessible from w. Semantics
that provide some version of this approach are called Kripke semantics and
made possible the tumultuous development of modal logics (in the plural).
When interpreted by the Kripke semantics, modal logic shows us what re-
lational structures look like “from the inside.” A relational structure is just a set
equipped with a binary relation (for instance, the set of students in the class
ordered by their social security number is a relational structure). But in fact re-
lational structures come in all sorts of domains: besides relative possibility of
states of the world, we can have epistemic states of some agent related by epis-
temic possibility, or states of a dynamical system with their state transitions,
etc. Modal logic can be used to model all of these: the first gives us ordinary,
alethic, modal logic; the others give us epistemic logic, dynamic logic, etc.
We focus on one particular angle, known to modal logicians as “corre-
spondence theory.” One of the most significant early discoveries of Kripke’s
is that many properties of the accessibility relation R (whether it is transitive,
symmetric, etc.) can be characterized in the modal language itself by means
of appropriate “modal schemas.” Modal logicians say, for instance, that the
reflexivity of R “corresponds” to the schema “If necessarily φ, then φ”. We
explore mainly the correspondence theory of a number of classical systems of
modal logic (e.g., S4 and S5) obtained by a combination of the schemas D, T,
B, 4, and 5.
Definition 49.2. Formulas of the basic modal language are inductively defined
as follows:
1. ⊥ is an atomic formula.
1. ⊤ abbreviates ¬⊥.
2. φ ↔ ψ abbreviates ( φ → ψ) ∧ (ψ → φ).
♢( p2 → p3 ) → □(♢( p2 → p3 ) ∧ ¬□p1 )
¬□p1 → □(¬□p1 ∧ ♢( p2 → p3 ))
Note that simultaneous substitution is in general not the same as iterated sub-
stitution, e.g., compare φ[θ1 /p1 , θ2 /p2 ] above with ( φ[θ1 /p1 ])[θ2 /p2 ], which
is:
p
w2
q
p
w1
¬q
¬p
w3
¬q
Definition 49.6. A model for the basic modal language is a triple M = ⟨W, R, V ⟩,
where
1. φ ≡ ⊥: Never M, w ⊩ ⊥.
2. M, w ⊩ p iff w ∈ V ( p).
3. φ ≡ ¬ψ: M, w ⊩ φ iff M, w ⊮ ψ.
6. φ ≡ (ψ → χ): M, w ⊩ φ iff M, w ⊮ ψ or M, w ⊩ χ.
2. M, w ⊩ ♢φ iff M, w ⊩ ¬□¬ φ.
2. Exercise.
49.7 Validity
Formulas that are true in all models, i.e., true at every world in every model,
are particularly interesting. They represent those modal propositions which
are true regardless of how □ and ♢ are interpreted, as long as the interpreta-
tion is “normal” in the sense that it is generated by some accessibility relation
on possible worlds. We call such formulas valid. For instance, □( p ∧ q) → □p
is valid. Some formulas one might expect to be valid on the basis of the alethic
interpretation of □, such as □p → p, are not valid, however. Part of the interest
of relational models is that different interpretations of □ and ♢ can be captured
by different kinds of accessibility relations. This suggests that we should de-
fine validity not just relative to all models, but relative to all models of a certain
kind. It will turn out, e.g., that □p → p is true in all models where every world
is accessible from itself, i.e., R is reflexive. Defining validity relative to classes
of models enables us to formulate this succinctly: □p → p is valid in the class
of reflexive models.
Proof. By induction on φ.
1. φ ≡ ⊥: Both v ⊭ ⊥ and M, w ⊮ ⊥.
2. φ ≡ pi :
v ⊨ p i ⇔ v( p i ) = T
by definition of v ⊨ pi
⇔ M, w ⊩ θi
by assumption
⇔ M, w ⊩ pi [θ1 /p1 , . . . , θn /pn ]
since pi [θ1 /p1 , . . . , θn /pn ] ≡ θi .
3. φ ≡ ¬ψ:
v ⊨ ¬ψ ⇔ v ⊭ ψ
by definition of v ⊨;
⇔ M, w ⊮ ψ[θ1 /p1 , . . . , θn /pn ]
by induction hypothesis
⇔ M, w ⊩ ¬ψ[θ1 /p1 , . . . , θn /pn ]
by definition of v ⊨.
4. φ ≡ (ψ ∧ χ):
v ⊨ ψ ∧ χ ⇔ v ⊨ ψ and v ⊨ χ
by definition of v ⊨
⇔ M, w ⊩ ψ[θ1 /p1 , . . . , θn /pn ] and
M, w ⊩ χ[θ1 /p1 , . . . , θn /pn ]
by induction hypothesis
⇔ M, w ⊩ (ψ ∧ χ)[θ1 /p1 , . . . , θn /pn ]
by definition of M, w ⊩.
5. φ ≡ (ψ ∨ χ):
v ⊨ ψ ∨ χ ⇔ v ⊨ ψ or v ⊨ χ
by definition of v ⊨;
⇔ M, w ⊩ ψ[θ1 /p1 , . . . , θn /pn ] or
M, w ⊩ χ[θ1 /p1 , . . . , θn /pn ]
by induction hypothesis
⇔ M, w ⊩ (ψ ∨ χ)[θ1 /p1 , . . . , θn /pn ]
by definition of M, w ⊩.
6. φ ≡ (ψ → χ):
v ⊨ ψ → χ ⇔ v ⊭ ψ or v ⊨ χ
by definition of v ⊨
⇔ M, w ⊮ ψ[θ1 /p1 , . . . , θn /pn ] or
M, w ⊩ χ[θ1 /p1 , . . . , θn /pn ]
by induction hypothesis
⇔ M, w ⊩ (ψ → χ)[θ1 /p1 , . . . , θn /pn ]
by definition of M, w ⊩.
Definition 49.18. A schema is true in a model if and only if all of its instances
are; and a schema is valid if and only if it is true in every model.
Proof. We need to show that all instances of the schema are true at every world
in every model. So let M = ⟨W, R, V ⟩ and w ∈ W be arbitrary. To show that
a conditional is true at a world we assume the antecedent is true to show that
consequent is true as well. In this case, let M, w ⊩ □( φ → ψ) and M, w ⊩ □φ.
We need to show M ⊩ □ψ. So let w′ be arbitrary such that Rww′ . Then by the
first assumption M, w′ ⊩ φ → ψ and by the second assumption M, w′ ⊩ φ. It
follows that M, w′ ⊩ ψ. Since w′ was arbitrary, M, w ⊩ □ψ.
♢φ ↔ ¬□¬ φ. (DUAL)
Proof. Exercise.
Proposition 49.22. A formula φ is valid iff all its substitution instances are. In
other words, a schema is valid iff its characteristic formula is.
Note, however, that it is not true that a schema is true in a model iff its
characteristic formula is. Of course, the “only if” direction holds: if every
instance of φ is true in M, φ itself is true in M. But it may happen that φ
is true in M but some instance of φ is false at some world in M. For a very
simple counterexample consider p in a model with only one world w and
V ( p) = {w}, so that p is true at w. But ⊥ is an instance of p, and not true at w.
49.10 Entailment
With the definition of truth at a world, we can define an entailment relation
between formulas. A formula ψ entails φ iff, whenever ψ is true, φ is true as
well. Here, “whenever” means both “whichever model we consider” as well
as “whichever world in that model we consider.”
w2 p w3 p
w1 ¬ p
Problems
Problem 49.1. Consider the model of Figure 49.1. Which of the following
hold?
1. M, w1 ⊩ q;
2. M, w3 ⊩ ¬q;
3. M, w1 ⊩ p ∨ q;
4. M, w1 ⊩ □( p ∨ q);
5. M, w3 ⊩ □q;
6. M, w3 ⊩ □⊥;
7. M, w1 ⊩ ♢q;
8. M, w1 ⊩ □q;
9. M, w1 ⊩ ¬□□¬q.
Problem 49.5. Consider the following model M for the language comprising
p1 , p2 , p3 as the only propositional variables:
p1 p1
¬ p 2 w1 w3 p 2
¬ p3 p3
p1
w2 p2
¬ p3
Are the following formulas and schemas true in the model M, i.e., true at
every world in M? Explain.
1. p → ♢p (for p atomic);
2. φ → ♢φ (for φ arbitrary);
3. □p → p (for p atomic);
1. ⊨ □p → □(q → p);
2. ⊨ □¬⊥;
3. ⊨ □p → (□q → □p).
Problem 49.9. Prove the claim in the “only if” part of the proof of Proposi-
tion 49.22. (Hint: use induction on φ.)
Problem 49.10. Show that none of the following formulas are valid:
D: □p → ♢p;
T: □p → p;
B: p → □♢p;
4: □p → □□p;
5: ♢p → □♢p.
Problem 49.11. Prove that the schemas in the first column of Table 49.1 are
valid and those in the second column are not valid.
Problem 49.12. Decide whether the following schemas are valid or invalid:
2. ♢( φ → ψ) ∨ □(ψ → φ).
Problem 49.13. For each of the following schemas find a model M such that
every instance of the formula is true in M:
1. p → ♢♢p;
2. ♢p → □p.
Frame Definability
50.1 Introduction
One question that interests modal logicians is the relationship between the
accessibility relation and the truth of certain formulas in models with that ac-
cessibility relation. For instance, suppose the accessibility relation is reflexive,
i.e., for every w ∈ W, Rww. In other words, every world is accessible from
itself. That means that when □φ is true at a world w, w itself is among the
accessible worlds at which φ must therefore be true. So, if the accessibility
relation R of M is reflexive, then whatever world w and formula φ we take,
□φ → φ will be true there (in other words, the schema □p → p and all its
substitution instances are true in M).
The converse, however, is false. It’s not the case, e.g., that if □p → p is
true in M, then R is reflexive. For we can easily find a non-reflexive model M
where □p → p is true at all worlds: take the model with a single world w,
not accessible from itself, but with w ∈ V ( p). By picking the truth value of p
suitably, we can make □φ → φ true in a model that is not reflexive.
The solution is to remove the variable assignment V from the equation. If
we require that □p → p is true at all worlds in M, regardless of which worlds
are in V ( p), then it is necessary that R is reflexive. For in any non-reflexive
model, there will be at least one world w such that not Rww. If we set V ( p) =
W \ {w}, then p will be true at all worlds other than w, and so at all worlds
accessible from w (since w is guaranteed not to be accessible from w, and w is
the only world where p is false). On the other hand, p is false at w, so □p → p
is false at w.
This suggests that we should introduce a notation for model structures
without a valuation: we call these frames. A frame F is simply a pair ⟨W, R⟩
consisting of a set of worlds with an accessibility relation. Every model ⟨W, R, V ⟩
is then, as we say, based on the frame ⟨W, R⟩. Conversely, a frame determines
the class of models based on it; and a class of frames determines the class of
models which are based on any frame in the class. And we can define F ⊨ φ,
708
50.2. PROPERTIES OF ACCESSIBILITY RELATIONS
If R is . . . then . . . is true in M:
serial: ∀u∃vRuv □p → ♢p (D)
reflexive: ∀wRww □p → p (T)
symmetric: p → □♢p (B)
∀u∀v( Ruv → Rvu)
transitive: □p → □□p (4)
∀u∀v∀w(( Ruv ∧ Rvw) → Ruw)
euclidean: ♢p → □♢p (5)
∀w∀u∀v(( Rwu ∧ Rwv) → Ruv)
Table 50.1: Five correspondence facts.
the notion of a formula being valid in a frame as: M ⊩ φ for all M based on F.
With this notation, we can establish correspondence relations between for-
mulas and classes of frames: e.g., F ⊨ □p → p if, and only if, F is reflexive.
Theorem 50.1. Let M = ⟨W, R, V ⟩ be a model. If R has the property on the left side
of Table 50.1, every instance of the formula on the right side is true in M.
Proof. Here is the case for B: to show that the schema is true in a model we
need to show that all of its instances are true at all worlds in the model. So
let φ → □♢φ be a given instance of B, and let w ∈ W be an arbitrary world.
Suppose the antecedent φ is true at w, in order to show that □♢φ is true at
w. So we need to show that ♢φ is true at all w′ accessible from w. Now, for
any w′ such that Rww′ we have, using the hypothesis of symmetry, that also
Rw′ w (see Figure 50.1). Since M, w ⊩ φ, we have M, w′ ⊩ ♢φ. Since w′ was an
arbitrary world such that Rww′ , we have M, w ⊩ □♢φ.
We leave the other cases as exercises.
Notice that the converse implications of Theorem 50.1 do not hold: it’s not
true that if a model verifies a schema, then the accessibility relation of that
model has the corresponding property. In the case of T and reflexive models,
it is easy to give an example of a model in which T itself fails: let W = {w} and
V ( p) = ∅. Then R is not reflexive, but M, w ⊩ □p and M, w ⊮ p. But here we
have just a single instance of T that fails in M, other instances, e.g., □¬ p → ¬ p
w w′
⊩φ ⊩ ♢φ
⊩ □♢φ
Proposition 50.2. Let M = ⟨W, R, V ⟩ be a model such that W = {u, v}, where
worlds u and v are related by R: i.e., both Ruv and Rvu. Suppose that for all p:
u ∈ V ( p) ⇔ v ∈ V ( p). Then:
Since M is not reflexive (it is, in fact, irreflexive), the converse of Theorem 50.1 fails
in the case of T (similar arguments can be given for some—though not all—the other
schemas mentioned in Theorem 50.1).
50.3 Frames
Definition 50.3. A frame is a pair F = ⟨W, R⟩ where W is a non-empty set of
worlds and R a binary relation on W. A model M is based on a frame F =
⟨W, R⟩ if and only if M = ⟨W, R, V ⟩ for some valuation V.
If R is . . . then . . . is true in M:
partially functional:
♢p → □p
∀w∀u∀v(( Rwu ∧ Rwv) → u = v)
functional: ∀w∃v∀u( Rwu ↔ u = v) ♢p ↔ □p
weakly dense:
□□p → □p
∀u∀v( Ruv → ∃w( Ruw ∧ Rwv))
weakly connected:
□(( p ∧ □p) → q) ∨
∀w∀u∀v(( Rwu ∧ Rwv) → (L)
□((q ∧ □q) → p)
( Ruv ∨ u = v ∨ Rvu))
weakly directed:
∀w∀u∀v(( Rwu ∧ Rwv) → ♢□p → □♢p (G)
∃t( Rut ∧ Rvt))
Table 50.2: Five more correspondence facts.
Theorem 50.6. If the formula on the right side of Table 50.1 is valid in a frame F,
then F has the property on the left side.
You’ll notice a difference between the proof for D and the other cases: no
mention was made of the valuation V. In effect, we proved that if M ⊩ D then
M is serial. So D defines the class of serial models, not just frames.
Corollary 50.8. Each formula on the right side of Table 50.1 defines the class of
frames which have the property on the left side.
Proof. In Theorem 50.1, we proved that if a model has the property on the left,
the formula on the right is true in it. Thus, if a frame F has the property on
the left, the formula on the right is valid in F. In Theorem 50.6, we proved
the converse implications: if a formula on the right is valid in F, F has the
property on the left.
Theorem 50.6 also shows that the properties can be combined: for instance
if both B and 4 are valid in F then the frame is both symmetric and transitive,
etc. Many important modal logics are characterized as the set of formulas
valid in all frames that combine some frame properties, and so we can charac-
terize them as the set of formulas valid in all frames in which the correspond-
ing defining formulas are valid. For instance, the classical system S4 is the
set of all formulas valid in all reflexive and transitive frames, i.e., in all those
where both T and 4 are valid. S5 is the set of all formulas valid in all reflexive,
symmetric, and euclidean frames, i.e., all those where all of T, B, and 5 are
valid.
Logical relationships between properties of R in general correspond to re-
lationships between the corresponding defining formulas. For instance, every
reflexive relation is serial; hence, whenever T is valid in a frame, so is D. (Note
that this relationship is not that of entailment. It is not the case that whenever
M, w ⊩ T then M, w ⊩ D.) We record some such relationships.
It turns out that the properties and modal formulas that define them con-
sidered so far are exceptional. Not every formula defines a first-order de-
finable class of frames, and not every first-order definable class of frames is
definable by a modal formula.
A counterexample to the first is given by the Löb formula:
( Q( a1 , a2 ) ∧ · · · ∧ Q( an−1 , an ))
Γ = { β, φ1 , φ2 , . . . }.
those valid in reflexive, symmetric, and transitive frames. There are reflexive,
symmetric, and transitive frames that are not universal, hence every formula
valid in all universal frames is also valid in some non-universal frames.
1. R is an equivalence relation;
Proof. Exercise.
Proposition 50.13. Let R be an equivalence relation, and for each w ∈ W define the
equivalence class of w as the set [w] = {w′ ∈ W : Rww′ }. Then:
[w]
[z]
[u]
[v]
1. w ∈ [w];
1. W ′ = [w];
2. R′ is universal on W ′ ;
3. V ′ ( p) = V ( p) ∩ W ′ .
(So the set W ′ of worlds in M′ is represented by the shaded area in Figure 50.2.)
It is easy to see that R and R′ agree on W ′ . Then one can show by induction
on formulas that for all w′ ∈ W ′ : M′ , w′ ⊩ φ if and only if M, w′ ⊩ φ for each
φ (this makes sense since W ′ ⊆ W). In particular, M′ , w ⊮ ψ, and ψ fails in a
model based on a universal frame.
1. φ ≡ ⊥: STx ( φ) = ⊥.
2. φ ≡ pi : STx ( φ) = Pi ( x ).
M, w ⊩ φ iff M′ , s ⊨ STx ( φ)
Proof. By induction on φ.
F ⊨ φ iff F′ ⊨ φ′
′
Proof. F′ ⊨ φ′ iff for every structure M′ where PiM ⊆ W for i = 1, . . . , n, and
for every s with s( x ) ∈ W, M′ , s ⊨ STx ( φ). By Proposition 50.16, that is the
case iff for all models M based on F and every world w ∈ W, M, w ⊩ φ, i.e.,
F ⊨ φ.
Proof. The monadic second-order sentence φ′ of the preceding proof has the
required property.
Problems
Problem 50.1. Complete the proof of Theorem 50.1.
Problem 50.3. Let M = ⟨W, R, V ⟩ be a model. Show that if R satisfies the left-
hand properties of Table 50.2, every instance of the corresponding right-hand
formula is true in M.
Problem 50.4. Show that if the formula on the right side of Table 50.2 is valid
in a frame F, then F has the property on the left side. To do this, consider a
frame that does not satisfy the property on the left, and define a suitable V
such that the formula on the right is false at some world.
2. If R is reflexive, it is serial.
Explain why this suffices for the proof that the conditions are equivalent.
Axiomatic Derivations
51.1 Introduction
We have a semantics for the basic modal language in terms of modal models,
and a notion of a formula being valid—true at all worlds in all models—or
valid with respect to some class of models or frames—true at all worlds in
all models in the class, or based on the frame. Logic usually connects such
semantic characterizations of validity with a proof-theoretic notion of deriv-
ability. The aim is to define a notion of derivability in some system such that
a formula is derivable iff it is valid.
The simplest and historically oldest derivation systems are so-called Hilbert-
type or axiomatic derivation systems. Hilbert-type derivation systems for
many modal logics are relatively easy to construct: they are simple as ob-
jects of metatheoretical study (e.g., to prove soundness and completeness).
However, they are much harder to use to prove formulas in than, say, natural
deduction systems.
In Hilbert-type derivation systems, a derivation of a formula is a sequence
of formulas leading from certain axioms, via a handful of inference rules, to
the formula in question. Since we want the derivation system to match the
semantics, we have to guarantee that the set of derivable formulas are true
in all models (or true in all models in which all axioms are true). We’ll first
isolate some properties of modal logics that are necessary for this to work:
the “normal” modal logics. For normal modal logics, there are only two in-
ference rules that need to be assumed: modus ponens and necessitation. As
axioms we take all (substitution instances) of tautologies, and, depending on
the modal logic we deal with, a number of modal axioms. Even if we are just
interested in the class of all models, we must also count all substitution in-
stances of K and Dual as axioms. This alone generates the minimal normal
modal logic K.
720
51.2. NORMAL MODAL LOGICS
φ φ→ψ
MP
ψ
We say the formula ψ follows from the formulas φ by necessitation iff ψ ≡ □φ.
With this definition, it will turn out that the set of derivable formulas forms
a normal modal logic, and that any derivable formula is true in every model
in which every axiom is true. This property of derivations is called soundness.
The converse, completeness, is harder to prove.
In order to use the relational semantics for modal logics, we also have to re-
quire that all formulas valid in all modal models are included. It turns out that
this requirement is met as soon as all instances of K and DUAL are derivable,
and whenever a formula φ is derivable, so is □φ. A modal logic that satisfies
these conditions is called normal. (Of course, there are also non-normal modal
logics, but the usual relational models are not adequate for them.)
Proposition 51.6. Every normal modal logic is closed under rule RK,
φ 1 → ( φ 2 → · · · ( φ n −1 → φ n ) · · · )
RK
□φ1 → (□φ2 → · · · (□φn−1 → □φn ) · · · ).
Proof. By induction on n: If n = 1, then the rule is just NEC, and every normal
modal logic is closed under NEC.
Now suppose the result holds for n − 1; we show it holds for n.
Assume
φ 1 → ( φ 2 → · · · ( φ n −1 → φ n ) · · · ) ∈ Σ
4. We have Kφ1 . . . φn ⊢ K, so K ∈ Σ.
51.4 Proofs in K
In order to practice proofs in the smallest modal system, we show the valid
formulas on the left-hand side of Table 49.1 can all be given K-proofs.
Proof.
1. φ → (ψ → φ) TAUT
2. □( φ → (ψ → φ)) NEC , 1
3. □( φ → (ψ → φ)) → (□φ → □(ψ → φ)) K
4. □φ → □(ψ → φ) MP , 2, 3
Proof.
1. ( φ ∧ ψ) → φ TAUT
2. □(( φ ∧ ψ) → φ) NEC
3. □(( φ ∧ ψ) → φ) → (□( φ ∧ ψ) → □φ) K
4. □( φ ∧ ψ) → □φ MP , 2, 3
5. ( φ ∧ ψ) → ψ TAUT
6. □(( φ ∧ ψ) → ψ) NEC
7. □(( φ ∧ ψ) → ψ) → (□( φ ∧ ψ) → □ψ) K
8. □( φ ∧ ψ) → □ψ MP , 6, 7
9. (□( φ ∧ ψ) → □φ) →
((□( φ ∧ ψ) → □ψ) →
(□( φ ∧ ψ) → (□φ ∧ □ψ))) TAUT
10. (□( φ ∧ ψ) → □ψ) →
(□( φ ∧ ψ) → (□φ ∧ □ψ)) MP , 4, 9
11. □( φ ∧ ψ) → (□φ ∧ □ψ) MP , 8, 10.
( p → q) → (( p → r ) → ( p → (q ∧ r ))).
Proof.
1. φ → (ψ → ( φ ∧ ψ)) TAUT
2. □( φ → (ψ → ( φ ∧ ψ))) NEC , 1
3. □( φ → (ψ → ( φ ∧ ψ))) → (□φ → □(ψ → ( φ ∧ ψ))) K
4. □φ → □(ψ → ( φ ∧ ψ)) MP , 2, 3
5. □(ψ → ( φ ∧ ψ)) → (□ψ → □( φ ∧ ψ)) K
6. (□φ → □(ψ → ( φ ∧ ψ))) →
(□(ψ → ( φ ∧ ψ)) → (□ψ → □( φ ∧ ψ))) →
(□φ → (□ψ → □( φ ∧ ψ)))) TAUT
7. (□(ψ → ( φ ∧ ψ)) → (□ψ → □( φ ∧ ψ))) →
(□φ → (□ψ → □( φ ∧ ψ))) MP , 4, 6
8. □φ → (□ψ → □( φ ∧ ψ))) MP , 5, 7
9. (□φ → (□ψ → □( φ ∧ ψ)))) →
((□φ ∧ □ψ) → □( φ ∧ ψ)) TAUT
10. (□φ ∧ □ψ) → □( φ ∧ ψ) MP , 8, 9
( p → q) → ((q → r ) → ( p → r ))
( p → (q → r )) → (( p ∧ q) → r )
Proof.
1. ♢¬ p ↔ ¬□¬¬ p DUAL
2. (♢¬ p ↔ ¬□¬¬ p) →
(¬□¬¬ p → ♢¬ p) TAUT
3. ¬□¬¬ p → ♢¬ p MP , 1, 2
4. ¬¬ p → p TAUT
5. □(¬¬ p → p) NEC , 4
6. □(¬¬ p → p) → (□¬¬ p → □p) K
7. (□¬¬ p → □p) MP , 5, 6
8. (□¬¬ p → □p) → (¬□p → ¬□¬¬ p) TAUT
9. ¬□p → ¬□¬¬ p MP , 7, 8
10. (¬□p → ¬□¬¬ p) →
((¬□¬¬ p → ♢¬ p) → (¬□p → ♢¬ p)) TAUT
11. (¬□¬¬ p → ♢¬ p) → (¬□p → ♢¬ p) MP , 9, 10
12. ¬□p → ♢¬ p MP , 3, 11
The formulas on lines 8 and 10 are instances of the tautologies
( p → q) → (¬q → ¬ p)
( p → q) → ((q → r ) → ( p → r )).
( φ → ψ) → ((ψ → χ) → ( φ → χ))
φ1 → ( φ2 → · · · ( φ n → ψ ) . . . )
We will indicate use of this proposition by RK. Let’s illustrate how these
results help establishing derivability results more easily.
Proof.
1. K ⊢ φ → (ψ → ( φ ∧ ψ)) TAUT
2. K ⊢ □φ → (□ψ → □( φ ∧ ψ))) RK , 1
3. K ⊢ (□φ ∧ □ψ) → □( φ ∧ ψ) PL , 2
Proof. Exercise.
⊢ χ( φ)
⊢ φ↔ψ
⊢ χ(ψ) by Proposition 51.19
For instance:
Proof.
1. K ⊢ ♢¬ p ↔ ¬□¬¬ p DUAL
2. K ⊢ ¬□¬¬ p → ♢¬ p PL ,1
3. K ⊢ ¬□p → ♢¬ p p for ¬¬ p
K ⊢ ¬□¬¬ p → ♢¬ p
K ⊢ ¬¬ p ↔ p TAUT
K ⊢ ¬□p → ♢¬ p by Proposition 51.19
The roles of χ(q), φ, and ψ in Proposition 51.19 are played here, respectively,
by ¬□q → ♢¬ p, ¬¬ p, and p.
When a formula contains a sub-formula ¬♢φ, we can replace it by □¬ φ us-
ing Proposition 51.19, since K ⊢ ¬♢φ ↔ □¬ φ. We’ll indicate this and similar
replacements simply by “□¬ for ¬♢.”
The following proposition justifies that we can establish derivability re-
sults schematically. E.g., the previous proposition does not just establish that
K ⊢ ¬□p → ♢¬ p, but K ⊢ ¬□φ → ♢¬ φ for arbitrary φ.
Proof. It is tedious but routine to verify (by induction on the length of the
derivation of ψ) that applying a substitution to an entire derivation also re-
sults in a correct derivation. Specifically, substitution instances of tautological
instances are themselves tautological instances, substitution instances of in-
stances of DUAL and K are themselves instances of DUAL and K, and applica-
tions of MP and NEC remain correct when substituting formulas for proposi-
tional variables in both premise(s) and conclusion.
Proof.
1. K ⊢ ( φ → ψ) → (¬ψ → ¬ φ) PL
2. K ⊢ □( φ → ψ ) → (□¬ ψ → □¬ φ ) RK , 1
3. K ⊢ (□¬ψ → □¬ φ) → (¬□¬ φ → ¬□¬ψ) TAUT
4. K ⊢ □( φ → ψ) → (¬□¬ φ → ¬□¬ψ) PL , 2, 3
5. K ⊢ □( φ → ψ) → (♢φ → ♢ψ) ♢ for ¬□¬.
Proof.
Proof.
1. K ⊢ ¬( φ ∨ ψ) → ¬ φ TAUT
2. K ⊢ □¬( φ ∨ ψ) → □¬ φ RK , 1
3. K ⊢ ¬□¬ φ → ¬□¬( φ ∨ ψ) PL , 2
4. K ⊢ ♢φ → ♢( φ ∨ ψ) ♢ for ¬□¬
5. K ⊢ ♢ψ → ♢( φ ∨ ψ) similarly
6. K ⊢ (♢φ ∨ ♢ψ) → ♢( φ ∨ ψ) PL , 4, 5.
Proof.
1. K ⊢ ¬ φ → (¬ψ → ¬( φ ∨ ψ) TAUT
2. K ⊢ □¬ φ → (□¬ψ → □¬( φ ∨ ψ) RK
3. K ⊢ □¬ φ → (¬□¬( φ ∨ ψ) → ¬□¬ψ)) PL , 2
4. K ⊢ ¬□¬( φ ∨ ψ) → (□¬ φ → ¬□¬ψ) PL , 3
5. K ⊢ ¬□¬( φ ∨ ψ) → (¬¬□¬ψ → ¬□¬ φ) PL , 4
6. K ⊢ ♢( φ ∨ ψ) → (¬♢ψ → ♢φ) ♢ for ¬□¬
7. K ⊢ ♢( φ ∨ ψ) → (♢ψ ∨ ♢φ) PL , 6.
p → ♢p (T♢ )
♢□p → p (B♢ )
♢♢p → ♢p (4♢ )
♢□p → □p (5♢ )
Each of the above dual formulas is obtained from the corresponding for-
mula by substituting ¬ p for p, contraposing, replacing ¬□¬ by ♢, and replac-
ing ¬♢¬ by □. D, i.e., □φ → ♢φ is its own dual in that sense.
1. KT5 ⊢ B;
2. KT5 ⊢ 4;
3. KDB4 ⊢ T;
4. KB4 ⊢ 5;
5. KB5 ⊢ 4;
6. KT ⊢ D.
1. KT5 ⊢ B:
1. KT5 ⊢ ♢φ → □♢φ 5
2. KT5 ⊢ φ → ♢φ T♢
3. KT5 ⊢ φ → □♢φ PL .
2. KT5 ⊢ 4:
3. KDB4 ⊢ T:
1. KDB4 ⊢ ♢□φ → φ B♢
2. KDB4 ⊢ □□φ → ♢□φ D with □φ for p
3. KDB4 ⊢ □□φ → φ PL 1, 2
4. KDB4 ⊢ □φ → □□φ 4
5. KDB4 ⊢ □φ → φ PL , 1, 4.
4. KB4 ⊢ 5:
5. KB5 ⊢ 4:
6. KT ⊢ D:
1. KT ⊢ □φ → φ T
2. KT ⊢ φ → ♢φ T♢
3. KT ⊢ □φ → ♢φ PL , 1, 2
The following proposition shows that the classical system S5 has several
equivalent axiomatizations. This should not surprise, as the various combina-
tions of axioms all characterize equivalence relations (see Proposition 50.12).
Proposition 51.29. KTB4 = KT5 = KDB4 = KDB5.
Proof. Exercise.
51.9 Soundness
A derivation system is called sound if everything that can be derived is valid.
When considering modal systems, i.e., derivations where in addition to K we
can use instances of some formulas φ1 , . . . , φn , we want every derivable for-
mula to be true in any model in which φ1 , . . . , φn are true.
Theorem 51.30 (Soundness Theorem). If every instance of φ1 , . . . , φn is valid in
the classes of models C1 , . . . , Cn , respectively, then Kφ1 . . . φn ⊢ ψ implies that ψ is
valid in the class of models C1 ∩ · · · ∩ Cn .
Proposition 51.31. KD ⊊ KT
Proof. This is the syntactic counterpart to the semantic fact that all reflexive
relations are serial. To show KD ⊆ KT we need to see that KD ⊢ ψ implies
KT ⊢ ψ, which follows from KT ⊢ D, as shown in Proposition 51.27(6). To
show that the inclusion is proper, by Soundness (Theorem 51.30), it suffices
to exhibit a model of KD where T, i.e., □p → p, fails (an easy task left as an
exercise), for then by Soundness KD ⊬ □p → p.
¬p p
w1 w2
⊩ □p ⊮ □p
⊮ □□p
Proof. By Theorem 50.1 we know that all instances of T and B are true in every
reflexive symmetric model (respectively). So by soundness, it suffices to find
a reflexive symmetric model containing a world at which some instance of 4
fails, and similarly for 5. We use the same model for both claims. Consider
the symmetric, reflexive model in Figure 51.2. Then M, w1 ⊮ □p → □□p, so 4
fails at w1 . Similarly, M, w2 ⊮ ♢¬ p → □♢¬ p, so the instance of 5 with φ = ¬ p
fails at w2 .
w1 p w2 p w3 ¬ p
⊩ □p ⊩ ♢¬ p
⊮ □□p ⊮ □♢¬ p
⊮ ♢¬ p
w4 ¬ p
p p
w2 w3
w1 ¬ p
⊩ □p, ⊮ □□p
Proof. By Theorem 50.1 we know that all instances of D and 5 are true in all se-
rial euclidean models. So it suffices to find a serial euclidean model containing
a world at which some instance of 4 fails. Consider the model of Figure 51.3,
and notice that M, w1 ⊮ □p → □□p.
2. Reflexivity: If φ ∈ Γ then Γ ⊢Σ φ;
The proof is an easy exercise. Part (5) of Proposition 51.36 gives us that, for
instance, if Γ ⊢Σ φ ∨ ψ and Γ ⊢Σ ¬ φ, then Γ ⊢Σ ψ. Also, in what follows, we
write Γ, φ ⊢Σ ψ instead of Γ ∪ { φ} ⊢Σ ψ.
51.13 Consistency
Consistency is an important property of sets of formulas. A set of formulas is
inconsistent if a contradiction, such as ⊥, is derivable from it; and otherwise
consistent. If a set is inconsistent, its formulas cannot all be true in a model at a
world. For the completeness theorem we prove the converse: every consistent
set is true at a world in a model, namely in the “canonical model.”
So for instance, the set {□( p → q), □p, ¬□q} is consistent relatively to propo-
sitional logic, but not K-consistent. Similarly, the set {♢p, □♢p → q, ¬q} is not
K5-consistent.
Proof. These facts follow easily using classical propositional logic. We give the
argument for (3). Proceed contrapositively and suppose neither Γ ∪ { φ} nor
Γ ∪ {¬ φ} is Σ-consistent. Then by (2), both Γ, φ ⊢Σ ⊥ and Γ, ¬ φ ⊢Σ ⊥. By the
deduction theorem Γ ⊢Σ φ → ⊥ and Γ ⊢Σ ¬ φ → ⊥. But ( φ → ⊥) → ((¬ φ →
⊥) → ⊥) is a tautological instance, hence by Proposition 51.36(5), Γ ⊢Σ ⊥.
Problems
Problem 51.1. Prove Proposition 51.7.
1. □¬ p → □( p → q)
2. (□p ∨ □q) → □( p ∨ q)
3. ♢p → ♢( p ∨ q)
2. K ⊢ □( φ ∨ ψ) → (♢φ ∨ □ψ);
Problem 51.7. Give an alternative proof of Theorem 51.34 using a model with
3 worlds.
Problem 51.8. Provide a single reflexive transitive model showing that both
KT4 ⊬ B and KT4 ⊬ 5.
52.1 Introduction
If Σ is a modal system, then the soundness theorem establishes that if Σ ⊢ φ,
then φ is valid in any class C of models in which all instances of all formulas
in Σ are valid. In particular that means that if K ⊢ φ then φ is true in all
models; if KT ⊢ φ then φ is true in all reflexive models; if KD ⊢ φ then φ is
true in all serial models, etc.
Completeness is the converse of soundness: that K is complete means that
if a formula φ is valid, ⊢ φ, for instance. Proving completeness is a lot harder
to do than proving soundness. It is useful, first, to consider the contrapositive:
K is complete iff whenever ⊬ φ, there is a countermodel, i.e., a model M such
that M ⊮ φ. Equivalently (negating φ), we could prove that whenever ⊬
¬ φ, there is a model of φ. In the construction of such a model, we can use
information contained in φ. When we find models for specific formulas we
often do the same: e.g., if we want to find a countermodel to p → □q, we know
that it has to contain a world where p is true and □q is false. And a world
where □q is false means there has to be a world accessible from it where q is
false. And that’s all we need to know: which worlds make the propositional
variables true, and which worlds are accessible from which worlds.
In the case of proving completeness, however, we don’t have a specific
formula φ for which we are constructing a model. We want to establish that
a model exists for every φ such that ⊬Σ ¬ φ. This is a minimal requirement,
since if ⊢Σ ¬ φ, by soundness, there is no model for φ (in which Σ is true).
Now note that ⊬Σ ¬ φ iff φ is Σ-consistent. (Recall that Σ ⊬Σ ¬ φ and φ ⊬Σ ⊥
are equivalent.) So our task is to construct a model for every Σ-consistent
formula.
The trick we’ll use is to find a Σ-consistent set of formulas that contains φ,
but also other formulas which tell us what the world that makes φ true has to
736
52.2. COMPLETE Σ-CONSISTENT SETS
look like. Such sets are complete Σ-consistent sets. It’s not enough to construct
a model with a single world to make φ true, it will have to contain multiple
worlds and an accessibility relation. The complete Σ-consistent set contain-
ing φ will also contain other formulas of the form □ψ and ♢χ. In all accessible
worlds, ψ has to be true; in at least one, χ has to be true. In order to accom-
plish this, we’ll simply take all possible complete Σ-consistent sets as the basis
for the set of worlds. A tricky part will be to figure out when a complete
Σ-consistent set should count as being accessible from another in our model.
We’ll show that in the model so defined, φ is true at a world—which is
also a complete Σ-consistent set—iff φ is an element of that set. If φ is Σ-
consistent, it will be an element of at least one complete Σ-consistent set (a
fact we’ll prove), and so there will be a world where φ is true. So we will have
a single model where every Σ-consistent formula φ is true at some world. This
single model is the canonical model for Σ.
1. Γ is deductively closed in Σ.
2. Σ ⊆ Γ.
3. ⊥ ∈
/Γ
4. ¬ φ ∈ Γ if and only if φ ∈
/ Γ.
5. φ ∧ ψ ∈ Γ iff φ ∈ Γ and ψ ∈ Γ
6. φ ∨ ψ ∈ Γ iff φ ∈ Γ or ψ ∈ Γ
7. φ → ψ ∈ Γ iff φ ∈
/ Γ or ψ ∈ Γ
4. If ¬ φ ∈ Γ, then by consistency φ ∈
/ Γ; and if φ ∈
/ Γ then φ ∈ Γ since Γ is
complete Σ-consistent.
5. Exercise.
7. Exercise.
Now let ∆ = ∞
S
n =0 ∆ n .
We have to show that this definition actually yields a set ∆ with the re-
quired properties, i.e., Γ ⊆ ∆ and ∆ is complete Σ-consistent.
□Γ = {□ψ : ψ ∈ Γ }
♢Γ = {♢ψ : ψ ∈ Γ }
and
□−1 Γ = {ψ : □ψ ∈ Γ }
♢−1 Γ = {ψ : ♢ψ ∈ Γ }
□□−1 Γ = {□ψ : □ψ ∈ Γ }
i.e., it’s just the set of all those formulas of Γ that start with □.
Proof. Suppose □−1 Γ ⊢Σ φ; then by Lemma 52.6, □□−1 Γ ⊢ □φ. But since
□□−1 Γ ⊆ Γ, also Γ ⊢Σ □φ by monotonicity.
Lemma 52.9. Suppose Γ and ∆ are complete Σ-consistent. Then □−1 Γ ⊆ ∆ if and
only if ♢∆ ⊆ Γ.
Definition 52.11. Let Σ be a normal modal logic. The canonical model for Σ is
MΣ = ⟨W Σ , RΣ , V Σ ⟩, where:
1. W Σ = {∆ : ∆ is complete Σ-consistent}.
3. V Σ ( p) = {∆ : p ∈ ∆}.
Proof. By induction on φ.
1. φ ≡ ⊥: MΣ , ∆ ⊮ ⊥ by Definition 49.7, and ⊥ ∈
/ ∆ by Proposition 52.2(3).
2. φ ≡ p: MΣ , ∆ ⊩ p iff ∆ ∈ V Σ ( p) by Definition 49.7. Also, ∆ ∈ V Σ ( p) iff
p ∈ ∆ by definition of V Σ .
3. φ ≡ ¬ψ: MΣ , ∆ ⊩ ¬ψ iff MΣ , ∆ ⊮ ψ (Definition 49.7) iff ψ ∈
/ ∆ (by
inductive hypothesis) iff ¬ψ ∈ ∆ (by Proposition 52.2(4)).
4. φ ≡ ψ ∧ χ: Exercise.
5. φ ≡ ψ ∨ χ: MΣ , ∆ ⊩ ψ ∨ χ iff MΣ , ∆ ⊩ ψ or MΣ , ∆ ⊩ χ (by Defini-
tion 49.7) iff ψ ∈ ∆ or χ ∈ ∆ (by inductive hypothesis) iff ψ ∨ χ ∈ ∆ (by
Proposition 52.2(6)).
6. φ ≡ ψ → χ: Exercise.
7. φ ≡ □ψ: First suppose that MΣ , ∆ ⊩ □ψ. By Definition 49.7, for every
∆′ such that RΣ ∆∆′ , MΣ , ∆′ ⊩ ψ. By inductive hypothesis, for every ∆′
such that RΣ ∆∆′ , ψ ∈ ∆′ . By definition of RΣ , for every ∆′ such that
□−1 ∆ ⊆ ∆′ , ψ ∈ ∆′ . By Proposition 52.8, □ψ ∈ ∆.
Now assume □ψ ∈ ∆. Let ∆′ ∈ W Σ be such that RΣ ∆∆′ , i.e., □−1 ∆ ⊆
∆′ . Since □ψ ∈ ∆, ψ ∈ □−1 ∆. Consequently, ψ ∈ ∆′ . By inductive
hypothesis, MΣ , ∆′ ⊩ ψ. Since ∆′ is arbitrary with RΣ ∆∆′ , for all ∆′ ∈ W Σ
such that RΣ ∆∆′ , MΣ , ∆′ ⊩ ψ. By Definition 49.7, MΣ , ∆ ⊩ □ψ.
8. φ ≡ ♢ψ: Exercise.
Corollary 52.15. The basic modal logic K is complete with respect to the class of all
models, i.e., if ⊨ φ then K ⊢ φ.
Theorem 52.16. If a normal modal logic Σ contains one of the formulas on the left-
hand side of Table 52.1, then the canonical model for Σ has the corresponding property
on the right-hand side.
Theorem 52.17. Let CD , CT , CB , C4 , and C5 be the class of all serial, reflexive, sym-
metric, transitive, and euclidean models (respectively). Then for any schemas φ1 , . . . ,
φn among D, T, B, 4, and 5, the system Kφ1 . . . φn is determined by the class of
models C = C φ1 ∩ · · · ∩ C φn .
3. If Σ contains the schema □□φ → □φ then the canonical model for Σ is weakly
dense.
2. This follows immediately from part (1) and the seriality proof in Theo-
rem 52.16.
Γ = □−1 ∆ 1 ∪ ♢∆ 2 .
Suppose for contradiction that Γ is not consistent. Then there are formu-
las □φ1 , . . . , □φn ∈ ∆ 1 and ψ1 , . . . , ψm ∈ ∆ 2 such that
φ1 , . . . , φn , ♢ψ1 , . . . , ♢ψm ⊢Σ ⊥.
φ1 , . . . , φn ,♢ψ1 , . . . , ♢ψm ⊢Σ ⊥
φ1 , . . . , φn ⊢Σ (♢ψ1 ∧ · · · ∧ ♢ψm ) → ⊥
by the deduction theorem
Proposition 51.36(4), and TAUT
φ1 , . . . , φn ⊢Σ ♢(ψ1 ∧ · · · ∧ ψm ) → ⊥
since Σ is normal
φ1 , . . . , φn ⊢Σ ¬♢(ψ1 ∧ · · · ∧ ψm )
by PL
φ1 , . . . , φn ⊢Σ □¬(ψ1 ∧ · · · ∧ ψm )
□¬ for ¬♢
□φ1 , . . . , □φn ⊢Σ □□¬(ψ1 ∧ · · · ∧ ψm )
by Lemma 52.6
□φ1 , . . . , □φn ⊢Σ □¬(ψ1 ∧ · · · ∧ ψm )
by schema □□φ → □φ
∆ 1 ⊢Σ □¬(ψ1 ∧ · · · ∧ ψm )
by monotonicity, Proposition 51.36(1)
□¬(ψ1 ∧ · · · ∧ ψm ) ∈ ∆ 1
by deductive closure;
¬(ψ1 ∧ · · · ∧ ψm ) ∈ ∆ 2
since RΣ ∆ 1 ∆ 2 .
On the strength of these examples, one might think that every system Σ of
modal logic is complete, in the sense that it proves every formula which is valid
in every frame in which every theorem of Σ is valid. Unfortunately, there are
many systems that are not complete in this sense.
Problems
Problem 52.1. Complete the proof of Proposition 52.2.
53.1 Introduction
One important question about a logic is always whether it is decidable, i.e., if
there is an effective procedure which will answer the question “is this formula
valid.” Propositional logic is decidable: we can effectively test if a formula is
a tautology by constructing a truth table, and for a given formula, the truth
table is finite. But we can’t obviously test if a modal formula is true in all
models, for there are infinitely many of them. We can list all the finite models
relevant to a given formula, since only the assignment of subsets of worlds
to propositional variables which actually occur in the formula are relevant. If
the accessibility relation is fixed, the possible different assignments V ( p) are
just all the subsets of W, and if |W | = n there are 2n of those. If our formula φ
contains m propositional variables there are then 2nm different models with n
worlds. For each one, we can test if φ is true at all worlds, simply by comput-
ing the truth value of φ in each. Of course, we also have to check all possible
accessibility relations, but there are only finitely many relations on n worlds
2
as well (specifically, the number of subsets of W × W, i.e., 2n .
If we are not interested in the logic K, but a logic defined by some class of
models (e.g., the reflexive transitive models), we also have to be able to test
if the accessibility relation is of the right kind. We can do that whenever the
frames we are interested in are definable by modal formulas (e.g., by testing if
T and 4 valid in the frame). So, the idea would be to run through all the finite
frames, test each one if it is a frame in the class we’re interested in, then list all
the possible models on that frame and test if φ is true in each. If not, stop: φ
is not valid in the class of models of interest.
There is a problem with this idea: we don’t know when, if ever, we can stop
looking. If the formula has a finite countermodel, our procedure will find it.
But if it has no finite countermodel, we won’t get an answer. The formula may
be valid (no countermodels at all), or it have only an infinite countermodel,
which we’ll never look at. This problem can be overcome if we can show that
747
CHAPTER 53. FILTRATIONS AND DECIDABILITY
every formula that has a countermodel has a finite countermodel. If this is the
case we say the logic has the finite model property.
But how would we show that a logic has the finite model property? One
way of doing this would be to find a way to turn an infinite (counter)model
of φ into a finite one. If that can be done, then whenever there is a model
in which φ is not true, then the resulting finite model also makes φ not true.
That finite model will show up on our list of all finite models, and we will
eventually determine, for every formula that is not valid, that it isn’t. Our
procedure won’t terminate if the formula is valid. If we can show in addition
that there is some maximum size that the finite model our procedure provides
can have, and that this maximum size depends only on the formula φ, we
will have a size up to which we have to test finite models in our search for
countermodels. If we haven’t found a countermodel by then, there are none.
Then our procedure will, in fact, decide the question “is φ valid?” for any
formula φ.
A strategy that often works for turning infinite structures into finite struc-
tures is that of “identifying” elements of the structure which behave the same
way in relevant respects. If there are infinitely many worlds in M that be-
have the same in relevant respects, then we might hope that there are only
finitely many “classes” of such worlds. In other words, we partition the set
of worlds in the right way. Each partition contains infinitely many worlds,
but there are only finitely many partitions. Then we define a new model M∗
where the worlds are the partitions. Finitely many partitions in the old model
give us finitely many worlds in the new model, i.e., a finite model. Let’s call
the partition a world w is in [w]. We’ll want it to be the case that M, w ⊩ φ iff
M∗ , [w] ⊩ φ, since we want the new model to be a countermodel to φ if the old
one was. This requires that we define the partition, as well as the accessibility
relation of M∗ in the right way.
To see how this would go, first imagine we have no accessibility relation.
M, w ⊩ □ψ iff for some v ∈ W, M, v ⊩ □ψ, and the same for M∗ , except with
[w] and [v]. As a first idea, let’s say that two worlds u and v are equivalent
(belong to the same partition) if they agree on all propositional variables in M,
i.e., M, u ⊩ p iff M, v ⊩ p. Let V ∗ ( p) = {[w] : M, w ⊩ p}. Our aim is to show
that M, w ⊩ φ iff M∗ , [w] ⊩ φ. Obviously, we’d prove this by induction: The
base case would be φ ≡ p. First suppose M, w ⊩ p. Then [w] ∈ V ∗ by
definition, so M∗ , [w] ⊩ p. Now suppose that M∗ , [w] ⊩ p. That means that
[w] ∈ V ∗ ( p), i.e., for some v equivalent to w, M, v ⊩ p. But “w equivalent to v”
means “w and v make all the same propositional variables true,” so M, w ⊩ p.
Now for the inductive step, e.g., φ ≡ ¬ψ. Then M, w ⊩ ¬ψ iff M, w ⊮ ψ
iff M∗ , [w] ⊮ ψ (by inductive hypothesis) iff M∗ , [w] ⊩ ¬ψ. Similarly for the
other non-modal operators. It also works for □: suppose M∗ , [w] ⊩ □ψ. That
means that for every [u], M∗ , [u] ⊩ ψ. By inductive hypothesis, for every u,
M, u ⊩ ψ. Consequently, M, w ⊩ □ψ.
In the general case, where we have to also define the accessibility relation
for M∗ , things are more complicated. We’ll call a model M∗ a filtration if its
accessibility relation R∗ satisfies the conditions required to make the induc-
tive proof above go through. Then any filtration M∗ will make φ true at [w]
iff M makes φ true at w. However, now we also have to show that there are
filtrations, i.e., we can define R∗ so that it satisfies the required conditions. In
order for this to work, however, we have to require that worlds u, v count as
equivalent not just when they agree on all propositional variables, but on all
sub-formulas of φ. Since φ has only finitely many sub-formulas, this will still
guarantee that the filtration is finite. There is not just one way to define a fil-
tration, and in order to make sure that the accessibility relation of the filtration
satisfies the required properties (e.g., reflexive, transitive, etc.) we have to be
inventive with the definition of R∗ .
53.2 Preliminaries
Filtrations allow us to establish the decidability of our systems of modal logic
by showing that they have the finite model property, i.e., that any formula that
is true (false) in a model is also true (false) in a finite model. Filtrations are
defined relative to sets of formulas which are closed under subformulas.
For instance, given a formula φ, the set of all its sub-formulas is closed
under sub-formulas. When we’re defining a filtration of a model through the
set of sub-formulas of φ, it will have the property we’re after: it makes φ true
(false) iff the original model does.
The set of worlds of a filtration of M through Γ is defined as the set of all
equivalence classes of the following equivalence relation.
The equivalence class [w]≡ of a world w, or [w] for short, is the set of all worlds
≡-equivalent to w:
[ w ] = { v : v ≡ w }.
Proof. The relation ≡ is reflexive, since w makes exactly the same formulas
from Γ true as itself. It is symmetric since if u makes the same formulas from Γ
true as v, the same holds for v and u. It is also transitive, since if u makes the
same formulas from Γ true as v, and v as w, then u makes the same formulas
from Γ true as w.
The relation ≡, like any equivalence relation, divides W into partitions, i.e.,
subsets of W which are pairwise disjoint, and together cover all of W. Every
w ∈ W is an element of one of the partitions, namely of [w], since w ≡ w. So
the partitions [w] cover all of W. They are pairwise disjoint, for if u ∈ [w] and
u ∈ [v], then u ≡ w and u ≡ v, and by symmetry and transitivity, w ≡ v, and
so [w] = [v].
53.3 Filtrations
Rather than define “the” filtration of M through Γ, we define when a model M∗
counts as a filtration of M. All filtrations have the same set of worlds W ∗ and
the same valuation V ∗ . But different filtrations may have different accessibil-
ity relations R∗ . To count as a filtration, R∗ has to satisfy a number of condi-
tions, however. These conditions are exactly what we’ll require to prove the
main result, namely that M, w ⊩ φ iff M∗ , [w] ⊩ φ, provided φ ∈ Γ.
1. W ∗ = {[w] : w ∈ W };
2. For any u, v ∈ W:
3. V ∗ ( p) = {[u] : u ∈ V ( p)}.
It’s worthwhile thinking about what V ∗ ( p) is: the set consisting of the
equivalence classes [w] of all worlds w where p is true in M. On the one
hand, if w ∈ V ( p), then [w] ∈ V ∗ ( p) by that definition. However, it is not
necessarily the case that if [w] ∈ V ∗ ( p), then w ∈ V ( p). If [w] ∈ V ∗ ( p) we are
only guaranteed that [w] = [u] for some u ∈ V ( p). Of course, [w] = [u] means
that w ≡ u. So, when [w] ∈ V ∗ ( p) we can (only) conclude that w ≡ u for some
u ∈ V ( p ).
4. Exercise.
6. Exercise.
8. Exercise.
What holds for truth at worlds in a model also holds for truth in a model
and validity in a class of models.
1. If □φ ∈ Γ and M, u ⊩ □φ then M, v ⊩ φ;
Proof. Given the definition of R∗ , the only condition that is left to verify is
the implication from Ruv to R∗ [u][v]. So assume Ruv. Suppose □φ ∈ Γ and
M, u ⊩ □φ; then obviously M, v ⊩ φ, and (1) is satisfied. Suppose ♢φ ∈ Γ and
M, v ⊩ φ. Then M, u ⊩ ♢φ since Ruv, and (2) is satisfied.
1 2 3 4
¬p p ¬p p
numbers, so □p → p is true at all and only the even numbers. In other words,
every odd number makes □p true and p and □p → p false; every even number
makes p and □p → p true, but □p false. So W ∗ = {[1], [2]}, where [1] =
{1, 3, 5, . . . } and [2] = {2, 4, 6, . . . }. Since 2 ∈ V ( p), [2] ∈ V ∗ ( p); since 1 ∈
/
V ( p ), [1] ∈ ∗ ∗
/ V ( p). So V ( p) = {[2]}.
Any filtration based on W ∗ must have an accessibility relation that in-
cludes ⟨[1], [2]⟩, ⟨[2], [1]⟩: since R12, we must have R∗ [1][2] by Definition 53.4(2a),
and since R23 we must have R∗ [2][3], and [3] = [1]. It cannot include ⟨[1], [1]⟩:
if it did, we’d have R∗ [1][1], M, 1 ⊩ □p but M, 1 ⊮ p, contradicting (2b). Noth-
ing requires or rules out that R∗ [2][2]. So, there are two possible filtrations
of M, corresponding to the two accessibility relations
{⟨[1], [2]⟩, ⟨[2], [1]⟩} and {⟨[1], [2]⟩, ⟨[2], [1]⟩, ⟨[2], [2]⟩}.
In either case, p and □p → p are false and □p is true at [1]; p and □p → p are
true and □p is false at [2].
Proof. The size of W ∗ is the number of different classes [w] under the equiva-
lence relation ≡. Any two worlds u, v in such class—that is, any u and v such
that u ≡ v—agree on all formulas φ in Γ, φ ∈ Γ either φ is true at both u and
v, or at neither. So each class [w] corresponds to subset of Γ, namely the set of
all φ ∈ Γ such that φ is true at the worlds in [w]. No two different classes [u]
and [v] correspond to the same subset of Γ. For if the set of formulas true at u
and that of formulas true at v are the same, then u and v agree on all formulas
in Γ, i.e., u ≡ v. But then [u] = [v]. So, there is an injective function from
W ∗ to ℘( Γ ), and hence |W ∗ | ≤ |℘( Γ )|. Hence if Γ contains n sentences, the
cardinality of W ∗ is no greater than 2n .
Proof. K is the set of valid formulas, i.e., any model is a model of K. By Theo-
rem 53.5, if M, w ⊩ φ, then M∗ , w ⊩ φ for any filtration of M through the set Γ
of sub-formulas of φ. Any formula only has finitely many sub-formulas, so Γ
is finite. By Proposition 53.12, |W ∗ | ≤ 2n , where n is the number of formulas
in Γ. And since K imposes no restriction on models, M∗ is a K-model.
To show that a logic L has the finite model property via filtrations it is
essential that the filtration of an L-model is itself a L-model. Often this re-
quires a fair bit of work, and not any filtration yields a L-model. However, for
universal models, this still holds.
Proposition 53.15. Let U be the class of universal models (see Proposition 50.14)
and UFin the class of all finite universal models. Then any formula φ is valid in U if
and only if it is valid in UFin .
Proof. Finite universal models are universal models, so the left-to-right direc-
tion is trivial. For the right-to left direction, suppose that φ is false at some
world w in a universal model M. Let Γ contain φ as well as all of its sub-
formulas; clearly Γ is finite. Take a filtration M∗ of M; then M∗ is finite by
Proposition 53.12, and by Theorem 53.5, φ is false at [w] in M∗ . It remains to
observe that M∗ is also universal: given u and v, by hypothesis Ruv and by
Definition 53.4(2), also R∗ [u][v].
53.7 S5 is Decidable
The finite model property gives us an easy way to show that systems of modal
logic given by schemas are decidable (i.e., that there is a computable procedure
to determine whether a formula is derivable in the system or not).
The above proof works for S5 because filtrations of universal models are
automatically universal. The same holds for reflexivity and seriality, but more
work is needed for other properties.
2. Suppose R∗ [u][v] if and only if C1 (u, v) ∧ C3 (u, v). Then R∗ is transitive, and
M∗ = ⟨W ∗ , R∗ , V ∗ ⟩ is a filtration if M is transitive.
2. Exercise.
3. Exercise.
4. Exercise.
in Figure 53.3. That model isn’t euclidean. Moreover, we cannot add arrows
to that model in order to make it euclidean. We would have to add double
arrows between [w2 ] and [w4 ], and then also between w2 and w5 . But □p is
supposed to be true at w2 , while p is false at w5 .
¬ p w1 w2 p
⊩ □p ⊩ □p
¬ p w3 w4 p w5 ¬ p
⊩ □p ⊮ □p ⊮ □p
[ w2 ] p
⊩ □p
¬ p [ w1 ] [ w1 ] = [ w3 ]
⊩ □p
[ w4 ] p [ w5 ] ¬ p
⊮ □p ⊮ □p
1. If M is symmetric, so is M∗ .
2. If M is transitive, so is M∗ .
3. If M is euclidean, so is M∗ .
2. Exercise. Use the fact that both 5 and 5♢ are valid in all euclidean mod-
els.
3. Exercise. Use the fact that B and B♢ are valid in all symmetric models.
Problems
Problem 53.1. Complete the proof of Theorem 53.5
000
p
00
¬q
p
001
¬q
¬p
0 q
p
¬q 010
p
01
¬q
¬p
011
q
¬p
q
Problem 53.4. Show that any filtration of a serial or reflexive model is also
serial or reflexive (respectively).
Modal Tableaux
Draft chapter on prefixed tableaux for modal logic. Needs more ex-
amples, completeness proofs, and discussion of how one can find coun-
termodels from unsuccessful searches for closed tableaux.
54.1 Introduction
Tableaux are certain (downward-branching) trees of signed formulas, i.e., pairs
consisting of a truth value sign (T or F) and a sentence
T φ or F φ.
{F φ, Tψ1 , . . . , Tψn }.
For modal logics, we have to both extend the notion of signed formula
and add rules that cover □ and ♢ In addition to a sign(T or F), formulas in
modal tableaux also have prefixes σ. The prefixes are non-empty sequences of
positive integers, i.e., σ ∈ (Z+ )∗ \ {Λ}. When we write such prefixes without
760
54.2. RULES FOR K
σ T¬ φ σ F ¬φ
¬T ¬F
σFφ σ Tφ
σ Tφ ∧ ψ
∧T σFφ ∧ ψ
σ Tφ ∧F
σ F φ | σ Fψ
σ Tψ
σFφ ∨ ψ
σ Tφ ∨ ψ ∨F
∨T σFφ
σ T φ | σ Tψ
σ Fψ
σFφ → ψ
σ Tφ → ψ →F
→T σ Tφ
σ F φ | σ Tψ
σ Fψ
the surrounding ⟨ ⟩, and separate the individual elements by .’s instead of ,’s.
If σ is a prefix, then σ.n is σ ⌢ ⟨n⟩; e.g., if σ = 1.2.1, then σ.3 is 1.2.1.3. So for
instance,
1.2 T□φ → φ
is a prefixed signed formula (or just a prefixed formula for short).
Intuitively, the prefix names a world in a model that might satisfy the for-
mulas on a branch of a tableau, and if σ names some world, then σ.n names a
world accessible from (the world named by) σ.
σ Tφ and σFφ
σ T□φ σ F □φ
□T □F
σ.n T φ σ.n F φ
σ T♢φ σ F ♢φ
♢T ♢F
σ.n T φ σ.n F φ
The rules for setting up assumptions is also as for ordinary tableaux, ex-
cept that for assumptions we always use the prefix 1. (It does not matter which
prefix we use, as long as it’s the same for all assumptions.) So, e.g., we say that
ψ1 , . . . , ψn ⊢ φ
1 Tψ1 , . . . , 1 Tψn , 1 F φ.
For the modal operators □ and ♢, the prefix of the conclusion of the rule
applied to a formula with prefix σ is σ.n. However, which n is allowed de-
pends on whether the sign is T or F.
The T□ rule extends a branch containing σ T□φ by σ.n T φ. Similarly, the
F♢ rule extends a branch containing σ F ♢φ by σ.n F φ. They can only be ap-
plied for a prefix σ.n which already occurs on the branch in which it is applied.
Let’s call such a prefix “used” (on the branch).
The F□ rule extends a branch containing σ F □φ by σ.n F φ. Similarly, the
T♢ rule extends a branch containing σ T♢φ by σ.n T φ. These rules, however,
can only be applied for a prefix σ.n which does not already occur on the branch
in which it is applied. We call such prefixes “new” (to the branch).
The rules are given in Table 54.2.
The requirement that the restriction that the prefix for □T must be used is
necessary as otherwise we would count the following as a closed tableau:
1. 1 T □φ Assumption
2. 1 F ♢φ Assumption
3. 1.1 T φ □T 1
4. 1.1 F φ ♢F 2
⊗
1. 1 T ♢φ Assumption
2. 1 F □φ Assumption
3. 1.1 T φ ♢T 1
4. 1.1 F φ □F 2
⊗
Example 54.1. We give a closed tableau that shows ⊢ (□φ ∧ □ψ) → □( φ ∧ ψ).
7. 1.1 F φ 1.1 F ψ ∧F 6
8. 1.1 T φ 1.1 T ψ □T 4; □T 5
⊗ ⊗
7. 1.1 T φ 1.1 T ψ ∨T 6
8. 1.1 F φ 1.1 F ψ ♢F 4; ♢F 5
⊗ ⊗
This soundness proof reuses the soundness proof for classical propo-
sitional logic, i.e., it proves everything from scratch. That’s ok if you want
a self-contained soundness proof. If you already have seen soundness for
ordinary tableau this will be repetitive. It’s planned to make it possible
to switch between self-contained version and a version building on the
non-modal case.
In order to show that prefixed tableaux are sound, we have to show that if
1 Tψ1 , . . . , 1 Tψn , 1 F φ
Definition 54.3. Let P be some set of prefixes, i.e., P ⊆ (Z+ )∗ \ {Λ} and let M
be a model. A function f : P → W is an interpretation of P in M if, whenever σ
and σ.n are both in P, then R f (σ ) f (σ.n).
Relative to an interpretation of prefixes P we can define:
1. M satisfies σ T φ iff M, f (σ ) ⊩ φ.
Definition 54.4. Let Γ be a set of prefixed formulas, and let P( Γ ) be the set of
prefixes that occur in it. If f is an interpretation of P( Γ ) in M, we say that M
satisfies Γ with respect to f , M, f ⊩ Γ, if M satisfies every prefixed formula
in Γ with respect to f . Γ is satisfiable iff there is a model M and interpretation f
of P( Γ ) such that M, f ⊩ Γ.
Proposition 54.5. If Γ contains both σ T φ and σ F φ, for some formula φ and pre-
fix σ, then Γ is unsatisfiable.
Proof. We call a branch of a tableau satisfiable iff the set of signed formulas
on it is satisfiable, and let’s call a tableau satisfiable if it contains at least one
satisfiable branch.
We show the following: Extending a satisfiable tableau by one of the rules
of inference always results in a satisfiable tableau. This will prove the theo-
rem: any closed tableau results by applying rules of inference to the tableau
consisting only of assumptions from Γ. So if Γ were satisfiable, any tableau
for it would be satisfiable. A closed tableau, however, is clearly not satisfiable,
since all its branches are closed and closed branches are unsatisfiable.
Suppose we have a satisfiable tableau, i.e., a tableau with at least one sat-
isfiable branch. Applying a rule of inference either adds signed formulas to a
branch, or splits a branch in two. If the tableau has a satisfiable branch which
is not extended by the rule application in question, it remains a satisfiable
branch in the extended tableau, so the extended tableau is satisfiable. So we
only have to consider the case where a rule is applied to a satisfiable branch.
Let Γ be the set of signed formulas on that branch, and let σ S φ ∈ Γ be
the signed formula to which the rule is applied. If the rule does not result in a
split branch, we have to show that the extended branch, i.e., Γ together with
the conclusions of the rule, is still satisfiable. If the rule results in split branch,
we have to show that at least one of the two resulting branches is satisfiable.
First, we consider the possible inferences with only one premise.
σ T□φ σ F ♢φ
T□ T♢
σ Tφ σFφ
σ T□φ σ F ♢φ
D□ D♢
σ T♢φ σ F □φ
σ T□φ σ F ♢φ
4□ 4♢
σ.n T□φ σ.n F ♢φ
Logic R is . . . Rules
T = KT reflexive T□, T♢
D = KD serial D□, D♢
K4 transitive 4□, 4♢
B = KTB reflexive, T□, T♢
symmetric B□, B♢
S4 = KT4 reflexive, T□, T♢,
transitive 4□, 4♢
S5 = KT4B reflexive, T□, T♢,
transitive, 4□, 4♢,
euclidean 4r□, 4r♢
1. 1 F □φ → □♢φ Assumption
2. 1 T □φ →F 1
3. 1 F □♢φ →F 1
4. 1.1 F ♢φ □F 3
5. 1 F ♢φ 4r♢ 4
6. 1.1 F φ ♢F 5
7. 1.1 T φ □T 2
⊗
Proposition 54.14. 4r□ and 4r♢ are sound for euclidean models.
Proof. 1. The branch is expanded by applying 4r□ to σ.n T□ψ ∈ Γ: This re-
sults in a new signed formula σ T□ψ on the branch. Suppose M, f ⊩ Γ,
in particular, M, f (σ.n) ⊩ □ψ. Since f is an interpretation of prefixes on
the branch into M, we know that R f (σ ) f (σ.n). Now let w be any world
such that R f (σ )w. Since R is euclidean, R f (σ.n)w. Since M, f (σ ).n ⊩
□ψ, M, w ⊩ ψ. Hence, M, f (σ ) ⊩ □ψ, and M, f satisfies σ T□ψ.
Corollary 54.15. The tableau systems given in Table 54.4 are sound for the respective
classes of models.
n T□φ n F □φ
□T □F
m Tφ mFφ
m is used m is new
n T♢φ n F ♢φ
♢T ♢F
m Tφ mFφ
m is new m is used
S5 is sound and complete with respect to the class of universal models, i.e.,
models where every world is accessible from every world. In universal mod-
els the accessibility relation doesn’t matter: “there is a world w where M, w ⊩
φ” is true if and only if there is such a w that’s accessible from u. So in S5, we
can define models as simply a set of worlds and a valuation V. This suggests
that we should be able to simplify the tableau rules as well. In the general
case, we take as prefixes sequences of positive integers, so that we can keep
track of which such prefixes name worlds which are accessible from others:
σ.n names a world accessible from σ. But in S5 any world is accessible from
any world, so there is no need to so keep track. Instead, we can use positive
integers as prefixes. The simplified rules are given in Table 54.5.
1. 1 F ♢φ → □♢φ Assumption
2. 1 T ♢φ →F 1
3. 1 F □♢φ →F 1
4. 2 F ♢φ □F 3
5. 3T φ ♢T 2
6. 3F φ ♢F 4
⊗
1. the prefixed formulas that are the corresponding conclusions of the rule,
in the case of propositional stacking rules;
3. at least one possible conclusion in the case of modal rules that require a
new prefix;
Proposition 54.18. Every finite Γ has a tableau in which every branch is complete.
Proof. Consider an open branch in a tableau for Γ. There are finitely many
prefixed formulas in the branch to which a rule could be applied. In some
fixed order (say, top to bottom), for each of these prefixed formulas for which
the conditions (1)–(4) do not already hold, apply the rules that can be applied
to it to extend the branch. In some cases this will result in branching; apply
the rule at the tip of each resulting branch for all remaining prefixed formu-
las. Since the number of prefixed formulas is finite, and the number of used
prefixes on the branch is finite, this procedure eventually results in (possibly
many) branches extending the original branch. Apply the procedure to each,
and repeat. But by construction, every branch is closed.
and
V ( p) = {σ : σ T p ∈ ∆}.
We show by induction on φ that if σ T φ ∈ ∆ then M(∆), σ ⊩ φ, and if σ F φ ∈ ∆
then M(∆), σ ⊮ φ.
3. φ ≡ ψ ∧ χ: Exercise.
5. φ ≡ ψ → χ: Exercise.
7. φ ≡ ♢ψ: Exercise.
Since Γ ⊆ ∆, M(∆) ⊩ Γ.
The tableau is of course not finished yet. In the next step, we consider the
only line without a checkmark: the prefixed formula 1 T□( p ∨ q) on line 2.
The construction of the closed tableau says to apply the □T rule for every
prefix used on the branch, i.e., for both 1.1 and 1.2:
Now lines 2, 8, and 9, don’t have checkmarks. But no new prefix has been
added, so we apply ∨T to lines 8 and 9, on all resulting branches (as long as
they don’t close):
There is one remaining open branch, and it is complete. From it we define the
model with worlds W = {1, 1.1, 1.2} (the only prefixes appearing on the open
branch), the accessibility relation R = {⟨1, 1.1⟩, ⟨1, 1.2⟩}, and the assignment
V ( p) = {1.2} (because line 11 contains 1.2 T p) and V (q) = {1.1} (because
line 10 contains 1.1 Tq). The model is pictured in Figure 54.1, and you can
verify that it is a countermodel to □( p ∨ q) → (□p ∨ □q).
Problems
Problem 54.1. Find closed tableaux in K for the following formulas:
1. □¬ p → □( p → q)
2. (□p ∨ □q) → □( p ∨ q)
¬p p
1.1 q 1.2 ¬q
¬p
1 ¬q
3. ♢p → ♢( p ∨ q)
4. □( p ∧ q) → □p
1. KT5 ⊢ B;
2. KT5 ⊢ 4;
3. KDB4 ⊢ T;
4. KB4 ⊢ 5;
5. KB5 ⊢ 4;
6. KT ⊢ D.
Intuitionistic Logic
776
54.9. COUNTERMODELS FROM TABLEAUX
Introduction
Theorem 55.1. There are irrational numbers a and b such that ab is rational.
778
55.2. SYNTAX OF INTUITIONISTIC LOGIC
√ √2 √
Proof. Consider 2 . If this is rational, we are done: we can let a = b = 2.
Otherwise, it is irrational. Then we have
√ √2 √ √ √2· √2 √ 2
( 2 ) 2= 2 = 2 = 2,
√
√ 2 √
which is rational. So, in this case, let a be 2 , and let b be 2.
Does this constitute a valid proof? Most mathematicians feel that it does.
But again, there is something a little bit unsatisfying here: we have proved the
existence of a pair of real numbers with a certain property, without being able
to say which pair of numbers it is. It is possible to prove the √
same result, but in
such a way that the pair a, b is given in the proof: take a = 3 and b = log3 4.
Then √ log 4
ab = 3 3 = 31/2·log3 4 = (3log3 4 )1/2 = 41/2 = 2,
since 3log3 x = x.
Intuitionistic logic is designed to capture a kind of reasoning where moves
like the one in the first proof are disallowed. Proving the existence of an x
satisfying φ( x ) means that you have to give a specific x, and a proof that it
satisfies φ, like in the second proof. Proving that φ or ψ holds requires that
you can prove one or the other.
Formally speaking, intuitionistic logic is what you get if you restrict a deriva-
tion system for classical logic in a certain way. From the mathematical point
of view, these are just formal deductive systems, but, as already noted, they
are intended to capture a kind of mathematical reasoning. One can take this
to be the kind of reasoning that is justified on a certain philosophical view of
mathematics (such as Brouwer’s intuitionism); one can take it to be a kind of
mathematical reasoning which is more “concrete” and satisfying (along the
lines of Bishop’s constructivism); and one can argue about whether or not
the formal description captures the informal motivation. But whatever philo-
sophical positions we may hold, we can study intuitionistic logic as a formally
presented logic; and for whatever reasons, many mathematical logicians find
it interesting to do so.
1. ⊥ is an atomic formula.
1. ¬ φ abbreviates φ → ⊥.
2. φ ↔ ψ abbreviates ( φ → ψ) ∧ (ψ → φ).
p1 (⟨ N1 , N2 ⟩) = N1
p2 (⟨ N1 , N2 ⟩) = N2
Here is what f does: First it applies p1 to its input M. That yields a con-
struction of φ. Then it applies p2 to M, yielding a construction of φ → ⊥.
Such a construction, in turn, is a function p2 ( M) which, if given as input a
construction of φ, yields a construction of ⊥. In other words, if we apply
p2 ( M) to p1 ( M), we get a construction of ⊥. Thus, we can define f ( M ) =
p2 ( M )( p1 ( M)).
As you can see, using the BHK interpretation to show the intuitionistic
validity of formulas quickly becomes cumbersome and confusing. Luckily,
there are better derivation systems for intuitionistic logic, and more precise
semantic interpretations.
Conjunction
φ∧ψ
φ ∧Elim
φ ψ
∧Intro
φ∧ψ φ∧ψ
ψ
∧Elim
Conditional
[ φ]u
φ→ψ φ
ψ
→Elim
ψ
u →Intro
φ→ψ
Disjunction
φ [ φ]n [ψ]n
∨Intro
φ∨ψ
ψ
∨Intro φ∨ψ χ χ
φ∨ψ n ∨Elim
χ
Absurdity
⊥ ⊥
φ I
Rules for ¬
Since ¬ φ is defined as φ → ⊥, we strictly speaking do not need rules for ¬.
But if we did, this is what they’d look like:
[ φ]n
¬φ φ
¬Elim
⊥
⊥
¬ φ ¬Intro
n
Examples of Derivations
1. ⊢ φ → (¬ φ → ⊥), i.e., ⊢ φ → (( φ → ⊥) → ⊥)
[ φ ]2 [ φ → ⊥]1
→Elim
⊥
1 →Intro
( φ → ⊥) → ⊥
2 →Intro
φ → ( φ → ⊥) → ⊥
2. ⊢ (( φ ∧ ψ) → χ) → ( φ → (ψ → χ))
[ φ ]2 [ ψ ]1
∧Intro
[( φ ∧ ψ) → χ]3 φ∧ψ
χ →Elim
1 →Intro
ψ→χ
2 →Intro
φ → (ψ → χ)
3 →Intro
(( φ ∧ ψ) → χ) → ( φ → (ψ → χ))
[ φ ∧ ( φ → ⊥)]1 [ φ ∧ ( φ → ⊥)]1
∧Elim ∧Elim
φ→⊥ φ
→Elim
⊥
1 →Intro
( φ ∧ ( φ → ⊥)) → ⊥
[ φ ]1
∨Intro
[( φ ∨ ( φ → ⊥)) → ⊥]2 φ ∨ ( φ → ⊥)
→Elim
⊥
1 →Intro
φ→⊥
2 ∨Intro
[( φ ∨ ( φ → ⊥)) → ⊥] φ ∨ ( φ → ⊥)
→Elim
⊥
2 →Intro
(( φ ∨ ( φ → ⊥)) → ⊥) → ⊥
Proof. Every natural deduction rule is also a rule in classical natural deduc-
tion, so every derivation in intuitionistic logic is also a derivation in classical
logic.
1. φi ∈ Γ; or
2. φi is an axiom; or
3. φi follows from some φ j and φk with j < i and k < i by modus ponens,
i.e., φk ≡ φ j → φi .
Definition 55.10 (Axioms). The set of Ax0 of axioms for the intuitionistic propo-
sitional logic are all formulas of the following forms:
( φ ∧ ψ) → φ (55.1)
( φ ∧ ψ) → ψ (55.2)
φ → (ψ → ( φ ∧ ψ)) (55.3)
φ → ( φ ∨ ψ) (55.4)
φ → (ψ ∨ φ) (55.5)
( φ → χ) → ((ψ → χ) → (( φ ∨ ψ) → χ)) (55.6)
φ → (ψ → φ) (55.7)
( φ → (ψ → χ)) → (( φ → ψ) → ( φ → χ)) (55.8)
⊥→φ (55.9)
Problems
Problem 55.1. Give derivations in intutionistic logic of the following formu-
las:
1. (¬ φ ∨ ψ) → ( φ → ψ)
2. ¬¬¬ φ → ¬ φ
4. ¬( φ ∨ ψ) ↔ (¬ φ ∧ ¬ψ)
5. (¬ φ ∨ ¬ψ) → ¬( φ ∧ ψ)
Semantics
56.1 Introduction
No logic is satisfactorily described without a semantics, and intuitionistic logic
is no exception. Whereas for classical logic, the semantics based on valu-
ations is canonical, there are several competing semantics for intuitionistic
logic. None of them are completely satisfactory in the sense that they give an
intuitionistically acceptable account of the meanings of the connectives.
The semantics based on relational models, similar to the semantics for
modal logics, is perhaps the most popular one. In this semantics, proposi-
tional variables are assigned to worlds, and these worlds are related by an
accessibility relation. That relation is always a partial order, i.e., it is reflexive,
antisymmetric, and transitive.
Intuitively, you might think of these worlds as states of knowledge or “ev-
identiary situations.” A state w′ is accessible from w iff, for all we know, w′ is
a possible (future) state of knowledge, i.e., one that is compatible with what’s
known at w. Once a proposition is known, it can’t become un-known, i.e.,
whenever φ is known at w and Rww′ , φ is known at w′ as well. So “knowl-
edge” is monotonic with respect to the accessibility relation.
If we define “φ is known” as in epistemic logic as “true in all epistemic
alternatives,” then φ ∧ ψ is known at w if in all epistemic alternatives, both φ
and ψ are known. But since knowledge is monotonic and R is reflexive, that
means that φ ∧ ψ is known at w iff φ and ψ are known at w. For the same
788
56.2. RELATIONAL MODELS
1. W is a non-empty set,
1. φ ≡ p: M, w ⊩ φ iff w ∈ V ( p).
2. φ ≡ ⊥: not M, w ⊩ φ.
4. φ ≡ ψ ∧ χ: M, w ⊩ φ iff M, w ⊩ ψ and M, w ⊩ χ.
Proof. Exercise.
2. If M ⊩ Γ and Γ ⊨ φ, then M ⊩ φ.
Ww = {u ∈ W : Rwu},
Rw = R ∩ (Ww )2 , and
Vw ( p) = V ( p) ∩ Ww .
We may write X for a topology if the collection of open sets can be inferred
from the context; note that, still, only after X is endowed with open sets can it
be called a topology.
1. [⊥]]X = ∅
2. [ p]X = V ( p)
Here, Int(V ) is the function that maps a set V ⊆ X to its interior, that is, the
union of all open sets it contains. In other words,
[
Int(V ) = {U : U ⊆ V and U ∈ O}.
Note that the interior of any set is always open, since it is a union of open
sets. Thus, [ φ]X is always an open set.
Although topological semantics is highly abstract, there are ways to think
about it that might motivate it. Suppose that the elements, or “points,” of X
are points at which statements can be evaluated. The set of all points where φ
is true is the proposition expressed by φ. Not every set of points is a potential
proposition; only the elements of O are. φ ⊨ ψ iff ψ is true at every point at
which φ is true, i.e., [ φ]X ⊆ [ψ]X , for all X. The absurd statement ⊥ is never
true, so [⊥]]X = ∅. How must the propositions expressed by ψ ∧ χ, ψ ∨ χ, and
ψ → χ be related to those expressed by ψ and χ for the intuitionistically valid
laws to hold, i.e., so that φ ⊢ ψ iff [ φ]X ⊂ [ψ]X . ⊥ ⊢ φ for any φ, and only
∅ ⊆ U for all U. Since ψ ∧ χ ⊢ ψ, [ψ ∧ χ]X ⊆ [ψ]X , and similarly [ψ ∧ χ]X ⊆
[χ]X . The largest set satisfying W ⊆ U and W ⊆ V is U ∩ V. Conversely,
ψ ⊢ ψ ∨ χ and χ ⊢ ψ ∨ χ, and so [ψ]X ⊆ [ψ ∨ χ]X and [χ]X ⊆ [ψ ∨ χ]X . The
smallest set W such that U ⊆ W and V ⊆ W is U ∪ V. The definition for
→ is tricky: φ → ψ expresses the weakest proposition that, combined with φ,
entails ψ. That φ → ψ combined with φ entails ψ is clear from ( φ → ψ) ∧ φ ⊢ ψ.
So [ φ → ψ]X should be the greatest open set such that [ φ → ψ]X ∩ [ φ]X ⊂ [ψ]X ,
leading to our definition.
Problems
Problem 56.1. Show that according to Definition 56.2, M, w ⊩ ¬ φ iff M, w ⊩
φ → ⊥.
The soundness proof relies on the fact that all axioms are intuitionisti-
cally valid; this still needs to be proved, e.g., in the Semantics chapter.
793
CHAPTER 57. SOUNDNESS AND COMPLETENESS
4. The derivation ends in ∨Intro: Suppose the premise is ψ, and the undis-
charged assumptions of the derivation ending in ψ are Γ. Then we have
Γ ⊢ ψ and by inductive hypothesis, Γ ⊨ ψ. We have to show that
Γ ⊨ ψ ∨ χ. Suppose M, w ⊩ Γ. Since Γ ⊨ ψ, M, w ⊩ ψ. But then
also M, w ⊩ ψ ∨ χ. Similarly, if the premise is χ, we have that Γ ⊨ χ.
1. Γ is consistent, i.e., Γ ⊬ ⊥;
2. if Γ ⊢ φ then φ ∈ Γ; and
3. if φ ∨ ψ ∈ Γ then φ ∈ Γ or ψ ∈ Γ.
1. Γn ⊢ ψi ∨ χi
2. ψi ∈
/ Γn and χi ∈
/ Γn
1. Λ ∈ N∗ .
3. Nothing else is in N∗ .
1. ∆(Λ) = ∆
2. ∆(σ.n) =
(
(∆(σ) ∪ {ψn })∗ if ∆(σ ) ∪ {ψn } ⊬ χn
∆(σ ) otherwise
Here by (∆(σ ) ∪ {ψn })∗ we mean the prime set of formulas which exists by
Lemma 57.4 applied to the set ∆(σ ) ∪ {ψn } and the formula χn . Note that by
this definition, if ∆(σ ) ∪ {ψn } ⊬ χn , then ∆(σ.n) ⊢ ψn and ∆(σ.n) ⊬ χn . Note
also that ∆(σ ) ⊆ ∆(σ.n) for any n. If ∆ is prime, then ∆(σ ) is prime for all σ.
Definition 57.5. Suppose ∆ is prime. Then the canonical model M(∆) for ∆ is
defined by:
3. V ( p) = {σ : p ∈ ∆(σ )}.
Proof. By induction on φ.
3. φ ≡ ¬ψ: exercise.
57.7 Decidability
Observe that the proof of the completeness theorem gives us for every Γ ⊬ φ
a model with an infinite number of worlds witnessing the fact that Γ ⊭ φ.
The following proposition shows that to prove ⊨ φ it is enough to prove that
M ⊩ φ for all finite models (i.e., models with a finite set of worlds).
M, w ⊩ φ iff M′ , [w] ⊩ φ
for all formulas φ with only propositional variables from P. This is left as an
exercise for the reader.
Problems
Problem 57.1. Complete the proof of Theorem 57.2. For the cases for ¬Intro
and ¬Elim, use the definition of M, w ⊩ ¬ φ in Definition 56.2, i.e., don’t treat
¬ φ as defined by φ → ⊥.
Problem 57.2. Show that the following formulas are not derivable in intu-
itionistic logic:
1. ( φ → ψ) ∨ (ψ → φ)
2. (¬¬ φ → φ) → ( φ ∨ ¬ φ)
3. ( φ → ψ ∨ χ) → ( φ → ψ) ∨ ( φ → χ)
Problem 57.6. Show that if M is a relational model using a linear order then
M ⊩ ( φ → ψ ) ∨ ( ψ → φ ).
Problem 57.7. Finish the proof of Theorem 57.8 by showing that M, w ⊩ φ iff
M′ , [w] ⊩ φ for all formulas φ with only propositional variables from P.
Propositions as Types
58.1 Introduction
Historically the lambda calculus and intuitionistic logic were developed sepa-
rately. Haskell Curry and William Howard independently discovered a close
similarity: types in a typed lambda calculus correspond to formulas in intu-
itionistic logic in such a way that a derivation of a formula corresponds di-
rectly to a typed lambda term with that formula as its type. Moreover, beta re-
duction in the typed lambda calculus corresponds to certain transformations
of derivations.
For instance, a derivation of φ → ψ corresponds to a term λx φ . N ψ , which
has the function type φ → ψ. The inference rules of natural deduction corre-
spond to typing rules in the typed lambda calculus, e.g.,
[ φ] x
ψ x:φ ⇒ N:ψ
x →Intro λ
φ→ψ corresponds to ⇒ λx φ . N ψ : φ → ψ
where the rule on the right means that if x is of type φ and N is of type ψ, then
λx φ . N is of type φ → ψ.
801
CHAPTER 58. PROPOSITIONS AS TYPES
The →Elim rule corresponds to the typing rule for composition terms, i.e.,
φ→ψ φ
ψ
→Elim corresponds to
⇒ P : φ→ψ ⇒ Q:φ
app
⇒ P φ → ψ φ
Q :ψ
[ φ] x
φ
−
→
ψ
x →Intro
φ→ψ φ
ψ
→Elim
ψ
(λx φ . Pψ ) Q −
→ P[ Q/x ].
Similar correspondences hold between the rules for ∧ and “product” types,
and between the rules for ∨ and “sum” types.
This correspondence between terms in the simply typed lambda calculus
and natural deduction derivations is called the “Curry-Howard”, or “propo-
sitions as types” correspondence. In addition to formulas (propositions) cor-
responding to types, and proofs to terms, we can summarize the correspon-
dences as follows:
logic program
proposition type
proof term
assumption variable
discharged assumption bind variable
not discharged assumption free variable
implication function type
conjunction product type
disjunction sum type
absurdity bottom type
The label ∧Elim hints at the relation with the rule of the same name in natural
deduction.
Likewise, suppose we have Γ, φ ⇒ ψ, meaning we have a derivation with
undischarged assumptions Γ, φ and end-formula ψ. If we apply the →Intro
rule, we have a derivation with Γ as undischarged assumptions and φ → ψ as
the end-formula, i.e., Γ ⇒ φ → ψ. Note how this has made the discharge of
assumptions more explicit.
Γ, φ ⇒ ψ
→Intro
Γ ⇒ φ→ψ
We can draw conclusions from other rules in the same fashion, which is
spelled out as follows:
Γ ⇒ φ ∆ ⇒ ψ
∧Intro
Γ, ∆ ⇒ φ ∧ ψ
Γ ⇒ φ∧ψ Γ ⇒ φ∧ψ
∧Elim1 ∧Elim2
Γ ⇒ φ Γ ⇒ ψ
Γ ⇒ φ Γ ⇒ ψ
∨Intro1 ∨Intro2
Γ ⇒ φ∨ψ Γ ⇒ φ∨ψ
Γ ⇒ φ∨ψ ∆, φ ⇒ χ ∆′ , ψ ⇒ χ
∨Elim
Γ, ∆, ∆′ ⇒ χ
Γ, φ ⇒ ψ ∆ ⇒ φ→ψ Γ ⇒ φ
→Intro →Elim
Γ ⇒ φ→ψ Γ, ∆ ⇒ ψ
Γ ⇒ ⊥ ⊥
I
Γ ⇒ φ
φ ⇒ φ
Together, these rules can be taken as a calculus about what natural deduc-
tion derivations exist. They can also be taken as a notational variant of natural
deduction, in which each step records not only the formula derived but also
the undischarged assumptions from which it was derived.
φ ⇒ φ
φ ⇒ φ ∨ ( φ → ⊥) ψ ⇒ ψ
φ, ψ→ ⇒ ⊥
(ψ ⇒ φ → ⊥
(ψ ⇒ φ ∨ ( φ → ⊥) (ψ ⇒ ψ
(ψ ⇒ ⊥
⇒ ψ→⊥
Definition 58.1 (Proof terms). Proof terms are inductively generated by the
following rules:
Definition 58.3 (Typing pair). A typing pair is a pair ⟨ Γ, M⟩, where Γ is a typ-
ing context and M is a proof term.
Since in general terms only make sense with specific contexts, we will
speak simply of “terms” from now on instead of “typing pair”; and it will
be apparent when we are talking about the literal term M.
1. Assumptions discharged in the same step (that is, with the same number
on the square bracket) must be assigned the same variable.
φ into x : φ.
With assumptions all associated with variables (which are terms), we can now
inductively translate the rest of the deduction tree. The modified natural de-
duction rules taking into account context and proof terms are given below.
Given the proof terms for the premise(s), we obtain the corresponding proof
term for conclusion.
M1 : φ1 M2 : φ2
∧Intro
⟨ M1 , M2 ⟩ : φ1 ∧ φ2
M : φ1 ∧ φ2 M : φ1 ∧ φ2
∧Elim1 ∧Elim2
pi ( M ) : φ 1 pi ( M ) : φ 2
In ∧Intro we assume we have φ1 witnessed by term M1 and φ2 witnessed
by term M2 . We pack up the two terms into a pair ⟨ M1 , M2 ⟩ which witnesses
φ1 ∧ φ2 .
In ∧Elimi we assume that M witnesses φ1 ∧ φ2 . The term witnessing φi
is pi ( M). Note that M is not necessary of the form ⟨ M1 , M2 ⟩, so we cannot
simply assign M1 to the conclusion φi .
Note how this coincides with the BHK interpretation. What the BHK in-
terpretation does not specify is how the function used as proof for φ → ψ is
supposed to be obtained. If we think of proof terms as proofs or functions of
proofs, we can be more explicit.
[ x : φ]
P : φ→ψ Q:φ
→Elim
PQ : ψ
N:ψ
→Intro
λx φ . N : φ → ψ
The λ notation should be understood as the same as in the lambda calculus,
and PQ means applying P to Q.
M1 : φ1 M2 : φ2
φ1 ∨Intro1 φ2 ∨Intro2
in1 ( M1 ) : φ1 ∨ φ2 in2 ( M2 ) : φ1 ∨ φ2
[ x1 : φ1 ] [ x2 : φ2 ]
M : A1 ∨ φ2 N1 : χ N2 : χ
∨Elim
case( M, x1 .N1 , x2 .N2 ) : χ
φ
The proof term in1 1 ( M1 ) is a term witnessing φ1 ∨ φ2 , where M1 witnesses
φ1 .
The term case( M, x1 .N1 , x2 .N2 ) mimics the case clause in programming
languages: we already have the derivation of φ ∨ ψ, a derivation of χ assum-
ing φ, and a derivation of χ assuming ψ. The case operator thus select the
appropriate proof depending on M; either way it’s a proof of χ.
N:⊥ ⊥I
contr φ ( N ) : φ
[ x : φ ]1
φ→⊥
[y : ( φ ∨ ( φ → ⊥)) → ⊥]2 in1 ( x ) : φ ∨ ( φ → ⊥)
φ→⊥
y(in1 ( x )) : ⊥
1
φ→⊥
φ
λx . y(in1 ( x )) : φ → ⊥
φ φ→⊥
[y : ( φ ∨ ( φ → ⊥)) → ⊥]2 in2 (λx φ . y(in1 ( x ))) : φ ∨ ( φ → ⊥)
φ φ →⊥
y(in2 (λx φ . yin1 ( x ))) : ⊥
2
→⊥
λy( φ∨( φ→⊥))→⊥ . y(in2 (λx φ . yin1
φ φ
( x ))) : (( φ ∨ ( φ → ⊥)) → ⊥) → ⊥
The tree has no assumptions, so the context is empty; we get:
φ→⊥
⊢ λy( φ∨( φ→⊥))→⊥ . y(in2 (λx φ . yin1
φ
( x ))) : (( φ ∨ ( φ → ⊥)) → ⊥) → ⊥
If we leave out the last →Intro, the assumptions denoted by y would be in the
context and we would get:
φ φ→⊥
y : (( φ ∨ ( φ → ⊥)) → ⊥) ⊢ y(in2 (λx φ . yin1 ( x ))) : ⊥
Another example: ⊢ φ → ( φ → ⊥) → ⊥
[ x : φ ]2 [y : φ → ⊥]1
yx : ⊥
1
λy φ→⊥ . yx : ( φ → ⊥) → ⊥
2
λx φ . λy φ→⊥ . yx : φ → ( φ → ⊥) → ⊥
Again all assumptions are discharged and thus the context is empty, the re-
sulting term is
⊢ λx φ . λy φ→⊥ . yx : φ → ( φ → ⊥) → ⊥
If we leave out the last two →Intro inferences, the assumptions denoted by
both x and y would be in context and we would get
x : φ, y : φ → ⊥ ⊢ yx : ⊥
For each natural deduction rule, the term in the conclusion is always formed
by wrapping some operator around the terms assigned to the premise(s). Rules
correspond uniquely to such operators. For example, from the structure of
the S we infer that the last rule applied must be →Intro, since it is of the form
λy... . . . ., and the λ operator corresponds to →Intro. In general we can recover
the skeleton of the derivation solely by the structure of the term, e.g.,
[ x ]1
φ→⊥
∨Intro1
[ y : ]2 in1 (x) :
φ→⊥
→Elim
y(in1 ( x )) :
1
φ→⊥
→Intro
λx φ . y(in1 ( x )) :
φ φ→⊥
∨Intro2
[ y : ]2 in2 (λx φ . yin1 ( x )) :
φ φ→⊥
→Elim
y(in2 (λx φ . yin1 ( x ))) :
2
φ→⊥
→Intro
λy( φ∨( φ→⊥))→⊥ . y(in2 (λx φ . y(in1
φ
( x )))) :
Our next step is to recover the formulas these terms witness. We define a
function F ( Γ, M) which denotes the formula witnessed by M in context Γ, by
induction on M as follows:
F ( Γ, x ) = Γ ( x )
F ( Γ, ⟨ N1 , N2 ⟩ = F ( Γ, N1 ) ∧ F ( Γ, N2 )
F ( Γ, pi ( N ) = φi if F ( Γ, N ) = φ1 ∧ φ2
(
φ F ( N ) ∨ φ if i = 1
F ( Γ, ini ( N ) =
φ ∨ F ( N ) if i = 2
F ( Γ, case( M, x1 .N1 , x2 .N2 )) = F ( Γ ∪ { xi : F ( Γ, M)}, Ni )
F ( Γ, λx φ . N ) = φ → F ( Γ ∪ { x : φ}, N )
F ( Γ, N M ) = ψ if F ( Γ, N ) = φ → ψ
φ→ψ φ
ψ
→Elim
Γ ⊢ M1 : φ1 ∆ ⊢ M2 : φ2 Γ ⊢ M : φ1 ∧ φ2
∧Intro ∧Elimi
Γ, ∆ ⊢ ⟨ M1 , M2 ⟩ : φ1 ∧ φ2 Γ ⊢ pi ( M ) : φ i
Γ ⊢ M1 : φ1 Γ ⊢ M2 : φ2
φ2 ∨Intro1 φ ∨Intro2
Γ ⊢ in1 ( M ) : φ1 ∨ φ2 Γ ⊢ in2 1 ( M ) : φ1 ∨ φ2
Γ ⊢ M : φ∨ψ ∆ 1 , x1 : φ1 ⊢ N1 : χ ∆ 2 , x2 : φ2 ⊢ N2 : χ
′ ∨Elim
Γ, ∆, ∆ ⊢ case( M, x1 .N1 , x2 .N2 ) : χ
Γ, x : φ ⊢ N : ψ Γ⊢Q:φ ∆⊢ P : φ→ψ
→Intro →Elim
Γ ⊢ λx φ . N : φ → ψ Γ, ∆ ⊢ PQ : ψ
Γ⊢M:⊥
⊥Elim
Γ ⊢ contr φ ( M ) : φ
These are the typing rules of the simply typed lambda calculus extended
with product, sum and bottom.
In addition, the F ( Γ, M ) is actually a type checking algorithm; it returns
the type of the term with respect to the context, or is undefined if the term is
ill-typed with respect to the context.
58.6 Reduction
φ φ→ψ
ψ
→Elim [χ]
∧Intro
ψ∧χ
ψ
∧Elim
→Intro
χ→ψ
φ φ→ψ
ψ
→Elim
→Intro
χ→ψ
D1 D2
φ1 φ2
∧Intro Di
φ1 ∧ φ2
φi ∧ Elim i −
→ φi
In the typed lambda calculus, this is the beta reduction rule for the product
type.
Note the type annotation on M1 and M2 : while in the standard term syntax
only λx φ . N has such notion, we reuse the notation here to remind us of the
formula the term is associated with in the corresponding natural deduction
derivation, to reveal the correspondence between the two kinds of syntax.
In natural deduction, a pair of inferences such as those on the left, i.e., a
pair that is subject to cancelling is called a cut. In the typed lambda calculus
the term on the left of −→ is called a redex, and the term to the right is called
the reductum. Unlike untyped lambda calculus, where only (λx. N ) Q is con-
sidered to be redex, in the typed lambda calculus the syntax is extended to
φ
terms involving ⟨ N, M ⟩, pi ( N ), ini ( N ), case( N, x1 .M1 , x2 .M2 ), and contr N (),
with corresponding redexes.
Similarly we have reduction for disjunction:
D
[ φ1 ] u [ φ2 ] u
D φi
D1 D2
φi
Di
φ1 ∨ φ2 ∨Intro χ χ
u
χ ∨Elim −
→ χ
This is the beta reduction rule of for sum types. Here, M[ N/x ] means replac-
ing all assumptions denoted by variable x in M with N,
It would be nice if we pass the context Γ to the substitution function so that
it can check if the substitution makes sense. For example, xy[ ab/y] does not
make sense under the context { x : φ → θ, y : φ, a : ψ → χ, b : ψ} since then we
[ φ]u
D′
D φ
ψ D′
u →Intro D
φ→ψ φ
ψ
→Elim −
→ ψ
( Γ, (λx φ . N ψ ) Q φ ) −
→ ( Γ, N ψ [ Q φ /x φ ])
Absurdity has only an elimination rule and no introduction rule, thus there
is no such reduction for it.
Note that the above notion of reduction concerns only deductions with a
cut at the end of a derivation. We would of course like to extend it to reduction
of cuts anywhere in a derivation, or reductions of subterms of proof terms
which constitute redexes. Note that, however, the conclusion of the reduction
does not change after reduction, thus we are free to continue applying rules
to both sides of −→. The resulting pairs of trees constitutes an extended notion
of reduction; it is analogous to compatibility in the untyped lambda calculus.
It’s easy to see that the context Γ does not change during the reduction
(both the original and the extended version), thus it’s unnecessary to men-
tion the context when we are discussing reductions. In what follows we will
assume that every term is accompanied by a context which does no change
during reduction. We then say “proof term” when we mean a proof term ac-
companied by a context which makes it well-typed.
As in lambda calculus, the notion of normal-form term and normal deduc-
tion is given:
Definition 58.5. A proof term with no redex is said to be in normal form; like-
wise, a derivation without cuts is a normal derivation. A proof term is in normal
form if and only if its counterpart derivation is normal.
58.7 Normalization
In this section we prove that, via some reduction order, any deduction can
be reduced to a normal deduction, which is called the normalization property.
len( p) = 0
len( φ ∧ ψ) = len( φ) + len(ψ) + 1
len( φ ∨ ψ) = len( φ) + len(ψ) + 1
len( φ → ψ) = len( φ) + len(ψ) + 1.
The complexity of a proof term is measured by the most complex redex in it,
and 0 if it is normal:
which is pi (⟨ P1 , P2 ⟩). Its cut rank is equal to cr( x ), which is len( φ).
The cases of case( N, x1 .N1 , x2 .N2 ) and PQ are similar.
Lemma 58.7. If M contracts to M′ , and cr( M ) > cr( N ) for all proper redex sub-
terms N of M, then cr( M ) > mr( M′ ).
sponding redex in Ni with equal cut rank, which is less than cr( M ) by
assumption; or the cut rank equals len( φi ), which by definition is less
φ χ φ χ
than cr(case(ini ( N φi ), x1 1 .N1 , x2 2 .N2 )).
Theorem 58.8. All proof terms reduce to normal form; all derivations reduce to nor-
mal derivations.
Proof. The second follows from the first. We prove the first by complete in-
duction on m = mr( M), where M is a proof term.
1. If m = 0, M is already normal.
2. Otherwise, we proceed by induction on n, the number of redexes in M
with cut rank equal to m.
a) If n = 1, select any redex N such that m = cr( N ) > cr( P) for any
proper sub-term P which is also a redex of course. Such a redex
must exist, since any term only has finitely many subterms.
Let N ′ denote the reductum of N. Now by the lemma mr( N ′ ) <
mr( N ), thus we can see that n, the number of redexes with cr(=)m
is decreased. So m is decreased (by 1 or more), and we can apply
the inductive hypothesis for m.
b) For the induction step, assume n > 1. the process is similar, except
that n is only decreased to a positive number and thus m does not
change. We simply apply the induction hypothesis for n.
Counterfactuals
816
Chapter 59
Introduction
If Γ, φ ⊨ ψ then Γ ⊨ φ → ψ (59.1)
φ → ψ ⊨ ¬φ ∨ ψ (59.2)
¬φ ∨ ψ ⊨ φ → ψ (59.3)
ψ ⊨ φ→ψ (59.4)
¬φ ⊨ φ → ψ (59.5)
817
CHAPTER 59. INTRODUCTION
¬( φ → ψ) ⊨ φ ∧ ¬ψ (59.6)
φ ∧ ¬ψ ⊨ ¬( φ → ψ) (59.7)
φ, φ → ψ ⊨ ψ (59.8)
φ → ψ, φ → χ ⊨ φ → (ψ ∧ χ) (59.9)
φ → ψ ⊨ ( φ ∧ χ) → ψ (59.10)
φ → ψ, ψ → χ ⊨ φ → χ (59.11)
φ → ψ ⊨ ¬ψ → ¬ φ (59.12)
¬ψ → ¬ φ ⊨ φ → ψ (59.13)
Lewis introduced the strict conditional J and argued that it, not the material
conditional, corresponds to implication. In alethic modal logic, φ J ψ can
be defined as □( φ → ψ). A strict conditional is thus true (at a world) iff the
corresponding material conditional is necessary.
How does the strict conditional fare vis-a-vis the paradoxes of the material
conditional? A strict conditional with a false antecedent and one with a true
consequent, may be true, or it may be false. Moreover, ( φ J ψ) ∨ (ψ J φ) is
not valid. The strict conditional φ J ψ is also not equivalent to ¬ φ ∨ ψ, so it is
not truth functional.
1 Reading “→” as “implies” is still widely practised by mathematicians and computer scien-
tists, although philosophers try to avoid the confusions Lewis highlighted by pronouncing it as
“only if.”
We have:
φ J ψ ⊨ ¬ φ ∨ ψ but: (59.14)
¬φ ∨ ψ ⊭ φ J ψ (59.15)
ψ⊭ φJψ (59.16)
¬φ ⊭ φ J ψ (59.17)
¬( φ → ψ) ⊭ φ ∧ ¬ψ but: (59.18)
φ ∧ ¬ψ ⊨ ¬( φ J ψ) (59.19)
φ, φ J ψ ⊨ ψ (59.20)
φ J ψ, φ J χ ⊨ φ J (ψ ∧ χ) (59.21)
φ J ψ ⊨ ( φ ∧ χ) J ψ (59.22)
φ J ψ, ψ J χ ⊨ φ J χ (59.23)
φ J ψ ⊨ ¬ψ J ¬ φ (59.24)
¬ψ J ¬ φ ⊨ φ J ψ (59.25)
However, the strict conditional still has its own “paradoxes.” Just as a ma-
terial conditional with a false antecedent or a true consequent is true, a strict
conditional with a necessarily false antecedent or a necessarily true consequent
is true. Moreover, any true strict conditional is necessarily true, and any false
strict conditional is necessarily false. In other words, we have
□¬ φ ⊨ φ J ψ (59.26)
□ψ ⊨ φ J ψ (59.27)
φ J ψ ⊨ □( φ J ψ ) (59.28)
¬( φ J ψ) ⊨ □¬( φ J ψ) (59.29)
These are not problems if you think of J as “implies.” Logical entailment rela-
tionships are, after all, mathematical facts and so can’t be contingent. But they
do raise issues if you want to use J as a logical connective that is supposed to
capture “if . . . then . . . ,” especially the last two. For surely there are “if . . . then
. . . ” statements that are contingently true or contingently false—in fact, they
generally are neither necessary nor impossible.
59.4 Counterfactuals
A very common and important form of “if . . . then . . . ” constructions in En-
glish are built using the past subjunctive form of to be: “if it were the case that
. . . then it would be the case that . . . ” Because usually the antecedent of such
a conditional is false, i.e., counter to fact, they are called counterfactual con-
ditionals (and because they use the subjunctive form of to be, also subjunctive
conditionals. They are distinguished from indicative conditionals which take
the form of “if it is the case that . . . then it is the case that . . . ” Counterfac-
tual and indicative conditionals differ in truth conditions. Consider Adams’s
famous example:
The first is indicative, the second counterfactual. The first is clearly true: we
know President John F. Kennedy was killed by someone, and if that someone
wasn’t (contrary to the Warren Report) Lee Harvey Oswald, then someone
else killed Kennedy. The second one says something different. It claims that
if Oswald hadn’t killed Kennedy, i.e., if the Dallas shooting had been avoided
or had been unsuccessful, history would have subsequently unfolded in such
a way that another assassination would have been successful. In order for it
to be true, it would have to be the case that powerful forces had conspired to
ensure JFK’s death (as many JFK conspiracy theorists believe).
It is a live debate whether the indicative conditional is correctly captured
by the material conditional, in particular, whether the paradoxes of the ma-
terial conditional can be “explained” in a way that is compatible with it giv-
ing the truth conditions for English indicative conditionals. By contrast, it
is uncontroversial that counterfactual conditionals cannot be symbolized cor-
rectly by the material conditionals. That is clear because, even though gener-
ally the antecedents of counterfactuals are false, not all counterfactuals with
false antecedents are true—for instance, if you believe the Warren Report, and
there was no conspiracy to assassinate JFK, then Adams’s counterfactual con-
ditional is an example.
Counterfactual conditionals play an important role in causal reasoning: a
prime example of the use of counterfactuals is to express causal relationships.
E.g., striking a match causes it to light, and you can express this by saying
“if this match were struck, it would light.” Material, and generally indicative
conditionals, cannot be used to express this: “the match is struck → the match
lights” is true if the match is never struck, regardless of what would happen
if it were. Even worse, “the match is struck → the match turns into a bouquet
of flowers” is also true if it is never struck, but the match would certainly not
turn into a bouquet of flowers if it were struck.
Problems
Problem 59.1. Give S5-counterexamples to the entailment relations which do
not hold for the strict conditional, i.e., for:
1. ¬ p ⊭ □( p → q)
2. q ⊭ □( p → q)
3. ¬□( p → q) ⊭ p ∧ ¬q
4. ⊭ □( p → q) ∨ □(q → p)
Problem 59.2. Show that the valid entailment relations hold for the strict con-
ditional by giving S5-proofs of:
1. □( φ → ψ) ⊨ ¬ φ ∨ ψ
2. φ ∧ ¬ψ ⊨ ¬□( φ → ψ)
3. φ, □( φ → ψ) ⊨ ψ
4. □( φ → ψ), □( φ → χ) ⊨ □( φ → (ψ ∧ χ))
5. □( φ → ψ) ⊨ □(( φ ∧ χ) → ψ)
6. □( φ → ψ), □(ψ → χ) ⊨ □( φ → χ)
7. □( φ → ψ) ⊨ □(¬ψ → ¬ φ)
8. □(¬ψ → ¬ φ) ⊨ □( φ → ψ)
1. □¬ψ ⊨ φ J ψ
2. φ J ψ ⊨ □( φ J ψ)
3. ¬( φ J ψ) ⊨ □¬( φ J ψ)
60.1 Introduction
Stalnaker and Lewis proposed accounts of counterfactual conditionals such
as “If the match were struck, it would light.” Their accounts were propos-
als for how to properly understand the truth conditions for such sentences.
The idea behind both proposals is this: to evaluate whether a counterfactual
conditional is true, we have to consider those possible worlds which are min-
imally different from the way the world actually is to make the antecedent
true. If the consequent is true in these possible worlds, then the counterfac-
tual is true. For instance, suppose I hold a match and a matchbook in my
hand. In the actual world I only look at them and ponder what would hap-
pen if I were to strike the match. The minimal change from the actual world
where I strike the match is that where I decide to act and strike the match. It
is minimal in that nothing else changes: I don’t also jump in the air, striking
the match doesn’t also light my hair on fire, I don’t suddenly lose all strength
in my fingers, I am not simultaneously doused with water in a SuperSoaker
ambush, etc. In that alternative possibility, the match lights. Hence, it’s true
that if I were to strike the match, it would light.
This intuitive account can be paired with formal semantics for logics of
counterfactuals. Lewis introduced the symbol “” for the counterfactual
while Stalnaker used the symbol “>”. We’ll use , and add it as a binary
connective to propositional logic. So, we have, in addition to formulas of the
form φ → ψ also formulas of the form φ ψ. The formal semantics, like the
relational semantics for modal logic, is based on models in which formulas are
evaluated at worlds, and the satisfaction condition defining M, w ⊩ φ ψ is
given in terms of M, w′ ⊩ φ and M, w′ ⊩ ψ for some (other) worlds w′ . Which
w′ ? Intuitively, the one(s) closest to w for which it holds that M, w′ ⊩ φ. This
requires that a relation of “closeness” has to be included in the model as well.
Lewis introduced an instructive way of representing counterfactual situa-
tions graphically. Each possible world is at the center of a set of nested spheres
824
60.2. SPHERE MODELS
w φ
The closest φ-worlds are those worlds w′ where φ is satisfied which lie in the
smallest sphere around the center world w (the gray area). Intuitively, φ ψ
is satisfied at w if ψ is true at all closest φ-worlds.
1. Ow is centered on w: {w} ∈ Ow .
The intuition behind Ow is that the worlds “around” w are stratified ac-
cording to how far away they are from w. The innermost sphere is just w by
itself, i.e., the set {w}: w is closer to w than the worlds in any other sphere. If
S ⊊ S′ , then the worlds in S′ \ S are further way from w than the worlds in S:
S′ \ S is the “layer” between the S and the worlds outside of S′ . In particular,
we have to think of the spheres as containing all the worlds within their outer
surface; they are not just the individual layers.
w1 w7
w5
w p
w6
w2
w3
w4
1. For all u ∈
S
Ow , M, u ⊮ ψ, or
2. For some S ∈ Ow ,
w φ
w φ
w φ
w φ
ψ-worlds, so M, v ⊮ φ ψ.
u v
w1
w
w2
q
Example 60.3. The sphere semantics invalidates the inference, i.e., we have
p r ⊭ ( p ∧ q) r. Consider the model M = ⟨W, O, V ⟩ where W =
{w, w1 , w2 }, Ow = {{w}, {w, w1 }, {w, w1 , w2 }}, V ( p) = {w1 , w2 }, V (q) =
{w2 }, and V (r ) = {w1 }. There is a p-admitting sphere S = {w, w1 } and p → r
is true at all worlds in it, so M, w ⊩ p r. There is also a ( p ∧ q)-admitting
sphere S′ = {w, w1 , w2 } but M, w2 ⊮ ( p ∧ q) → r, so M, w ⊮ ( p ∧ q) r (see
Figure 60.7).
60.5 Transitivity
For the material conditional, the chain rule holds: φ → ψ, ψ → χ ⊨ φ → χ.
In other words, the material conditional is transitive. Is the same true for
counterfactuals? Consider the following example due to Stalnaker.
If Hoover had been born (at the same time he actually did), not in the United
States, but in Russia, he would have grown up in the Soviet Union and become
a Communist (let’s assume). So the first premise is true. Likewise, the second
premise, considered in isolation is true. The conclusion, however, is false:
in all likelihood, Hoover would have been a fervent Communist if he had
been born in the USSR, and not been a traitor (to his country). The intuitive
assignment of truth values is borne out by the Stalnaker-Lewis account. The
closest possible world to ours with the only change being Hoover’s place of
birth is the one where Hoover grows up to be a good citizen of the USSR.
This is the closest possible world where the antecedent of the first premise
and of the conclusion is true, and in that world Hoover is a loyal member of
the Communist party, and so not a traitor. To evaluate the second premise, we
have to look at a different world, however: the closest world where Hoover is
a Communist, which is one where he was born in the United States, turned,
and thus became a traitor.1
Example 60.4. The sphere semantics invalidates the inference, i.e., we have
p q, q r ⊭ p r. Consider the model M = ⟨W, O, V ⟩ where W =
{w, w1 , w2 }, Ow = {{w}, {w, w1 }, {w, w1 , w2 }}, V ( p) = {w2 }, V (q) = {w1 , w2 },
and V (r ) = {w1 }. There is a p-admitting sphere S = {w, w1 , w2 } and p → q is
true at all worlds in it, so M, w ⊩ p q. There is also a q-admitting sphere
S′ = {w, w1 } and M ⊮ q → r is true at all worlds in it, so M, w ⊩ q r. How-
ever, the p-admitting sphere {w, w1 , w2 } contains a world, namely w2 , where
M, w2 ⊮ p → r.
60.6 Contraposition
Material and strict conditionals are equivalent to their contrapositives. Coun-
terfactuals are not. Here is an example due to Kratzer:
cal and political assumptions, e.g., that it is possible that Hoover could have been born to Russian
parents, or that Communists in the US of the 1950s were traitors to their country.
¬q
q
w w1
w2
p
¬p
The first sentence is true: humans don’t live hundreds of years. The second
is clearly false: if Goethe weren’t dead now, he would be still alive, and so
couldn’t have died in 1832.
Problems
Problem 60.1. Find a convincing, intuitive example for the failure of transi-
tivity of counterfactuals.
Hoover’s being born in Russia is a more remote possibility than him being a
Communist?
Set Theory
833
Chapter 61
61.1 Extensionality
The very first thing to say is that sets are individuated by their members. More
precisely:
Axiom (Extensionality). For any sets A and B: ∀ x ( x ∈ A ↔ x ∈ B) → A = B
834
61.3. PREDICATIVE AND IMPREDICATIVE
Proof. If R = { x : x ∈
/ x } exists, then R ∈ R iff R ∈
/ R, which is a contradic-
tion.
Russell discovered this result in June 1901. (He did not, though, put the
paradox in quite the form we just presented it, since he was considering Frege’s
set theory, as outlined in Grundgesetze. We will return to this in section 61.6.)
Russell wrote to Frege on June 16, 1902, explaining the inconsistency in Frege’s
system. For the correspondence, and a bit of background, see Heijenoort
(1967, pp. 124–8).
It is worth emphasising that this two-line proof is a result of pure logic.
Granted, we implicitly used a (non-logical?) axiom, Extensionality, in our no-
tation { x : x ∈
/ x }; for { x : φ( x )} is to be the unique (by Extensionality) set of
the φs, if one exists. But we can avoid even the hint of Extensionality, just by
stating the result as follows: there is no set whose members are exactly the non-self-
membered sets. And this has nothing much to do with sets. As Russell himself
observed, exactly similar reasoning will lead you to conclude: no man shaves
exactly the men who do not shave themselves. Or: no pug sniffs exactly the pugs
which don’t sniff themselves. And so on. Schematically, the shape of the result is
just:
¬∃ x ∀z( Rzx ↔ ¬ Rzz).
And that’s just a theorem (scheme) of first-order logic. Consequently, we can’t
avoid Russell’s Paradox just by tinkering with our set theory; it arises before
we even get to set theory. If we’re going to use (classical) first-order logic, we
simply have to accept that there is no set R = { x : x ∈
/ x }.
The upshot is this. If you want to accept Naı̈ve Comprehension whilst
avoiding inconsistency, you cannot just tinker with the set theory. Instead, you
would have to overhaul your logic.
Of course, set theories with non-classical logics have been presented. But
they are—to say the least—non-standard. The standard approach to Russell’s
Paradox is to treat it as a straightforward non-existence proof, and then to try
to learn how to live with it. That is the approach we will follow.
In the wake of the paradoxes, Whitehead, Russell, Poincaré and Weyl re-
jected such impredicative definitions as “viciously circular”:
An analysis of the paradoxes to be avoided shows that they all
result from a kind of vicious circle. The vicious circles in ques-
tion arise from supposing that a collection of objects may contain
members which can only be defined by means of the collection as
a whole[. . . . ¶]
The principle which enables us to avoid illegitimate totalities may
be stated as follows: ‘Whatever involves all of a collection must not
be one of the collection’; or, conversely: ‘If, provided a certain col-
lection had a total, it would have members only definable in terms
of that total, then the said collection has no total.’ We shall call
this the ‘vicious-circle principle,’ because it enables us to avoid the
vicious circles involved in the assumption of illegitimate totalities.
(Whitehead and Russell, 1910, p. 37)
If we follow them in rejecting the vicious-circle principle, then we might attempt
to replace the disastrous Naı̈ve Comprehension Scheme (of section 61.2) with
something like this:
Ramsey’s point is that “the tallest man in the group” is an impredicative defi-
nition; but it is obviously perfectly kosher.
One might respond that, in this case, we could pick out the tallest person
by predicative means. For example, maybe we could just point at the man in
question. The objection against impredicative definitions, then, would clearly
need to be limited to entities which can only be picked out impredicatively.
But even then, we would need to hear more, about why such “essential im-
predicativity” would be so bad.1
Admittedly, impredicative definitions are extremely bad news, if we want
our definitions to provide us with something like a recipe for creating an ob-
ject. For, given an impredicative definition, one would genuinely be caught in
a vicious circle: to create the impredicatively specified object, one would first
need to create all the objects (including the impredicatively specified object),
since the impredicatively specified object is specified in terms of all the ob-
jects; so one would need to create the impredicatively specified object before
one had created it itself. But again, this is only a serious objection against “es-
sentially impredicatively” specified sets, if we think of sets as things that we
create. And we (probably) don’t.
As such—for better or worse—the approach which became common does
not involve taking a hard line concerning (im)predicativity. Rather, it involves
what is now regarded as the cumulative-iterative approach. In the end, this
will allow us to stratify our sets into “stages”—a bit like the predicative ap-
proach stratifies entities into sets0 , sets1 , sets2 , . . . —but we will not postulate
any difference in kind between them.
Sets are formed in stages. For each stage S, there are certain stages
which are before S. At stage S, each collection consisting of sets
formed at stages before S is formed into a set. There are no sets
other than the sets which are formed at stages. (Shoenfield, 1977,
p. 323)
earlier stages.2 So we form only one set: the set with no elements ∅. At stage
1, exactly one set is available to us from earlier stages, so only one new set is
{∅}. At stage 2, two sets are available to us from earlier stages, and we form
two new sets {{∅}} and {∅, {∅}}. At stage 3, four sets are available to us
from earlier stages, so we form twelve new sets. . . . As such, the cumulative-
iterative picture of the sets will look a bit like this (with numbers indicating
stages):
6
5
4
3
2
1
0
RS = { x : x ∈
/ x and x was available before S}
The reasoning involved in proving Russell’s Paradox will now establish that
RS itself is not available before stage S. And that’s not a contradiction. More-
over, if we embrace the cumulative-iterative conception of set, then we shouldn’t
even have expected to be able to form the Russell set itself. For that would be
the set of all non-self-membered sets that “will ever be available”. In short:
the fact that we (provably) can’t form the Russell set isn’t surprising, given the
cumulative-iterative story; it’s what we would predict.
2 Why should we assume that there is a first stage? See the footnote to Stages-are-ordered in
section 62.1.
6
5
4
3
2
1
0
a ∈ b =df ∃ G (b = ϵx G ( x ) ∧ Ga)
roughly: a ∈ b iff a falls under a concept whose extension is b. (Note that the
quantifier “∃ G” is second-order.) Frege also maintained the following princi-
ple, known as Basic Law V:
ϵx F ( x ) = ϵx G ( x ) ↔ ∀ x ( Fx ↔ Gx )
roughly: concepts have identical extensions iff they are coextensive. (Again,
both “F” and “G” are in predicate position.) Now a simple principle connects
membership with property-satisfaction:
Steps towards Z
Stages-accumulate. For any stage S, and for any sets which were formed
before stage S: a set is formed at stage S whose members are exactly those
sets. Nothing else is formed at stage S.
These are informal principles, but we will be able to use them to vindicate
several of the axioms of Zermelo’s set theory.
(We should offer a word of caution. Although we will be presenting some
completely standard axioms, with completely standard names, the italicized
principles we have just presented have no particular names in the literature.
We simply monikers which we hope are helpful.)
62.2 Separation
We start with a principle to replace Naı̈ve Comprehension:
explained in chapter 63. This is a substantial assumption. In fact, using a very clever technique
due to Scott (1974), this assumption can be avoided and then derived. (This will also explain why
we should think that there is an initial stage.) We cannot go into that here; for more, see Button
(forthcoming).
842
62.2. SEPARATION
Note that this is not a single axiom. It is a scheme of axioms. There are
infinitely many Separation axioms; one for every formula φ( x ). The scheme
can equally well be (and normally is) written down as follows:
For any formula φ( x ) which does not contain “S”, this is an axiom:
∀ A∃S∀ x ( x ∈ S ↔ ( φ( x ) ∧ x ∈ A)).
In keeping with the convention noted at the start of part XIV, the formu-
las φ in the Separation axioms may have parameters.2
Separation is immediately justified by our cumulative-iterative conception
of sets we have been telling. To see why, let A be a set. So A is formed by some
stage S (by Stages-are-key). Since A was formed at stage S, all of A’s members
were formed before stage S (by Stages-accumulate). Now in particular, consider
all the sets which are members of A and which also satisfy φ; clearly all of
these sets, too, were formed before stage S. So they are formed into a set
{ x ∈ A : φ( x )} at stage S too (by Stages-accumulate).
Unlike Naı̈ve Comprehension, this avoid Russell’s Paradox. For we cannot
simply assert the existence of the set { x : x ∈
/ x }. Rather, given some set A, we
can assert the existence of the set R A = { x ∈ A : x ∈/ x }. But all this proves is
that R A ∈/ R A and R A ∈/ A, none of which is very worrying.
However, Separation has an immediate and striking consequence:
Proof. A \ B = { x ∈ A : x ∈
/ B} exists by Separation.
62.3 Union
Proposition 62.4 gave us intersections. But if we want arbitrary unions to exist,
we need to lay down another axiom:
A = { x : (∃b ∈ A) x ∈ b} exists.
S
Axiom (Union). For any set A, the set
∀ A∃U ∀ x ( x ∈ U ↔ (∃b ∈ A) x ∈ b)
62.4 Pairs
The next axiom to consider is the following:
Here is how to justify this axiom, using the iterative conception. Suppose
a is available at stage S, and b is available at stage T. Let M be whichever of
stages S and T comes later. Then since a and b are both available at stage M,
the set { a, b} is a possible collection available at any stage after M (whichever
is the greater).
But hold on! Why assume that there are any stages after M? If there are
none, then our justification will fail. So, to justify Pairs, we will have to add
another principle to the story we told in section 62.1, namely:
2. a ∪ b
3. ⟨ a, b⟩
62.5 Powersets
We will proceed with another axiom:
Axiom (Powersets). For any set A, the set ℘( A) = { x : x ⊆ A} exists.
∀ A∃ P∀ x ( x ∈ P ↔ (∀z ∈ x )z ∈ A)
Proof. The set ℘(℘( A ∪ B)) exists by Powersets and Proposition 62.5. So by
Separation, this set exists:
62.6 Infinity
We already have enough axioms to ensure that there are infinitely many sets
(if there are any). For suppose some set exists, and so ∅ exists (by Proposi-
tion 62.2). Now for any set x, the set x ∪ { x } exists by Proposition 62.5. So,
applying this a few times, we will get sets as follows:
0. ∅
1. {∅}
2. {∅, {∅}}
Definition 62.7. Let I be any set given to us by the Axiom of Infinity. Let s be
the function s( x ) = x ∪ { x }. Let ω = clos (∅). We call the members of ω the
natural numbers, and say that n is the result of n-many applications of s to ∅.
You can now look back and check that the set labelled “n”, a few para-
graphs earlier, will be treated as the number n.
We will discuss this significance of this stipulation in section 62.8. For now,
it enables us to prove an intuitive result:
The question remains, though, of how we might justify the Axiom of Infin-
ity. The short answer is that we will need to add another principle to the story
we have been telling. That principle is as follows:
62.7 Z− : a Milestone
We will revisit Stages-hit-infinity in the next section. However, with the Axiom
of Infinity, we have reached an important milestone. We now have all the
axioms required for the theory Z− . In detail:
Definition 62.9. The theory Z− has these axioms: Extensionality, Union, Pairs,
Powersets, Infinity, and all instances of the Separation scheme.
The name stands for Zermelo set theory (minus something which we will
come to later). Zermelo deserves the honour, since he essentially formulated
this theory in his 1908a.3
This theory is powerful enough to allow us to do an enormous amount
of mathematics. In particular, you should look back through part I, and con-
vince yourself that everything we did, naı̈vely, could be done more formally
within Z− . (Once you have done that for a bit, you might want to skip ahead
and read section 62.9.) So, henceforth, and without any further comment, we
will take ourselves to be working in Z− (at least).
3 For interesting comments on the history and technicalities, see Potter (2004, Appendix A).
But this should ring alarm bells: since Naı̈ve Comprehension fails, there is
no guarantee that { X : φ( X )} exists. It looks dangerously, then, like such
definitions are cheating.
Fortunately, they are not cheating; or rather, if they are cheating as they
stand, then we can engage in some honest toil to render them kosher. That
honest toil was foreshadowed in Proposition 62.4, when we explained why
A exists for any A ̸= ∅. But we will spell it out explicitly.
T
∀ x ( x ∈ C ↔ ∀ X ( φ( X ) → x ∈ X )) (*)
Now, suppose there is some set, S, such that φ(S). Then to deliver eq. (*), we
can simply define C using Separation, as follows:
C = { x ∈ S : ∀ X ( φ( X ) → x ∈ X )}.
We leave it as an exercise to check that this definition yields eq. (*), as desired.
And this general strategy will allow us to circumvent any apparent use of
Naı̈ve Comprehension in defining intersections. In the particular case which
got us started on this line of thought, namely that of clo f (o ), here is how that
would work. We began the proof of Lemma 6.3 by noting that o ∈ ran( f ) ∪
{o } and that ran( f ) ∪ {o } is f -closed. So, we can define what we want thus:
Problems
Problem 62.1. Show that, for any sets a, b, c, the set { a, b, c} exists.
Problem 62.2. Show that, for any sets a1 , . . . , an , the set { a1 , . . . , an } exists.
Problem 62.3. Show that, for any sets A, B: (i) the set of all relations with
domain A and range B exists; and (ii) the set of all functions from A to B
exists.
Ordinals
63.1 Introduction
In chapter 62, we postulated that there is an infinite-th stage of the hierarchy,
in the form of Stages-hit-infinity (see also our axiom of Infinity). However,
given Stages-keep-going, we can’t stop at the infinite-th stage; we have to keep
going. So: at the next stage after the first infinite stage, we form all possible
collections of sets that were available at the first infinite stage; and repeat; and
repeat; and repeat; . . .
Implicitly what has happened here is that we have started to invoke an
“intuitive” notion of number, according to which there can be numbers after
all the natural numbers. In particular, the notion involved is that of a transfinite
ordinal. The aim of this chapter is to make this idea more rigorous. We will
explore the general notion of an ordinal, and then explicitly define certain sets
to be our ordinals.
We call this, in the jargon, an ω-sequence. And indeed, this general ordering
is mirrored in our initial construction of the stages of the set hierarchy. But,
now suppose we move 0 to the end of this sequence, so that it comes after all
the other numbers:
We have the same entities here, but ordered in a fundamentally different way:
our first ordering had no last element; our new ordering does. Indeed, our
851
CHAPTER 63. ORDINALS
63.3 Well-Orderings
The fundamental notion is as follows:
Definition 63.1. The relation < well-orders A iff it meets these two conditions:
It is easy to see that three examples we just considered were indeed well-
ordering relations.
Here are some elementary but extremely important observations concern-
ing well-ordering.
Proof. We will prove the contrapositive. Suppose ¬(∀ a ∈ A) φ( a), i.e., that
X = { x ∈ A : ¬ φ( x )} ̸= ∅. Then X has an <-minimal element, a. So
(∀b < a) φ(b) but ¬ φ( a).
This last property should remind you of the principle of strong induction on
the naturals, i.e.: if (∀n ∈ ω )((∀m < n) φ(m) → φ(n)), then (∀n ∈ ω ) φ(n).
And this property makes well-ordering into a very robust notion.1
63.4 Order-Isomorphisms
To explain how robust well-ordering is, we will start by introducing a method
for comparing well-orderings.
Corollary 63.6. X ∼
= Y is an equivalence relation.
Proposition 63.7. If ⟨ A, <⟩ and ⟨ B, ⋖⟩ are isomorphic well-orderings, then the iso-
morphism between them is unique.
This gives some sense that well-orderings are robust. But to continue explain-
ing this, it will help to introduce some more notation.
Using this notation, we can state and prove that no well-ordering is isomor-
phic to any of its proper initial segments.
Our next result shows, roughly put, that an “initial segment” of an isomor-
phism is an isomorphism:
f [ A a ] = f [{ x ∈ A : x < a}]
= f [{ f −1 (y) ∈ A : f −1 (y) < a}]
= {y ∈ B : y ⋖ f ( a)}
= B f ( a)
Our next two results establish that well-orderings are always comparable:
Proof. We will prove left to right; the other direction is similar. Suppose both
⟨ A a1 , < a1 ⟩ ∼
= ⟨ Bb1 , ⋖b1 ⟩ and ⟨ A a2 , < a2 ⟩ ∼ = ⟨ Bb2 , ⋖b2 ⟩, with f : A a2 → Bb2 our
isomorphism. Let a1 < a2 ; then ⟨ A a1 , < a1 ⟩ ∼ = ⟨ B f (a1 ) , ⋖ f (a1 ) ⟩ by Lemma 63.10.
So ⟨ Bb1 , ⋖b1 ⟩ ∼= ⟨ B f (a1 ) , ⋖ f (a1 ) ⟩, and so b1 = f ( a1 ) by Lemma 63.9. Now b1 ⋖ b2
as f ’s domain is Bb2 .
Theorem 63.12. Given any two well-orderings, one is isomorphic to an initial seg-
ment (not necessarily proper) of the other.
f = {⟨ a, b⟩ ∈ A × B : ⟨ A a , < a ⟩ ∼
= ⟨ Bb , ⋖b ⟩}.
By Lemma 63.11, a1 < a2 iff b1 ⋖ b2 for all ⟨ a1 , b1 ⟩, ⟨ a2 , b2 ⟩ ∈ f . So f : dom( f ) →
ran( f ) is an isomorphism.
If a2 ∈ dom( f ) and a1 < a2 , then a1 ∈ dom( f ) by Lemma 63.10; so dom( f )
is an initial segment of A. Similarly, ran( f ) is an initial segment of B. For
reductio, suppose both are proper initial segments. Then let a be the <-least
element of A \ dom( f ), so that dom( f ) = A a , and let b be the ⋖-least element
of B \ ran( f ), so that ran( f ) = Bb . So f : A a → Bb is an isomorphism, and
hence ⟨ a, b⟩ ∈ f , a contradiction.
In what follows, we will use Greek letters for ordinals. It follows immedi-
ately from the definition that, if α is an ordinal, then ⟨α, ∈α ⟩ is a well-ordering,
where ∈α = {⟨ x, y⟩ ∈ α2 : x ∈ y}. So, abusing notation a little, we can just say
that α itself is a well-ordering.
Here are our first few ordinals:
You will note that these are the first few ordinals that we encountered in our
Axiom of Infinity, i.e., in von Neumann’s definition of ω (see section 62.6).
This is no coincidence. Von Neumann’s definition of the ordinals treats natu-
ral numbers as ordinals, but allows for transfinite ordinals too.
As always, we can now ask: are these the ordinals? Or has von Neumann
simply given us some sets that we can treat as the ordinals? The kinds of
discussions one might have about this question are similar to the discussions
we had in section 2.2, section 5.5, section 6.4, and section 62.8, so we will not
belabour the point. Instead, in what follows, we will simply use “the ordinals”
to speak of “the von Neumann ordinals”.
The rough gist of the next two main results, Theorem 63.16 and Theo-
rem 63.17, is that the ordinals themselves are well-ordered by membership:
Proof. Suppose φ(α), for some ordinal α. If (∀ β ∈ α)¬ φ( β), then we are done.
Otherwise, as α is an ordinal, it has some ∈-least element which is φ, and this
is an ordinal by Lemma 63.14.
just by taking ¬ φ(α) in Theorem 63.16, and then performing elementary logi-
cal manipulations.
Proof. The proof is by double induction, i.e., using Theorem 63.16 twice. Say
that x is comparable with y iff x ∈ y ∨ x = y ∨ y ∈ x.
For induction, suppose that every ordinal in α is comparable with every or-
dinal. For further induction, suppose that α is comparable with every ordinal
in β. We will show that α is comparable with β. By induction on β, it will
follow that α is comparable with every ordinal; and so by induction on α, ev-
ery ordinal is comparable with every ordinal, as required. It suffices to assume
that α ∈/ β and β ∈
/ α, and show that α = β.
To show that α ⊆ β, fix γ ∈ α; this is an ordinal by Lemma 63.14. So by
the first induction hypothesis, γ is comparable with β. But if either γ = β or
β ∈ γ then β ∈ α (invoking the fact that α is transitive if necessary), contrary
to our assumption; so γ ∈ β. Generalising, α ⊆ β.
Exactly similar reasoning, using the second induction hypothesis, shows
that β ⊆ α. So α = β.
This result is named after Burali-Forti. But, it was Cantor in 1899—in a letter
to Dedekind—who first saw clearly the contradiction in supposing that there is
a set of all the ordinals. As van Heijenoort explains:
2 We could write α ∈ β; but that would be wholly non-standard.
Proof. Any infinite strictly descending sequence of ordinals α0 > α1 > α2 >
. . . has no <-minimal member, contradicting Theorem 63.16.
63.7 Replacement
In section 63.5, we motivated the introduction of ordinals by suggesting that
we could treat them as order-types, i.e., canonical proxies for well-orderings.
In order for that to work, we would need to prove that every well-ordering
is isomorphic to some ordinal. This would allow us to define ord( A, <) as the
ordinal α such that ⟨ A, <⟩ ∼
= α.
Unfortunately, we cannot prove the desired result only the Axioms we pro-
vided introduced so far. (We will see why in section 65.2, but for now the point
is: we can’t.) We need a new thought, and here it is:
As with Separation, this is a scheme: it yields infinitely many axioms, for each
of the infinitely many different φ’s. And it can equally well be (and normally
is) written down thus:
For any formula φ( x, y) which does not contain “B”, the following is an
axiom:
Corollary 63.24. For any term τ ( x ), and any set A, this set exists:
{τ ( x ) : x ∈ A} = {y : (∃ x ∈ A)y = τ ( x )}.
This suggests that “Replacement” is a good name for the Axiom: given a set
A, you can form a new set, {τ ( x ) : x ∈ A}, by replacing every member of A
with its image under τ. Indeed, following the notation for the image of a set
under a function, we might write τ [ A] for {τ ( x ) : x ∈ A}.
Crucially, however, τ is a term. It need not be (a name for) a function, in
the sense of section 3.3, i.e., a certain set of ordered pairs. After all, if f is a
function (in that sense), then the set f [ A] = { f ( x ) : x ∈ A} is just a particular
subset of ran( f ), and that is already guaranteed to exist, just using the axioms
of Z− .3 Replacement, by contrast, is a powerful addition to our axioms, as we
will see in chapter 65.
Definition 63.25. The theory ZF− has these axioms: Extensionality, Union,
Pairs, Powersets, Infinity, and all instances of the Separation and Replacement
schemes. Otherwise put, ZF− adds Replacement to Z− .
This stands for Zermelo–Fraenkel set theory (minus something which we will
come to later). Fraenkel gets the honour, since he is credited with the formu-
lation of Replacement in 1922, although the first precise formulation was due
to Skolem (1922).
3 Just consider {y ∈
SS
f : (∃ x ∈ A)y = f ( x )}.
Definition 63.29. For any ordinal α, its successor is α+ = α ∪ {α}. We say that
α is a successor ordinal if β+ = α for some ordinal β. We say that α is a limit
ordinal iff α is neither empty nor a successor ordinal.
The following result shows that this is the right notion of successor:
1. α ∈ α+ ;
2. α+ is an ordinal;
1. φ(∅); and
Then ∀αφ(α).
Proof. We prove the contrapositive. So, suppose there is some ordinal which
is ¬ φ; let γ be the least such ordinal. Then either γ = ∅, or γ = α+ for some
α such that φ(α); or γ is a limit ordinal and (∀ β ∈ γ) φ( β).
α+ .
S
Definition 63.32. If X is a set of ordinals, then lsub( X ) = α∈ X
Here, “lsub” stands for “least strict upper bound”.4 The following result ex-
plains this:
4 Some books use “sup( X )” for this. But other books use “sup( X )” for the least non-strict
S
upper bound, i.e., simply X. If X has a greatest element, α, these notions come apart: the least
strict upper bound is α+ , whereas the least non-strict upper bound is just α.
Proposition 63.33. If X is a set of ordinals, lsub( X ) is the least ordinal greater than
every ordinal in X.
Problems
Problem 63.1. Section 63.2 presented three example orderings on the natural
numbers. Check that each is a well-ordering.
Problem 63.3. Complete the “exactly similar reasoning” in the proof of Theo-
rem 63.17.
S
Problem 63.4. Prove that, if every member of X is an ordinal, then X is an
ordinal.
Definition 64.1.
V∅ = ∅
Vα+ = ℘(Vα ) for any ordinal α
[
Vα = Vγ when α is a limit ordinal
γ<α
863
CHAPTER 64. STAGES AND RANKS
Lemma 64.3 (Bounded Recursion). For any term τ ( x ) and any ordinal α, there
is a unique α-approximation for τ.
σ ( α ) = f α+ ( α )
= τ ( f α + ↾α )
= τ ({⟨ β, f α+ ( β)⟩ : β ∈ α})
= τ ({⟨ β, f α ( β)⟩ : β ∈ α})
= τ (σ↾α )
Theorem 64.5 (Simple Recursion). For any terms τ ( x ) and θ ( x ) and any set A,
we can explicitly define a term σ ( x ) such that:
σ(∅) = A
σ(α+ ) = τ (σ (α)) for any ordinal α
σ (α) = θ (ran(σ↾α )) when α is a limit ordinal
By Theorem 64.4, there is a term σ ( x ) such that σ (α) = ξ (σ↾α ) for every or-
dinal α; moreover, σ↾α is a function with domain α. We show that σ has the
required properties, by simple transfinite induction (Theorem 63.31).
First, σ (∅) = ξ (∅) = A.
Next, σ (α+ ) = ξ (σ↾α+ ) = τ (σ↾α+ (α)) = τ (σ (α)).
Last, σ (α) = ξ (σ↾α ) = θ (ran(σ↾α )), when α is a limit.
1. Each Vα is transitive.
2. Each Vα is potent.
3 There’s no standard terminology for “potent”; this is the name used by Button (forthcom-
ing).
All of this allows us to think of each Vα as the αth stage of the hierarchy. Here
is why.
Certainly our Vα s can be thought of as being formed in an iterative process,
for our use of ordinals tracks the notion of iteration. Moreover, if one stage is
formed before the other, i.e., Vβ ∈ Vα , i.e., β ∈ α, then our process of forma-
tion is cumulative, since Vβ ⊆ Vα . Finally, we are indeed forming all possible
collections of sets that were available at any earlier stage, since any successor
stage Vα+ is the power-set of its predecessor Vα .
In short: with ZF− , we are almost done, in articulating our vision of the
cumulative-iterative hierarchy of sets. (Though, of course, we still need to
justify Replacement.)
64.4 Foundation
We are only almost done—and not quite finished—because nothing in ZF−
guarantees that every set is in some Vα , i.e., that every set is formed at some
stage.
Now, there is a fairly straightforward (mathematical) sense in which we
don’t care whether there are sets outside the hierarchy. (If there are any there,
we can simply ignore them.) But we have motivated our concept of set with
the thought that every set is formed at some stage (see Stages-are-key in sec-
tion 62.1). So we will want to preclude the possibility of sets which fall outside
of the hierarchy. Accordingly, we must add a new axiom, which ensures that
every set occurs somewhere in the hierarchy.
Since the Vα s are our stages, we might simply consider adding the follow-
ing as an axiom:
Regularity. ∀ A∃α A ⊆ Vα
With some effort, we can show (in ZF− ) that Foundation entails Regular-
ity:
cl0 ( A) = A,
[
cln+1 ( A) = cln ( A),
[
trcl( A) = cln ( A).
n<ω
Proof. Recalling the definition of “lsub( X )” from Definition 63.32, define two
sets:
D = { x ∈ A : ∀δ x ⊈ Vδ }
α = lsub{δ : (∃ x ∈ A)( x ⊆ Vδ ∧ (∀γ ∈ δ) x ⊈ Vγ )}
These results show that ZF− proves the conditional Foundation ⇒ Regularity.
In Proposition 64.22, we will show that ZF− proves Regularity ⇒ Foundation.
As such, Foundation and Regularity are equivalent (modulo ZF− ). But this
means that, given ZF− , we can justify Foundation by noting that it is equiva-
lent to Regularity. And we can justify Regularity immediately on the basis of
Stages-are-key.
Definition 64.14. The theory Z adds Foundation to Z− . So its axioms are Ex-
tensionality, Union, Pairs, Powersets, Infinity, Foundation, and all instances of
the Separation scheme.
The theory ZF adds Foundation to ZF− . Otherwise put, ZF adds all in-
stances of Replacement to Z.
Setting aside historical reasons (to do with who formulated what and when),
the basic reason is that Foundation can be presented without employing the
definition of the Vα s. That definition relied upon all of the work of section 64.2:
we needed to prove Transfinite Recursion, to show that it was justified. But
our proof of Transfinite Recursion employed Replacement. So, whilst Foun-
dation and Regularity are equivalent modulo ZF− , they are not equivalent
modulo Z− .
Indeed, the matter is more drastic than this simple remark suggests. Though
it goes well beyond this book’s remit, it turns out that both Z− and Z are too
weak to define the Vα s. So, if you are working only in Z, then Regularity (as
we have formulated it) does not even make sense. This is why our official
axiom is Foundation, rather than Regularity.
From now on, we will work in ZF (unless otherwise stated), without any
further comment.
64.6 Rank
Now that we have defined the stages as the Vα ’s, and we know that every set
is a subset of some stage, we can define the rank of a set. Intuitively, the rank
of A is the first moment at which A is formed. More precisely:
Definition 64.15. For each set A, rank( A) is the least ordinal α such that A ⊆
Vα .
Using this fact, we can establish a result which allows us to prove things about
all sets by a form of induction:
Proof. We will prove the contrapositive. So, suppose ¬∀ Aφ( A). By Transfi-
nite Induction (Theorem 63.16), there is some non-φ of least possible rank;
i.e. some A such that ¬ φ( A) and ∀ x (rank( x ) ∈ rank( A) → φ( x )). Now
if x ∈ A then rank( x ) ∈ rank( A), by Proposition 64.18, so that φ( x ); i.e.
(∀ x ∈ A) φ( x ) ∧ ¬ φ( A).
Here is an informal way to gloss this powerful result. Say that φ is hereditary
iff whenever every element of a set is φ, the set itself is φ. Then ∈-Induction
tells you the following: if φ is hereditary, every set is φ.
To wrap up the discussion of ranks (for now), we’ll prove a few claims
which we have foreshadowed a few times.
Proof. Suppose for transfinite induction that rank( β) = β for all β ∈ α. Now
rank(α) = lsubβ∈α rank( β) = lsubβ∈α β = α by Proposition 64.20.
Finally, here is a quick proof of the result promised at the end of sec-
tion 64.4, that ZF− proves the conditional Regularity ⇒ Foundation. (Note
that the notion of “rank” and Proposition 64.18 are available for use in this
proof since—as mentioned at the start of this section—they can be presented
using ZF− + Regularity.)
Problems
Problem 64.1. Prove Proposition 64.16.
Replacement
65.1 Introduction
Definition 65.1. For any set M, and any formula φ, let φ M be the formula
which results by restricting all of φ’s quantifiers to M. That is, replace “∃ x”
with “(∃ x ∈ M )”, and replace “∀ x” with “(∀ x ∈ M)”.
871
CHAPTER 65. REPLACEMENT
It can be shown that, for every axiom φ of Z, we have that ZF ⊢ φVω+ω . But
ω + ω is not in Vω +ω , by Corollary 64.21. So Z is consistent with the non-
existence of ω + ω.
This is why we said, in section 63.7, that Theorem 63.26 cannot be proved
without Replacement. For it is easy, within Z, to define an explicit well-
ordering which intuitively should have order-type ω + ω. Indeed, we gave
an informal example of this in section 63.2, when we presented the ordering
on the natural numbers given by:
But if ω + ω does not exist, this well-ordering is not isomorphic to any ordinal.
So Z does not prove Theorem 63.26.
Flipping things around: Replacement allows us to prove the existence of
ω + ω, and hence must allow us to prove the existence of Vω +ω . And not just
that. For any well-ordering we can define, Theorem 63.26 tells us that there
is some α isomorphic with that well-ordering, and hence that Vα exists. In a
straightforward way, then, Replacement guarantees that the hierarchy of sets
must be very tall.
Over the next few sections, and then again in section 68.5, we’ll get a better
sense of better just how tall Replacement forces the hierarchy to be. The simple
point, for now, is that Replacement really does stand in need of justification!
The gist of Boolos’s idea is that we should justify Replacement by its fruits.
And the specific fruits he mentions are the things we have discussed in the
past few chapters. Replacement allowed us to prove that the von Neumann
ordinals were excellent surrogates for the idea of a well-ordering type (this is
our “satisfactory if not ideal theory of infinite numbers”). Replacement also al-
lowed us to define the Vα s, establish the notion of rank, and prove ∈-Induction
(this amounts to our “theorems about the iterative conception”). Finally, Re-
placement allows us to prove the Transfinite Recursion Theorem (this is the
“inductive definitions on well-founded relations”).
These are, indeed, desirable consequences. But do these desirable conse-
quences suffice to justify Replacement? No. Or at least, not straightforwardly.
Here is a simple problem. Whilst we have stated some desirable conse-
quences of Replacement, we could have obtained many of them via other
means. This is not as well known as it ought to be, though, so we should
pause to explain the situation.
There is a simple theory of sets, Level Theory, or LT for short.1 LT’s axioms
are just Extensionality, Separation, and the claim that every set is a subset of
some level, where “level” is cunningly defined so that the levels behave like
our friends, the Vα s. So ZF proves LT; but LT is much weaker than ZF. In fact,
LT does not give you Pairs, Powersets, Infinity, or Replacement. Let Zr be the
result of adding Infinity and Powersets to LT; this delivers Pairs too, so, Zr is
at least as strong as Z. But, in fact, Zr is strictly stronger than Z, since it adds
the claim that every set has a rank (hence my suggestion that we call it Zr).
Indeed, Zr delivers: a perfectly satisfactory theory of ordinals; results which
stratify the hierarchy into well-ordered stages; a proof of ∈-Induction; and a
version of Transfinite Recursion.
In short: although Boolos didn’t know this, all of the desirable conse-
quences which he mentions could have been arrived at without Replacement;
he simply needed to use Zr rather than Z.
(Given all of this, why did we follow the conventional route, of teaching
you ZF, rather than LT and Zr? There are two reasons. First: for purely
historical reasons, starting with LT is rather nonstandard; we wanted to equip
you to be able to read more standard discussions of set theory. Second: when
you are ready to appreciate LT and Zr, you can simply read Potter 2004 and
Button forthcoming.)
Of course, since Zr is strictly weaker than ZF, there are results which ZF
proves which Zr leaves open. So one could try to justify Replacement on
extrinsic grounds by pointing to one of these results. But, once you know
how to use Zr, it is quite hard to find many examples of things that are (a)
settled by Replacement but not otherwise, and (b) are intuitively true. (For
more on this, see Potter 2004, §13.2.)
The bottom line is this. To provide a compelling extrinsic justification for
Replacement, one would need to find a result which cannot be achieved with-
out Replacement. And that’s not an easy enterprise.
Let’s consider a further problem which arises for any attempt to offer a
purely extrinsic justification for Replacement. (This problem is perhaps more
1 The first versions of LT are offered by Montague (1965) and Scott (1974); this was simpli-
fied, and given a book-length treatment, by Potter (2004); and Button (forthcoming) has recently
simplified LT further.
fundamental than the first.) Boolos does not just point out that Replacement
has many desirable consequences. He also states that Replacement has “(ap-
parently) no undesirable” consequences. But this paranthetical caveat, “ap-
parently,” is surely absolutely crucial.
Recall how we ended up here: Naı̈ve Comprehension ran into inconsis-
tency, and we responded to this inconsistency by embracing the cumulative-
iterative conception of set. This conception comes equipped with a story
which, we hope, assures us of its consistency. But if we cannot justify Replace-
ment from within that story, then we have (as yet) no reason to believe that
ZF is consistent. Or rather: we have no reason to believe that ZF is consistent,
apart from the (perhaps merely contingent) fact that no one has discovered
a contradiction yet. In exactly that sense, Boolos’s comment seems to come
down to this: “(apparently) ZF is consistent”. We should demand greater re-
assurance of consistency than this.
This issue will affect any purely extrinsic attempt to justify Replacement,
i.e., any justification which is couched solely in terms of the (known) conse-
quences of ZF. As such, we will want to look for an intrinsic justification of
Replacement, i.e., a justification which suggests that the story which we told
about sets somehow “already” commits us to Replacement.
65.4 Limitation-of-size
Perhaps the most common attempt to offer an “intrinsic” justification of Re-
placement comes via the following notion:
Limitation-of-size. Any things form a set, provided that there are not too
many of them.
This principle will immediately vindicate Replacement. After all, any set
formed by Replacement cannot be any larger than any set from which it was
formed. Stated precisely: suppose you form a set τ [ A] = {τ ( x ) : x ∈ A} using
Replacement; then τ [ A] ⪯ A; so if the elements of A were not too numerous
to form a set, their images are not too numerous to form τ [ A].
The obvious difficulty with invoking Limitation-of-size to justify Replace-
ment is that we have not yet laid down any principle like Limitation-of-size.
Moreover, when we told our story about the cumulative-iterative conception
of set in chapters 61 to 62, nothing ever hinted in the direction of Limitation-of-
size. This, indeed, is precisely why Boolos at one point wrote: “Perhaps one
may conclude that there are at least two thoughts ‘behind’ set theory” (1989,
p. 19). On the one hand, the ideas surrounding the cumulative-iterative con-
ception of set are meant to vindicate Z. On the other hand, Limitation-of-size is
meant to vindicate Replacement.
But the issue it is not just that we have thus far been silent about Limitation-
of-size. Rather, the issue is that Limitation-of-size (as just formulated) seems to
sit quite badly with the cumulative-iterative notion of set. After all, it men-
tions nothing about the idea of sets as formed in stages.
This is really not much of a surprise, given the history of these “two thoughts”
(i.e., the cumulative-iterative conception of set, and Limitation-of-size). These
“two thoughts” ultimately amount to two rather different projects for block-
ing the set-theoretic paradoxes. The cumulative-iterative notion of set blocks
Russell’s paradox by saying, roughly: we should never have expected a Russell
set to exist, because it would not be “formed” at any stage. By contrast, Limitation-
of-size is meant to rule out the Russell set, by saying, roughly: we should never
have expected a Russell set to exist, because it would have been too big.
Put like this, then, let’s be blunt: considered as a reply to the paradoxes,
Limitation-of-size stands in need of much more justification. Consider, for ex-
ample, this version of Russell’s Paradox: no pug sniffs exactly the pugs which
don’t sniff themselves (see section 61.2). If you ask “why is there no such pug?”,
it is not a good answer to be told that such a pug would have to sniff too many
pugs. So why would it be a good intuitive explanation, of the non-existence
of a Russell set, that it would have to be “too big” to exist?
In short, it’s forgivable if you are a bit mystified concerning the “intuitive”
motivation for Limitation-of-size.
since Z on its own is not strong enough to define the stages, so it is not clear how one would
formalise Stages-are-super-cofinal. One option, though, is to work in some extension of LT, as
discussed in section 65.3.
we have a map τ such that for any stage S there is some x ∈ A such that
S ∈ τ ( x ). In that case, we do have a way to get a handle on the supposed “ab-
solute infinity” of the hierarchy: it is exhausted by the range of τ applied to A.
And that compromises the thought that the hierarchy is “absolutely infinite”.
Contraposing: Stages-are-inexhaustible entails Stages-are-super-cofinal, which in
turn justifies Replacement.
This represents a genuinely promising attempt to provide an intrinsic jus-
tification for Replacement. But whether it ultimately works, or not, we will
have to leave to you to decide.
We’ll start with a lemma which, for brevity, employs the notational device
of overlining to deal with sequences of variables or objects. So: “ak ” abbrevi-
ates “ak1 , . . . , akn ”, where n is determined by context.
Lemma 65.3. For each 1 ≤ i ≤ k, let φi (vi , x ) be a formula. Then for each α there
is some β > α such that, for any a1 , . . . , ak ∈ Vβ and each 1 ≤ i ≤ k:
∃ xφi ( ai , x ) → (∃ x ∈ Vβ ) φi ( ai , x )
∃ xφi ( ai , x ) → (∃ x ∈ V ) φi ( ai , x ))
It is easy to confirm that µ( a1 , . . . , ak ) exists for all a1 , . . . , ak . Now, using Re-
placement and our recursion theorem, define:
S0 = Vα+1
[
S n +1 = S n ∪ { µ ( a1 , . . . , a k ) : a1 , . . . , a k ∈ Sn }
[
S= Sn .
m<ω
To use this to prove Replacement, we will first follow Lévy (1960, first part
of Theorem 2) and show that we can “reflect” two formulas at once:
(∀z, x ∈ S)( φ ↔ φS )
i.e. (∀z, x ∈ S)(((z = 0 ∧ ψ) ∨ (z = 1 ∧ χ)) ↔
((z = 0 ∧ ψ) ∨ (z = 1 ∧ χ))S )
i.e. (∀z, x ∈ S)(((z = 0 ∧ ψ) ∨ (z = 1 ∧ χ)) ↔
((z = 0 ∧ ψS ) ∨ (z = 1 ∧ χS )))
i.e. (∀ x ∈ S)((ψ ↔ ψS ) ∧ (χ ↔ χS ))
The second claim entails the third because “z = 0” and “z = 1” are absolute
for S; the fourth claim follows since 0 ̸= 1.
We can now obtain Replacement, just by following and simplifying Lévy (1960,
Theorem 6):
Theorem 65.5 (in Z + Weak-Reflection). For any formula φ(v, w), and any A, if
(∀ x ∈ A)∃!y φ( x, y), then {y : (∃ x ∈ A) φ( x, y} exists.
6 More formally, letting ξ be either of these formulas, ξ (z) ↔ ξ S (z).
T ⊢ ∃ M(ψ( M ) ∧ (∃ Nψ( N )) M )
In particular, then:
T ⊢ ∃ M(ψ( M ) ∧ (∃ N ∈ M )( N is transitive ∧ θ N ))
So that T is inconsistent.7
Proposition 65.7. Let T extend Z with finitely many new axioms. If T ⊢ ZF, then
T is inconsistent. (Here we use the same tacit restrictions as for Theorem 65.6.)
Proof. Use θ for the conjunction of all of T’s axioms except for the (infinitely
many) instances of Separation. Defining ψ from θ as in Theorem 65.6, we can
show that T ⊢ ∃ Mψ( M).
As in Theorem 65.6, we can establish the schema that, whenever T ⊢ σ,
we have that T ⊢ ∀ X (ψ( X ) → σ X ). We then finish our proof, exactly as in
Theorem 65.6.
However, establishing the schema involves a little more work than in The-
orem 65.6. After all, the Separation-instances are in T, but they are not con-
juncts of θ. However, we can overcome this obstacle by proving that T ⊢
∀ X ( X is transitive → σ X ), for every Separation-instance σ. We leave this to
the reader.
Problems
Problem 65.1. Formalize Stages-are-super-cofinal within ZF.
Problem 65.4. Confirm the remaining schematic results invoked in the proofs
of Theorem 65.6 and Proposition 65.7.
Ordinal Arithmetic
66.1 Introduction
In chapter 63, we developed a theory of ordinal numbers. We saw in chap-
ter 64 that we can think of the ordinals as a spine around which the remainder
of the hierarchy is constructed. But that is not the only role for the ordinals.
There is also the task of performing ordinal arithmetic.
We already gestured at this, back in section 63.2, when we spoke of ω,
ω + 1 and ω + ω. At the time, we spoke informally; the time has come to
spell it out properly. However, we should mention that there is not much phi-
losophy in this chapter; just technical developments, coupled with a (mildly)
interesting observation that we can do the same thing in two different ways.
883
CHAPTER 66. ORDINAL ARITHMETIC
But here is a striking fact. To define ordinal addition, we could instead have
simply used the Transfinite Recursion Theorem, and laid down the recursion
equations, exactly as given in Lemma 66.6 (though using “β+ ” rather than
“β + 1”).
There are, then, two different ways to define operations on the ordinals.
We can define them synthetically, by explicitly constructing a well-ordered set
and considering its order type. Or we can define them recursively, just by
laying down the recursion equations. Done correctly, though, the outcome is
identical. For Theorem 63.26 guarantees that these recursion equations pin
down unique ordinals.
In many ways, ordinal arithmetic behaves just like addition of the natural
numbers. For example, we can prove the following:
2. if α + β = α + γ, then β = γ
4. If α ≤ β, then α + γ ≤ β + γ
Proof. We prove (3), leaving the rest as an exercise. The proof is by Simple
Transfinite Induction on γ, using Lemma 66.6. When γ = 0:
( α + β ) + 0 = α + β = α + ( β + 0)
(α + β) + (δ + 1) = ((α + β) + δ) + 1
= (α + ( β + δ)) + 1
= α + (( β + δ) + 1)
= α + ( β + (δ + 1))
(α + β) + γ = lsub((α + β) + δ)
δ<γ
= lsub(α + ( β + δ))
δ<γ
= α + lsub( β + δ)
δ<γ
= α + ( β + γ)
In these ways, ordinal addition should be very familiar. But, there is a cru-
cial way in which ordinal addition is not like addition on the natural numbers.
Whilst this may initially come as a surprise, it shouldn’t. On the one hand,
when you consider 1 + ω, you are thinking about the order type you get by
putting an extra element before all the natural numbers. Reasoning as we did
with Hilbert’s Hotel in section 6.1, intuitively, this extra first element shouldn’t
make any difference to the overall order type. On the other hand, when you
consider ω + 1, you are thinking about the order type you get by putting an
extra element after all the natural numbers. And that’s a radically different
beast!
1. rank(℘( A)) = α + 1
3. rank( A ∪ B) = max(α, β)
5. rank( A × B) ≤ max(α, β) + 2
S S
6. rank( A) = α when α is empty or a limit; rank( A) = γ when α = γ + 1
1. α ∈
/ ω, i.e., α is not a natural number
2. ω ≤ α
3. 1 + α = α
5. α is Dedekind infinite
1 + α = 1 + ( β + γ) = (1 + β) + γ = lsub(1 + δ) + γ = β + γ = α.
δ< β
α·0 = 0
α · ( β + 1) = ( α · β ) + α
α · β = lsub(α · δ) when β is a limit ordinal.
δ< β
Indeed, just as in the case of addition, we could have defined ordinal multi-
plication via these recursion equations, rather than offering a direct definition.
Equally, as with addition, certain behaviour is familiar:
2. if α ̸= 0 and α · β = α · γ, then β = γ;
3. α · ( β · γ) = (α · β) · γ;
4. If α ≤ β, then α · γ ≤ β · γ;
5. α · ( β + γ) = (α · β) + (α · γ).
You can prove (or look up) other results, to your heart’s content. But, given
Proposition 66.8, the following should not come as a surprise:
Definition 66.16.
α (0) = 1
α ( β +1) = α ( β ) · α
α( β) = α(δ)
[
when β is a limit ordinal
δ< β
add, except to note the unsurprising fact that ordinal exponentiation does not
commute. Thus 2(ω ) = δ<ω 2(δ) = ω, whereas ω (2) = ω · ω. But then, we
S
should not expect exponentiation to commute, since it does not commute with
natural numbers: 2(3) = 8 < 9 = 3(2) .
Problems
Problem 66.1. Prove the remainder of Lemma 66.7.
Problem 66.2. Produce sets A and B such that rank( A × B) = max(rank( A), rank( B)).
Produce sets A and B such that rank( A × B) max(rank( A), rank( B)) + 2. Are
any other ranks possible?
Problem 66.3. Prove Lemma 66.12, Lemma 66.13, and Lemma 66.14
Cardinals
Cast your mind back to section 63.5. We were discussing well-ordered sets,
and suggested that it would be nice to have objects which go proxy for well-
orders. With this is mind, we introduced ordinals, and then showed in Corol-
lary 63.28 that these behave as we would want them to, i.e.:
Cast your mind back even further, to section 4.8. There, working naı̈vely, we
introduced the notion of the “size” of a set. Specifically, we said that two sets
are equinumerous, A ≈ B, just in case there is a bijection f : A → B. This
is an intrinsically simpler notion than that of a well-ordering: we are only
interested in bijections, and not (as with order-isomorphisms) whether the
bijections “preserve any structure”.
This all gives rise to an obvious thought. Just as we introduced certain
objects, ordinals, to calibrate well-orders, we can introduce certain objects, car-
dinals, to calibrate size. That is the aim of this chapter.
Before we say what these cardinals will be, we should lay down a principle
which they ought to satisfy. Writing | X | for the cardinality of the set X, we
would want them to obey:
| A| = | B| iff A ≈ B.
We’ll call this Cantor’s Principle, since Cantor was probably the first to have it
very clearly in mind. (We’ll say more about its relationship to Hume’s Principle
in section 67.5.) So our aim is to define | X |, for each X, in such a way that it
delivers Cantor’s Principle.
891
CHAPTER 67. CARDINALS
2. | A| ≈ A;
The next result guarantees Cantor’s Principle, and more besides. (Note
that cardinals inherit their ordering from the ordinals, i.e., a < b iff a ∈ b. In
formulating this, we will use Fraktur letters for objects we know to be cardi-
nals. This is fairly standard. A common alternative is to use Greek letters,
since cardinals are ordinals, but to choose them from the middle of the alpha-
bet, e.g.: κ, λ.):
A ≈ B iff | A| = | B|
A ⪯ B iff | A| ≤ | B|
A ≺ B iff | A| < | B|
Proof. We will prove the left-to-right direction of the second claim (the other
cases are similar, and left as an exercise). So, consider the following diagram:
A B
| A| | B|
We can also use Lemma 67.3 to re-prove Schröder–Bernstein. This is the claim
that if A ⪯ B and B ⪯ A then A ≈ B. We stated this as Theorem 4.25, but first
proved it—with some effort—in section 6.5. Now consider:
Whilst this is a very simple proof, it implicitly relies on both Replacement (to
secure Theorem 63.26) and on Well-Ordering (to guarantee Lemma 67.3). By
contrast, the proof of section 6.5 was much more self-standing (indeed, it can
be carried out in Z− ).
Definition 67.4. The theory ZFC has these axioms: Extensionality, Union,
Pairs, Powersets, Infinity, Foundation, Well-Ordering and all instances of the
Separation and Replacement schemes. Otherwise put, ZFC adds Well-Ordering
to ZF.
ZFC stands for Zermelo-Fraenkel set theory with Choice. Now this might
seem slightly odd, since the axiom we added was called “Well-Ordering”, not
“Choice”. But, when we later formulate Choice, it will turn out that Well-
Ordering is equivalent (modulo ZF) to Choice (see Theorem 69.6). So which
to take as our “basic” axiom is a matter of indifference. And the name “ZFC”
is entirely standard in the literature.
Proof. Immediate.
It also follows that several reasonable notions of what it might mean to de-
scribe a cardinal as “finite” or “infinite” coincide:
1. | A| ∈
/ ω, i.e., A is not a natural number;
2. ω ≤ | A|;
3. A is Dedekind infinite.
This licenses the following definition of some notions which we used rather
informally in part I:
But note that this definition is presented against the background of ZFC. After
all, we needed Well-Ordering to guarantee that every set has a cardinality.
And indeed, without Well-Ordering, there can be a set which is neither finite
nor Dedekind infinite. We will return to this sort of issue in chapter 69. For
now, we continue to rely upon Well-Ordering.
Let us now turn from the finite cardinals to the infinite cardinals. Here are
two elementary points:
So in all cases, | A| ≤ ω.
Indeed, ω has a special place. Whilst there are many countable ordinals:
Of course, there are infinitely many cardinals. So we might ask: How many
cardinals are there? The following results show that we might want to recon-
sider that question.
S
Proposition 67.13. If every member of X is a cardinal, then X is a cardinal.
Proof. For any cardinal a, Cantor’s Theorem (Theorem 4.24) and Lemma 67.2
entail that a < |℘(a)|.
You should compare this with both Russell’s Paradox and Burali-Forti.
| A| = | B| iff A ≈ B.
This is very similar to what is now called Hume’s Principle, which says:
#x F ( x ) = #x G ( x ) iff F ∼ G
where ‘F ∼ G’ abbreviates that there are exactly as many Fs as Gs, i.e., the Fs
can be put into a bijection with the Gs, i.e.:
When two numbers are so combined as that the one has always
an unit answering to every unit of the other, we pronounce them
equal. (Hume, 1740, Pt.III Bk.1 §1)
And Hume’s Principle was brought to contemporary mathematico-logical promi-
nence by Frege (1884, §63), who quoted this passage from Hume, before (in
effect) sketching (what we have called) Hume’s Principle.
You should note the structural similarity between Hume’s Principle and
Basic Law V. We formulated this in section 61.6 as follows:
ϵx F ( x ) = ϵx G ( x )iff ∀ x ( F ( x ) ↔ G ( x )).
Cardinal Arithmetic
a ⊕ b = |a ⊔ b|
a ⊗ b = |a × b|
ab = b a
It might help to explain this definition. Concerning addition: this uses the
notion of disjoint sum, ⊔, as defined in Definition 66.1; and it is easy to see
that this definition gives the right verdict for finite cases. Concerning mul-
tiplication: Proposition 1.27 tells us that if A has n members and B has m
members then A × B has n · m members, so our definition simply generalises
the idea to transfinite multiplication. Exponentiation is similar: we are simply
generalising the thought from the finite to the transfinite. Indeed, in certain
ways, transfinite cardinal arithmetic looks much more like “ordinary” arith-
metic than does transfinite ordinal arithmetic:
899
CHAPTER 68. CARDINAL ARITHMETIC
This explains why we need to use different symbols for ordinal versus car-
dinal addition/multiplication: these are genuinely different operations. This
next pair of results shows that ordinal versus cardinal exponentiation are also
different operations. (Recall that Definition 62.7 entails that 2 = {0, 1}):
A 2. A 2.
f : ℘( A) →
Now let f ( B) = χ B ; this defines a bijection
So ℘( A) ≈
| |
Hence ℘( A) ≈ 2, so that |℘( A)| = 2 = 2| A| .
A | A|
This snappy proof essentially subsumes the discussion of section 4.13. There,
we showed how to “reduce” the uncountability of ℘(ω ) to the uncountability
of the set of infinite binary strings, Bω . In effect, Bω is just ω 2; and the pre-
ceding proof showed that the reasoning we went through in section 4.13 will
go through using any set A in place of ω. The result also yields a quick fact
about cardinal exponentiation:
Proof skeleton. There are plenty of ways to prove this. The most straightfor-
ward is to argue that ℘(ω ) ⪯ R and R ⪯ ℘(ω ), and then use Schröder-
Bernstein to infer that R ≈ ℘(ω ), and Lemma 68.4 to infer that |R| = 2ω . We
leave it as an (illuminating) exercise to define injections f : ℘(ω ) → R and
g : R → ℘(ω ).
And now we will put all this to work, in proving a crucial lemma:
Proof. For reductio, let α be the least infinite ordinal for which this is false.
Proposition 4.12 shows that ω ≈ ω × ω, so ω ∈ α. Moreover, α is a cardi-
nal: suppose otherwise, for reductio; then |α| ∈ α, so that |α| ≈ |α| × |α|, by
hypothesis; and |α| ≈ α by definition; so that α ≈ α × α by Proposition 68.9.
Now, for each ⟨γ1 , γ2 ⟩ ∈ α × α, consider the segment:
a ⊗ b = a ⊕ b = max(a, b).
Proof. For the first claim, consider a function f : (b ⊔ c) → a. Now “split this”,
by defining f b ( β) = f ( β, 0) for each β ∈ b, and f c (γ) = f (γ, 1) for each γ ∈ c.
The map f 7→ ( f b × f c ) is a bijection b⊔c a → (b a × c a).
For the second claim, consider a function f : c → (b a); so for each γ ∈ c we
have some function f (γ) : b → a. Now define f ∗ ( β, γ) = ( f (γ))( β) for each
⟨ β, γ⟩ ∈ b × c. The map f 7→ f ∗ is a bijection c (b a) → b⊗c a.
1 How are these “fixed”? See section 69.5.
Now, what we would like is an easy way to compute ab when we are deal-
ing with infinite cardinals. Here is a nice step in this direction:
Proof.
2b ≤ ab , as 2 ≤ a
≤ (2a )b , by Lemma 68.4
= 2a⊗b , by Proposition 68.13
= 2b , by Theorem 68.11
We should not really expect to be able to simplify this any further, since
b < 2b by Lemma 68.4. However, this does not tell us what to say about ab
when b < a. Of course, if b is finite, we know what to do.
Definition 68.17. Where a⊕ is the least cardinal strictly greater than a, we de-
fine two infinite sequences:
ℵ0 = ω ℶ0 = ω
ℵα+1 = (ℵα )⊕ ℶα+1 = 2ℶα
[ [
ℵα = ℵβ ℶα = ℶβ when α is a limit ordinal.
β<α β<α
The definition of a⊕ is in order, since Theorem 67.14 tells us that, for each
cardinal a, there is some cardinal greater than a, and Transfinite Induction
guarantees that there is a least cardinal greater than a. The rest of the definition
of a is provided by transfinite recursion.
Cantor introduced this “ℵ” notation; this is aleph, the first letter in the He-
brew alphabet and the first letter in the Hebrew word for “infinite”. Peirce
introduced the “ℶ” notation; this is beth, which is the second letter in the He-
brew alphabet.2 Now, these notations provide us with infinite cardinals.
κ0 = 0
κ n +1 = ℵ κ n
[
κ= κn
n<ω
Boolos once wrote an article about exactly the ℵ-fixed-point we just con-
structed. After noting the existence of κ, at the start of his article, he said:
If we have, indeed, outrun “anything that is the case”, then we must point the
finger of blame directly at Replacement. For it is this axiom which allows our
proof to work. In which case, one assumes, Boolos would need to revisit the
claim he made, a few decades earlier, that Replacement has “no undesirable”
consequences (see section 65.3).
But is the existence of κ so bad? It might help, here, to consider Russell’s
Tristram Shandy paradox. Tristram Shandy documents his life in his diary, but
it takes him a year to record a single day. With every passing year, Tristram
falls further and further behind: after one year, he has recorded only one day,
and has lived 364 days unrecorded days; after two years, he has only recorded
two days, and has lived 728 unrecorded days; after three years, he has only
recorded three days, and lived 1092 unrecorded days . . . 5 Still, if Tristram is
immortal, Tristram will manage to record every day, for he will record the nth
day on the nth year of his life. And so, “at the end of time”, Tristram will have
a complete diary.
Now: why is this so different from the thought that α is smaller than ℵα —
and indeed, increasingly, desperately smaller—up until κ, at which point, we
catch up, and κ = ℵκ ?
5 Forgetting about leap years.
Setting that aside, and assuming we accept ZFC, let’s close with a little
more fun concerning fixed-point constructions. The next three results estab-
lish, intuitively, that there is a (non-trivial) point at which the hierarchy is as
wide as it is tall:
Proposition 68.21. There is a ℶ-fixed-point, i.e., a κ such that κ = ℶκ .
Proof. The first claim holds by a simple transfinite induction. The second
claim follows, since if ω · ω ≤ α then ω + α = α. To establish this, we use facts
about ordinal arithmetic from chapter 66. First note that ω · ω = ω · (1 + ω ) =
(ω · 1) + (ω · ω ) = ω + (ω · ω ). Now if ω · ω ≤ α, i.e., α = (ω · ω ) + β for some
β, then ω + α = ω + ((ω · ω ) + β) = (ω + (ω · ω )) + β = (ω · ω ) + β = α.
W0 = 0
Wα+1 = τ (Wα )
[
Wα = Wβ , when α is a limit
β<α
The construction is defined for all ordinals. Intuitively, then, W is “an injec-
tion” from the ordinals to ℶ-fixed points. And, exactly as before, VWα is as
wide as it is tall, for any α.
Problems
Problem 68.1. Prove in Z− that X Y exists for any sets X and Y. Working in ZF,
compute rank( X Y ) from rank( X ) and rank(Y ), in the manner of Lemma 66.9.
Problem 68.3. Complete the proof of Theorem 68.6, by showing that ℘(ω ) ⪯
R and R ⪯ ℘(ω ).
Choice
69.1 Introduction
In chapters 67 to 68, we developed a theory of cardinals by treating cardinals
as ordinals. That approach depends upon the Axiom of Well-Ordering. It
turns out that Well-Ordering is equivalent to another principle—the Axiom of
Choice—and there has been serious philosophical discussion of its acceptabil-
ity. Our question for this chapter are: How is the Axiom used, and can it be
justified?
{ x : A ≈ x }.
You might want to compare this with Frege’s definition of #xFx, sketched at
the very end of section 67.5. And, for reasons we gestured at there, this defi-
nition fails. Any singleton set is equinumerous with {∅}. But new singleton
sets are formed at every successor stage of the hierarchy (just consider the sin-
gleton of the previous stage). So { x : A ≈ x } does not exist, since it cannot
have a rank.
909
CHAPTER 69. CHOICE
To get around this problem, we use a trick due to Tarski and Scott:1
The definition of a TS-cardinal does not use Well-Ordering. But, even with-
out that Axiom, we can show that TS-cardinals behave rather like cardinals
as defined in Definition 67.1. For example, if we restate Lemma 67.3 and
Lemma 68.4 in terms of TS-cardinals, the proofs go through just fine in ZF,
without assuming Well-Ordering.
Whilst we are on the topic, it is worth noting that we can also develop
a theory of ordinals using the Tarski-Scott trick. Where ⟨ A, <⟩ is a well-
ordering, let tso( A, <) = [⟨ X, R⟩ : ⟨ A, <⟩ ∼
= ⟨ X, R⟩]. For more on this treat-
ment of cardinals and ordinals, see Potter (2004, chs. 9–12).
Lemma 69.3 (in ZF). For any set A, there is an ordinal α such that α ⪯̸ A
α = {ord( B, R) : ⟨ B, R⟩ ∈ C }.
B = ran( f )
R = {⟨ f (α), f ( β)⟩ ∈ A × A : α ∈ β}.
Proof. (1) ⇒ (2). Fix A and B. Invoking (1), there are well-orderings ⟨ A, R⟩
and ⟨ B, S⟩. Invoking Theorem 63.26, let f : α → ⟨ A, R⟩ and g : β → ⟨ B, S⟩ be
isomorphisms. By Proposition 63.22, either α ⊆ β or β ⊆ α. If α ⊆ β, then
g ◦ f −1 : A → B is an injection, and hence A ⪯ B; similarly, if β ⊆ α then
B ⪯ A.
(2) ⇒ (1). Fix A; by Lemma 69.3 there is some ordinal β such that β ⪯̸ A.
Invoking (2), we have A ⪯ β. So there is some injection f : A → β, and we
can use this injection to well-order the elements of A, by defining an order
{⟨ a, b⟩ ∈ A × A : f ( a) ∈ f (b)}.
g (0) = f ( A )
(
stop! if A = g[α]
g(α) =
f ( A \ g[α]) otherwise
So Well-Ordering and Choice stand or fall together. But the question re-
mains: do they stand or fall?
But matters get murkier as soon as we consider infinite sets. For example,
consider this “minimal” extension to the above:
This is a special case of Choice. And it transpires that this principle was
invoked fairly frequently, without an obvious awareness of its use. Here are
two nice examples.2
Example 69.8. Here is a natural thought: for any set A, either ω ⪯ A, or
A ≈ n for some n ∈ ω. This is one way to state the intuitive idea, that every set
is either finite or infinite. Cantor, and many other mathematicians, made this
claim without proving it. Cautious as we are, we proved this in Theorem 67.7.
But in that proof we were working in ZFC, since we were assuming that any
set A can be well-ordered, and hence that | A| is guaranteed to exist. That is:
we explicitly assumed Choice.
In fact, Dedekind (1888) offered his own proof of this claim, as follows:
Theorem 69.9 (in Z− + Countable Choice). For any A, either ω ⪯ A or A ≈ n
for some n ∈ ω.
Proof. Suppose A ̸≈ n for all n ∈ ω. Then in particular for each n < ω there is
subset An ⊆ A with exactly 2n elements. Using this sequence A0 , A1 , A2 , . . .,
we define for each n: [
Bn = An \ Ai .
i <n
Now note the following
[
A n ≤ | A 0 | + | A 1 | + . . . + | A n −1 |
i <n
= 1 + 2 + . . . + 2n −1
= 2n − 1
< 2n = | A n |
Hence each Bn has at least one member, cn . Moreover, the Bn s are pairwise
disjoint; so if cn = cm then n = m. But every cn ∈ A. So the function f (n) = cn
is an injection ω → A.
2 Due to Potter (2004, §9.4) and Luca Incurvati.
Dedekind did not flag that he had used Countable Choice. But, did you spot
its use? Look again. (Really: look again.)
The proof used Countable Choice twice. We used it once, to obtain our
sequence of sets A0 , A1 , A2 , . . . We then used it again to select our elements
cn from each Bn . Moreover, this use of Choice is ineliminable. Cohen (1966,
p. 138) proved that the result fails if we have no version of Choice. That is: it
is consistent with ZF that there are sets which are incomparable with ω.
Example 69.10. In 1878, Cantor stated that a countable union of countable sets
is countable. He did not present a proof, perhaps indicating that he took the
proof to be obvious. Now, cautious as we are, we proved a more general
version of this result in Proposition 68.12. But our proof explicitly assumed
Choice. And even the proof of the less general result requires Countable
Choice.
Theorem 69.11 (in Z− + Countable Choice). If An is countable for each n ∈ ω,
S
then n<ω An is countable.
Did you spot the use of the Countable Choice? It is used to choose our se-
quence of functions f 0 , f 1 , f 2 , . . . 3 And again, the result fails in the absence of
any Choice principle. Specifically, Feferman and Levy (1963) proved that it is
consistent with ZF that a countable union of countable sets has cardinality ℶ1 .
But here is a much funnier statement of the point, from Russell:
This is illustrated by the millionaire who bought a pair of socks
whenever he bought a pair of boots, and never at any other time,
and who had such a passion for buying both that at last he had ℵ0
pairs of boots and ℵ0 pairs of socks. . . Among boots we can distin-
guish right and left, and therefore we can make a selection of one
out of each pair, namely, we can choose all the right boots or all the
left boots; but with socks no such principle of selection suggests it-
self, and we cannot be sure, unless we assume the multiplicative
axiom [i.e., in effect Choice], that there is any class consisting of
one sock out of each pair. (Russell, 1919, p. 126)
In short, some form of Choice is needed to prove the following: If you have
countably many pairs of socks, then you have (only) countably many socks.
And in fact, without Countable Choice (or something equivalent), a countable
union of countable sets can fail to be countable.
3 A similar use of Choice occurred in Proposition 68.12, when we gave the instruction “For
The moral is that Countable Choice was used repeatedly, without much
awareness of its users. The philosophicaly question is: How could we justify
Countable Choice?
An attempt at an intuitive justification might invoke an appeal to a super-
task. Suppose we make the first choice in 1/2 a minute, our second choice in
1/4 a minute, . . . , our n-th choice in 1/2n a minute, . . . Then within 1 minute,
we will have made an ω-sequence of choices, and defined a choice function.
But what, really, could such a thought-experiment tell us? For a start, it
relies upon taking this idea of “choosing” rather literally. For another, it seems
to bind up mathematics in metaphysical possibility.
More important: it is not going to give us any justification for Choice tout
court, rather than mere Countable Choice. For if we need every set to have a
choice function, then we’ll need to be able to perform a “supertask of arbitrary
ordinal length.” Bluntly, that idea is laughable.
Stages-accumulate. For any stage S, and for any sets which were formed
before stage S: a set is formed at stage S whose members are exactly those
sets. Nothing else is formed at stage S.
In fact, many authors have suggested that the Axiom of Choice can be justified
via (something like) this principle. We will briefly provide a gloss on that
approach.
We will start with a simple little result, which offers yet another equivalent
for Choice:
Theorem 69.12 (in ZF). Choice is equivalent to the following principle. If the el-
ements of A are disjoint and non-empty, then there is some C such that C ∩ x is a
singleton for every x ∈ A. (We call such a C a choice set for A.)
available before stage S. Now, by Stages-accumulate, for any sets which were
formed before S, a set is formed whose members are exactly those sets. Other-
wise put: every possible collections of earlier-available sets will exist at S. But
it is certainly possible to select objects which could be formed into a choice set
S
for A; that is just some very specific subset of A. So: some such choice set
exists, as required.
Well, that’s a very quick attempt to offer a justification of Choice on intrin-
sic grounds. But, to pursue this idea further, you should read Potter’s (2004,
§14.8) neat development of it.
Theorem 69.13 (Banach-Tarski Paradox (in ZFC)). Any ball can be decomposed
into finitely many pieces, which can be reassembled (by rotation and transportation)
to form two copies of that ball.
At first glance, this is a bit amazing. Clearly the two balls have twice the vol-
ume of the original ball. But rigid motions—rotation and transportation—do
not change volume. So it looks as if Banach-Tarski allows us to magick new
matter into existence.
It gets worse.4 Similar reasoning shows that a pea can be cut into finitely
many pieces, which can then be reassembled (by rotation and transportation)
to form an entity the shape and size of Big Ben.
None of this, however, holds in ZF on its own.5 So we face a decision:
reject Choice, or learn to live with the “paradox”.
We’re going to suggest that we should learn to live with the “paradox”.
Indeed, we don’t think it’s much of a paradox at all. In particular, we don’t
see why it is any more or less paradoxical than any of the following results:6
None of these three results require Choice. Indeed, we now just regard them
as surprising, lovely, bits of mathematics. Maybe we should adopt the same
attitude to the Banach-Tarski Paradox.
To be sure, a technical observation is required here; but it only requires
keeping a level head. Rigid motions preserve volume. Consequently, the five7
pieces into which the ball is decomposed cannot all be measurable. Roughly
put, then, it makes no sense to assign a volume to these individual pieces. You
should think of these as unpicturable, “infinite scatterings” of points. Now,
maybe it is “weird” to conceive of such “infinitely scattered” sets. But their
existence seems to fall out from the injunction, embodied in Stages-accumulate,
that you should form all possible collections of earlier-available sets.
If none of that convinces, here is a final (extrinsic) argument in favour of
embracing the Banach-Tarski Paradox. It immediately entails the best math
joke of all time:
Theorem 69.14 (Vitali’s Paradox (in ZFC)). Any circle can be decomposed into
countably many pieces, which can be reassembled (by rotation and transportation)
to form two copies of that circle.
that the decomposition can be achieved with five pieces (but no fewer). For a proof, see Tomkow-
icz and Wagon (2016, pp. 66–7).
8 For a much fuller treatment, see Weston (2003) or Tomkowicz and Wagon (2016).
Proof. Writing 0R for the rotation by 0 radians, this is an identity element for
R, since ρ ◦ 0R = 0R ◦ ρ = ρ for any ρ ∈ R.
Every element has an inverse. Where ρ ∈ R rotates by r radians, ρ−1 ∈ R
rotates by 2π − r radians, so that ρ ◦ ρ−1 = 0R .
Composition is associative: (τ ◦ σ ) ◦ ρ = τ ◦ (σ ◦ ρ) for any ρ, σ, τ ∈ R
Composition is commutative: σ ◦ ρ = ρ ◦ σ for any ρ, σ ∈ R.
In fact, we can split our group R in half, and then use either half to recover
the whole group:
Lemma 69.16. There is a partition of R into two disjoint sets, R1 and R2 , both of
which are a basis for R.
Proof. Let R1 consist of the rotations by rational radian values in [0, π ); let
R2 = R \ R1 . By elementary algebra, {ρ ◦ ρ : ρ ∈ R1 } = R. A similar result
can be obtained for R2 .
We will use this fact about groups to establish Theorem 69.14. Let S be
the unit circle, i.e., the set of points
√ exactly 1 unit away from the origin of
the plane, i.e., {⟨r, s⟩ ∈ R2 : r2 + s2 = 1}. We will split S into parts by
considering the following relation on S:
That is, the points of S are linked by this relation iff you can get from one to
the other by a rational-valued rotation about the origin. Unsurprisingly:
E = {[r ]∼ : r ∈ S},
and let C = ran( f ). For each rotation ρ ∈ R, the set ρ[C ] consists of the points
obtained by applying the rotation ρ to each point in C. These next two results
show that these sets cover the circle completely and without overlap:
S
Lemma 69.18. S = ρ∈ R ρ [ C ].
Proof. Fix s ∈ S; there is some r ∈ C such that r ∈ [s]∼ , i.e., r ∼ s, i.e., ρ(r ) = s
for some ρ ∈ R.
Lemma 69.20. There is a partition of S into two disjoint sets, D1 and D2 , such that
D1 can be partitioned into countably many sets which can be rotated to form a copy
of S (and similarly for D2 ).
This is a partition of S, by Lemma 69.18, and D1 and D2 are disjoint by Lemma 69.19.
By construction, D1 can be partitioned into countably many sets, ρ[C ] for each
ρ ∈ R1 . And these can be rotated to form a copy of S, since S = ρ∈ R ρ[C ] =
S
ρ∈ R1 ( ρ ◦ ρ )[C ] by Lemma 69.16 and Lemma 69.18. The same reasoning ap-
S
plies to D2 .
This immediately entails Vitali’s Paradox. For we can generate two copies
of S from S, just by splitting it up into countably many pieces (the various
ρ[C ]’s) and then rigidly moving them (simply rotate each piece of D1 , and
first transport and then rotate each piece of D2 ).
9 Since R is enumerable, each element of E is enumerable. Since S is non-enumerable, it
follows from Lemma 69.18 and Proposition 68.12 that E is non-enumerable. So this is a use of
uncountable Choice.
Let’s recap the proof-strategy. We started with some algebraic facts about
the group of rotations on the plane. We used this group to partition S into
equivalence classes. We then arrived at a “paradox”, by using Choice to select
elements from each class.
We use exactly the same strategy to prove Banach–Tarski. The main differ-
ence is that the algebraic facts used to prove Banach–Tarski are significantly
more complicated than those used to prove Vitali’s Paradox. But those alge-
braic facts have nothing to do with Choice. We will summarise them quickly.
To prove Banach–Tarski, we start by establishing an analogue of Lemma 69.16:
any free group can be split into four pieces, which intuitively we can “move
around” to recover two copies of the whole group.10 We then show that we
can use two particular rotations around the origin of R3 to generate a free
group of rotations, F.11 (No Choice yet.) We now regard points on the surface
of the sphere as “similar” iff one can be obtained from the other by a rotation
in F. We then use Choice to select exactly one point from each equivalence class
of “similar” points. Applying our division of F to the surface of the sphere, as
in Lemma 69.20, we split that surface into four pieces, which we can “move
around” to obtain two copies of the surface of the sphere. And this establishes
(Hausdorff, 1914):
Theorem 69.21 (Hausdorff’s Paradox (in ZFC)). The surface of any sphere can
be decomposed into finitely many pieces, which can be reassembled (by rotation and
transportation) to form two disjoint copies of that sphere.
A couple of further algebraic tricks are needed to obtain the full Banach-
Tarski Theorem (which concerns not just the sphere’s surface, but its interior
too). Frankly, however, this is just icing on the algebraic cake. Hence Weston
writes:
[. . . ] the result on free groups is the key step in the proof of the
Banach-Tarski paradox. From this point of view, the Banach-Tarski
paradox is not a statement about R3 so much as it is a statement
about the complexity of the group [of translations and rotations in
R3 ]. (Weston, 2003, p. 16)
Tomkowicz and Wagon (2016, Theorem 5.2). We follow Weston (2003, p. 3) in describing this
as “moving” the pieces of the group.
11 See Tomkowicz and Wagon (2016, Theorem 2.1).
Corollary 69.22 (Vitali). Let µ be a measure such that µ(S) = 1, and such that
µ( X ) = µ(Y ) if X and Y are congruent. Then ρ[C ] is unmeasurable for all ρ ∈ R.
1 = µ(S) = ∑ µ(ρ[C]) = ∑ r
ρ∈ R ρ∈ R
Problems
Problem 69.1. Prove Theorem 69.12. If you struggle, you can find a proof in
(Potter, 2004, pp. 242–3).
Methods
922
69.8. APPENDIX: VITALI’S PARADOX
Proofs
70.1 Introduction
Based on your experiences in introductory logic, you might be comfortable
with a derivation system—probably a natural deduction or Fitch style deriva-
tion system, or perhaps a proof-tree system. You probably remember doing
proofs in these systems, either proving a formula or show that a given argu-
ment is valid. In order to do this, you applied the rules of the system un-
til you got the desired end result. In reasoning about logic, we also prove
things, but in most cases we are not using a derivation system. In fact, most
of the proofs we consider are done in English (perhaps, with some symbolic
language thrown in) rather than entirely in the language of first-order logic.
When constructing such proofs, you might at first be at a loss—how do I prove
something without a derivation system? How do I start? How do I know if
my proof is correct?
Before attempting a proof, it’s important to know what a proof is and how
to construct one. As implied by the name, a proof is meant to show that some-
thing is true. You might think of this in terms of a dialogue—someone asks
you if something is true, say, if every prime other than two is an odd number.
To answer “yes” is not enough; they might want to know why. In this case,
you’d give them a proof.
In everyday discourse, it might be enough to gesture at an answer, or give
an incomplete answer. In logic and mathematics, however, we want rigorous
proof—we want to show that something is true beyond any doubt. This means
that every step in our proof must be justified, and the justification must be
cogent (i.e., the assumption you’re using is actually assumed in the statement
of the theorem you’re proving, the definitions you apply must be correctly
applied, the justifications appealed to must be correct inferences, etc.).
Usually, we’re proving some statement. We call the statements we’re prov-
ing by various names: propositions, theorems, lemmas, or corollaries. A
proposition is a basic proof-worthy statement: important enough to record,
924
70.2. STARTING A PROOF
but perhaps not particularly deep nor applied often. A theorem is a signifi-
cant, important proposition. Its proof often is broken into several steps, and
sometimes it is named after the person who first proved it (e.g., Cantor’s The-
orem, the Löwenheim-Skolem theorem) or after the fact it concerns (e.g., the
completeness theorem). A lemma is a proposition or theorem that is used
in the proof of a more important result. Confusingly, sometimes lemmas are
important results in themselves, and also named after the person who intro-
duced them (e.g., Zorn’s Lemma). A corollary is a result that easily follows
from another one.
A statement to be proved often contains assumptions that clarify which
kinds of things we’re proving something about. It might begin with “Let φ
be a formula of the form ψ → χ” or “Suppose Γ ⊢ φ” or something of the
sort. These are hypotheses of the proposition, theorem, or lemma, and you may
assume these to be true in your proof. They restrict what we’re proving, and
also introduce some names for the objects we’re talking about. For instance, if
your proposition begins with “Let φ be a formula of the form ψ → χ,” you’re
proving something about all formulas of a certain sort only (namely, condi-
tionals), and it’s understood that ψ → χ is an arbitrary conditional that your
proof will talk about.
In order to even start the proof, we need to know what it means for two sets
to be identical; i.e., we need to know what the “=” in that equation means for
sets. Sets are defined to be identical whenever they have the same elements.
So the definition we have to unpack is:
and the same set, even though we use different letters for it on the left and the right side. But the
ways in which that set is picked out may be different, and that makes the definition non-trivial.
Within the proof we are dealing with set-theoretic notions such as union,
and so we must also know the meanings of the symbol ∪ in order to under-
stand how the proof should proceed. And sometimes, unpacking the defini-
tion gives rise to further definitions to unpack. For instance, A ∪ B is defined
as {z : z ∈ A or z ∈ B}. So if you want to prove that x ∈ A ∪ B, unpacking
the definition of ∪ tells you that you have to prove x ∈ {z : z ∈ A or z ∈ B}.
Now you also have to remember that x ∈ {z : . . . z . . .} iff . . . x . . . . So, further
unpacking the definition of the {z : . . . z . . .} notation, what you have to show
is: x ∈ A or x ∈ B. So, “every element of A ∪ B is also an element of B ∪ A”
really means: “for every x, if x ∈ A or x ∈ B, then x ∈ B or x ∈ A.” If we fully
unpack the definitions in the proposition, we see that what we have to show
is this:
Proposition 70.3. For any sets A and B: (a) for every x, if x ∈ A or x ∈ B, then
x ∈ B or x ∈ A, and (b) for every x, if x ∈ B or x ∈ A, then x ∈ A or x ∈ B.
Using a Conjunction
Perhaps the simplest inference pattern is that of drawing as conclusion one of
the conjuncts of a conjunction. In other words: if we have assumed or already
proved that p and q, then we’re entitled to infer that p (and also that q). This is
such a basic inference that it is often not mentioned. For instance, once we’ve
unpacked the definition of D = E we’ve established that every element of D is
an element of E and vice versa. From this we can conclude that every element
of E is an element of D (that’s the “vice versa” part).
Proving a Conjunction
Sometimes what you’ll be asked to prove will have the form of a conjunc-
tion; you will be asked to “prove p and q.” In this case, you simply have
to do two things: prove p, and then prove q. You could divide your proof
into two sections, and for clarity, label them. When you’re making your first
notes, you might write “(1) Prove p” at the top of the page, and “(2) Prove q”
in the middle of the page. (Of course, you might not be explicitly asked to
prove a conjunction but find that your proof requires that you prove a con-
junction. For instance, if you’re asked to prove that D = E you will find that,
after unpacking the definition of =, you have to prove: every element of D is
an element of E and every element of E is an element of D).
Proving a Disjunction
When what you are proving takes the form of a disjunction (i.e., it is an state-
ment of the form “p or q”), it is enough to show that one of the disjuncts is true.
However, it basically never happens that either disjunct just follows from the
assumptions of your theorem. More often, the assumptions of your theorem
are themselves disjunctive, or you’re showing that all things of a certain kind
have one of two properties, but some of the things have the one and others
have the other property. This is where proof by cases is useful (see below).
Conditional Proof
Many theorems you will encounter are in conditional form (i.e., show that if
p holds, then q is also true). These cases are nice and easy to set up—simply
assume the antecedent of the conditional (in this case, p) and prove the con-
clusion q from it. So if your theorem reads, “If p then q,” you start your proof
with “assume p” and at the end you should have proved q.
Conditionals may be stated in different ways. So instead of “If p then q,”
a theorem may state that “p only if q,” “q if p,” or “q, provided p.” These all
mean the same and require assuming p and proving q from that assumption.
Recall that a biconditional (“p if and only if (iff) q”) is really two conditionals
put together: if p then q, and if q then p. All you have to do, then, is two
instances of conditional proof: one for the first conditional and another one
for the second. Sometimes, however, it is possible to prove an “iff” statement
by chaining together a bunch of other “iff” statements so that you start with
“p” an end with “q”—but in that case you have to make sure that each step
really is an “iff.”
Universal Claims
Using a universal claim is simple: if something is true for anything, it’s true
for each particular thing. So if, say, the hypothesis of your proof is A ⊆ B, that
means (unpacking the definition of ⊆), that, for every x ∈ A, x ∈ B. Thus, if
you already know that z ∈ A, you can conclude z ∈ B.
Proving a universal claim may seem a little bit tricky. Usually these state-
ments take the following form: “If x has P, then it has Q” or “All Ps are Qs.”
Of course, it might not fit this form perfectly, and it takes a bit of practice to
figure out what you’re asked to prove exactly. But: we often have to prove
that all objects with some property have a certain other property.
The way to prove a universal claim is to introduce names or variables, for
the things that have the one property and then show that they also have the
other property. We might put this by saying that to prove something for all Ps
you have to prove it for an arbitrary P. And the name introduced is a name
for an arbitrary P. We typically use single letters as these names for arbitrary
things, and the letters usually follow conventions: e.g., we use n for natural
numbers, φ for formulas, A for sets, f for functions, etc.
The trick is to maintain generality throughout the proof. You start by as-
suming that an arbitrary object (“x”) has the property P, and show (based only
on definitions or what you are allowed to assume) that x has the property Q.
Because you have not stipulated what x is specifically, other that it has the
property P, then you can assert that all every P has the property Q. In short,
x is a stand-in for all things with property P.
Proof by Cases
Suppose you have a disjunction as an assumption or as an already established
conclusion—you have assumed or proved that p or q is true. You want to
prove r. You do this in two steps: first you assume that p is true, and prove r,
then you assume that q is true and prove r again. This works because we
assume or know that one of the two alternatives holds. The two steps establish
that either one is sufficient for the truth of r. (If both are true, we have not one
but two reasons for why r is true. It is not necessary to separately prove that
r is true assuming both p and q.) To indicate what we’re doing, we announce
that we “distinguish cases.” For instance, suppose we know that x ∈ B ∪ C.
B ∪ C is defined as { x : x ∈ B or x ∈ C }. In other words, by definition, x ∈ B
or x ∈ C. We would prove that x ∈ A from this by first assuming that x ∈ B,
and proving x ∈ A from this assumption, and then assume x ∈ C, and again
prove x ∈ A from this. You would write “We distinguish cases” under the
assumption, then “Case (1): x ∈ B” underneath, and “Case (2): x ∈ C halfway
down the page. Then you’d proceed to fill in the top half and the bottom half
of the page.
Proof by cases is especially useful if what you’re proving is itself disjunc-
tive. Here’s a simple example:
Since x ∈ A, A ̸= ∅.
have to go into all this detail when you write down your own proofs.
Let a ∈ A.
It’s maybe good practice to keep bound variables like “x” separate from
hypothetical names like a, like we did. In practice, however, we often don’t
and just use x, like so:
However, when you do this, you have to be extra careful that you use different
x’s and y’s for different existential claims. For instance, the following is not a
correct proof of “If A ̸= ∅ and B ̸= ∅ then A ∩ B ̸= ∅” (which is not true).
Can you spot where the incorrect step occurs and explain why the result does
not hold?
70.5 An Example
Our first example is the following simple fact about unions and intersections
of sets. It will illustrate unpacking definitions, proofs of conjunctions, of uni-
versal claims, and proof by cases.
This completes the first case of the proof by cases. Now we want
to derive the conclusion in the second case, where z ∈ B ∩ C.
Again, we are working with the intersection of two sets. Let’s ap-
ply the definition of ∩:
Now for the second case, z ∈ B. Here we’ll unpack the second ∪
and do another proof-by-cases:
Case 2: Suppose that z ∈ B. Since z ∈ A ∪ C, either z ∈ A or z ∈ C. We
distinguish cases further:
Case 2a: z ∈ A. Then, again, z ∈ A ∪ ( B ∩ C ).
Ok, this was a bit weird. We didn’t actually need the assumption
that z ∈ B for this case, but that’s ok.
Case 2b: z ∈ C. Then z ∈ B and z ∈ C, so z ∈ B ∩ C, and consequently,
z ∈ A ∪ ( B ∩ C ).
This concludes both proofs-by-cases and so we’re done with the
second half.
So, if z ∈ ( A ∪ B) ∩ ( A ∪ C ) then z ∈ A ∪ ( B ∩ C ).
Here we’ve used the fact recorded earlier which followed from the
hypothesis of the proposition that A ⊆ C. The first case is com-
plete, and we turn to the second case, z ∈ (C \ A). Recall that
C \ A denotes the difference of the two sets, i.e., the set of all ele-
ments of C which are not elements of A. But any element of C not
in A is in particular an element of C.
Great, we’ve proved the first direction. Now for the second direc-
tion. Here we prove that C ⊆ A ∪ (C \ A). So we assume that
z ∈ C and prove that z ∈ A ∪ (C \ A).
Either z ∈ A or z ∈
/ A. In the former case, z ∈ A ∪ (C \ A). In the latter case,
z ∈ C and z ∈
/ A, so z ∈ C \ A. But then z ∈ A ∪ (C \ A).
A has no elements iff it’s not the case that there is an x such that x ∈ A.
Since A ⊆ B, x ∈ B.
A ∩ ( A ∪ B) = A
This is the first half of the proof of the identity: it estabishes that if an
arbitrary z is an element of the left side, it is also an element of the right, i.e.,
A ∩ ( A ∪ B) ⊆ A. Assume that z ∈ A ∩ ( A ∪ B). Since z is an element of
the intersection of two sets iff it is an element of both sets, we can conclude
that z ∈ A and also z ∈ A ∪ B. In particular, z ∈ A, which is what we
wanted to show. Since that’s all that has to be done for the first half, we know
that the rest of the proof must be a proof of the second half, i.e., a proof that
A ⊆ A ∩ ( A ∪ B ).
3. Ask for help. You have many resources available to you—your instructor
and teaching assistant are there for you and want you to succeed. They
should be able to help you work out a problem and identify where in
the process you’re struggling.
4. Take a break. If you’re stuck, it might be because you’ve been staring at the
problem for too long. Take a short break, have a cup of tea, or work on
a different problem for a while, then return to the problem with a fresh
mind. Sleep on it.
Notice how these strategies require that you’ve started to work on the
proof well in advance? If you’ve started the proof at 2am the day before it’s
due, these might not be so helpful.
This might sound like doom and gloom, but solving a proof is a challenge
that pays off in the end. Some people do this as a career—so there must be
something to enjoy about it. Like basically everything, solving problems and
doing proofs is something that requires practice. You might see classmates
who find this easy: they’ve probably just had lots of practice already. Try not
to give in too easily.
If you do run out of time (or patience) on a particular problem: that’s ok. It
doesn’t mean you’re stupid or that you will never get it. Find out (from your
instructor or another student) how it is done, and identify where you went
wrong or got stuck, so you can avoid doing that the next time you encounter
a similar issue. Then try to do it without looking at the solution. And next
time, start (and ask for help) earlier.
Motivational Videos
Feel like you have no motivation to do your homework? Feeling down? These
videos might help!
• https://www.youtube.com/watch?v=ZXsQAXx_ao0
• https://www.youtube.com/watch?v=BQ4yd2W50No
• https://www.youtube.com/watch?v=StTqXEQ2l-Y
Problems
Problem 70.1. Suppose you are asked to prove that A ∩ B ̸= ∅. Unpack all
the definitions occuring here, i.e., restate this in a way that does not mention
“∩”, “=”, or “∅”.
Induction
71.1 Introduction
Induction is an important proof technique which is used, in different forms,
in almost all areas of logic, theoretical computer science, and mathematics. It
is needed to prove many of the results in logic.
Induction is often contrasted with deduction, and characterized as the in-
ference from the particular to the general. For instance, if we observe many
green emeralds, and nothing that we would call an emerald that’s not green,
we might conclude that all emeralds are green. This is an inductive inference,
in that it proceeds from many particlar cases (this emerald is green, that emer-
ald is green, etc.) to a general claim (all emeralds are green). Mathematical
induction is also an inference that concludes a general claim, but it is of a very
different kind than this “simple induction.”
Very roughly, an inductive proof in mathematics concludes that all math-
ematical objects of a certain sort have a certain property. In the simplest case,
the mathematical objects an inductive proof is concerned with are natural
numbers. In that case an inductive proof is used to establish that all natural
numbers have some property, and it does this by showing that
Induction on natural numbers can then also often be used to prove general
claims about mathematical objects that can be assigned numbers. For instance,
finite sets each have a finite number n of elements, and if we can use induction
to show that every number n has the property “all finite sets of size n are . . . ”
then we will have shown something about all finite sets.
Induction can also be generalized to mathematical objects that are induc-
tively defined. For instance, expressions of a formal language such as those of
first-order logic are defined inductively. Structural induction is a way to prove
945
CHAPTER 71. INDUCTION
71.2 Induction on N
In its simplest form, induction is a technique used to prove results for all nat-
ural numbers. It uses the fact that by starting from 0 and repeatedly adding 1
we eventually reach every natural number. So to prove that something is true
for every number, we can (1) establish that it is true for 0 and (2) show that
whenever it is true for a number n, it is also true for the next number n + 1. If
we abbreviate “number n has property P” by P(n) (and “number k has prop-
erty P” by P(k), etc.), then a proof by induction that P(n) for all n ∈ N consists
of:
1. a proof of P(0), and
Proof. Let P(n) be the claim: “It is possible to throw any number between n
and 6n using n dice.” To use induction, we prove:
1. The induction basis P(1), i.e., with just one die, you can throw any num-
ber between 1 and 6.
(1) Is proved by inspecting a 6-sided die. It has all 6 sides, and every num-
ber between 1 and 6 shows up one on of the sides. So it is possible to throw
any number between 1 and 6 using a single die.
To prove (2), we assume the antecedent of the conditional, i.e., P(k). This
assumption is called the inductive hypothesis. We use it to prove P(k + 1). The
hard part is to find a way of thinking about the possible values of a throw of
k + 1 dice in terms of the possible values of throws of k dice plus of throws of
the extra k + 1-st die—this is what we have to do, though, if we want to use
the inductive hypothesis.
The inductive hypothesis says we can get any number between k and 6k
using k dice. If we throw a 1 with our (k + 1)-st die, this adds 1 to the total.
So we can throw any value between k + 1 and 6k + 1 by throwing k dice and
then rolling a 1 with the (k + 1)-st die. What’s left? The values 6k + 2 through
6k + 6. We can get these by rolling k 6s and then a number between 2 and 6
with our (k + 1)-st die. Together, this means that with k + 1 dice we can throw
any of the numbers between k + 1 and 6(k + 1), i.e., we’ve proved P(k + 1)
using the assumption P(k), the inductive hypothesis.
s0 = 0
s n +1 = s n + ( n + 1 )
s0 = 0,
s1 = s0 + 1 = 1,
s2 = s1 + 2 = 1+2 = 3
s3 = s2 + 3 = 1 + 2 + 3 = 6, etc.
Proof. We have to prove (1) that s0 = 0 · (0 + 1)/2 and (2) if sk = k(k + 1)/2
then sk+1 = (k + 1)(k + 2)/2. (1) is obvious. To prove (2), we assume the
inductive hypothesis: sk = k (k + 1)/2. Using it, we have to show that sk+1 =
(k + 1)(k + 2)/2.
k ( k + 1)
s k +1 = + ( k + 1) =
2
k ( k + 1) 2( k + 1)
= + =
2 2
k ( k + 1) + 2( k + 1)
= =
2
(k + 2)(k + 1)
= .
2
The important lesson here is that if you’re proving something about some
inductively defined sequence an , induction is the obvious way to go. And
even if it isn’t (as in the case of the possibilities of dice throws), you can use
induction if you can somehow relate the case for k + 1 to the case for k.
This variant is useful if establishing the claim for k can’t be made to just
rely on the claim for k − 1 but may require the assumption that it is true for
one or more l < k.
Definition 71.3 (Nice terms). The set of nice terms is inductively defined as
follows:
This definition tells us that something counts as a nice term iff it can be
constructed according to the two conditions (1) and (2) in some finite number
of steps. In the first step, we construct all nice terms just consisting of letters
by themselves, i.e.,
a, b, c, d
In the second step, we apply (2) to the terms we’ve constructed. We’ll get
for all combinations of two letters. In the third step, we apply (2) again, to any
two nice terms we’ve constructed so far. We get new nice term such as [a ◦ [a ◦
Proposition 71.4. For any n, the number of [ in a nice term of length n is < n/2.
Proof. To prove this result by (strong) induction, we have to show that the
following conditional claim is true:
If for every l < k, any nice term of length l has < l/2 [’s, then any
nice term of length k has < k/2 [’s.
To show this conditional, assume that its antecedent is true, i.e., assume that
for any l < k, nice terms of length l contain < l/2 [’s. We call this assumption
the inductive hypothesis. We want to show the same is true for nice terms of
length k.
So suppose t is a nice term of length k. Because nice terms are inductively
defined, we have two cases: (1) t is a letter by itself, or (2) t is [s1 ◦ s2 ] for some
nice terms s1 and s2 .
2. t is [s1 ◦ s2 ] for some nice terms s1 and s2 . Let’s let l1 be the length of s1
and l2 be the length of s2 . Then the length k of t is l1 + l2 + 3 (the lengths
of s1 and s2 plus three symbols [, ◦, ]). Since l1 + l2 + 3 is always greater
than l1 , l1 < k. Similarly, l2 < k. That means that the induction hypothe-
sis applies to the terms s1 and s2 : the number m1 of [ in s1 is < l1 /2, and
the number m2 of [ in s2 is < l2 /2.
l1 l l + l2 + 2 l + l2 + 3
m1 + m2 + 1 < + 2 +1 = 1 < 1 = k/2.
2 2 2 2
In each case, we’ve shown that the number of [ in t is < k/2 (on the basis of
the inductive hypothesis). By strong induction, the proposition follows.
o (s1 , s2 ) =[s1 ◦ s2 ]
You can even think of the natural numbers N themselves as being given by an
inductive definition: the initial object is 0, and the operation is the successor
function x + 1.
In order to prove something about all elements of an inductively defined
set, i.e., that every element of the set has a property P, we must:
2. Prove that for each operation o, if the arguments have P, so does the
result.
For instance, in order to prove something about all nice terms, we would
prove that it is true about all letters, and that it is true about [s1 ◦ s2 ] provided
it is true of s1 and s2 individually.
Proposition 71.5. The number of [ equals the number of ] in any nice term t.
Proof. We use structural induction. Nice terms are inductively defined, with
letters as initial objects and the operation o for constructing new nice terms
out of old ones.
1. The claim is true for every letter, since the number of [ in a letter by itself
is 0 and the number of ] in it is also 0.
Proof. By induction on t:
1. t is a letter by itself: Then t has no proper initial segments.
This definition, for instance, will tell us that a ⊑ [b ◦ a]. For (2) says that
a ⊑ [b ◦ a] iff a = [b ◦ a], or a ⊑ b, or a ⊑ a. The first two are false: a
clearly isn’t identical to [b ◦ a], and by (1), a ⊑ b iff a = b, which is also false.
However, also by (1), a ⊑ a iff a = a, which is true.
It’s important to note that the success of this definition depends on a fact
that we haven’t proved yet: every nice term t is either a letter by itself, or there
are uniquely determined nice terms s1 and s2 such that t = [s1 ◦ s2 ]. “Uniquely
determined” here means that if t = [s1 ◦ s2 ] it isn’t also = [r1 ◦ r2 ] with s1 ̸= r1
or s2 ̸= r2 . If this were the case, then clause (2) may come in conflict with
itself: reading t2 as [s1 ◦ s2 ] we might get t1 ⊑ t2 , but if we read t2 as [r1 ◦ r2 ]
we might get not t1 ⊑ t2 . Before we prove that this can’t happen, let’s look at
an example where it can happen.
Definition 71.8. Define bracketless terms inductively by
1. Every letter is a bracketless term.
2. If s1 and s2 are bracketless terms, then s1 ◦ s2 is a bracketless term.
3. Nothing else is a bracketless term.
s1 = b and s2 = a ◦ b.
r1 = b ◦ a and r2 = b.
We can also define functions inductively: e.g., we can define the function f
that maps any nice term to the maximum depth of nested [. . . ] in it as follows:
Definition 71.10. The depth of a nice term, f (t), is defined inductively as fol-
lows: (
0 if t is a letter
f (t) =
max( f (s1 ), f (s2 )) + 1 if t = [s1 ◦ s2 ].
For instance
f ([a ◦ b]) = max( f (a), f (b)) + 1 =
= max(0, 0) + 1 = 1, and
f ([[a ◦ b] ◦ c]) = max( f ([a ◦ b]), f (c)) + 1 =
= max(1, 0) + 1 = 2.
Here, of course, we assume that s1 an s2 are nice terms, and make use
of the fact that every nice term is either a letter or of the form [s1 ◦ s2 ]. It
is again important that it can be of this form in only one way. To see why,
consider again the bracketless terms we defined earlier. The corresponding
“definition” would be:
(
0 if t is a letter
g(t) =
max( g(s1 ), g(s2 )) + 1 if t = s1 ◦ s2 .
Now consider the bracketless term a ◦ b ◦ c ◦ d. It can be read in more than
one way, e.g., as s1 ◦ s2 with
s1 = a and s2 = b ◦ c ◦ d,
or as r1 ◦ r2 with
r1 = a ◦ b and r2 = c ◦ d.
Calculating g according to the first way of reading it would give
g(s1 ◦ s2 ) = max( g(a), g(b ◦ c ◦ d)) + 1 =
= max(0, 2) + 1 = 3
Problems
Problem 71.1. Define the set of supernice terms by
Problem 71.2. Prove by structural induction that no nice term starts with ].
Problem 71.3. Give an inductive definition of the function l, where l (t) is the
number of symbols in the nice term t.
Problem 71.4. Prove by structural induction on nice terms t that f (t) < l (t)
(where l (t) is the number of symbols in t and f (t) is the depth of t as defined
in Definition 71.10).
History
956
Chapter 72
Biographies
957
CHAPTER 72. BIOGRAPHIES
Further Reading For full biographies of Cantor, see Dauben (1990) and Grattan-
Guinness (1971). Cantor’s radical views are also described in the BBC Radio 4
program A Brief History of Mathematics (du Sautoy, 2014). If you’d like to hear
about Cantor’s theories in rap form, see Rose (2012).
Further Reading For a brief biography of Church, see Enderton (2019). Church’s
original writings on the lambda calculus and the Entscheidungsproblem (Church’s
Thesis) are Church (1936a,b). Aspray (1984) records an interview with Church
about the Princeton mathematics community in the 1930s. Church wrote a se-
ries of book reviews of the Journal of Symbolic Logic from 1936 until 1979. They
are all archived on John MacFarlane’s website (MacFarlane, 2015).
Study in Princeton, New Jersey. Despite his introversion and eccentric nature,
Gödel’s time at Princeton was collaborative and fruitful. He published essays
in set theory, philosophy and physics. Notably, he struck up a particularly
strong friendship with his colleague at the IAS, Albert Einstein.
In his later years, Gödel’s mental health deteriorated. His wife’s hospi-
talization in 1977 meant she was no longer able to cook his meals for him.
Having suffered from mental health issues throughout his life, he succumbed
to paranoia. Deathly afraid of being poisoned, Gödel refused to eat. He died
of starvation on January 14, 1978, in Princeton.
Further Reading For a biography of Noether, see Dick (1981). The Perime-
ter Institute for Theoretical Physics has their lectures on Noether’s life and
influence available online (Institute, 2015). If you’re tired of reading, Stuff You
Missed in History Class has a podcast on Noether’s life and influence (Frey and
Wilson, 2015). The collected works of Noether are available in the original
German (Jacobson, 1983).
Rózsa Péter was born Rósza Politzer, in Budapest, Hungary, on February 17,
1905. She is best known for her work on recursive functions, which was es-
sential for the creation of the field of recursion theory.
Péter was raised during harsh polit-
ical times—WWI raged when she was
a teenager—but was able to attend the
affluent Maria Terezia Girls’ School in
Budapest, from where she graduated
in 1922. She then studied at Pázmány
Péter University (later renamed Loránd
Eötvös University) in Budapest. She
began studying chemistry at the insis-
tence of her father, but later switched
to mathematics, and graduated in 1927.
Although she had the credentials to
teach high school mathematics, the eco-
nomic situation at the time was dire as
the Great Depression affected the world
economy. During this time, Péter took
Figure 72.6: Rózsa Péter
odd jobs as a tutor and private teacher
of mathematics. She eventually returned to university to take up graduate
studies in mathematics. She had originally planned to work in number the-
ory, but after finding out that her results had already been proven, she almost
gave up on mathematics altogether. She was encouraged to work on Gödel’s
incompleteness theorems, and unknowingly proved several of his results in
different ways. This restored her confidence, and Péter went on to write her
first papers on recursion theory, inspired by David Hilbert’s foundational pro-
gram. She received her PhD in 1935, and in 1937 she became an editor for the
Journal of Symbolic Logic.
Péter’s early papers are widely credited as founding contributions to the
field of recursive function theory. In Péter (1935a), she investigated the rela-
tionship between different kinds of recursion. In Péter (1935b), she showed
that a certain recursively defined function is not primitive recursive. This
simplified an earlier result due to Wilhelm Ackermann. Péter’s simplified
function is what’s now often called the Ackermann function—and sometimes,
more properly, the Ackermann–Péter function. She wrote the first book on re-
cursive function theory (Péter, 1951).
Despite the importance and influence of her work, Péter did not obtain a
full-time teaching position until 1945. During the Nazi occupation of Hungary
during World War II, Péter was not allowed to teach due to anti-Semitic laws.
In 1944 the government created a Jewish ghetto in Budapest; the ghetto was
cut off from the rest of the city and attended by armed guards. Péter was
forced to live in the ghetto until 1945 when it was liberated. She then went on
to teach at the Budapest Teachers Training College, and from 1955 onward at
Eötvös Loránd University. She was the first female Hungarian mathematician
to become an Academic Doctor of Mathematics, and the first woman to be
elected to the Hungarian Academy of Sciences.
Péter was known as a passionate teacher of mathematics, who preferred
to explore the nature and beauty of mathematical problems with her students
rather than to merely lecture. As a result, she was affectionately called “Aunt
Rosa” by her students. Péter died in 1977 at the age of 71.
Further Reading For more biographical reading, see (O’Connor and Robert-
son, 2014) and (Andrásfai, 1986). Tamassy (1994) conducted a brief interview
with Péter. For a fun read about mathematics, see Péter’s book Playing With
Infinity (Péter, 2010).
there was substantial scar tissue build up on her heart due to the rheumatic
fever she suffered as a child. Due to the severity of the scar tissue, the doctor
predicted that she would not live past forty and she was advised not to have
children (Reid, 1986, 13).
Robinson was depressed for a long time, but eventually decided to con-
tinue studying mathematics. She returned to Berkeley and completed her PhD
in 1948 under the supervision of Alfred Tarski. The first-order theory of the
real numbers had been shown to be decidable by Tarski, and from Gödel’s
work it followed that the first-order theory of the natural numbers is unde-
cidable. It was a major open problem whether the first-order theory of the
rationals is decidable or not. In her thesis (1949), Robinson proved that it was
not.
Interested in decision problems, Robinson next attempted to find a solu-
tion to Hilbert’s tenth problem. This problem was one of a famous list of
23 mathematical problems posed by David Hilbert in 1900. The tenth prob-
lem asks whether there is an algorithm that will answer, in a finite amount of
time, whether or not a polynomial equation with integer coefficients, such as
3x2 − 2y + 3 = 0, has a solution in the integers. Such questions are known as
Diophantine problems. After some initial successes, Robinson joined forces with
Martin Davis and Hilary Putnam, who were also working on the problem.
They succeeded in showing that exponential Diophantine problems (where
the unknowns may also appear as exponents) are undecidable, and showed
that a certain conjecture (later called “J.R.”) implies that Hilbert’s tenth prob-
lem is undecidable (Davis et al., 1961). Robinson continued to work on the
problem throughout the 1960s. In 1970, the young Russian mathematician
Yuri Matijasevich finally proved the J.R. hypothesis. The combined result
is now called the Matijasevich–Robinson–Davis–Putnam theorem, or MRDP
theorem for short. Matijasevich and Robinson became friends and collabo-
rated on several papers. In a letter to Matijasevich, Robinson once wrote that
“actually I am very pleased that working together (thousands of miles apart)
we are obviously making more progress than either one of us could alone”
(Matijasevich, 1992, 45).
Robinson was the first female president of the American Mathematical So-
ciety, and the first woman to be elected to the National Academy of Science.
She died on July 30, 1985 at the age of 65 after being diagnosed with leukemia.
with Robinson, and her influence on his work, see (Matijasevich, 1992).
Tarski could not return. His wife and children remained in Poland until the
end of the war, but were then able to emigrate to the United States as well.
Tarski taught at Harvard, the College of the City of New York, and the Insti-
tute for Advanced Study at Princeton, and finally the University of California,
Berkeley. There he founded the multidisciplinary program in Logic and the
Methodology of Science. Tarski died on October 26, 1983 at the age of 82.
Further Reading For more on Tarski’s life, see the biography Alfred Tarski:
Life and Logic (Feferman and Feferman, 2004). Tarski’s seminal works on logi-
cal consequence and truth are available in English in (Corcoran, 1983). All of
Tarski’s original works have been collected into a four volume series, (Tarski,
1981).
the team the ability to crack the code by creating a de-crypting machine called
a “bombe.” His ideas also helped in the creation of the world’s first pro-
grammable electronic computer, the Colossus, also used at Bletchley park to
break the German Lorenz cypher.
Turing was gay. Nevertheless, in 1942 he proposed to Joan Clarke, one
of his teammates at Bletchley Park, but later broke off the engagement and
confessed to her that he was homosexual. He had several lovers throughout
his lifetime, although homosexual acts were then criminal offences in the UK.
In 1952, Turing’s house was burgled by a friend of his lover at the time, and
when filing a police report, Turing admitted to having a homosexual relation-
ship, under the impression that the government was on their way to legalizing
homosexual acts. This was not true, and he was charged with gross indecency.
Instead of going to prison, Turing opted for a hormone treatment that reduced
libido. Turing was found dead on June 8, 1954, of a cyanide overdose—most
likely suicide. He was given a royal pardon by Queen Elizabeth II in 2013.
This chapter includes the historical prelude from Tim Button’s Open
Set Theory text.
f (x)
5
x
1 2 3 4
971
CHAPTER 73. HISTORY AND MYTHOLOGY OF SET THEORY
f (1/2 + β) − f (1/2)
.
β
So the gradient of our red triangle, with base length 3, is exactly 1. The hy-
potenuse of a smaller triangle, the blue triangle with base length 2, gives a
better approximation; its gradient is 3/4. A yet smaller triangle, the green tri-
angle with base length 1, gives a yet better approximation; with gradient 1/2.
Ever-smaller triangles give us ever-better approximations. So we might
say something like this: the hypotenuse of a triangle with an infinitesimal base
length gives us the gradient at c = 1/2 itself. In this way, we would obtain a
formula for the (first) derivative of the function f at the point c:
f (c + β) − f (c)
f ′ (c) = where β is infinitesimal.
β
I admit that signs may be made to denote either any thing or noth-
ing: and consequently that in the original notation c + β, β might
have signified either an increment or nothing. But then which of
these soever you make it signify, you must argue consistently with
such its signification, and not proceed upon a double meaning:
Which to do were a manifest sophism. (Berkeley 1734, §XIII, vari-
ables changed to match preceding text)
To defend the infinitesimal calculus against Berkeley, one might reply that the
talk of “infinitesimals” is merely figurative. One might say that, so long as
we take a really small triangle, we will get a good enough approximation to the
tangent. Berkeley had a reply to this too: whilst that might be good enough
for engineering, it undermines the status of mathematics, for
we are told that in rebus mathematicis errores quàm minimi non sunt
contemnendi. [In the case of mathematics, the smallest errors are
not to be neglected.] (Berkeley, 1734, §IX)
lim g( x ) = ℓ.
x →c
In the 19th century, building upon earlier work by Cauchy, Weierstrass offered
a perfectly rigorous definition of this expression. The idea is indeed that we
can make g( x ) as close as we like to ℓ, by making x suitably close to c. More
precisely, we stipulate that limx→c g( x ) = ℓ will mean:
The vertical bars here indicate absolute magnitude. That is, | x | = x when
x ≥ 0, and | x | = − x when x < 0; you can depict that function as follows:
|x|
x
−2 −1 1 2
So the definition says roughly this: you can make your “error” less than ε (i.e.,
| g( x ) − ℓ| < ε) by choosing arguments which are no more than δ away from c
(i.e., | x − c| < δ).
f (c + x ) − f (c)
′
f (c) = lim where a limit exists.
x →0 x
It is important, though, to realise why our definition needs the caveat “where
a limit exists”. To take a simple example, consider f ( x ) = | x |, whose graph we
just saw. Evidently, f ′ (0) is ill-defined: if we approach 0 “from the right”, the
gradient is always 1; if we approach 0 “from the left”, the gradient is always
−1; so the limit is undefined. As such, we might add that a function f is
differentiable at x iff such a limit exists.
We have seen how to handle differentiation using the notion of a limit. We
can use the same notion to define the idea of a continuous function. (Bolzano
had, in effect, realised this by 1817.) The Cauchy–Weierstrass treatment of
continuity is as follows. Roughly: a function f is continuous (at a point) pro-
vided that, if you demand a certain amount of precision concerning the output
of the function, you can guarantee this by insisting upon a certain amount of
precision concerning the input of the function. More precisely: f is continu-
ous at c provided that, as x tends to zero, the difference between f (c + x ) and
f (c) itself tends to 0. Otherwise put: f is continuous at c iff f (c) = limx→c f ( x ).
To go any further would just lead us off into real analysis, when our subject
matter is set theory. So now we should pause, and state the moral. During
the 19th century, mathematicians learnt how to do without infinitesimals, by
invoking a rigorously defined notion of a limit.
73.3 Pathologies
However, the definition of a limit turned out to allow for some rather “patho-
logical” constructions.
Around the 1830s, Bolzano discovered a function which was continuous ev-
erywhere, but differentiable nowhere. (Unfortunately, Bolzano never published
this; the idea was first encountered by mathematicians in 1872, thanks to
Weierstrass’s independent discovery of the same idea.)1 This was, to say the
least, rather surprising. It is easy to find functions, such as | x |, which are con-
tinuous everywhere but not differentiable at a particular point. But a function
which is continuous everywhere but differentiable nowhere is a very different
beast. Consider, for a moment, how you might try to draw such a function.
To ensure it is continuous, you must be able to draw it without ever removing
your pen from the page; but to ensure it is differentiable nowhere, you would
have to abruptly change the direction of your pen, constantly.
1 The history is documented in extremely thorough footnotes to the Wikipedia article on the
Weierstrass function.
But, in 1877, Cantor proved that he had been wrong. In fact, a line and a
square have exactly the same number of points. He wrote on 29 June 1877 to
Dedekind “je le vois, mais je ne le crois pas”; that is, “I see it, but I don’t believe
it”. In the “received history” of mathematics, this is often taken to indicate
just how literally incredible these new results were to the mathematicians of the
time. (The correspondence is presented in Gouvêa (2011), and we return to it
in section 73.4. Cantor’s proof is outlined in section 73.5.)
Inspired by Cantor’s result, Peano started to consider whether it might be
possible to map a line smoothly onto a plane. This would be a curve which
fills space. In 1890, Peano constructed just such a curve. This is truly counter-
intuitive: Euclid had defined a line as “breadthless length” (Book I, Definition
2), but Peano had shown that, by curling up a line appropriately, its length can
be turned into breadth. In 1891, Hilbert described a slightly more intuitive
space-filling curve, together with some pictures illustrating it. The curve is
constructed in sequence, and here are the first six stages of the construction:
Cantor knew his result was “so unexpected, so new”. But it is doubtful that
he ever found his result unbelievable. As Gouvêa points out, he was simply
asking Dedekind to check the proof he had offered.
On the question of geometric intuition: Peano published his space-filling
curve without including any diagrams. But when Hilbert published his curve,
he explained his purpose: he would provide readers with a clear way to un-
derstand Peano’s result, if they “help themselves to the following geometric
intuition”; whereupon he included a series of diagrams just like those provided
in section 73.3.
More generally: whilst diagrams have fallen rather out of fashion in pub-
lished proofs, there is no getting round the fact that mathematicians frequently
use diagrams when proving things. (Roughly put: good mathematicians know
when they can rely upon geometric intuition.)
In short: don’t believe the hype; or at least, don’t just take it on trust. For
more on this, you could read Giaquinto (2007).
Theorem 73.1. L ≈ S
Proof: first part.. Fix a, b ∈ L. Write them in binary notation, so that we have
infinite sequences of 0s and 1s, a1 , a2 , . . . , and b1 , b2 , . . . , such that:
a = 0.a1 a2 a3 a4 . . .
b = 0.b1 b2 b3 b4 . . .
f ( a, b) = 0.a1 b1 a2 b2 a3 b3 a4 b4 . . .
a = 0.1̇1̇ = 0.111111 . . .
b=0
Proof: completed.. So, we have shown that S ⪯ L. But there is obviously an in-
jection from L to S: just lay the line flat along one side of the square. So L ⪯ S
and S ⪯ L. By Schröder–Bernstein (Theorem 4.25), L ≈ S.
But of course, Cantor could not complete the last line in these terms, for
the Schröder-Bernstein Theorem was not yet proved. Indeed, although Cantor
would subsequently formulate this as a general conjecture, it was not satisfac-
torily proved until 1897. (And so, later in 1877, Cantor offered a different
proof of Theorem 73.1, which did not go via Schröder–Bernstein.)
2 For a more rigorous explanation, see Rose (2010). The tweak amounts to the inclusion of the
red parts of the curves below. This makes it slightly easier to check that the curve is continuous.
The different colours will help explain how h2 was constructed. We first place
scaled-down copies of the non-red bit of h1 into the bottom left, top left, top
right, and bottom right of our square (drawn in black). We then connect these
four figures (with green lines). Finally, we connect our figure to the boundary
of the square (with red lines).
Now to h3 . Just as h2 was made from four connected, scaled-down copies
of the non-red bit of h1 , so h3 is made up of four scaled-down copies of the
non-red bit of h2 (drawn in black), which are then joined together (with green
lines) and finally connected to the boundary of the square (with red lines).
And now we see the general pattern for defining hn+1 from hn . At last we
define the curve h itself by considering the point-by-point limit of these suc-
cessive functions h1 , h2 , . . . That is, for each x ∈ S:
h( x ) = lim hn ( x )
n→∞
We now show that this curve fills space. When we draw the curve hn , we
impose a 2n × 2n grid onto S. By Pythagoras’s Theorem, the diagonal of each
grid-location is of length:
q
1
(1/2n )2 + (1/2n )2 = 2( 2 −n)
This is fairly intuitive: a curve is, intuitively, a “smooth” map which takes
a canonical line onto the plane R2 . Our function, h, is indeed a map from L
Reference
981
CHAPTER 73. HISTORY AND MYTHOLOGY OF SET THEORY
Alpha α A Nu ν N
Beta β B Xi ξ Ξ
Gamma γ Γ Omicron o O
Delta δ ∆ Pi π Π
Epsilon ε E Rho ρ P
Zeta ζ Z Sigma σ Σ
Eta η H Tau τ T
Theta θ Θ Upsilon υ Υ
Iota ι I Phi φ Φ
Kappa κ K Chi χ X
Lambda λ Λ Psi ψ Ψ
Mu µ M Omega ω Ω
983
Chapter 75
A A a a N N n n
B B b b O O o o
C C c c P P p p
D D d d Q Q q q
E E e e R R r r
F F f f S S s s
G G g g T T t t
H H h h U U u u
I I i i V V v v
J J j j W W w w
K K k k X X x x
L L l l Y Y y y
M M m m Z Z z z
Photo Credits
Georg Cantor, p. 957: Portrait of Georg Cantor by Otto Zeth courtesy of the
Universitätsarchiv, Martin-Luther Universität Halle–Wittenberg. UAHW Rep. 40-
VI, Nr. 3 Bild 102.
Alonzo Church, p. 958: Portrait of Alonzo Church, undated, photogra-
pher unknown. Alonzo Church Papers; 1924–1995, (C0948) Box 60, Folder 3.
Manuscripts Division, Department of Rare Books and Special Collections, Prince-
ton University Library. © Princeton University. The Open Logic Project has
984
Photo Credits
Andrásfai, Béla. 1986. Rózsa (Rosa) Péter. Periodica Polytechnica Electrical En-
gineering 30(2-3): 139–145. URL http://www.pp.bme.hu/ee/article/
view/4651.
Banach, Stefan and Alfred Tarski. 1924. Sur la décomposition des ensembles
de points en parties respectivement congruentes. Fundamenta Mathematicae
6: 244–77.
Benacerraf, Paul. 1965. What numbers could not be. The Philosophical Review
74(1): 47–73.
Berkeley, George. 1734. The Analyst; or, a Discourse Adressed to an Infidel Mathe-
matician.
Boolos, George. 1971. The iterative conception of set. The Journal of Philosophy
68(8): 215–31.
Boolos, George. 2000. Must we believe in set theory? In Between Logic and Intu-
ition: Essays in Honor of Charles Parsons, eds. Gila Sher and Richard Tieszen,
257–68. Cambridge: Cambridge University Press.
Burali-Forti, Cesare. 1897. Una questione sui numeri transfiniti. Rendiconti del
Circolo Matematico di Palermo 11: 154–64.
Button, Tim. forthcoming. Level theory, part 1: Axiomatizing the bare idea of
a cumulative hierarchy of sets. Bulletin of Symbolic Logic .
986
BIBLIOGRAPHY
Cantor, Georg. 1878. Ein Beitrag zur Mannigfaltigkeitslehre. Journal für die
reine und angewandte Mathematik 84: 242–58.
Cohen, Paul J. 1966. Set Theory and the Continuum Hypothesis. Reading, MA:
Benjamin.
Conway, John. 2006. The power of mathematics. In Power, eds. Alan Blackwell
and David MacKay, Darwin College Lectures. Cambridge: Cambridge Uni-
versity Press. URL http://www.cs.toronto.edu/˜mackay/conway.
pdf.
Csicsery, George. 2016. Zala films: Julia Robinson and Hilbert’s tenth problem.
URL http://www.zalafilms.com/films/juliarobinson.html.
Dauben, Joseph. 1990. Georg Cantor: His Mathematics and Philosophy of the Infi-
nite. Princeton: Princeton University Press.
Davis, Martin, Hilary Putnam, and Julia Robinson. 1961. The decision prob-
lem for exponential Diophantine equations. Annals of Mathematics 74(3):
425–436. URL http://www.jstor.org/stable/1970289.
Dedekind, Richard. 1888. Was sind und was sollen die Zahlen? Braunschweig:
Vieweg.
Duncan, Arlene. 2015. The Bertrand Russell Research Centre. URL http:
//russell.mcmaster.ca/.
Enderton, Herbert B. 2019. Alonzo Church: Life and Work. In The Collected
Works of Alonzo Church, eds. Tyler Burge and Herbert B. Enderton. Cam-
bridge, MA: MIT Press.
Feferman, Anita and Solomon Feferman. 2004. Alfred Tarski: Life and Logic.
Cambridge: Cambridge University Press.
Feferman, Solomon and Azriel Levy. 1963. Independence results in set theory
by Cohen’s method II. Notices of the American Mathematical Society 10: 593.
Fraenkel, Abraham. 1922. Über den Begriff ‘definit’ und die Unabhängigkeit
des Auswahlaxioms. Sitzungsberichte der Preussischen Akadademie der Wis-
senschaften, Physikalisch-mathematische Klasse 253–257.
Frege, Gottlob. 1884. Die Grundlagen der Arithmetik: Eine logisch mathematische
Untersuchung über den Begriff der Zahl. Breslau: Wilhelm Koebner. Transla-
tion in Frege (1953).
Frey, Holly and Tracy V. Wilson. 2015. Stuff you missed in history class:
Emmy Noether, mathematics trailblazer. URL https://www.iheart.
com/podcast/stuff-you-missed-in-history-cl-21124503/
episode/emmy-noether-mathematics-trailblazer-30207491/.
Podcast audio.
Gentzen, Gerhard. 1935a. Untersuchungen über das logische Schließen I.
Mathematische Zeitschrift 39: 176–210. English translation in Szabo (1969),
pp. 68–131.
Gentzen, Gerhard. 1935b. Untersuchungen über das logische Schließen II.
Mathematische Zeitschrift 39: 176–210, 405–431. English translation in Szabo
(1969), pp. 68–131.
Giaquinto, Marcus. 2007. Visual Thinking in Mathematics. Oxford: Oxford Uni-
versity Press.
Gödel, Kurt. 1929. Über die Vollständigkeit des Logikkalküls [On the com-
pleteness of the calculus of logic]. Dissertation, Universität Wien. Reprinted
and translated in Feferman et al. (1986), pp. 60–101.
Gödel, Kurt. 1931. über formal unentscheidbare Sätze der Principia Mathe-
matica und verwandter Systeme I [On formally undecidable propositions
of Principia Mathematica and related systems I]. Monatshefte für Mathematik
und Physik 38: 173–198. Reprinted and translated in Feferman et al. (1986),
pp. 144–195.
Gödel, Kurt. 1938. The consistency of the axiom of choice and the generalized
continuum hypothesis. Proceedings of the National Academy of Sciences of the
United States of America 50: 1143–8.
Gouvêa, Fernando Q. 2011. Was Cantor surprised? American Mathematical
Monthly 118(3): 198–209.
Grattan-Guinness, Ivor. 1971. Towards a biography of Georg Cantor. Annals
of Science 27(4): 345–391.
Hammack, Richard. 2013. Book of Proof. Richmond, VA: Virginia Common-
wealth University. URL http://www.people.vcu.edu/˜rhammack/
BookOfProof/BookOfProof.pdf.
Hartogs, Friedrich. 1915. Über das Problem der Wohlordnung. Mathematische
Annalen 76: 438–43.
Hausdorff, Felix. 1914. Bemerkung über den Inhalt von Punktmengen. Math-
ematische Annalen 75: 428–34.
Heijenoort, Jean van. 1967. From Frege to Gödel: A Source Book in Mathematical
Logic, 1879–1931. Cambridge, MA: Harvard University Press.
Hilbert, David. 1891. Über die stetige Abbildung einer Linie auf ein
Flächenstück. Mathematische Annalen 38(3): 459–460.
Institute, Perimeter. 2015. Emmy Noether: Her life, work, and influence. URL
https://www.youtube.com/watch?v=tNNyAyMRsgE. Video Lecture.
John Dawson, Jr. 1997. Logical Dilemmas: The Life and Work of Kurt Gödel. Boca
Raton: CRC Press.
Katz, Karin Usadi and Mikhail G. Katz. 2012. Stevin numbers and reality.
Foundations of Science 17(2): 109–23.
Lévy, Azriel. 1960. Axiom schemata of strong infinity in axiomatic set theory.
Pacific Journal of Mathematics 10(1): 223–38.
Maddy, Penelope. 1988a. Believing the axioms I. Journal of Symbolic Logic 53(2):
481–511.
Maddy, Penelope. 1988b. Believing the axioms II. Journal of Symbolic Logic
53(3): 736–64.
Menzler-Trott, Eckart. 2007. Logic’s Lost Genius: The Life of Gerhard Gentzen.
Providence: American Mathematical Society.
Montague, Richard. 1965. Set theory and higher-order logic. In Formal systems
and recursive functions, eds. John Crossley and Michael Dummett, 131–48.
Amsterdam: North-Holland. Proceedings of the Eight Logic Colloquium,
July 1963.
Peano, Giuseppe. 1890. Sur une courbe, qui remplit toute une aire plane. Math-
ematische Annalen 36(1): 157–60.
Péter, Rózsa. 1935a. Über den Zusammenhang der verschiedenen Begriffe der
rekursiven Funktion. Mathematische Annalen 110: 612–632.
Péter, Rózsa. 2010. Playing with Infinity. New York: Dover. URL
https://books.google.ca/books?id=6V3wNs4uv_4C&lpg=PP1&
ots=BkQZaHcR99&lr&pg=PP1#v=onepage&q&f=false.
Potter, Michael. 2004. Set Theory and its Philosophy. Oxford: Oxford University
Press.
Reid, Constance. 1986. The autobiography of Julia Robinson. The College Math-
ematics Journal 17: 3–21.
Robinson, Julia. 1996. The Collected Works of Julia Robinson. Providence: Amer-
ican Mathematical Society.
Scott, Dana. 1974. Axiomatizing set theory. In Axiomatic Set Theory II, ed.
Thomas Jech, 207–14. American Mathematical Society. Proceedings of the
Symposium in Pure Mathematics of the American Mathematical Society,
July–August 1967.
Solow, Daniel. 2013. How to Read and Do Proofs. Hoboken, NJ: Wiley.
Steinhart, Eric. 2018. More Precisely: The Math You Need to Do Philosophy. Pe-
terborough, ON: Broadview, 2nd ed.
Sykes, Christopher. 1992. BBC Horizon: The strange life and death of Dr. Tur-
ing. URL https://www.youtube.com/watch?v=gyusnGbBSHE.
Takeuti, Gaisi, Nicholas Passell, and Mariko Yasugi. 2003. Memoirs of a Proof
Theorist: Gödel and Other Logicians. Singapore: World Scientific.
Tamassy, Istvan. 1994. Interview with Róza Péter. Modern Logic 4(3): 277–280.
Tarski, Alfred. 1981. The Collected Works of Alfred Tarski, vol. I–IV. Basel:
Birkhäuser.
Tomkowicz, Grzegorz and Stan Wagon. 2016. The Banach-Tarski Paradox. Cam-
bridge: Cambridge University Press.
Vitali, Giuseppe. 1905. Sul problema della misura dei gruppi di punti di una retta.
Bologna: Gamberini e Parmeggiani.
Zermelo, Ernst. 1904. Beweis, daß jede Menge wohlgeordnet werden kann.
Mathematische Annalen 59: 514–516. English translation in (Ebbinghaus
et al., 2010, pp. 115–119).