NaturalProof
NaturalProof
Evan Chen《陳誼廷》
30 November 2024
Contents
1 The tl;dr 2
1.1 Roadmap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
4 How formal proofs are done in theory but not in real life 5
4.1 Solution: specify axioms you want to be true . . . . . . . . . . . . . . . . 6
4.2 First-order logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4.3 Real life . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1
Evan Chen《陳誼廷》 — 30 November 2024 Intro to Proofs for the Morbidly Curious
§1 The tl;dr
For your purposes, a mathematical proof is an informal but convincing natural-language
explanation for why a statement is true.
If you are learning how to write proofs, I want to emphasize you could stop reading
here and, in the hands of a competent mentor, you would probably be okay. But I realize
the sentence above is evasive: what exactly does “convincing” mean?
So I’ll tell you exactly what “convincing” means, in full gory detail that could leave
you sorry that you asked. However, I want to emphasize that this verbosity is mostly to
satisfy your curiosity. It shouldn’t change how you approach mathematics.
§1.1 Roadmap
The thesis of this article is captured in the following “definition”:
Definition 1.1. A mathematical proof is an informal natural-language explanation
that in principle could be compiled into a formal machine-verifiable proof.
In Section 2 we provide some analogies that try to give explain what “compiled”
means in this big picture. Then in Section 3 we describe how “informal natural-language
explanation” is done in practice. Finally, in Section 4 we talk a bit about what “formal
machine-verifiable proof” means, just as a curiosity. At the end, Section 5 gives some
pragmatic advice for actually learning how to write proofs.
What happens after you write this code? Well, after that, the compiler will translate the
program into machine code. And this machine code is incomprehensible to humans. For
example, an intermediate step in the compilation process might start with:
.file "hello.cpp"
.text
.local _ZStL8__ioinit
.comm _ZStL8__ioinit,1,1
.section .rodata
.LC0:
.string "Hello World!"
.text
.globl main
.type main, @function
main:
.LFB1761:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
leaq .LC0(%rip), %rax
movq %rax, %rsi
leaq _ZSt4cout(%rip), %rax
movq %rax, %rdi
call _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc@PLT
2
Evan Chen《陳誼廷》 — 30 November 2024 Intro to Proofs for the Morbidly Curious
Holy cow! And that’s just one of the intermediate steps. The final output is going to be
some bytes run by machine.
It’s good that programmers are told this is vaguely what’s happening underneath the
hood, but for practical day-to-day work, they don’t need to actually understand how the
machine code works. Programmers never read the generated machine code — and they
don’t need to. Instead, these programmers “think” in C++, and then simply trust
the compiler will do the work of interpreting the machine code. These programmers
neither know nor care how that compiler works. That tough work is deferred to people
who write compilers.
Proofs in math operate on a similar principle, except the compilers are imaginary
humans that understand English. That is: you provide a natural-language explanation
of why a statement is true. This explanation should, in principle, be able to compiled
into a “formal machine-verifiable proof” – the analog of machine code. We describe in
Section 4 what this is (spoiler: it’s not fun to read). But for a working student, the exact
details of the compilation process are unimportant; it only matters that it could be done
in theory.
and so on, ending once one of the kings is checkmated (or draw, resignation, etc.). And
there is a list of rules that govern what moves are legal — rooks may move only along a
row/column, bishops along diagonals, and so on. The list of legal moves is not up for
debate, it’s just a social convention.
However, when people talk about strategy for chess, they don’t speak primarily in
algebraic notation, or spend a lot of time thinking about whether moves are legal. Coaches
for chess speak in much broader strokes, saying things like “try to control the center”
and “develop pieces”, and the chess community gives names to common openings like
“Morphy defence” to capture entire common sequences of moves into a single unit. In
other words, people can naturally operate at a layer of abstraction above individual
moves.
The advantage of this analogy over the programming analogy is that it gives a better
idea of how vague notions like “control the center” still make sense for humans. The disad-
vantage of the analogy is that algebraic notation does turn out to be pretty understandable
by humans, whereas formal proofs (like machine code) are incomprehensible.
3
Evan Chen《陳誼廷》 — 30 November 2024 Intro to Proofs for the Morbidly Curious
• What’s the definition of an “altitude”? We define the altitude to be the line through
a vertex perpendicular to the opposite side.
• What’s the definition of “perpendicular”? The two lines should meet at a 90◦ angle.
At some point, you have to stop somewhere. And there is a way to stop somewhere,
which we’ll write about more in Section 4.
However, for day-to-day work, these foundations are out-of-scope. Instead, what
people usually do is accept some concepts as “obvious”, and don’t give a definition for
them, instead just trusting intuition. So, for example, probably no one ever gave a perfect
definition of an “angle”, even though we’re happy to say that a 10◦ angle and 40◦ angle
should add up to a 50◦ angle.
What counts as “obvious” for you is essentially a social convention, because the
“compiler” is an imaginary human and not a real one. If you’re in high school, for
example, the statement 2 + 2 = 4 is OK to take for granted, even though probably no
one ever defined what plus, equals, 2, or 4 meant for you.
As a less obvious example, consider the statement
0.99999 . . . = 1.
This statement is true, and it comes as a surprise to a lot of people. But the reason
they’re confused is because nobody ever told them a proper definition of a repeating
decimal (or “real number”, for that matter). In fact, the definition of 0.99999 . . . means
that it’s equal to the infinite sum
9 9 9
+ + + ···
10 102 103
and the definition of an infinite sum is “the limit of the partial sums” (and the definition
of “limit” is the one given in any analysis textbook). If you agree with all the definitions,
you can check that limit indeed equals 1.
That shows that definitions themselves are really a social convention, too. In mathe-
matics, once the definitions are chosen, the conclusions that follow from it are no longer
up for debate. If you want to insist that 0.99999 . . . 6= 1, then trying to debate the proof
of this result is futile. The proof is correct. Instead, you need to propose a different
definition of the left-hand side and then argue that your definitions are (subjectively)
better than the standard definitions.
§3.2 Proving things after you have a few things taken for granted
As we said, in practice what we actually do is accept some objects as “undefined but
obvious”, and then build new definitions from there.
Let’s give an example. In a number theory classroom, the instructor decrees that the
set of integers Z, and the operations +, −, ×, will be taken for granted, and are not up
for debate. If you accept this, then you could make new definitions like:
Definition 3.2. An integer n is even if n = 2 · k for some integer k.
From this, you can deduce basic facts like:
Fact 3.3. The product of two even integers is even.
4
Evan Chen《陳誼廷》 — 30 November 2024 Intro to Proofs for the Morbidly Curious
Again, enemies of the concept that 0 is an even number (and there are many) will
make no progress arguing about the proof, and instead need to propose a new definition
of “even” and argue why their definition (subjectively) makes more sense.
§4 How formal proofs are done in theory but not in real life
So what happens at the foundations? For example, earlier I mentioned that the statement
2+2 = 4 is not subject to proof in a normal high-school classroom, because the definitions
of 2, 4, +, = were never given. I’m sure that leaves you wondering just what those
definitions are.
As we mentioned, the problem you face when you drill all the way down is that
definitions usually depend on earlier definitions, and the buck has to stop somewhere. So
what’s the plan?
1
For concreteness, the solution to the problem that I had written looks like this:
Replace 1010 by N in the obvious way and proceed by induction on N ≥ 0 with the base case N = 0
being vacuous. Notice that for any index k 6= 0, 3N − 1, 3N we have
ak = 2ak+1 − 4ak−1 = 2(2ak+2 − 4ak ) − 4ak−1 = 4(ak+2 − 2ak − ak−1 )
so it follows that (a1 , a2 , . . . , a3N −2 ) are all divisible by 4 and satisfy the same relations. But then
(a1 /4, a2 /4, . . . , a3N −2 /4) has length 3(N − 1) + 1 and so one of them is divisible by 4N −1 ; hence some
term of our original sequence is divisible by 4N .
5
Evan Chen《陳誼廷》 — 30 November 2024 Intro to Proofs for the Morbidly Curious
• logical symbols like ⇐⇒ (for “if and only if”), ¬ (for “not”), parentheses, and so
on;
• an equality symbol =;
• variable names,
• ...
That means, for example, the “empty set” axiom earlier could be written as the sentence
∃A∀x¬(x ∈ A).
Mathematical machine code! And like before, logicians then specify a “wishlist” of
properties they want these sentences to satisfy. Again, they don’t define what the
symbols mean; instead, the specify some rules about how sentences can be transformed.
For example, the “reflexive property” a = a is taken as an axiom, as is the “symmetric
property” (a = b) ⇐⇒ (b = a), and so on.
This lets you give a truly airtight definition of proof, in the sense that it could be
verified by a computer:
Definition 4.1. A formal proof is a sequence of sentences, each of which is either an
axiom or follows from preceding sentences by a specified transformation rule.
6
Evan Chen《陳誼廷》 — 30 November 2024 Intro to Proofs for the Morbidly Curious
2
Actually, this is a white lie. There is some effort to make machine-verification of proofs a reality; systems
like Coq and Lean were built to try to make this process at least plausible in some areas of math. And
they’re making steady progress!
3
Available at https://store.doverpublications.com/0486453065.html.
4
Also, I better make a comment on style. Since you’re writing in natural language, there are cosmetic ways
you can make your proof easier to read, such as being clear with definitions, adding paragraph breaks,
and isolating clear claims, and so on.
Good mentors will make notes of this to help you, and you should listen, but I also want to say that
you should try to keep a distinction in your head between stylistic comments on how to be clearer about
what you mean, versus real errors in logic and math.
7
Evan Chen《陳誼廷》 — 30 November 2024 Intro to Proofs for the Morbidly Curious
§5.4 Is there a way to verify entire proofs without having to check every
individual step?
Again, there will be no surefire way to do this rapidly. However, like in the last section,
there are some things you can use in practice to quickly sanity-check your work. Passing
all the pre-tests isn’t a guarantee there’s no mistake, but these pre-tests seem to be pretty
robust in practice and can give you a bit more well-placed confidence.
(While I was writing this, I realized that a lot of these actually appear in Scott
Aaronson’s blog post Ten Signs a Claimed Mathematical Breakthrough is Wrong. That
blog post is about verifying research papers, but some of the advice still applies.)
• Look for counterexamples to statements you doubt (we said this already). I
can’t emphasize this enough.
5
General heuristic: if you’re doing anything related to probability or infinite sums, and you don’t know
any university-level math, then you probably don’t understand the definitions.
8
Evan Chen《陳誼廷》 — 30 November 2024 Intro to Proofs for the Morbidly Curious
• In fact look for nearby statements which are false too. In philosophy, we call
this “proving too much”. In other words, you should ask yourself the question: “if
this approach did work, would it also imply a similar result that’s false?”.
For example, in a recent episode of Twitch Solves ISL, there was a set A of positive
integers which we needed to prove had a property P. We had defined a certain
notion of “density”, and we knew that d(A) ≈ 131
. So one audience member tried to
suggest an argument using the fact that d(A) < 10 1
to deduce P. We immediately
replied by giving a counterexample of a set B not satisfying P where d(B) = 11 1
.
Even though B was unrelated to the problem at hand, it showed that d(A) < 10 1
could not by itself solve the problem. A grader who sees a student trying to only
use d(A) < 10
1
already knows the proof is wrong without reading it.
• For longer and more difficult problems, try to isolate key claims and their proofs,
when possible. If you look at any solutions published by me, you will often see
that there are big claims marked visibly in green boxes, and self-contained proofs
of these claims.
Why is this useful for checking your work? There are two reasons.
– First, when you identify what the key claims are, you can hold them up to
scrutiny by, e.g. searching for counterexamples to them.
– Secondly, small components are easier to check than their union. Verifying a
proof is exponentially long in the number of moving parts, so any time you
can “divide and conquer”, it’s almost always good.
• Write your solutions out, do not just keep them in your head. I don’t trust
myself to have solved a problem completely anymore until I have the solution in
writing. I cannot tell you how many times I only found a mistake once I started
actually typing up what I thought was a complete solution.
9
Evan Chen《陳誼廷》 — 30 November 2024 Intro to Proofs for the Morbidly Curious
It’s like learning the “Sicilian defense” in chess. You’re playing the same game of chess,
the rules of chess didn’t change. The Sicilian defense is just a pattern that you can use
which is so common that it earned its own name.
https://web.evanchen.cc/handouts/english/english.pdf
10