(Lecture Notes) Alexandru Buium - Introduction To Mathematical Thinking (2019) PDF
(Lecture Notes) Alexandru Buium - Introduction To Mathematical Thinking (2019) PDF
Alexandru Buium
Department of Mathematics and Statistics, University of New Mex-
ico, Albuquerque, NM 87131, USA
E-mail address: [email protected]
This is a (drastically) simplified version of my book:
Mathematics: a minimal introduction, CRC press, 2013.
For a more complete treatment of the topics one may refer to the book. However
one should be aware of the fact that there are some key conceptual differences
between the present text and the book, especially when it comes to the material
that pertains to logic (e.g., to witnesses, quantifier axioms, and the structure of
theories).
Contents
Part 1. Logic 5
Chapter 1. Languages 7
Chapter 2. Metalanguage 15
Chapter 3. Syntax 21
Chapter 4. Tautologies 25
Chapter 5. Proofs 31
Chapter 6. Theories 39
Chapter 7. ZFC 49
Chapter 8. Sets 55
Chapter 9. Maps 59
Logic
CHAPTER 1
Languages
1) “Socrates is a man”
2) “Caesar killed Brutus”
3) “The killer of Caesar is Brutus”
4) “Brutus killed Caesar and Socrates is a man”
5) “Brutus is not a man or Caesar is a killer”
6) “If Brutus killed Caesar then Brutus is a killer”
7) “Brutus did not kill Caesar”
8) “A man killed Caesar”
9) “If a man killed another man then the first man is a killer”
10) “A man is a killer if and only if that man killed another man”
Furthermore let us introduce a rule (called translation) that attaches to each symbol
in Formal a symbol in English as follows:
Then the English sentences 1-10 are translations of the following Formal sentences;
equivalently the following Formal sentences are translations (called formalizations)
of the corresponding English sentences:
1’) “m(S)”
2’) “C † B”
3’) “↓ (C) = B”
4’) “(B † C) ∧ m(S)”
5’) “(¬(m(B))) ∨ k(C)”
6’) “(B † C) → (k(B))”
7’) “¬(B † C)”
8’) “∃x(x † C)”
9’) “∀x((m(x) ∧ (∃y(m(y) ∧ ¬(x = y) ∧ (x † y))) → k(x))”
10’) “∀x(k(x) ↔ (m(x) ∧ (∃y(m(y) ∧ ¬(x = y) ∧ (x † y))))”
Descartes, Leibniz, Gödel; cf. Aquinas and Kant for criticism). See 6.12 for more
on this. The sentence 4 is a version of the “cosmological argument” (Aquinas).
Remark 1.17. (Declarative/imperative/interrogative sentences) All sentences
considered so far were declarative (they declare their content). Natural languages
have other types of sentences: imperative (giving a command like: “Lift this
weight!”) and interrogative (asking a question such as: “Is the electron in this
portion of space-time?”). In principle, from now on, we will only consider declara-
tive sentences in our languages. An exception to this is the language called Argot;
see below
Example 1.18. For a language L (such as Formal) we may introduce a new
language called Argotic L (or simply Argot), denoted sometimes by LArgot . Most
mathematics books, for instance, are written in such a language. The language
LArgot has as symbols all the symbols of English together with all the symbols of
a language L, to which one adds one more category of symbols,
• commands: “consider,” “assume,” “let...be,” “let us...,” etc.)
Examples of sentences in Argot are
1) “Since (s ∈ w) → (ρ(s)) it follows that ρ(t)”
2) “Let c be such that ρ(c).”
We will not insist on explaining the syntax of Argot which is rather different from
that of both English and Formal. Suffices to note that the symbols in Formal do
not appear between quotation marks inside sentences of Argot; loosely speaking
the sentences in Argot often appear as obtained from sentences in Metalanguage
via disquotation.
CHAPTER 2
Metalanguage
Syntax
2) If t, s, ... are terms and f is a functional symbol then f (t, s, ...) is a term.
Remark 3.3. Functional symbols may be unary f (t), binary f (t, s), ternary
f (t, s, u), etc. When we write f (t, s) we simply mean a string of 5 symbols; there
is no “substitution” involved here. Substitution will play a role later, though; cf.
3.12.
Example 3.4. If a, b, ... are constants, x, y, ... are variables, f is a unary func-
tional symbol, and g is a a binary functional symbol, all of them in L, then
f (g(f (b), g(x, g(x, y))))
is a term. The latter should be simply viewed, again, as a string of symbols.
For the next metaaxiom we introduce the predicate is a formula into metalan-
guage.
Metaaxiom 3.5.
1) If t, s are terms then t = s is a formula.
2) If t, s, ... are terms and ρ is a predicate then ρ(t, s, ...) is a formula.
3) If Q, Q0 are formulae then Q ∧ Q0 , Q ∨ Q0 , ¬Q, Q → Q0 , Q ↔ Q0 , ∃xQ, ∀xQ
are formulae.
Formulae of the form 1) or 2) above are called atomic formulae.
Recall our convention that if we have a different number of symbols (written
differently) we make similar metadefinitions for them; in particular some of the
symbols may be missing altogether. For instance if equality is missing from L we
ignore 1.
We introduce a predicate “is in Lf ” into metalanguage and we introduce the
metaaxioms: P is a formula if and only if P is in Lf .
Remark 3.6. Predicates can be unary ρ(t), binary ρ(t, s), ternary ρ(t, s, u),
etc. Again, ρ(t, s) simply means a string of 5 symbols ρ, (, t, s, ) and nothing else.
Sometimes one uses another syntax for predicates: instead of ρ(t, s) one writes tρs
or ρts; instead of ρ(t, s, u) one may write ρtsu, etc. All of this is in the language L.
On the other hand if some variables x, y, ... appear in a formula P we sometimes
write in metalanguage P (x, y, ...) instead of P . In particular if x appears in P
(there may be other variables in P as well) we sometimes write P (x) instead of P .
Formulas of the form (i.e., which are equal to one of) ∀xP , ∀xP (x) are referred
to as universal formulas. Formulas of the form ∃xP , ∃xP (x) are referred to as
existential formulas. Formulas of the form P → Q are referred to as conditional
formulas. Formulas of the form P ↔ Q are referred to as biconditional formulas.
Example 3.7. Let p and q be a ternary and a binary predicate respectively.
The following is a formula (∀y(∃x q(x, y)))) → ¬ p(x, y, z).
Example 3.8. Assume L contains a constant c, a unary predicate ρ, and a
unary functional symbol f . Then the following is a formula:
(∀x(f (x) = c)) → (ρ(f (x)))
For what follows we need to add a predicate “x is free in”. If x is free in P we
also say x is a free variable in P .
Metaaxiom 3.9.
3. SYNTAX 23
Tautologies
We start now the analysis of inference within a given language (which is also
referred to as deduction or proof). In order to introduce the general notion of proof
we need to first introduce tautologies; in their turn tautologies are introduced via
certain arrays of symbols in metalanguage called tables.
Metadefinition 4.1. Let T and F be two symbols in metalanguage. We also
allow separators in metalanguage that are frames of tables. Using the above plus
arbitrary constants P and Q in metalanguage we introduce the following strings of
symbols in metalanguage (which are actually arrays rather than strings but which
can obviously be rearranged in the form of strings). They are referred to as the
truth tables of the 5 standard connectives.
P Q P ∧Q P Q P ∨Q P Q P →Q P Q P ↔Q
T T T T T T T T T T T T
T F F T F T T F F T F F
F T F F T T F T T F T F
F F F F F F F F T F F T
P ¬P
T F
F T
Remark 4.2. If in the tables above P is the sentence “p....” and Q is the
sentence “q....” we allow ourselves, as usual, to identify the symbols P, Q, P ∧ Q,
etc. with the corresponding sentences “p...,” “q...,” “(p...)∧(q...),” etc. Also: the
letters T and F evoke “truth” and “falsehood”; but they should be viewed as devoid
of any meaning.
Fix in what follows a language L that has the 5 standard connectives ∧, ∨, ¬,
→, ↔ (but does not necessarily have quantifiers or equality).
Metadefinition 4.3. Let P, Q, ..., R be sentences in L. By a Boolean string
generated by P, Q, ..., R we mean a string of sentences
P
Q
...
R
U
...
...
25
26 4. TAUTOLOGIES
such that for any sentence V among U, ..., W we have that V is preceded by V 0 , V 00
(with V 0 , V 00 among P, ..., W ) and V equals one of the following:
V 0 ∧ V 00 , V 0 ∨ V 00 , ¬V 0 , V 0 → V 00 , V 0 ↔ V 00 .
P → (Q ∨ ¬R)
P ∧R
(P ∧ R) ↔ (P → (Q ∨ ¬R))
Remark 4.6. The same sentence may appear as the last sentence in two dif-
ferent Boolean strings; cf. the last 2 examples.
P ¬P P ∨ ¬P
T T T
T F T
F T T
F F F
and the last column in the latter table does not consist of T s only. This does not
change the fact that P ∨ ¬P is a tautology. Morally, in this latter computation we
had to treat P and ¬P as “independent”; this is not a mistake but rather a failed
attempt to metaprove that P ∨ ¬P is a tautology.
Example 4.12. (P ∧ (P → Q)) → Q is a tautology; it is called modus ponens.
To metaprove this consider the following Boolean string generated by P, Q, R:
P
Q
P →Q
P ∧ (P → Q)
(P ∧ (P → Q)) → Q
Its truth table is:
P Q P → Q P ∧ (P → Q) S
T T T T T
T F F F T
F T T F T
F F T F T
Exercise 4.13. Explain how the table above was computed.
Exercise 4.14. Give a metaproof of the fact that each of the sentences below
is a tautology:
1) (P → Q) ↔ (¬P ∨ Q).
2) (P ↔ Q) ↔ ((P → Q) ∧ (Q → P )).
Exercise 4.15. Give a metaproof of the fact that each of the sentences below
is a tautology:
1) (P ∧ Q) → P .
2) P → (P ∨ Q).
3) ((P ∧ Q) ∧ R) ↔ (P ∧ (Q ∧ R)).
4) (P ∧ Q) ↔ (Q ∧ P ).
5) (P ∧ (Q ∨ R)) ↔ ((P ∧ Q) ∨ (P ∧ R)).
6) (P ∨ (Q ∧ R)) ↔ ((P ∨ Q) ∧ (P ∨ R)).
Metadefinition 4.16.
1) Q → P is called the converse of P → Q.
2) ¬Q → ¬P is called the contrapositive of P → Q.
Exercise 4.17. Give a metaproof of the fact that each of the sentences below
is a tautology:
1) ((P ∨ Q) ∧ (¬P )) → Q (modus ponens, variant).
2) (P → Q) ↔ (¬Q → ¬P ) (contrapositive argument).
3) (¬(P ∧ Q)) ↔ (¬P ∨ ¬Q) (de Morgan law).
4) (¬(P ∨ Q)) ↔ (¬P ∧ ¬Q) (de Morgan law).
4. TAUTOLOGIES 29
Proofs
A weak theory that only has axioms is a theory. A weak theory that is a
development of a theory is a theory.
Axioms are specific to the theory and are referred to as specific axioms; they
are given either as a finite list or by a rule to form them which may create an
indefinitely growing list. When a definition is added to T one also needs to add to
L the newly introduced corresponding symbol. If ∃xP (x) is a theorem or an axiom
one can add a new constant cP to L and one can add to the theory the Theorem
P (cP ); cP is called an existential witness for the sentence ∃xP (x). Apart from the
above procedure of addition of theorems a theorem can be added to T only if a
proof for it is also being supplied; a proof of a sentence U in L is a sequence of
sentences in Argot that is formed as in the examples below (or “combining” the
examples below). There is a precise set of rules which one can use to check if a
sequence of sentences is a proof of a given theorem; we will not go into that but
rather present examples of applications of the various rules.
Metadefinition 5.3. A theory is inconsistent if it has a development that
contains a sentence of the form A∧¬A. A theory is consistent if it is not inconsistent.
A theory is complete if for any sentence A in L there is a development of the theory
in which either A belongs to the development or ¬A belongs to the development.
A theory is incomplete if it is not complete.
Remark 5.4. Clearly no metaproof can be given by inspection of the metasen-
tence that a given theory is consistent. However, if one finds a sentence of the
form A ∧ ¬A then a metaproof was found that the theory is inconsistent. Finally
note that no metaproof can be given in general for the metasentence saying that a
theory is complete or incomplete.
The concepts of logic developed here can be imitated by concepts in set theory
(i.e., in mathematics) and then theorems about set theoretic completeness and
consistency can be proved in set theory (Gödel’s theorems for instance); however
these latter theorems are not metatheorems in logic (i.e., about sentences) but
rather theorems in set theory (i.e., about nothing).
In what follows we clarify what proofs are by examples.
Example 5.5. (Direct proof). Fix a theory T in L with specific axioms A, B, ...
and definitions D, E, .... Say we want to prove H → C. Then H is usually called
hypothesis and C is called conclusion. A proof of H → C can presented as follows.
Theorem 5.6. H → C.
Proof. Assume H. Since H and A it follows that Q. Since Q and B it follows
that R. Hence C.
The above counts as a proof if
H ∧ A → Q, Q ∧ B → R, R→C
are either tautologies or axioms or theorems previously included in the theory T .
Next we explain the rules that govern the use of quantifiers in proofs. Here
they are:
1) If in a proof one writes “∀xP (x)” at some point then anywhere after that
one is allowed to write “P (c)” where c is any constant (that has or has not been
used before in the proof).
2) If in a proof one writes “∃xP (x)” at some point then anywhere after that
one is allowed to write “Let c be such that P (c)” (where c is a NEW constant i.e.,
a constant that has NOT been used before in the proof, that needs to be added to
the language). Such a constant is called an existential witness.
3) If one wants to prove a sentence of the form “∀xP (x)” then it is sufficient
to proceed as follows. One first writes “Let c be arbitrary” (where c is a NEW
constant that needs to be added to the language). Such a constant is called a
universal witness. Then one proceeds to proving P (c).
Example 5.9. Here is an example of proof by contradiction that involves quan-
tifiers.
Theorem 5.10. (∀x(¬P (x))) → (¬(∃P (x))).
Proof. Assume ∀x(¬P (x)), ¬¬(∃xP (x)), and seek a contradiction. Since
¬¬(∃xP (x)) it follows that ∃xP (x) (a tautology). Let c be such that P (c). Now
since ∀x(¬P (x)) we get in particular ¬P (c), a contradiction. This ends the proof.
The constant c introduced in the proof is a new constant and is an existential
witness.
A proof can start as a direct proof and involve later an embedded argument by
contradiction. Here is an example.
Theorem 5.11. (¬(∃xP (x))) → (∀x(¬P (x))).
Proof. Assume ¬(∃xP (x)). We want to show that ∀x(¬P (x)). Let c be arbi-
trary; we want to show that ¬P (c). Assume ¬¬P (c) and seek a contradiction. Since
¬¬P (c) it follows that P (c) (a tautology). So ∃xP (x); but we assumed ¬(∃xP (x)),
a contradiction. This ends the proof.
The constant c in the above proof is a universal witness.
Example 5.12. In order to prove a theorem U of the form P ↔ Q one first
proves P → Q and then one proves Q → P .
Exercise 5.13. Prove the following:
1) (¬(∀xP (x))) ↔ (∃x(¬P (x)))
2) (¬(∀x∀y∃zP (x, y, z))) ↔ (∃x∃y∀z¬P (x, y, z))
Example 5.14. Direct proofs and proofs by contradiction can be given to
sentences which are not necessarily of the form H → C. Here is an example of a
proof by contradiction for:
34 5. PROOFS
Theorem 5.15. C.
Proof. Assume C is false, and seek a contradiction. Since ¬C and A it follows
that Q. Since A and Q we get R. On the there hand since B and ¬C we get ¬R,
a contradiction. This ends the proof.
The above counts as a proof if
(¬C) ∧ A → Q, A ∧ Q → R, B ∧ ¬C → ¬R
are sentences previously included in the theory T .
Example 5.16. (Case by case proof) Say we want to prove a theorem of the
form:
Theorem 5.17. (H 0 ∨ H 00 ) → C.
Proof. There are two cases: Case 1 is H 0 ; Case 2 is H 00 . Assume first that H 0 .
Then by axiom A we get P . So C. Now assume that H 00 . Since B it follows that
Q. So, again, we get C. So in either case we get C. This ends the proof.
The above counts as a proof if
A ∧ H 0 → P, P → C, H 00 ∧ B → Q, Q→C
are sentences previously included in the theory T .
The above “case by case” strategy applies more generally to theorems of the
form
Theorem 5.18. H → C
Proof. There are two cases:
Case 1: W holds.
Case 2: ¬W holds.
Assume first we are in Case 1. Then by axiom A we get P . So C.
Now assume we are in Case 2. Since B it follows that Q. So, again, we get C.
So in either case we get C. This ends the proof.
The above counts as a proof if W is any sentence and
A ∧ W → P, P → C, (¬W ) ∧ B → Q, Q→C
are either tautologies or axioms or theorems previously included in the theory T .
Note that, in the latter proof, finding a sentence W and dividing the proof in
two cases according as W or ¬W holds is usually a creative act: one needs to guess
what W will work.
Here is an example that combines proof by contradiction with “case by case”
proof. Say we want to prove:
Theorem 5.19. H → C.
Proof. Assume H and ¬C and seek a contradiction. There are two cases:
Case 1. W holds.
Case 2. ¬W holds.
In case 1, by ... we get ... hence a contradiction.
In case 2, by ... we get ... hence a contradiction.
This ends the proof.
Example 5.20. Sometimes a theorem U has the statement:
5. PROOFS 35
the role of an existential witness for Q. But these existential witnesses are not the
same.
Exercise 5.34. Give examples of wrong proofs of each of the above types. If
you can’t solve this now, wait until we get to discuss the integers.
Remark 5.35. Later, when we discuss induction we will discuss another typical
fallacy; cf. Example 13.7.
CHAPTER 6
Theories
We analyze in what follows a few toy examples of theories and proofs of the-
orems in these theories. Later we will present the main example of theory in this
course which is set theory (identified with mathematics itself).
Example 6.1. The first example is that of group theory. The language L of
the theory has a constant e, variables x, y, ..., and a binary functional symbol ?.
We introduce the following definition in L:
Definition 6.2. z is a neutral element if
∀x((z ? x = x) ∧ (x ? z = x)).
So we added “is a neutral element” as a new unary predicate.
We also introduce the following
Definition 6.3. y is an inverse of x if x ? y = y ? x = e.
So we added “is an inverse of” as a new binary predicate.
The axioms of the theory are:
Axiom 6.4. For all x, y, z we have x ? (y ? z) = (x ? y) ? z.
Axiom 6.5. e is a neutral element.
Axiom 6.6. For any x there is y such that y is an inverse of x.
We prove the following:
Theorem 6.7. For any z if z is a neutral element then z = e.
The sentence that needs to be proved is of the form: “If H then C.” Recall that
in general for such a sentence H is called hypothesis and C is called conclusion.
Here is a direct proof:
Proof. Let f be a neutral element. Since e is a neutral element it follows
that ∀x(e ? x = x). By the latter e ? f = f . Since f is a neutral element we get
∀x(x ? f = x). So e ? f = e. Hence we get e = e ? f . So e = f .
Here is a proof by contradiction of the same theorem:
Proof Let a, b, c be such that b and c are inverses of a. We want to show that
b = c. By Axioms 6.4, 6.5 and the definition of inverse we have
b ? e = b ? (a ? c) = (b ? a) ? c = e ? c = c.
Group theory can be developed independently of what later we will call set
theory (which is identified with mathematics); a better approach however is to
replace the group theory above with a series of theorems in set theory.
Proof of Theorem 6.11. Assume w(p). There are two cases: the first case is
∃xg(x); the second case is ¬(∃xg(x)). Assume first that ∃xg(x). Since w(p), by
6. THEORIES 41
The next example is the famous “ontological argument” for the existence of
God (cf. Anselm, Descartes, Leibnitz, Gödel). The version below is, in some sense,
a “baby version” of the argument; Gödel’s formalization (which he never published)
is considerably subtler. Cf. (Wang 1996).
Example 6.12. The structure of the classical ontological argument for the
existence of God is as follows. Let us assume that qualities (same as properties)
are either positive or negative (and none is both). Let us think of existence as
having 2 kinds: existence in mind (which shall be referred to as belonging to mind)
and existence in reality (which shall be referred to as belonging to reality). It is
not important that we do not know what mind and reality are; we just see them
as English words here. The 2 kinds are not necessarily related: belonging to mind
does not imply (and is not implied by) belonging to reality. (In particular we do
not view mind necessarily as part of reality which we should not: unicorns belong
to mind but not to reality.) The constants and variables refer to things (myself,
my cat, God,...) or qualities (red, omnipresent, deceiving, eternal, murderous,
mortal,...); we identify the latter with their extensions which are, again, things (the
Red, th Omnipresent, the Deceiving, the Eternal, the Murderous, the Mortal,...)
In particular we consider the following constants: reality, mind, God, the Positive
Qualities. We also consider the binary predicate belongs to. We say a thing has
a certain quality (e.g. my cat is eternal) if that thing belongs to the extension of
that quality (e.g. my cat belongs to the Eternal). Assume the following axioms:
A1) There exists a thing belonging to mind that has all the positive qualities
and no negative quality. (Call it God.)
A2) “Being real” is a positive quality.
A3) Two things belonging to mind that have exactly the same qualities are
identical.
(Axiom A3 is Leibniz’s famous principle of identity of indiscernibles. It implies
that God is unique.) Then one can prove the following:
Theorem 6.13. God belongs to reality.
In other words God is real.
The above sentences are written in the English language L0 . Let us formalize
the above in a language L and prove a formal version of Theorem 6.13 in L whose
translation is Theorem 6.13. Assume L contains among its constants the constants
r, m, p and a binary predicate E. We consider a translation of L into L0 such that
r is translated as “reality”;
m is translated as “mind”;
p is translated as “the positive qualities”
xEy is translated as “x belongs to y”.
The specific axioms are:
A1) ∃x((xEm) ∧ (∀z((zEp) ↔ (xEz)))).
A2) rEp.
A3) ∀x∀y(((xEm) ∧ (yEm)) → ((∀z(xEz ↔ yEz)) → (x = y))).
42 6. THEORIES
Note that later, in set theory, we will have a predicate ∈ which, like E, will be
translated as “belongs to” (as in an object belongs to the collection of objects that
have a certain quality); but the axioms are different.
By A1 we can make the following:
Definition 6.14. g = c(xEm)∧(∀z((zEp)↔(xEz))) .
So g is defined to be equal to a certain existential witness.
Also we can add to the theory the following
Theorem 6.15. (gEm) ∧ (∀z((zEp) ↔ (gEz))).
We will translate g as “God”. We have the following theorem expressing the
uniqueness of God:
Theorem 6.16. ∀x(((xEm) ∧ (∀z((zEp) ↔ (xEz))) → (x = g))).
The translation of the above in English is: “If something in my mind has all
the positive qualities and no negative quality then that thing is God.”
Proof. A trivial exercise using axiom A3 only.
Exercise 6.17. Prove Theorem 6.16.
On the other hand, and more importantly, we have the following Theorem
whose translation in L0 is “God belongs to reality”:
Theorem 6.18. gEr.
Proof. By axiom A1 we have gEm and
∀z((zEp) ↔ (gEz)).
Hence we have, in particular,
(rEp) ↔ (gEr).
By axiom A2, rEp. Hence gEr.
The argument above is, of course, correct. What is questionable is the choice
of the axioms and the reference of L. Also recall that, in our notes, the question
of truth was not addressed; so it does not make sense to ask whether the English
sentence “God has existence in reality” is true of false. For criticism of the relevance
of this argument (or similar ones) see, for instance, (Kant 1991) and (Wang 1996).
However, the mere fact that some of the most distinguished logicians of all times (in
particular Leibniz and Gödel) took this argument seriously shows that the argument
has merit and, in particular, cannot be dismissed on trivial grounds.
Example 6.19. The next example is again a toy example and comes from
physics. In order to present this example we do not need to introduce any physical
concepts. But it would help to keep in mind the two slit experiment in quantum
mechanics (for which we refer to Feynman’s Physics course, say). Now there are
two types of physical theories that can be referred to as phenomenological and
explanatory. They are intertwined but very different in nature. Phenomenological
theories are simply descriptions of phenomena/effects of (either actual or possible)
experiments; examples of such theories are those of Ptolemy, Copernicus, or that of
pre-quantum experimental physics of radiation. Explanatory theories are systems
postulating transcendent causes that act from behind phenomena; examples of such
6. THEORIES 43
theories are those of Newton, Einstein, or quantum theory. The theory below is
a baby example of the phenomenological (pre-quantum) theory of radiation; our
discussion is therefore not a discussion of quantum mechanics but rather it suggests
the necessity of introducing quantum mechanics. The language L0 and definitions
are those of experimental/phenomenological (rather than theoretical/explanatory)
physics. We will not make them explicit. Later we will move to a simplified language
L and will not care about definitions.
Consider the following specific axioms (which are the translation in English of
the phenomenological predictions of classical particle mechanics and classical wave
mechanics, respectively):
A1) If radiation in the 2 slit experiment consists of a beam of particles then the
impact pattern on the photographic plate consists of a series of successive flashes
and the pattern has 2 local maxima.
A2) If radiation in the 2 slit experiment is a wave then the impact pattern on
the photographic plate is not a series of successive flashes and the pattern has more
than 2 local maxima.
We want to prove the following
Theorem 6.20. If in the 2 slit experiment the impact consists of a series of
successive flashes and the impact pattern has more than 2 local maxima then in this
experiment radiation is neither a beam of particles nor a wave.
The sentence reflects one of the elementary puzzles that quantum phenomena
exhibit: radiation is neither particles nor waves but something else! And that
something else requires a new theory which is quantum mechanics. (A common
fallacy would be to conclude that radiation is both particles and waves !!!) Rather
than analyzing the language L0 of physics in which our axioms and sentence are
stated (and the semantics that goes with it) let us introduce a simplified language
L as follows.
We consider the language L with constants a, b, ..., variables x, y, ..., and unary
predicates p, w, f , m. Then there is a translation of L into L0 such that:
p is translated as “is a beam of particles”
w is translated as “is a wave”
f is translated as “produces a series of successive flashes”
m is translated as “produces a pattern with 2 local maxima”
Then we consider the specific axioms
A1) ∀x(p(x) → (f (x) ∧ m(x))).
A2) ∀x(w(x) → (¬f (x)) ∧ ¬m(x))).
Here we tacitly assume that the number of maxima cannot be 1. Theorem 6.20
above is the translation of the following theorem in L:
Theorem 6.21. ∀x((f (x) ∧ (¬m(x))) → ((¬p(x)) ∧ (¬w(x)))).
So it is enough to prove Theorem 6.21. The proof below is, as we shall see, a
combination of proof by contradiction and case by case.
and
¬(¬p(a) ∧ (¬w(a)))
and seek a contradiction. Since ¬(¬p(a)∧(¬w(a))) we get p(a)∨w(a). There are two
cases. The first case is p(a); the second case is w(a). We will get a contradiction
in each of these cases separately. Assume first p(a). Then by axiom A1 we get
f (a) ∧ m(a), hence m(a). But we assumed f (a) ∧ (¬m(a)), hence ¬m(a), so we
get a contradiction. Assume now w(a). By axiom A2 we get (¬f (a)) ∧ (¬m(a))
hence ¬f (a). But we assumed f (a) ∧ (¬m(a)), hence f (a), so we get again a
contradiction. .
Exercise 6.22. Consider the specific axioms A1 and A2 above and also the
specific axioms:
A3) ∃x(f (x) ∧ (¬m(x))).
A4) ∀x(p(x) ∨ w(x)).
Metaprove that the theory with specific axioms A1, A2, A3, A4 is inconsistent.
A3 is translated as saying that in some experiments one sees a series of successive
flashes and, at the same time, one has more than 2 maxima. Axiom A4 is translated
as saying that any type of radiation is either particles or waves. The inconsistency
of the theory says that classical (particle and wave) mechanics is not consistent
with experiment. (So a new mechanics, quantum mechanics, is needed.) Note that
none of the above discussion has anything to do with any concrete proposal for a
quantum mechanical theory; all that the above suggests is the necessity of such a
theory.
Example 6.23. The next example is a logical puzzle from the Mahabharata.
King Yudhishthira loses his kingdom to Sakuni at a game of dice; then he stakes
himself and he loses himself; then he stakes his wife Draupadi and loses her too.
She objects by saying that her husband could not have staked her because he did
not own her anymore after losing himself. Here is a possible formalization of her
argument.
We use a language with constants i, d, ..., variables x, y, z, ..., the binary pred-
icate “owns,” quantifiers, and equality =. We define a predicate 6= by (x 6= y) ↔
(¬(x = y)). Consider the following specific axioms:
A1) For all x, y, z if x owns y and y owns z then x owns z.
A2) For all y there exists x such that x owns y.
A3) For all x, y, z if y owns x and z owns x then y = z.
We will prove the following
Theorem 6.24. If i does not own himself then i does not own d.
Proof. We proceed by contradiction. So we assume i does not own i and i owns
d and seek a contradiction. There are two cases: first case is d owns i; the second
case is d does not own i. We prove that in each case we get a contradiction. Assume
first that d owns i; since i owns d, by axiom A1, i owns i, a contradiction. Assume
now d does not own i. By axiom A2 we know that there exists j such that j owns
i. Since i does not own i it follows that j 6= i. Since j owns i and i owns d, by
axiom A1, j owns d. But i also owns d. By axiom A3, i = j, a contradiction.
Example 6.25. This example illustrates the logical structure of the Newto-
nian theory of gravitation that unified Galileo’s phenomenological theory of falling
bodies (the physics on Earth) with Kepler’s phenomenological theory of planetary
6. THEORIES 45
G) ∀x∀y((c(x)
∧ c(y)) → (a(x,
3E) = a(y, 3E)),
K3) ∀x∀y (p(x) ∧ p(y)) → Td 2(x,S) d (y,S)
(x,S) = T 2 (y,S) ,
a(x,z) a(y,z)
N) ∀x∀y∀z (f (x, z) ∧ f (y, z)) → 1/d2 (x,z) = 1/d2 (y,z) .
G represents Galileo’s great empirical discovery that all cannonballs (by which
we mean here terrestrial airborne objects with no self-propulsion) have the same
acceleration towards the Earth. K3 is Kepler’s third law which is his empirical
great discovery that the cubes of distances of planets to the Sun are in the same
proportion as the squares of their periods of revolution. Kepler’s second law about
equal areas being swept in equal times is somewhat hidden in axiom A above.
N is Newton’s law of gravitation saying that the accelerations of any two bodies
moving freely towards a fixed body are in the same proportion as the inverses of
the squares of the respective distances to the (center of the) fixed body. Newton’s
great invention is the creation of a binary predicate f (where f (x, y) is translated
into English as “x is in free fall with respect to y”) equipped with the following
axioms
F1) ∀x(c(x) → f (x, E))
F2) f (M, E)
F3) ∀x(p(x) → f (x, S))
expressing the idea that cannonballs and the Moon moving relative to the Earth
and planets moving relative to the Sun are instances of a more general predicate
expressing “free falling.” Finally let us consider the definition
g = a(r, E)
and the following sentence:
4π 2 d3 (M,E)
X) g = R2 T 2 (M,E) .
ZFC
One finally adds a technical list of axioms (indexed by formulas P (x, y, z))
about the “images of maps with parameters z”:
Axiom 7.21. (Axiom of replacement) If for any z and any u we have that
P (x, y, z) “defines y as a function of x ∈ u” (i.e., for any x ∈ u there exists a unique
y such that P (x, y, z)) then for all z there is a set v which is the “image of this
map” (i.e., v consists of all y’s with the property that there is an x ∈ u such that
P (x, y, z)). Here x, z may be tuples of variables.
Exercise 7.22. Write the axioms of choice, infinity, foundation, and replace-
ment in the language of sets.
Metadefinition 7.23. All of the above axioms form the ZFC system of axioms
(Zermelo-Fraenkel+Choice). Set theory Tset is the theory in Lset with ZF C axioms.
Unless otherwise specified all theorems in the rest of the course are understood to
be theorems in Tset . By abuse of terminology we continue to denote by Tset any
development of Tset .
Remark 7.24. Note the important fact that the axioms did not involve con-
stants. In the next chapter we investigate the constants, i.e., the sets.
Set theory
CHAPTER 8
Sets
We will start here our discussion of sets and prove our first theorems in set
theory. Recall that we introduced mathematics/set theory as being a specific theory
Tset in the language Lset , with axioms ZF C described in the last chapter.
Recall the following:
Metadefinition 8.1. A set is a constant in the language of set theory.
Sets will be denoted by a, b, ..., A, B, ..., A, B, ..., α, β, γ, ....
In what follows all definitions will be definitions in the language Lset of sets.
Sometimes definitions are given in Argotic Lset . Recall that some definitions intro-
ducing new constants are also simply referred to as notation.
We start by defining a new constant ∅ (called the empty set) as being equal to
the witness for the axiom ∃x∀y(y 6∈ x); in other words ∅ is defined by
Definition 8.2. ∅ = c∀y(y6∈x) .
Note that ∀y(y 6∈ ∅) is a theorem. In Argot we say that ∅ is the “unique” set
that contains no element.
Next if a is a set we introduce a new constant {a} defined to be the witness for
the sentence ∃yP where
P equals “(a ∈ y) ∧ (∀z((z ∈ y) → (z = a))).”
In other words {a} is defined by
Definition 8.3. {a} = cP = c(a∈y)∧(∀z((z∈y)→(z=a))) .
The sentence ∃yP is a theorem (use the singleton axiom) so the following is a
theorem:
(a ∈ {a}) ∧ (∀z((z ∈ {a}) → (z = a))).
We can say (and we will usually say, by abuse of terminology) that {a} is “the
unique” set containing a only among its elements; we will often use this kind of
abuse of terminology. In particular {{a}} denotes the set whose only element is
the set {a}, etc. Similarly, for any two sets a, b with a 6= b denote by {a, b} the set
that only has a and b as elements; the set {a, b} is a witness for a theorem that
follows from the unordered pair axiom. Also whenever we write {a, b} we implicitly
understand that a 6= b.
Next, for any set A and any formula P (x) in the language of sets, having one
free variable x we denote by A(P ) or {a ∈ A; P (a)} or {x ∈ A; P (x)} the set whose
elements are the elements a ∈ A such that P (a); so the set A(P ) equals by definition
the witness for the separation axiom that corresponds to A and P . More precisely
we have:
Definition 8.4. A(P ) = {x ∈ A; P (x)} = c∃z∀x((x∈z)↔(x∈A)∧P (x))) .
55
56 8. SETS
So the dots indicate that there may be other elements in d other than a, b, c; also
note that when we write {a, b, c, ...} we implicitly imply that a, b, c are pairwise
distinct.
Exercise 8.14.
1) Prove that {∅} =
6 ∅.
2) Prove that {{∅}} =
6 {∅}.
Exercise 8.15. Prove that A = B if and only if A ⊂ B and B ⊂ A.
Exercise 8.16. Prove that:
1) {a, b, c} = {b, c, a}.
2) {a, b} =6 {a, b, c}. Hint: Use c 6= a, c 6= b.
3) {a, b, c} = {a, b, d} if and only if c = d.
Exercise 8.17. Let A = {a, b, c} and B = {c, d}. Prove that
1) A ∪ B = {a, b, c, d},
2) A ∩ B = {c}, A\B = {a, b}.
Exercise 8.18. Let A = {a, b, c, d, e, f }, B = {d, e, f, g, h}. Compute
1) A ∩ B,
2) A ∪ B,
3) A\B,
4) B\A,
5) (A\B) ∪ (B\A).
Exercise 8.19. Prove the following:
1) A ∩ B ⊂ A,
2) A ⊂ A ∪ B,
3) A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C),
4) A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C),
5) (A\B) ∩ (B\A) − ∅.
Definition 8.20. For any set A we define the set P(A) as the set whose ele-
ments are the subsets of A; we call P(A) the power set of A; P(A) is a witness for
(a theorem obtained from) the power set axiom.
Exercise 8.21. Explain in detail the definition of P(A) (using the witness
notation).
Example 8.22. If A = {a, b, c} then
P(A) = {∅, {a}, {b}, {c}, {a, b}, {a, c}, {b, c}, {a, b, c}}.
Exercise 8.23. Let A = {a, b, c, d}. Write down the set P(A).
Exercise 8.24. Let A = {a, b}. Write down the set P(P(A)).
Definition 8.25. (Ordered pairs) Let A and B be sets and let a ∈ A, b ∈ B.
If a 6= b the ordered pair (a, b) is the set {{a}, {a, b}}. We sometimes say “pair”
instead of “ordered pair.” If a = b the pair (a, b) = (a, a) is the set {{a}}. Note
that (a, b) ∈ P(P(A ∪ B)).
Definition 8.26. For any sets A and B we define the product of A and B as
the set
A × B = {c ∈ P(P(A ∪ B)); ∃x∃y((x ∈ A) ∧ (y ∈ B) ∧ (c = (a, b)))}.
This is the set whose elements are exactly the pairs (a, b) with a ∈ A and b ∈ B.
58 8. SETS
Maps
The concept of map (or function) has a long history. Originally functions were
understood to be given by more or less explicit “formulae” (polynomial, ratio-
nal, algebraic, and later by series). Controversies around what the “most general”
functions should be arose, for instance, in connection with solving partial differ-
ential equations (by means of trigonometric series); this is somewhat parallel to
the controversy around what the “most general” numbers should be that arose in
connection with solving algebraic equations (such as x2 = 2, x2 = −1, or higher
degree equations with no solutions expressed by radicals, etc.). The notion of “com-
pletely arbitrary” function gradually arose through the work of Dirichlet, Riemann,
Weierstrass, Cantor, etc. Here is the definition:
Definition 9.1. A map (or function) from a set A to a set B is a subset
F ⊂ A × B such that for any a ∈ A there is a unique b ∈ B with (a, b) ∈ F . If
(a, b) ∈ F we write F (a) = b or a 7→ b or a 7→ F (a). We also write F : A → B or
F
A → B.
Remark 9.2. The above defines a new (ternary) predicate µ equal to “...is a
map from ... to ....” Also we may introduce a new functional symbol Fb by
∀x∀y((µ(F, A, B) ∧ (x ∈ A) ∧ (y ∈ B)) → ((Fb(x) = y) ↔ ((x, y) ∈ F ))).
Here F, A, B can be constants or variables depending on the context. We will
usually drop the b (or think of the Argot translation as dropping the hat). Also
note that what we call a map F ⊂ A × B corresponds to what in elementary
mathematics is called the graph of a map.
Example 9.3. The set
(9.1) F = {(a, a), (b, c)} ⊂ {a, b} × {a, b, c}
is a map and F (a) = a, F (b) = c. On the other hand the subset
F = {(a, b), (a, c)} ⊂ {a, b} × {a, b, c}
is not a map.
Definition 9.4. For any A the identity map I : A → A is defined as I(a) = a,
i.e.,
I = IA = {(a, a); a ∈ A} ⊂ A × A.
Definition 9.5. A map F : A → B is injective (or an injection, or one-to-one)
if F (a) = F (c) implies a = c.
Definition 9.6. A map F : A → B is surjective (or a surjection, or onto) if
for any b ∈ B there exists an a ∈ A such that F (a) = b.
59
60 9. MAPS
Relations
a = bb if and only if a ∼ b.
Exercise 10.30. Prove that b
Exercise 10.31. Prove that:
a ∩Sbb 6= ∅ then b
1) if b a = bb;
2) A = a∈A b a.
Definition 10.32. If A is a set a partition of A is a family (Ai )i∈I if subsets
Ai ⊂ A such that:
1) if i 6=Sj then Ai ∩ Aj = ∅
2) A = i∈I Ai .
66 10. RELATIONS
Remark 10.42. Note that we have not defined 2 or 3 yet; this will be done
later when we introduce integers. The meaning of these axioms is, however, clearly
expressible in terms that were already defined. For instance axiom 2 says that for
any points P and Q with P 6= Q there exists a line through P and Q; we do not
need to define the symbol 2 to express this. The same holds for the use of the
symbol 3.
Exercise 10.43. Prove that any two distinct non-parallel lines intersect in
exactly one point.
Exercise 10.44. Let A = {a, b} × {a, b} and let L ⊂ P(A) consist of all subsets
of 2 elements; there are 6 of them. Prove that (A, L) is an affine plane. (Again
one can reformulate everything without reference to the symbols 2 or 6; one simply
uses 2 or 6 letters and writes that they are pairwise unequal.)
Exercise 10.45. Let A = {a, b, c} × {a, b, c} . Find all subsets L ⊂ P(A) such
that (A, L) is an affine plane. (This is tedious !)
Definition 10.46. A projective plane is a pair (A, L) where A is a set and
L ⊂ P(A) is a set of subsets of A satisfying a series of axioms which we now
explain. Again it is convenient to introduce some terminology as follows. A is
called the projective plane. The elements of A are called points, P . The elements
L of L are called lines; so each such L ⊂ A. We say a point P lies on a line L if
P ∈ L; we also say that L passes through P . We say that two lines intersect if they
have a point in common; we say that two lines are parallel if they either coincide
or they do not intersect. We say that 3 points are collinear if they lie on the same
line. Here are the axioms that we impose:
1) There exist 3 points which are not collinear and any line has at least 3 points.
2) Any 2 distinct points lie on exactly one line.
3) Any 2 distinct lines meet in exactly one point.
Example 10.47. One can attach to any affine plane (A, L) a projective plane
(A, L) as follows. We introduce the relation k on L by letting L k L0 if and only
if L and L0 are parallel. This is an equivalence relation (check!). Denote by L
b the
equivalence class of L. Then we consider the set of equivalence classes, L∞ = L/ k;
call this set the line at infinity. There exists a set A such that A = A ∪ L∞ and
A ∩ L∞ = ∅. Define a line in A to be either L∞ or set of the form L = L ∪ {L}. b
Finally define L to be the set of all lines in A.
Exercise 10.48. Explain why A exists. Check that (A, L) is a projective plane.
Exercise 10.49. Describe the projective plane attached to the affine plane in
Exercise 10.44; how many points does it have? How many lines?
CHAPTER 11
Operations
3) ψa (X∆Y ) = ψa (X) + ψa (Y ),
4) ψa (¬X) = ¬ψa (X).
Next note that X = Y if and only if ψa (X) = ψa (Y ) for all a ∈ A. Use these
functions to reduce the present exercise to Exercise 11.12.
Definition 11.16. Given a subset X ⊂ A one can define the characteristic
function χX : A → {0, 1} by letting χX (a) = 1 if and only if a ∈ X; in other words
χX (a) = ψa (X).
Exercise 11.17. Prove that
1) χX∨Y (a) = χX (a) ∨ χY (a),
2) χX∧Y (a) = χX (a) ∧ χY (a),
3) χX∆Y (a) = χX (a) + χY (a),
4) χ¬X (a) = ¬χX (a).
Definition 11.18. An algebraic structure is a tuple (A, ?, •, ..., ¬, −, ..., 0, 1, ...)
where A is a set, ?, •, ... are binary operations, ¬, −, ... are unary operations, and
1, 0, ... are given elements of A. (Some of these may be missing; for instance we
could have only one binary operation, one given element, and no unary operations.)
Assume we are given two algebraic structures
(A, ?, •, ..., ¬, −, ..., 0, 1, ...) and (A0 , ?0 , •0 , ..., ¬0 , −0 , ..., 00 , 10 , ...)
(with the same number of corresponding operations). A map F : A → A0 is called
a homomorphism if for all a, b ∈ A we have:
1) F (a ? b) = F (a) ?0 F (b), F (a • b) = F (a) •0 F (b),...
2) F (¬a) = ¬0 F (a), F (−a) = −0 F (a),...
3) F (0) = 00 , F (1) = 10 ,....
Example 11.19. A map F : A → A0 between two commutative unital rings is
called a homomorphism (of commutative unital rings) if for all a, b ∈ A we have:
1) F (a + b) = F (a) + F (b) and F (ab) = F (a)F (b),
2) F (−a) = −F (a) (prove that this is automatic !),
3) F (0) = 0 (prove that this is automatic !) and F (1) = 1.
Exercise 11.20. Prove that if F : A → A0 is a homomorphism of algebraic
structures and F is bijective then its inverse F −1 : A0 → A is a homomorphism.
Such an F will be called an isomorphism.
Definition 11.21. A subset A ⊂ P(A) is called a Boolean algebra of sets if
the following hold:
1) ∅ ∈ A, A ∈ A;
2) If X, Y ∈ A then X ∩ Y ∈ A, X ∪ Y ∈ A, CX ∈ A.
(Hence (A, ∨, ∧, C, ∅, A) is a Boolean algebra.)
Exercise 11.22. Prove that if A is a Boolean algebra of sets then for any
X, Y ∈ A we have X∆Y ∈ A. Prove that (A, ∆, ∩, I, ∅, A) is a Boolean ring.
Definition 11.23. A subset B ⊂ P(A) is called a Boolean ring of sets if the
following properties hold:
1) ∅ ∈ B, A ∈ B;
2) If X, Y ∈ B then X ∩ Y ∈ B, X∆Y ∈ B.
(Hence (A, ∆, ∨, I, ∅, A) is a Boolean ring.)
72 11. OPERATIONS
Exercise 11.24. Prove that any Boolean ring of sets B is a Boolean algebra
of sets.
Definition 11.25. A commutative unital ordered ring (or simply an ordered
ring) is a tuple
(R, +, ×, −, 0, 1, ≤)
where
(R, +, ×, −, 0, 1)
is a ring, ≤ is a total order on R, and for all a, b, c ∈ R the following axioms are
satisfied
1) If a < b then a + c < b + c;
2) If a < b and c > 0 then ac < bc.
We say that a ∈ R is positive if a > 0; and that a is negative if a < 0. We say
a is non-negative if a ≥ 0.
Exercise 11.26. Prove that the ring ({0, 1}, +, ×, −, 0, 1) has no structure of
ordered ring i.e., there is no order ≤ on {0, 1} such that ({0, 1}, +, ×, −, 0, 1, ≤) is
an ordered ring.
Remark 11.27. We cannot give examples yet of ordered rings. Later we will
see that the rings of integers, rationals, and reals have natural structures of ordered
rings.
Definition 11.28. Let R be an ordered ring and let R+ = {a ∈ R; a ≥ 0}. A
finite measure space is a triple (A, A, µ) where A is a set, A ⊂ P(A) is a Boolean
algebra of sets, and µ : A → R+ is a map satisfying the property that for any
X, Y ∈ A with X ∩ Y = ∅ we have
µ(X ∪ Y ) = µ(X) + µ(Y ).
If in addition µ(A) = 1 we say (A, A, µ) is a finite probability measure. We say
that X, Y ∈ A are independent if µ(X ∩ Y ) = µ(X) · µ(Y ).
Exercise 11.29. Prove that in a finite measure space µ(∅) = 0 and for any
X, Y ∈ A we have
µ(X ∪ Y ) = µ(X) + µ(Y ) − µ(X ∩ Y ).
Exercise 11.30. Let (A, ∨, ∧, ¬, 0, 1) be a Boolean algebra. For any a, b ∈ A
set
a + b = (a ∨ b) ∧ (¬(a ∧ b)).
Prove that (A, +, ∧, I, 0, 1) is a Boolean ring (I the identity map).
Exercise 11.31. Let (A, +, ×, −, 0, 1) be a Boolean ring. For any a, b ∈ A let
a ∨ b = a + b − ab
a ∧ b = ab
¬a = 1 − a.
Prove that (A, ∨, ∧, ¬, 0, 1) is a Boolean algebra.
Exercise 11.32. Let X be a set and (R, +, ·, −, 0, 1) a commutative unital
ring. Let RX be the set of all functions X → R. For F, G ∈ RX we define
F + G, F · G, −F, 0, 1 ∈ RX by the formulae
(F + G)(x) = F (x) + G(x), (F · G)(x) = F (x) · G(x),
11. OPERATIONS 73
The discrete
CHAPTER 12
Integers
In this Chapter we introduce the ring Z of integers and we prove some easy
theorems about this concept.
Definition 12.1. A well ordered ring is an ordered ring (R, +, ×, 0, 1, ≤) with
1 6= 0 having the property that any non-empty subset of R which is bounded from
below has a minimum element.
Remark 12.2. If (R, +, ×, 0, 1, ≤) is a well ordered ring then (R, ≤) is not a
priori a well ordered set. But if R>0 = {a ∈ R; a > 0} then (R>0 , ≤) is a well
ordered set.
We have the following remarkable theorem in set theory Tset :
Theorem 12.3. There exists a well ordered ring.
Remark 12.4. The above theorem is formulated, as usual, in Argot; but it
should be understood as being a sentence Z in Lset of the form
∃r∃s∃p∃o∃u∃l(...)
where we take a variable r to stand for the ring, a variable s for the sum, p for
the product, o for 0, u for 1, l for ≤, and the dots stand for the corresponding
conditions in the definition of a well ordered ring, written in the language of sets.
The sentence Z is complicated so we preferred to give the theorem not as a sentence
in Lset but as a sentence in Argot. This kind of abuse is very common.
Remark 12.5. We are going to sketch the proof of Theorem 12.3 in an exercise
below. The proof is involved. A cheap way to avoid the proof of this theorem is as
follows: add this theorem to the ZFC axioms and let ZFC’ be the resulting enriched
0
system of axioms. Then replace Tset by the theory Tset with axioms ZFC’. This is
what all working mathematicians essentially do anyway.
Definition 12.6. We let Z, +, ×, 0, 1, ≤ be the witnesses for the sentence Z
above; we call Z the ring of integers. In particular the conditions in the definition
of rings (associativity, commutativity, etc.) and order (transitivity, etc.) become
theorems for Z. We also set N = {a ∈ Z; a > 0} and we call N the set of natural
numbers. Later we will prove the “essential uniqueness” of Z.
Remark 12.7. The only predicate in the language Lset of sets is ∈ and the con-
stants in this language are called sets. In particular when we consider the ordered
ring of integers (Z, +, ×, 0, 1, ≤) the symbols Z, +, ×, 0, 1, ≤, N are all constants
(they are sets). In particular +, × are not originally functional symbols and ≤ is
not originally a predicate. But, according to our conventions, we may introduce
functional symbols (still denoted by +, ×) and a predicate (still denoted by ≤) via
77
78 12. INTEGERS
Induction
clearly G is a bijection. Now consider the map H : {1, ..., m}\{i} → {1, ..., m − 1}
defined by H(j) = j for 1 ≤ j ≤ i − 1 and H(j) = j − 1 for i + 1 ≤ j ≤ m. (The
definition is correct because for any j ∈ {1, ..., m}\{i} either j ≤ i − 1 or j ≥ i + 1;
cf. Exercise 12.10.) Clearly H is a bijection. We get a bijection
H ◦ G : {1, ..., n − 1} → {1, ..., m − 1}.
Since P (n − 1) is true we get n − 1 = m − 1. Hence n = m and we are done.
Exercise 13.5. Check that P (1) is true in the above Proposition.
Remark 13.6. Note the general strategy of proofs by inductions. Say P (n)
is “about n objects.” There are two steps. The first step is the verification of
P (1) i.e., one verifies the statement “for one object.” For the second step (called
the induction step) one considers a situation with n objects; one “removes” from
that situation “one object” to get a “situation with n − 1 objects”; one uses the
“induction hypothesis” P (n − 1) to conclude the claim for the “situation with
n − 1 objects.” Then one tries to “go back” and prove that the claim is true for
the situation with n objects. So the second step is performed by “removing” one
object from an arbitrary situation with n objects and NOT by adding one object
to an arbitrary situation with n − 1 objects. Below is an example of a fallacious
reasoning by induction based on “adding” instead of “subtracting” an object.
Example 13.7. Here is a wrong argument for the induction step in the proof
of Proposition 13.4.
“Proof.” Let G : {1, ..., n − 1} → {1, ..., m − 1} be any bijection and let F :
{1, ..., n} → {1, ..., m} be defined by F (i) = G(i) for i ≤ n − 1 and F (n) = m.
Clearly F is a bijection. Now by the induction hypothesis n − 1 = m − 1. Hence
n = m. This ends the proof.
The mistake is that the above does not end the proof: the above argument
only covers bijections F : {1, ..., n} → {1, ..., m} constructed from bijections G :
{1, ..., n − 1} → {1, ..., m − 1} in the special way described above. In other words
an arbitrary bijection F : {1, ..., n} → {1, ..., m} does not always arise the way we
defined F in the above “proof.” In some sense the mistake we just pointed out is
that of defining the same constant twice (cf. Example 12.28): we were supposed to
define the symbol F as being an arbitrary bijection but then we redefined F in a
special way through an arbitrary G. The point is that if G is arbitrary and F is
defined as above in terms of G then F will not be arbitrary (because F will always
send n into m).
Definition 13.8. A set A is finite if there exists an integer n ≥ 0 and a
bijection F : {1, ..., n} → A. (Note that n is then unique by Proposition 13.4.)
We write |A| = n and we call this number the cardinality of A or the number of
elements of A. (Note that |∅| = 0.) If F (i) = ai we write A = {a1 , ..., an }. A set is
infinite if it is not finite.
Exercise 13.9. Prove that |{2, 4, −6, 9, −100}| = 5.
Exercise 13.10. For any finite sets A and B we have that A ∪ B is finite and
|A ∪ B| + |A ∩ B| = |A| + |B|.
13. INDUCTION 83
Hint: Reduce to the case A ∩ B = ∅. Then if F : {1, ..., a} → A and G : {1, ..., b} →
B are bijections prove that H : {1, ..., a + b} → A ∪ B defined by H(i) = F (i) for
1 ≤ i ≤ a and H(i) = G(i − a) for a + 1 ≤ i ≤ a + b is a bijection.
Exercise 13.11. For any finite sets A and B we have that A × B is finite and
|A × B| = |A| × |B|.
Hint: Induction on |A|.
Exercise 13.12. Let F : {1, ..., n} → Z be an injective map and write F (i) =
ai . We refer to such a map as a (finite) family of numbers. Prove that there exists
a unique map G : {1, ..., n} → Z such that G(1) = a1 and G(k) = G(k − 1) + ak for
2 ≤ k ≤ n. Hint: Induction on n.
Definition
Pn 13.13. In the notation of the above Exercise define the (finite)
sum i=1 ai as the number G(n). We also write a1 + ... + an for this sum. If
a1 = ... = an = a the sum a1 + ... + an is written as a + ... + a (n times).
Exercise 13.14. Prove that for any a, b ∈ N we have
a × b = a + ... + a (b times) = b + ... + b (a times).
Qn
Exercise 13.15. Define in a similar way the (finite) product i=1 ai (which
is also denoted by a1 ...an = a1 × ... × an ). Prove the analogues of associativity and
distributivity for sums and products of families of numbers. Define ab for a, b ∈ N
and prove that ab+c = ab × ac and (ab )c = abc .
Exercise 13.16. Prove that if a is an integer and n is a natural number then
an − 1 = (a − 1)(an−1 + an−2 + ... + a + 1).
Hint: Induction on n.
Exercise 13.17. Prove that if a is an integer and n is an integer then
a2n+1 + 1 = (a + 1)(a2n − a2n−1 + a2n−2 − ... − a + 1).
Hint: Set a = −b.
Exercise 13.18. Prove that a subset A ⊂ N is bounded if and only if it is
finite. Hint: To prove that bounded sets are finite assume this is false and let b
be the minimum natural number with the property that there is a set A bounded
from above by b and infinite. If b 6∈ A then A is bounded from above by b − 1
(Exercise 12.10) and we are done. If b ∈ A then, by minimality of b, there is a
bijection A\{b} → {1, ..., m} and one constructs a bijection A → {1, ..., m + 1}
which is a contradiction. To prove that finite sets are bounded assume this is false
and let n be minimum natural number with the property that there is a finite subset
A ⊂ N of cardinality n which is not bounded. Let F : {1, ..., n} → A be a bijection,
ai = F (i). Then {a1 , ..., an−1 } is bounded from above by some b and conclude that
A is bounded from above by either b or an .
Exercise 13.19. Prove that any subset of a finite set is finite. Hint: Use the
previous exercise.
Definition 13.20. Let A be a set and n ∈ N. Define the set An to be the set
{1,...,n}
A of all maps {1, ..., n} → A. Call
[
A? = An
n=1
84 13. INDUCTION
Rationals
With the integers at our disposal one can use the axioms of set theory to
construct a whole array of familiar sets of numbers such as the rationals, the reals,
the imaginaries, etc. We start here with the rationals.
Definition 14.1. For any a, b ∈ Z with b 6= 0 define the fraction ab to be the
set of all pairs (c, d) with c, d ∈ Z, d 6= 0 such that ad = bc. Denote by Q the set of
all fractions. So
a
= {(c, d) ∈ Z × Z; d 6= 0, ad = bc},
b
a
Q = { ; a, b ∈ Z, b 6= 0}.
b
Example 14.2.
6
= {(6, 10), (−3, −5), (9, 15), ...} ∈ Q.
10
Exercise 14.3. Prove that ab = dc if and only if ad = bc. Hint: Assume ad = bc
and let us prove that ab = dc . We need to show that ab ⊂ dc and that dc ⊂ ab . Now if
(x, y) ∈ ab then xb = ay; hence xbd = ayd. Since ad = bc we get xbd = bcy. Hence
b(xd − cy) = 0. Since b 6= 0 we have xd − cy = 0 hence xd = cy hence (x, y) ∈ dc .
We proved that ab ⊂ dc . The other inclusion is proved similarly. So the equality
a c a c
b = d is proved. Conversely if one assumes b = d one needs to prove ad = bc; we
leave this to the reader.
Exercise 14.4. On the set A = Z × (Z\{0}) one can consider the relation:
(a, b) ∼ (c, d) if and only if ad = bc. Prove that ∼ is an equivalence relation. Then
observe that ab is the equivalence class
[
(a, b)
of (a, b). Also observe that Q = A/ ∼ is the quotient of A by the relation ∼.
a
Exercise 14.5. Prove that the map Z → Q, a 7→ 1 is injective.
Definition 14.6. By abuse we identify a ∈ Z with a1 ∈ Q and write a1 = a;
this identifies Z with a subset of Q. Such identifications are very common and will
be done later in similar contexts.
a c ad+bc a c ac
Definition 14.7. Define b + d = bd , b × d = bd .
a a0 c c0
Exercise 14.8. Show that the above definition is correct (i.e., if b = b0 , d = d0
0 0 0 0
then ad+bc
bd = a db0+b
d0
c
and similarly for the product).
Exercise 14.9. Prove that Q (with the operations + and × defined above and
with the elements 0, 1) is a field.
85
86 14. RATIONALS
n(n + 1)
1 + 2 + ... + n = .
2
n(n + 1)(2n + 1)
12 + 22 + ... + n2 = .
6
n2 (n + 1)2
13 + 23 + ... + n3 = .
4
CHAPTER 15
Combinatorics
Exercise 15.7. (Subsets) Prove that if |A| = n then |P(A)| = 2n . (A set with
n elements has 2n subsets.) Hint: Induction on n; if A = {a1 , ..., an+1 } use
P(A) = {B ∈ P(A); an+1 ∈ B} ∪ {B ∈ P(A); an+1 6∈ B}.
Exercise 15.8. (Combinations) Let A be a set with |A| = n, let 0 ≤ k ≤ n,
and set
Comb(k, A) = {B ∈ P(A); |B| = k}.
Prove that
n
|Comb(k, A)| = .
k
n
In other words a set of n elements has exactly subsets with k elements. A
k
subset of A having k elements is called a combination of k elements from the set A.
Hint: Fix k and proceed by induction on n. If A = {a1 , ..., an+1 } use Exercise
15.4 plus the fact that Comb(k, A) can be written as
{B ∈ P(A); |B| = k, an+1 ∈ B} ∪ {B ∈ P(A); |B| = k, an+1 6∈ B}.
Exercise 15.9. (Permutations) For a set A let Perm(A) ⊂ AA be the set of all
bijections F : A → A. A bijection F : A → A is also called a permutation. Prove
that if |A| = n then
|Perm(A)| = n!.
So the exercise says that a set of n elements has n! permutations. Hint: Let
|A| = |B| = n and let Bij(A, B) be the set of all bijections F : A → B; it is enough
to show that |Bij(A, B)| = n!. Proceed by induction on n; if A = {a1 , ..., an+1 },
B = {b1 , ..., bn+1 } then use the fact that
n+1
[
Bij(A, B) = {F ∈ Bij(A, B); F (a1 ) = bk }.
k=1
For d ∈ N and X a set let X d be the set of all maps {1, ..., d} → X. We identify
a map i 7→ ai with the tuple (a1 , ..., ad ).
Exercise 15.10. (Combinations with repetition) Let
Combrep(n, d) = {(x1 , ..., xd ) ∈ Zd ; xi ≥ 0, x1 + ... + xd = n}.
Prove that
n+d−1
|Combrep(n, d)| = .
d−1
Hint: Let A = {1, ..., n + d − 1}. Prove that there is a bijection
Comb(d − 1, A) → Combrep(n, d).
The bijection is given by attaching to any subset
{i1 , ..., id−1 } ⊂ {1, ..., n + d − 1}
(where i1 < ... < id−1 ) the tuple (x1 , ..., xd−1 ) where
1) x1 = |{i ∈ Z; 1 ≤ i < i1 }|,
2) xk = |{i ∈ Z; ik < i < ik+1 }|, for 2 ≤ k ≤ d − 1, and
3) xd = |{i ∈ Z; id−1 < i ≤ n + d − 1}|.
CHAPTER 16
Sequences
For any set A if there exists an injection A → P(N) then either there exists an
injection A → N or there exists a bijection A → P(N).
One can ask if the above is a theorem. Answering this question (raised by
Cantor) leads to important investigations in set theory. The answer (given by two
theorems of Gödel and Cohen in the framework of mathematical logic rather than
logic) turned out to be rather surprising.
Part 4
The continuum
CHAPTER 17
Reals
Real numbers have been implicitly around throughout the history of mathe-
matics as an expression of the idea of continuity of magnitudes. What amounts to
an axiomatic introduction of the reals can be found in Euclid (and is attributed
to Eudoxus). The first construction of the reals from the “discrete” (i.e., from the
rationals) is due to Dedekind.
Definition 17.1. (Dedekind) A real number is a subset u ⊂ Q of the set Q of
rational numbers with the following properties:
1) u 6= ∅ and u 6= Q;
2) u has no minimum;
2) if x ∈ u, y ∈ Q, and x ≤ y then y ∈ u.
Denote by R the set of real numbers.
Example 17.2.
1) Any rational number x ∈ Q can be identified with the real number
ux = {y ∈ Q; x < y}.
It is clear that ux = ux0 for x, x0 ∈ Q implies x = x0 . We identify any rational
number x with ux . So we may view Q ⊂ R.
2) One defines, for instance, for any r ∈ Q with ≥ 0,
√
r = {x ∈ Q; x ≥ 0, x2 > r}.
Definition 17.3. A real number u ∈ R is called irrational if u 6∈ Q.
Definition 17.4. If u and v are real numbers we write u ≤ v if and only if
v ⊂ u. For u, v ≥ 0 define
u+v = {x + y; x ∈ u, y ∈ v}
u × v = uv = {xy; x ∈ u, y ∈ v}.
Note that this extends addition and multiplication on the non-negative ratio-
nals.
Exercise 17.5.
1) Prove that √ ≤ is a total order on R.
2) Prove that r is a real number for r ∈ Q, r ≥ 0.
3) Prove that u + v and u × v are real numbers.
4) Naturally extend the definition of addition + and multiplication × of real
numbers to the case when the numbers are not necessarily ≥ 0. Prove that
(R, +, ×, −, 0, 1) is a field. Naturally extend the order ≤ on Q to an order on
R and prove that R with ≤ is an ordered ring.
95
96 17. REALS
Exercise 17.6. Define the sum and the product of a family of real (or complex)
numbers indexed by a finite set. Hint: Use the already defined concept for integers
(and hence for the rationals).
√
Exercise 17.7. Prove that for r ∈ Q, r ≥ 0, we have ( r)2 = r. (Hint: The
harder part is to show that if z ∈ Q satisfies z > r then there exist x, y ∈ Q such
that x ≥ 0, y ≥ 0, z = xy, x2 > r, y 2 > r. It is enough to show that there exists a
rational number ρ with z > ρ2 > r because then we can write z = xy with x = ρ
and y = z/ρ > ρ. To show this it is enough to prove that for any n natural there
exist rational numbers tn and sn such that t2n < r < s2n and sn −tn ≤ (s1 −t1 )/2n−1 .
For n = 1 take t1 = 0 and s1 such that s21 > r. Assuming the above is true for
n one sets tn+1 = tn and sn+1 = (tn + sn )/2 if t2n ≤ r < ((tn + sn )/2)2 and
one sets tn+1 = (tn + sn )/2 and sn+1 = sn if ((tn + sn )/2)2 < r < s2n ; the case
r = ((tn + sn )/2)2 is left to the reader.
Exercise
√ 17.8. Prove√that for any√r ∈ R with r > 0 there exists a unique
number r ∈ R such that r > 0 and ( r)2 = r.
√ √
Exercise 17.9. Prove that 2 is irrational i.e., 2 6∈ Q. Hint: Assume there
exists a rational number x such that x2 = 2 and seek a contradiction. Let a ∈ N
2
be minimal with the property that x = ab for some b. Now ab2 = 2 hence 2b2 = a2 .
Hence a2 is even. Hence a is even (because if a were odd then a2 would be odd).
Hence a = 2c for some integer c. Hence 2b2 = (2c)2 = 4c2 . Hence b2 = 2c2 . Hence
2c
b2 is even. Hence b is even. Hence b = 2d for some integer d. Hence x = 2d = dc
and c < a. This contradicts the minimality of a which ends the proof.
Remark 17.10. The above proof is probably one of the “first” proofs by contra-
diction in the history of mathematics; this proof appears, for instance, in Aristotle,
and√it is believed to have been discovered by the Pythagoreans. The irrationality
of 2 was translated √ by the Greeks as evidence that arithmetic is insufficient to
control geometry ( 2 is the length of the diagonal of a square with side 1) and ar-
guably created the first crisis in the history of mathematics, leading to a separation
of algebra and geometry that lasted until Fermat and Descartes.
Exercise 17.11. Prove that the set
{r ∈ Q; r > 0, r2 < 2}
has no supremum in Q.
Remark 17.12. Later we will prove that R is not countable.
Definition 17.13. For any a ∈ R we let |a| be a or −a according as a ≥ 0 or
a ≤ 0, respectively.
Exercise 17.14. Prove the so-called triangle inequality:
|a + b| ≤ |a| + |b|
for all a, b ∈ R.
Definition 17.15. For a < b in R we define the open interval
(a, b) = {c ∈ R; a < c < b} ⊂ R.
(Not to be confused with the pair (a, b) ∈ R × R which is denoted by the same
symbol.)
CHAPTER 18
Topology
Definition 18.8. If T0 ⊂ P(X) is a subset of the power set then the intersection
\
T= T0
T 0 ⊃T0
continuous.
Exercise 18.15. Give an example of two topological spaces X, X 0 and of a
bijection F : X → X 0 such that F is continuous but F −1 is not continuous. (This
is to be contrasted with the situation of algebraic structures to be discussed later.
See Exercise 11.20.)
Motivated by the above phenomenon, one gives the following
Definition 18.16. A homeomorphism between two topological spaces is a
continuous bijection whose inverse is also continuous.
Definition 18.17. If X is a topological space and Y ⊂ X is a subset then the
set of all subsets of Y of the form U ∩ Y with U open in X form a topology on Y
called the induced topology.
Exercise 18.18. Prove that if X is a topological space and Y ⊂ X is open
then the induced topology on Y consists of all open sets of X that are contained in
Y.
Definition 18.19. Let X be a topological space and let A ⊂ X be a subset.
We say that A is connected if whenever U and V are open in X with U ∩ V ∩ A = ∅
and A ⊂ U ∪ V it follows that U ∩ A = ∅ or V ∩ A = ∅.
Exercise 18.20. Prove that if F : X → X 0 is continuous and A ⊂ X is
connected then F (A) ⊂ X 0 is connected.
Definition 18.21. Let X be a topological space and let A ⊂ X be a subset.
A point x ∈ X is called an accumulation point of A if for any open set U in X
containing x the set U \{x} contains a point of A.
Exercise 18.22. Let X be a topological space and let A ⊂ X be a subset.
Prove that A is closed if and only if A contains all its accumulation points.
Definition 18.23. Let X be a topological space and A ⊂ X. We say A is
compact if whenever [
A⊂ Ui
i∈I
with (Ui )i∈I a family of open sets in X indexed by some set I there exists a finite
subset J ⊂ I such that [
A⊂ Uj .
j∈J
18. TOPOLOGY 99
We sometimes refer to (Ui )i∈I as an open cover of A and to (Uj )j∈J as a finite open
subcover. So A is compact if and only if any open cover of A has a finite open
subcover.
Exercise 18.24. Prove that if X is a topological space and X is a finite set
then it is compact.
Exercise 18.25. Prove that R is not compact in the Euclidean topology. Hint:
Consider the open cover [
R= (−n, n)
n∈N
and show it has no finite open subcover.
Exercise 18.26. Prove that no open interval (a, b) in R is compact (a < b).
Exercise 18.27. Prove that if F : X → X 0 is a continuous map of topological
spaces and A ⊂ X is compact then F (A) ⊂ X 0 is compact.
Definition 18.28. A topological space X is a Hausdorff space if for any two
points x, y ∈ X there exist open sets U ⊂ X and V ⊂ X such that x ∈ U , y ∈ V ,
and U ∩ V = ∅.
Exercise 18.29. Prove the R with the Euclidean topology is a Hausdorff space.
Exercise 18.30. Prove that if X is a Hausdorff space, A ⊂ X, and x ∈ X\A
then there exist open sets U ⊂ X and V ⊂ X such that x ∈ U , A ⊂ V , and
U ∩ V = ∅. In particular any compact subset of a Hausdorff space is closed.
Hint: For any a ∈ A let Ua ⊂ X and Va ⊂ X be open sets such that x ∈ Ua ,
a ∈ Va , Ua ∩ Va = ∅. Then (Va )a∈A is an open covering of A. Select (Vb )b∈B a finite
subcover of A where B ⊂ A is a finite set, B = {b1 , ..., bn }. Then let
U = Ub1 ∩ ... ∩ Ubn
V = Vb1 ∪ ... ∪ Vbn .
Definition 18.31. Let X, X 0 be topological spaces. Then the set X × X 0
may be equipped with the topology generated by the family of all sets of the form
U × U 0 where U and U 0 are open in X and X 0 respectively. This is called the
product topology on X × X 0 . Iterating this we get a product topology on a product
X1 × ... × Xn of n topological spaces.
Exercise 18.32. Prove that for any r ∈ R with r > 0, the set
D = {(x, y) ∈ R2 ; x2 + y 2 < r2 }
is open in the product topology of R2 .
Definition 18.33. A topological manifold is a topological space X such that for
any point x ∈ X there exists an open set U ⊂ X containing x and a homeomorphism
F : U → V where V ⊂ Rn is an open set in Rn for the Euclidean topology. (Here
U and V are viewed as topological spaces with the topologies induced from X and
Rn , respectively.)
Remark 18.34. If X is a set of topological manifolds then one can consider the
following relation ∼ on X: for X, X 0 ∈ X we let X ∼ X 0 if and only if there exists a
homeomorphism X → X 0 . Then ∼ is an equivalence relation on X and one of the
basic problems of topology is to “describe” the set X/ ∼ of equivalence classes in
various specific cases.
100 18. TOPOLOGY
Imaginaries
Algebra
CHAPTER 20
Arithmetic
Our main aim here is to introduce some of the basic “arithmetic” of Z. In its
turn arithmetic can be used to introduce the finite rings Z/mZ of residue classes
modulo m and, in particular, the finite fields Fp = Z/pZ, where p is a prime. The
arithmetic of Z to be discussed below already appears in Euclid. Congruences and
residue classes were introduced by Gauss.
Definition 20.1. For integers a and b we say a divides b if there exists an
integer n such that b = an. We write a|b. We also say a is a divisor of b. If a does
not divide b we write a 6 |b.
Example 20.2. 4|20; −4|20; 6 6 |20.
Exercise 20.3. Prove that
1) if a|b and b|c then a|c;
2) if a|b and a|c then a|b + c;
3) a|b defines an order relation on N but not on Z.
Theorem 20.4. (Euclid division) For any a ∈ Z and b ∈ N there exist unique
q, r ∈ Z such that a = bq + r and 0 ≤ r < b.
Proof. We prove the existence of q, r. The uniqueness is left to the reader. We
may assume a ∈ N. We proceed by contradiction. So assume there exists b and
a ∈ N such that for all q, r ∈ Z with 0 ≤ r < b we have a 6= qb + r. Fix such a
b. We may assume a is minimum with the above property. If a < b we can write
a = 0 × b + a, a contradiction. If a = b we can write a = 1 × a + 0, a contradiction.
If a > b set a0 = a − b. Since a0 < a, there exist q 0 , r ∈ Z such that 0 ≤ r < b and
a0 = q 0 b + r. But then a = qb + r, where q = q 0 + 1, a contradiction.
Definition 20.5. For a ∈ Z denote hai the set {na; n ∈ Z} of integers divisible
by a. For a, b ∈ Z denote by ha, bi the set {ma + nb; m, n ∈ Z} of all numbers
expressible as a multiple of a plus a multiple of b.
Proposition 20.6. For any integers a, b there exists an integer c such that
ha, bi = hci.
Proof. If a = b = 0 we can take c = 0. Assume a, b are not both 0. Then the
set S = ha, bi ∩ N is non-empty. Let c be the minimum of S. Clearly hci ⊂ ha, bi.
Let us prove that ha, bi ⊂ hci. Let u = ma + nb and let us prove that u ∈ hci. By
Euclidean division u = cq + r with 0 ≤ r < c. We want to show r = 0. Assume
r 6= 0 and seek a contradiction. Write c = m0 a + n0 b. Then r ∈ N and also
r = u − cq = (ma + nb) − (m0 a + n0 b)q = (m − m0 q)a + (n − n0 q)b ∈ ha, bi.
Hence r ∈ S. But r < c. So c is not the minimum of S, a contradiction.
105
106 20. ARITHMETIC
Proposition 20.7. If a and b are integers and have no common divisor > 1
then there exist integers m and n such that ma + nb = 1.
Proof. By the above Proposition ha, bi = hci for some c ≥ 1. In particular c|a
and c|b. The hypothesis implies c = 1 hence 1 ∈ ha, bi.
One of the main definitions of number theory is
Definition 20.8. An integer p is prime if p > 1 and if its only positive divisors
are 1 and p.
Proposition 20.9. If p is a prime and a is an integer such that p 6 |a then
there exist integers m, n such that ma + np = 1.
Proof. a and p have no common divisor > 1 and we conclude by Proposition
20.7.
Proposition 20.10. (Euclid Lemma) If p is a prime and p|ab for integers a
and b then either p|a or p|b.
Proof. Assume p|ab, p 6 |a, p 6 |b, and seek a contradiction. By Proposition 20.9
ma + np = 1 for some integers m, n and m0 b + n0 p = 1 for some integers m0 , n0 . We
get
1 = (ma + np)(m0 b + n0 p) = mm0 ab + p(nm0 + n0 m + nn0 ).
Since p|ab we get p|1, a contradiction.
Theorem 20.11. (Fundamental Theorem of Arithmetic) Any integer n > 1
can be written uniquely as a product of primes, i.e., there exist primes p1 , p2 , ..., ps ,
where s ≥ 1, such that
n = p1 p2 ...ps .
Moreover any such representation is unique in the following sense: if
p1 p2 ...ps = q1 q2 ...qt
with pi and qj prime and p1 ≤ p2 ≤ ..., q1 ≤ q2 ≤ ... then s = t and p1 = q1 ,
p2 = q2 , ....
Proof. Uniqueness follows from Euclid’s Lemma 20.10. To prove the existence
part let S be the set of all integers > 1 which are not products of primes. We
want to show S = ∅. Assume the contrary and seek a contradiction. Let n be the
minimum of S. Then n is not prime. So n = ab with a, b > 1 integers. So a < n
and b < n. So a 6∈ S and b 6∈ S. So a and b are products of primes. So n is a
product of primes, a contradiction.
Exercise 20.12. Prove the uniqueness part in the above theorem.
Definition 20.13. Fix an integer m 6= 0. Define a relation ≡m on Z by a ≡m b
if and only if m|a − b. Say a is congruent to b mod m (or modulo m). Instead of
a ≡m b one usually writes (following Gauss):
a ≡ b (mod m).
Example 20.14. 3 ≡ 17 (mod 7).
Exercise 20.15. Prove that ≡m is an equivalence relation. Prove that the
equivalence class a of a consists of all the numbers of the form mb + a where m ∈ Z.
Example 20.16. If m = 7 then 3 = 10 = {..., −4, 3, 10, 17, ...}.
20. ARITHMETIC 107
Groups
Our next chapters investigate a few topics in algebra. Recall that algebra is the
study of algebraic structures, i.e., sets with operations on them. We already intro-
duced, and constructed, some elementary examples of algebraic structures such as
rings and, in particular, fields. With rings/fields at our disposal one can study some
other fundamental algebraic objects such as groups, vector spaces, polynomials. In
what follows we briefly survey some of these. We begin with groups. In some sense
groups are more fundamental than rings and fields; but in order to be able to look
at more interesting examples we found it convenient to postpone the discussion of
groups until this point. Groups appeared in mathematics in the context of symme-
tries of roots of polynomial equations; cf. the work of Galois that involved finite
groups. Galois’ work inspired Lie who investigated differential equations in place of
polynomial equations; this led to (continuous) Lie groups, in particular groups of
matrices. Groups eventually penetrated most of mathematics and physics (Klein,
Poincaré, Einstein, Cartan, Weyl).
Definition 21.1. A group is a tuple (G, ?,0 , e) consisting of a set G, a binary
operation ? on G, a unary operation 0 on G (write 0 (x) = x0 ), and an element e ∈ G
(called the identity element) such that for any x, y, z ∈ G the following axioms are
satisfied:
1) x ? (y ? z) = (x ? y) ? z;
2) x ? e = e ? x = x;
3) x ? x0 = x0 ? x = e.
If in addition x ? y = y ? x for all x, y ∈ G we say G is commutative (or Abelian in
honor of Abel).
Remark 21.2. For any group G, any element g ∈ G, and any n ∈ Z one defines
g n ∈ G exactly as in Exercise 13.15.
Exercise 21.3. Check the above.
Definition 21.4. Sometimes one writes e = 1, x?y = xy, x0 = x−1 , x?...?x =
n
x (n ≥ 1 times). In the Abelian case one sometimes writes e = 0, x ? y = x + y,
x0 = −x, x ? ... ? x = nx (n ≥ 1 times). These notations depend on the context and
are justified by the following examples.
Example 21.5. If R is a ring then R is an Abelian group with e = 0, x ?
y = x + y, x0 = −x. Hence Z, Z/mZ, Fp , Zp , Q, R, C are groups “with respect to
addition.”
Example 21.6. If R is a field then R× = R\{0} is an Abelian group with
e = 1, x ? y = xy, x0 = x−1 . Hence Q× , R× , C× , F×
p are groups “with respect to
multiplication.”
109
110 21. GROUPS
Order
Vectors
does not generate V . We claim that u1 , ..., un are linearly independent. Assume
not. Hence there exists (a1 , ..., an ) 6= (0, ..., 0) such that a1 u1 + ... + an un = 0.
We may assume a1 = 1. Then one checks that u2 , ..., un generate V , contradicting
minimality.
Exercise 23.9. Assume R = Fp and V has a basis with n elements. Then
|V | = pn .
Theorem 23.10. If V has a basis u1 , ..., un and a basis v1 , ..., vm then n = m.
Proof. We prove m ≤ n; similarly one has n ≤ m. Assume m > n and seek
a contradiction. Since u1 , ..., un generate V we may write v1 = a1 u1 + ... + an un
with not all a1 , ..., an zero. Renumbering u1 , ..., un we may assume a1 6= 0. Hence
v1 , u2 , ..., un generates V . Hence v2 = b1 v1 + b2 u2 + ... + bn un . But not all b2 , ..., bn
can be zero because v1 , v2 are linearly independent. So renumbering u2 , ..., un
we may assume b2 6= 0. So v1 , v2 , u3 , ..., un generates V . Continuing (one needs
induction) we get that v1 , v2 , ..., vn generates V . So vn+1 = d1 vn + ... + dn vn . But
this contradicts the fact that v1 , ..., vm are linearly independent.
Exercise 23.11. Give a quick proof of the above theorem in case R = Fp .
Hint: We have pn = pm hence n = m.
Definition 23.12. We say V is finite dimensional (or that it has a finite basis)
if there exists a basis u1 , ..., un of V . Then we define the dimension of V to be n;
write dim V = n. (The definition is correct due to Theorem 23.10.)
Definition 23.13. If V and W are vector spaces a map F : V → W is called
linear if for all a ∈ K, u, v ∈ V we have:
1) F (au) = aF (u),
2) F (u + v) = F (u) + F (v).
Example 23.14. If a, b, c, d, e, f ∈ R then the map F : R3 → R2 given by
F (u, v, w) = (au + bv + cw, du + ev + f w)
is a linear map.
Exercise 23.15. Prove that if F : V → W is a linear map of vector spaces
then V 0 = F −1 (0) and V 00 = F (V ) are vector spaces (with respect to the obvious
operations). If in addition V and W are finite dimensional then V 0 and V 00 are
finite dimensional and
dim V = dim V 0 + dim V 00 .
Hint: Construct corresponding bases.
Exercise 23.16. Give an example of a vector space that is not finite dimen-
sional.
CHAPTER 24
Matrices
Matrices appeared in the context of linear systems of equations and were studied
in the work of Leibniz, Cramer, Cayley, Eisenstein, Hamilton, Sylvester, Jordan,
etc. They were later rediscovered and applied in the context of Heisenberg’s matrix
mechanics. Nowadays they are a standard concept in linear algebra courses.
Definition 24.1. Let m, n ∈ N. An m × n matrix with coefficients in a field
R is a map
A : {1, ..., m} × {1, ..., n} → R.
If A(i, j) = aij for 1 ≤ i ≤ m, 1 ≤ j ≤ n then we write
a11 ... a1n
A = (aij ) = ... ... ... .
am1 ... amn
We denote by
Rm×n = Mm×n (R)
the set of all m × n matrices. We also write Mn (R) = Mn×n (R). Note that R1×n
identifies with Rn ; its elements are of the form
(a1 , ..., an )
and are called row matrices. Similarly the elements of Rm×1 are of the form
a1
...
...
am
and are called column matrices. If A = (aij ) ∈ Rm×n then
a11 a1n
...
u1 = , ..., un = ...
... ...
am1 amn
are called the columns of A and we also write
A = (u1 , ..., un ).
Similarly
(a11 , ..., a1n ), ..., (am1 , ..., amn )
are called the rows of A.
117
118 24. MATRICES
Definition 24.2. Let 0 ∈ Rm×n the matrix 0 = (zij ) with zij = 0 for all i, j;
0 is called the zero matrix. Let I ∈ Rn×n the matrix I = (δij ) where δii = 1 for
all i and δij = 0 for all i 6= j; I is called the identity matrix and δij is called the
Kronecker symbol.
Definition 24.3. If A = (aij ), B = (bij ) ∈ Rm×n we define the sum
C = A + B ∈ Rm×n
as
C = (cij ), cij = aij + bij .
If A = (ais ) ∈ Rm×k , B = (bsj ) ∈ Rk×n , we define the product
C = AB ∈ Rm×n
as
k
X
C = (cij ), cij = ais bsj .
s=1
Exercise 24.9. Consider a matrix A = (aij ) ∈ Rm×n and consider the map
F : Rn×1 → Rm×1 , F (u) = Au (product of matrices).
Then the matrix of F with respect to the canonical bases of Rn×1 and Rm×1 is A
itself.
Hint: Let e1 , ..., en be the standard basis of Rn×1 and let f 1 , ..., f m be the
standard basis of Rm×1 . Then
1 a11
a11 ... a1n 0 a21
F (e1 ) = Ae1 = ... ... ... 1
... = ... = a11 f + ... + am1 f .
m
Determinants
One easily checks that fn is multilinear, alternating, and takes value 1 on the
identity matrix In .
Exercise 25.8. Check the last sentence in the proof above.
Lemma 25.9. If f and g are multilinear alternating maps Rn×n → R and
f (I) 6= 0 then there exists c ∈ R such that g(A) = cf (A) for all A.
Proof. Let A = (aij ). Let e1 , ..., en be the standard basis of Rn×1 . Then
!
X X X X
i1 in
g(A) = g ai1 1 e , ..., ain n e = ... ai1 1 ...ain n g(ei1 , ..., ein ).
i1 in i1 in
25. DETERMINANTS 123
The terms for which i1 , ..., in are not distinct are zero. The terms for which i1 , ..., in
are distinct are indexed by permutations σ. By Exercise 25.6 we get
!
X
g(A) = (σ)aσ(1)1 ...aσ(n)n g(I).
σ
A similar formula holds for f (A) and the Lemma follows.
By Lemmas 25.7 and 25.9 we get:
Theorem 25.10. There exists a unique multilinear alternating map (called de-
terminant)
det : Rn×n → R
such that det(I) = 1.
Exercise 25.11. Using the notation in the proof of Lemma 25.7 prove that:
1) For all i we have
n
X
det(A) = (−1)i+j aij det(Aij ).
j=1
Polynomials
Determining the roots of polynomials was one of the most important motivat-
ing problems in the development of algebra, especially in the work of Cardano,
Lagrange, Gauss, Abel, and Galois. Here we introduce polynomials and discuss
some basic facts about their roots.
Definition 26.1. Let R be a ring. We define the ring of polynomials R[x]
in one variable with coefficients in R as follows. An element of R[x] is a map
f : N ∪ {0} → R, i 7→ ai with the property that there exists i0 ∈ N such that for
all i ≥ i0 we have ai = 0; we also write such a map as
f = (a0 , a1 , a2 , a3 , ...).
We define 0, 1 ∈ R[x] by
0 = (0, 0, 0, 0, ...),
1 = (1, 0, 0, 0, ...).
If f is as above and g = (b0 , b1 , b2 , b3 , ...) then addition and multiplication are
defined by
f +g = (a0 + b0 , a1 + b1 , a2 + b2 , a3 + b3 , ...),
fg = (a0 b0 , a0 b1 + a1 b0 , a0 b2 + a1 b1 + a2 b0 , a0 b3 + a1 b2 + a2 b1 + a3 b0 , ...).
We define the degree of f = (a0 , a1 , a2 , a3 , ...) as
deg(f ) = min{i; ai 6= 0}
if f 6= 0 and deg(0) = 0. We also define
x = (0, 1, 0, 0, ...)
and we write
a = (a, 0, 0, 0, ...)
for any a ∈ R.
Exercise 26.2.
1) Prove that R[x] with the operations above is a ring.
2) Prove that the map R → R[x], a 7→ a = (a, 0, 0, 0, ...) is a ring homomor-
phism.
3) Prove that x2 = (0, 0, 1, 0, ...), x3 = (0, 0, 0, 1, 0, ...), etc.
4) Prove that if f = (a0 , a1 , a2 , a3 , ...) then
f = an xn + ... + a1 x + a0
where n = deg(f ). (We also write f = f (x) but we DO NOT SEE f (x) as a
function; this is just a notation.)
125
126 26. POLYNOMIALS
Congruences
We discuss here polynomial congruences which lie at the heart of number theory.
The main results below are due to Fermat, Lagrange, Euler, and Gauss.
Definition 27.1. Let f (x) ∈ Z[x] be a polynomial and p a prime. An integer
c ∈ Z is called a root of f (x) mod p (or a solution to the congruence f (x) ≡
0 (mod p)) if f (c) ≡ 0 (mod p); in other words if p|f (c). Let f ∈ Fp [x] be the
polynomial obtained from f ∈ Z[x] by replacing the coefficients of f with their
images in Fp . Then c is a root of f mod p if and only if the image c of c in Fp is
a root of f . We denote by Np (f ) the number of roots of f (x) mod p contained in
{0, 1, ..., p − 1}; equivalently Np (f ) is the number of roots of f in Fp . If f, g are
polynomials in Z[x] we write Np (f = g) for Np (f − g). If Zp (f ) is the set of roots
of f in Fp then of course Np (f ) = |Zp (f )|.
Exercise 27.2.
1) 3 is a root of x3 + x − 13 mod 17.
2) Any integer a is a root of xp − x mod p; this is Fermat’s Little Theorem. In
particular Np (xp − x) = p, Np (xp−1 = 1) = p − 1.
3) Np (ax − b) = 1 if p 6 |a.
4) Np (x2 = 1) = 2 if p 6= 2.
Proposition 27.3. For any two polynomials f, g ∈ Z[x] we have
Np (f g) ≤ Np (f ) + Np (g).
Proof. Clearly Zp (f g) ⊂ Zp (f ) ∪ Zp (g). Hence
|Zp (f g)| = |Zp (f ) ∪ Zp (g)| ≤ |Zp (f )| + |Zp (g)|.
Exercise 27.4. Consider the polynomials
f (x) = xp−1 − 1 and g(x) = (x − 1)(x − 2)...(x − p + 1) ∈ Z[x].
Prove that all the coefficients of the polynomial f (x) − g(x) are divisible by p.
Conclude that p divides the sums
p−1
X
a = 1 + 2 + 3 + ... + (p − 1)
a=1
and
X
ab = 1×2+1×3×...1×(p−1)+2×3+...+2×(p−1)+...+(p−2)×(p−1).
1≤a<b≤p−1
129
130 27. CONGRUENCES
1) Understand the set of primes p such that the congruence f (x) ≡ 0 (mod p)
has a solution or, equivalently, such that p|f (c) for some c ∈ Z.
Exercise 27.12. Prove that g is a primitive root mod p if and only if it is not
divisible by p and
g (p−1)/q 6≡ 1 (mod p)
for all primes q|p − 1.
Exercise 27.13. Prove that 3 is a primitive root mod 7 but 2 is not a primitive
root mod 7.
The following Theorem about the existence of primitive roots was proved by
Gauss:
Theorem 27.14. (Gauss) If p is a prime there exists a primitive root mod p.
Equivalently the group F×
p is cyclic.
Geometry
CHAPTER 28
Lines
affine plane over some field if and only if the theorems of Desargues (Parts I and
II) and Pappus (stated below) hold. See below for the “only if direction.”
Exercise 28.5. Prove that any line in Fp × Fp has exactly p points.
Exercise 28.6. How many lines are there in the plane Fp × Fp ?
Exercise 28.7. (Desargues’ Theorem, Part I) Let A1 , A2 , A3 , A01 , A02 , A03 be
distinct points in the plane. Also for all i 6= j assume Ai Aj and A0i A0j are not
parallel and let Pij be their intersection. Assume the 3 lines A1 A01 , A2 A02 , A3 A03
have a point in common. Then prove that the points L12 , L13 , L23 are collinear
(i.e., lie on some line). Hint: Consider the “space” R × R × R and define planes and
lines in this space. Prove that if two planes meet and don’t coincide then they meet
in a line. Then prove that through any two points in space there is a unique line
and through any 3 non-collinear points there is a unique plane. Now consider the
projection R × R × R → R × R, (x, y, z) 7→ (x, y) and show that lines project onto
lines. Next show that configuration of points Ai , A0i ∈ R × R can be realized as the
projection of a similar configuration of points Bi , Bi0 ∈ R × R × R not contained in a
plane. (Identifying R × R with the set of points in space with zero third coordinate
we take Bi = Ai , Bi0 = A0i for i = 1, 2, we let B3 have a nonzero third coordinate,
and then we choose B30 such that the lines B1 B10 , B2 B20 , B3 B30 have a point in
common.) Then prove “Desargues’ Theorem in Space” (by noting that if Qij is the
intersection of Bi Bj with Bi0 Bj0 then Qij is in the plane containing B1 , B2 , B3 and
also in the plane containing B10 , B20 , B30 ; hence Qij is in the intersection of these
planes which is a line). Finally deduce the original plane Desargues by projection.
Exercise 28.8. (Desargues’ Theorem, Part II) Let A1 , A2 , A3 , A01 , A02 , A03 be
distinct points in the plane. Assume the 3 lines A1 A01 , A2 A02 , A3 A03 have a point in
common or they are parallel. Assume A1 A2 is parallel to A01 A02 and A1 A3 is parallel
to A01 A03 . Prove that A2 A3 is parallel to A02 A03 . Hint: Compute coordinates. There
is an alternative proof that reduces Part II to Part I by using the “projective plane
over our field.”
Exercise 28.9. (Pappus’ Theorem) Let P1 , P2 , P3 be points on a line L and
let Q1 , Q2 , Q3 be points on a line M 6= L. Assume the lines P2 Q3 and P3 Q2 are
not parallel and let A1 be their intersection; define A2 , A3 similarly. Then prove
that A1 , A2 , A3 are collinear. Hint (for the case L and M meet): One can assume
L = {(x, 0); x ∈ R}, M = {(0, y); y ∈ R} (explain why). Let the points Pi = (xi , 0)
and Qi = (0, yi ) and compute the coordinates of Ai . Then check that the line
through A1 and A2 passes through A3 .
Remark 28.10. One can identify the projective plane (A2 , L) attached to the
affine plane (A2 , L) with the pair (P2 , P̌2 ) defined as follows. Let P2 = R3 / ∼ where
(x, y, z) ∼ (x0 , y 0 , z 0 ) if and only if there exists 0 6= λ ∈ R such that (x0 , y 0 , z 0 ) =
(λx, λy, λz). Denote the equivalence class of (x, y, z) by (x : y : z). Identify a point
(x, y) in the affine plane A2 = R2 = R × R with the point (x : y : 1) ∈ P2 . Identify
a point (x0 : y0 : 0) in the complement P2 \A2 with the class of lines in A2 parallel
to the line y0 x − x0 y = 0. This allows one to identify the complement P2 \A2 with
the line at infinity L∞ of A2 . Hence we get an identification of P2 with A2 . Finally
define a line in P2 as a set of the form
L = {(x : y : z); ax + by + cz = 0}.
28. LINES 137
Conics
So far we were concerned with lines in the plane. Let us discuss now “higher
degree curves.” We start with conics. Assume R is a field with 2 = 1 + 1 6= 0;
equivalently R does not contain the field F2 .
Definition 29.1. The circle of center (a, b) ∈ R × R and radius r is the set
C(R) = {(x, y) ∈ R × R; (x − a)2 + (y − b)2 = r2 }.
Definition 29.2. A line is tangent to a circle if it meets it in exactly one point.
(We say that the line is tangent to the circle at that point.) Two circles are tangent
if they meet in exactly one point.
Exercise 29.3. Prove that for any circle and any point on it there is exactly
one line tangent to the circle at that point.
Exercise 29.4. Prove that:
1) A circle and a line meet in at most 2 points.
2) Two circles meet in at most 2 points.
Exercise 29.5. How many points does a circle of radius 1 have if R = F13 ?
Same problem for F11 .
Exercise 29.6. Prove that the circle C(R) with center (0, 0) and radius 1 is
an Abelian group with e = (1, 0), (x, y)0 = (x, −y), and group operation
(x1 , y1 ) ? (x2 , y2 ) = (x1 x2 − y1 y2 , x1 y2 + x2 y1 ).
Prove that the map
a b
C(R) → SO2 (R), (a, b) 7→
−b a
is a bijective group homomorphism. (Cf. Exercise 21.14 for SO2 (R).)
Exercise 29.7. Consider the circle C(F17 ). Show that (3̄, 3̄), (1̄, 0̄) ∈ C(F17 )
and compute (3̄, 3̄) ? (1̄, 0̄) and 2(1̄, 0̄) (where the latter is, of course, (1̄, 0̄) ? (1̄, 0̄)).
Circles are special cases of conics:
Definition 29.8. A conic is a subset Q ⊂ R × R of the form
Q = Q(R) = {(x, y) ∈ R × R; ax2 + bxy + cy 2 + dx + ey + f = 0}
for some (a, b, c, d, e, f ) ∈ R × ... × R, where (a, b, c) 6= (0, 0, 0).
We refer to (a, b, c, d, e) as the equation of the conic and if the corresponding
conic passes through a point we say that the equation of the conic passes through
the point. We sometimes say “conic” instead of “equation of the conic.”
139
140 29. CONICS
Exercise 29.9. Prove that if 5 points are given in the plane such that no 4
of them are collinear then there exists a unique conic passing through these given
5 points. Hint: Consider the vector space of all (equations of) conics that pass
through a given set S of points. Next note that if one adds a point to S the
dimension of this space of conics either stays the same or drops by one. Since the
space of all conics has dimension 6 it is enough to show that for r ≤ 5 the conics
passing through r points are fewer than those passing through r − 1 of the r points.
For r = 4, for instance, this is done by taking a conic that is a union of 2 lines.
CHAPTER 30
Cubics
so we define
y3 = y1 + m(x3 − x1 ).
Summarizing, the definition of (x3 , y3 ) is
(x3 , y3 ) = (y2 − y1 )2 (x2 − x1 )−2 − x1 − x2 , y1 + (y2 − y1 )(x2 − x1 )−1 (x3 − x1 )
if (x1 , y1 ) = (x2 , y2 ), y1 6= 0.
Then E(R) with the above definitions is an Abelian group.
Exercise 30.2. Check the last statement. (N.B. Checking associativity is a
very laborious exercise.)
Exercise 30.3. Consider the group E(F13 ) defined by the equation y 2 = x3 + 8̄.
Show that (1̄, 3̄), (2̄, 4̄) ∈ E(F13 ) and compute (1̄, 3̄) ? (2̄, 4̄) and 2(2̄, 4̄) (where the
latter is, of course, (2̄, 4̄) ? (2̄, 4̄)).
Affine elliptic curves are special examples of cubics:
Definition 30.4. A cubic is a subset X = X(R) ⊂ R × R of the form
X(R) = {(x, y) ∈ R ×R; ax3 +bx2 y +cxy 2 +dy 3 +ex2 +f xy +gy 2 +hx+iy +j = 0}
where (a, b, c, ..., j) ∈ R × ... × R, (a, b, c, d) 6= (0, ..., 0).
As usual we refer to the tuple (a, b, c, ..., j) as the equation of a cubic (or, by
abuse, simply a cubic).
Exercise 30.5. (Three Cubics Theorem) Prove that if two cubics meet in
exactly 9 points and if a third cubic passes through 8 of the 9 points then the
third cubic must pass through the 9th point. Hint: First show that if r ≤ 8 and
r points are given then the set of cubics passing through them is strictly larger
than the set of cubics passing through r − 1 of the r points. (To show this show
first that no 4 of the 9 points are on a line. Then in order to find, for instance,
a cubic passing through P1 , ..., P7 but not through P8 one considers the cubics
Ci = Q1234i + Ljk , {i, j, k} = {5, 6, 7}, where Q1234i is the unique conic passing
through P1 , P2 , P3 , P4 , Pi and Ljk is the unique line through Pj and Pk . Assume
C5 , C6 , C7 all pass through P8 and derive a contradiction as follows. Note that P8
cannot lie on 2 of the 3 lines Ljk because this would force us to have 4 collinear
points. So we may assume P8 does not lie on either of the lines L57 , L67 . Hence P8
lies on both Q12345 and Q12346 . So these conics have 5 points in common. From
here one immediately gets a contradiction.) Once this is proved let P1 , ..., P9 be
the points of intersection of the cubics with equations F and G. We know that
the space of cubics passing through P1 , ..., P8 has dimension 2 and contains F and
G. So any cubic in this space is a linear combination of F and G, hence will pass
through P9 .
Exercise 30.6. (Pascal’s Theorem) Let P1 , P2 , P3 , Q1 , Q2 , Q3 be points on a
conic C. Let A1 be the intersection of P2 Q3 with P3 Q2 , and define A2 , A3 similarly.
(Assume the lines in question are not parallel.) Then prove that A1 , A2 , A3 are
collinear. Hint: The cubics
Q1 P2 ∪ Q2 P3 ∪ Q3 P1 and P1 Q2 ∪ P2 Q3 ∪ P3 Q1
30. CUBICS 143
Analysis
CHAPTER 31
Limits
We discuss now some simple topics in analysis. Analysis is the study of “passing
to the limit.” The key words are sequences, convergence, limits, and later differential
and integral calculus. Here we will discuss limits. Analysis emerged through work
of Abel, Cauchy, Riemann, and Weierstrass, as a clarification of the early calculus
of Newton, Leibniz, Euler, and Lagrange.
Definition 31.1. A sequence in R is a map F : N → R; if F (n) = an we denote
the sequence by a1 , a2 , a3 , ... or by (an ). We let F (N) be denoted by {an ; n ≥ 1};
the latter is a subset of R.
Definition 31.2. A subsequence of a sequence F : N → R is a sequence of the
form F ◦ G where G : N → N is an increasing map. If a1 , a2 , a3 , ... is F then F ◦ G
is ak1 , ak2 , ak3 , ... (or (akn )) where G(n) = kn .
Definition 31.3. A sequence (an ) is convergent to a0 ∈ R if for any real
number > 0 there exists an integer N such that for all n ≥ N we have |an −a0 | < .
We write an → a0 and we say a0 is the limit of (an ). A sequence is called convergent
if there exists a ∈ R such that the sequence converges to a. A sequence is called
divergent if it is not convergent.
Exercise 31.4. Prove that an = n1 converges to 0.
Hint: Let > 0; we need to find N such that for all n ≥ N we have
1
| − 0| < ;
n
it is enough to take N to be any integer such that N > 1 .
Exercise 31.5. Prove that an = √1 converges to 0.
n
1
Exercise 31.6. Prove that an = n2 converges to 0.
Exercise 31.7. Prove that an = n is divergent.
Exercise 31.8. Prove that an = (−1)n is divergent.
Exercise 31.9. Prove that if an → a0 and bn → b0 then
1) an + bn → a0 + b0
2) an bn → a0 b0 .
If in addition b0 6= 0 then there exists N such that for all n ≥ N we have bn 6= 0;
moreover if bn 6= 0 for all n then
3) abnn → ab00 .
Hint for 1: Consider any > 0. Since an → a0 there exists Na such that for
all n ≥ Na we have |an − a0 | < 2 . Since bn → b0 there exists Nb such that for all
147
148 31. LIMITS
Show that there exists sequences (an ) and (bn ), the first increasing, the second
decreasing, with an ≤ bn and bn − an → 0. (To check this use recursion to define
an+1 , bn+1 in terms of an , bn by the following rule: if cn = an +b
2
n
then set an+1 = cn
and bn+1 = bn in case cn ∈ A; and set an+1 = an and bn+1 = cn in case cn ∈ B.)
Note that an → a0 and bn → b0 and a0 = b0 . Since A, B are open and disjoint they
are closed. So a0 ∈ A and b0 ∈ B. But this contradicts the fact that A and B are
disjoint.
Definition 31.22. For a, b ∈ R the closed interval [a, b] ⊂ R is defined as
[a, b] = {x ∈ R; a ≤ x ≤ b}.
Exercise 31.23. Prove that [a, b] are closed in the Euclidean topology.
Exercise 31.24. Prove that the open intervals (a, b) and the closed intervals
[a, b] are connected in R.
Exercise 31.25. (Heine-Borel Theorem) Prove that any closed interval in R
is compact. Hint: Assume [a, b] is not compact and derive a contradiction as
follows. We know [a, b] has an open covering (Ui )i∈I that does not have a finite open
subcovering. Show that there exists sequences (an ) and (bn ), the first increasing,
the second decreasing, with an ≤ bn and bn − an → 0, such that [an , bn ] cannot
be covered by finitely many Ui s. (To check this use recursion to define an+1 , bn+1
in terms of an , bn by the following rule: let cn = an +b 2
n
; then at least one of the
two intervals [an , cn ] or [cn , bn ] cannot be covered by finitely many Ui s; if this is
the case with the first interval then set an+1 = an and bn+1 = cn ; in the other case
set an+1 = cn and bn+1 = bn .) Note that an → a0 and bn → b0 and a0 = b0 . But
a0 = b0 is in one of the Ui s; this Ui will completely contain one of the intervals
[an , bn ] which is a contradiction.
CHAPTER 32
Series
Pn
Definition 32.1. Let (an ) be a sequence and sn = k=1 ak . The sequence
(s
Pn∞) is called the sequence of partial sums. If (sn ) is convergent to some s we say
k=1 an is a convergent series and that this series converges to s; we write
∞
X
an = s.
k=1
P∞
If the sequence (sn ) is divergent we say that k=1 an is a divergent series.
Exercise 32.2. Prove that
∞
X 1
= 1.
n=1
n(n + 1)
Hint: Start with the equality
1 1 1
= −
n(n + 1) n n+1
and compute
N
X 1 1
=1− .
n=1
n(n + 1) N
Exercise 32.3. Prove that the series
∞
X 1
n=1
n2
is convergent.
Hint: Prove the sequence of partial sums is bounded using the inequality
1 1
≤
n2 n(n + 1)
plus Exercise 32.2.
Exercise 32.4. Prove that the series
∞
X 1
n k
n=1
is convergent for k ≥ 3.
Exercise 32.5. Prove that the series
∞
X 1
n=1
n
is divergent. This series is called the harmonic series.
151
152 32. SERIES
Hint: Assume the series is convergent. Then the sequence of partial sums is
convergent hence Cauchy. Get a contradiction from the inequality:
1 1 1 1 1 1
+ + + ... + n > 2n × n = .
2n + 1 2n + 2 2n + 3 2 + 2n 2 + 2n 2
Exercise 32.6. Prove that an → 0 if |a| < 1.
Hint: We may assume 0 < a < 1. Note that (an ) is decreasing. Since it is
bounded it is convergent. Let α be its limit. Assume α 6= 0 and get a contradiction
by noting that
1 an α
= n+1 → = 1.
a a α
Exercise 32.7. Prove that
∞
X 1
an =
n=1
1−a
if |a| < 1.
Exercise 32.8. Prove that the series
∞
X an
n=0
n!
is convergent for all a ∈ R; its limit is denoted by ea = exp(a); e = e1 is called the
Euler number; the map
R → R, a 7→ exp(a)
is called the exponential map. Prove that
exp(a + b) = exp(a) exp(b).
Exercise 32.9. Prove that the function exp : R → R is continuous.
Hint: Let a ∈ R and > 0. It is enough to show that there exists δ > 0 such
that if |b − a| < δ then | exp(b) − exp(a)| < . Show that there is a δ such that for
any b with |b − a| < δ there exists an n such that for all m ≥ n
m n
X ak X ak
| − |<
k! k! 3
k=0 k=0
m n
X bk X bk
| − |<
k! k! 3
k=0 k=0
n k n k
X b X a
| − |< .
k! k! 3
k=0 k=0
From the first two inequalities we get
n
X ak
| exp(a) − |≤
k! 3
k=0
n
X bk
| exp(b) − |≤ .
k! 3
k=0
Then
| exp(b) − exp(a)| < + + = .
3 3 3
32. SERIES 153
Exercise 32.10. Let S ⊂ {0, 1}N be the set of all sequences (an ) such that
there exist N with an = 1 for all n ≥ N . Prove that the map
∞
X an
{0, 1}N \S → R, (an ) 7→
n=1
2n
is (well defined and) injective. Conclude that R is not countable.
Exercise 32.11. Prove that there exist transcendental numbers in R. Hint:
R is uncountable whereas the set of algebraic numbers is countable; cf. Exercise
26.20. This is Cantor’s proof of existence of transcendental numbers.
Real analysis (analysis of sequences, continuity, and other concepts of calculus
like differentiation and integration of functions on R) can be extended to complex
analysis. Indeed we have:
Definition 32.12. A sequence (zn ) in C is convergent to z0 ∈ C if for any real
number > 0 there exists an integer N such that for all n ≥ N we have |zn −z0 | < .
We write zn → z0 and we say z0 is the limit of (zn ). A sequence is called convergent
if there exists z ∈ C such that the sequence converges to z. A sequence is called
divergent if it is not convergent.
Exercise 32.13. Let (zn ) be a sequence in C and let
zn = an + bn i,
an , bn ∈ R. Let z0 = a0 + b0 i. Prove that zn → z0 if and only if an → a0 and
bn → b0 .
Definition 32.14. A sequence (zn ) in C is Cauchy if for any real > 0 there
exists an integer N such that for all integers m, n ≥ N we have |zn − zm | < .
Exercise 32.15. Prove that a sequence in C is convergent if and only if it is
Cauchy.
Exercise 32.16. Prove that:
1) The series
∞
X zn
n=0
n!
is convergent for all z ∈ C; its limit is denoted by ez = exp(z).
2) exp(z + w) = exp(z) exp(w) for all z, w ∈ C.
3) exp(z) = exp(z) for all z ∈ C.
4) | exp(it)| = 1 for all t ∈ R.
5) The map
C → C, z 7→ exp(z),
is continuous. This map is called the (complex) exponential map.
There is a version of the above theory in p-adic analysis (which is crucial to
number theory). Recall the ring of p-adic numbers Zp whose elements are denoted
by [an ].
Definition 32.17. Say that pe divides α = [an ] if there exists β = [bn ] such
that [an ] = [pe bn ]; write pe |α. For any 0 6= α = [an ] ∈ Zp let v = v(α) be the
unique integer such that pn |an for n ≤ v and pv+1 6 |av+1 . Then define the norm of
α by the formula |α| = p−v(α) . We also set |0| = 0.
154 32. SERIES
Trigonometry
155
CHAPTER 34
Calculus
Exercise 34.8. Prove that F (x) = sin(x) is differentiable and F 0 (x) = cos(x).
Prove that G(x) = cos(x) is differentiable and G0 (x) = − sin(x). Hence F and G
are smooth.
Exercise 34.9. (Chain rule) Prove that if F, G ∈ C ∞ (R) then F ◦ G ∈ C ∞ (R)
and
D(F ◦ G) = (D(F ) ◦ G) · D(G).
(Here, as usual, ◦ denotes composition.)
More generally one can define derivatives of functions of several variables as
follows:
Definition 34.10. Let F : Rn → R be a function. Let a = (a1 , ..., an ) ∈ Rn
and define Fi : R → R by
Fi (x) = F (a1 , ..., ai−1 , x, ai+1 , ..., an )
(with the obvious adjustment if i = 1 or i = n). We say that F is differentiable
with respect to xi at a if Fi is differentiable at ai ; in this case we define
∂F
(a) = Fi0 (ai ).
∂xi
We say that F is differentiable with respect to xi if it is differentiable with respect to
∂F
xi at any a ∈ Rn . For such a function we have a well defined function ∂x i
: Rn → R
which is also denoted by Di F . We say that F is infinitely differentiable (or smooth)
if F is differentiable, each Di F is differentiable, each Di Dj F is differentiable, each
Di Dj Dk F is differentiable, etc. We denote by C ∞ (Rn ) the set of smooth functions;
it is a ring with respect to pointwise addition and multiplication.
Definition 34.11. Let P ∈ C ∞ (Rr+2 ). An equation of the form
d2 F dr F
dF
P x, F (x), (x), 2 (x), ..., r (x) = 0
dx dx dx
is called a differential equation. Here F ∈ C ∞ (R) is an unknown function and one
i
defines ddxFi = Di+1 (F ) = D(Di (F )).
The study of differential equations has numerous applications within mathe-
matics (e.g., geometry) as well as natural sciences (e.g., physics).
Example 34.12. Here is a random example of a differential equation:
2 5 ! 3
d F 3 dF d F
exp −x − x5 F 6 = 0.
dx2 dx dx3
The additivity and the Leibniz rule have an algebraic flavor. This suggests the
following:
Definition 34.13. Let R be a commutative unital ring. A map D : R → R is
called a derivation if
1) D(a + b) = D(a) + D(b) (additivity);
2) D(a · b) = a · D(b) + b · D(a) (Leibniz rule).
Example 34.14. D : C ∞ (R) → C ∞ (R) is a derivation.
34. CALCULUS 159