0% found this document useful (0 votes)
7 views196 pages

Number Theory Text 2012

Uploaded by

nusretmath152
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views196 pages

Number Theory Text 2012

Uploaded by

nusretmath152
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 196

Elementary Number Theory

David Pierce

September , 

Mathematics Department
Mimar Sinan Fine Arts University
Istanbul
[email protected]
http://mat.msgsu.edu.tr/~dpierce/
This work is licensed under the
Creative Commons
Attribution-NonCommercial-ShareAlike .
Unported License.
To view a copy of this license, visit
http://creativecommons.org/licenses/by-nc-sa/3.0/
or send a letter to
Creative Commons,
 Castro Street, Suite ,
Mountain View, California, , USA.

Bu çalışma
Creative Commons Attribution-Gayriticari-ShareAlike .
Unported Lisansı ile lisanslı.
Lisansın bir kopyasını görebilmek için,
http://creativecommons.org/licenses/by-nc-sa/3.0/
adresini ziyaret edin ya da mektup atın:
Creative Commons,
 Castro Street, Suite ,
Mountain View, California, , USA.

David Austin Pierce


C
CC BY: $
\

Matematik Bölümü
Mimar Sinan Güzel Sanatlar Üniversitesi
Bomonti, Şişli, İstanbul, 

[email protected]
http://mat.msgsu.edu.tr/~dpierce/
Contents

Preface 

. Proving and seeing 


.. The look of a number . . . . . . . . . . . . . . . . . . . . . . . . . 
.. Patterns that fail . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
.. Incommensurability . . . . . . . . . . . . . . . . . . . . . . . . . . . 

. Numbers 
.. The natural numbers . . . . . . . . . . . . . . . . . . . . . . . . . . 
.. The integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
.. The rational numbers . . . . . . . . . . . . . . . . . . . . . . . . . 
.. Other numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 

. Divisibility 
.. Division . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
.. Congruence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
.. Greatest common divisors . . . . . . . . . . . . . . . . . . . . . . . 
.. Least common multiples . . . . . . . . . . . . . . . . . . . . . . . . 
.. The Euclidean algorithm . . . . . . . . . . . . . . . . . . . . . . . . 
.. The Hundred Fowls Problem . . . . . . . . . . . . . . . . . . . . . 

. Prime numbers 
.. The Fundamental Theorem of Arithmetic . . . . . . . . . . . . . . 
.. Irreducibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
.. The Sieve of Eratosthenes . . . . . . . . . . . . . . . . . . . . . . . 
.. The infinity of primes . . . . . . . . . . . . . . . . . . . . . . . . . 
.. Bertrand’s Postulate . . . . . . . . . . . . . . . . . . . . . . . . . . 

. Computations with congruences 


.. Exponentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
.. Inversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
.. Chinese remainder problems . . . . . . . . . . . . . . . . . . . . . . 

. Powers of two 
.. Perfect numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
.. Mersenne primes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 


. Prime moduli 
.. Fermat’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 
.. Carmichael numbers . . . . . . . . . . . . . . . . . . . . . . . . . . 
.. Wilson’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 

. Arithmetic functions 
.. Multiplicative functions . . . . . . . . . . . . . . . . . . . . . . . . 
.. The Möbius function . . . . . . . . . . . . . . . . . . . . . . . . . . 
.. Convolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 

. Arbitrary moduli 
.. The Chinese Remainder Theorem . . . . . . . . . . . . . . . . . . . 
.. Euler’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
.. Gauss’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 

.Primitive roots 


.. Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
.. Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
.. Primitive roots of primes . . . . . . . . . . . . . . . . . . . . . . . . 
.. Discrete logarithms . . . . . . . . . . . . . . . . . . . . . . . . . . . 
.. Composite numbers with primitive roots . . . . . . . . . . . . . . . 

.Quadratic reciprocity 


.. Quadratic equations . . . . . . . . . . . . . . . . . . . . . . . . . . 
.. Quadratic residues . . . . . . . . . . . . . . . . . . . . . . . . . . . 
.. The Legendre symbol . . . . . . . . . . . . . . . . . . . . . . . . . 
.. Gauss’s Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
.. The Law of Quadratic Reciprocity . . . . . . . . . . . . . . . . . . 
.. Composite moduli . . . . . . . . . . . . . . . . . . . . . . . . . . . 

.Sums of squares 

A. Foundations 
A.. Construction of the natural numbers . . . . . . . . . . . . . . . . . 
A.. Why it matters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 

B. Some theorems without their proofs 

C. Exercises 

D. – examinations 


D.. In-term examination . . . . . . . . . . . . . . . . . . . . . . . . . . 
D.. In-term examination . . . . . . . . . . . . . . . . . . . . . . . . . . 

 Contents
D.. In-term examination . . . . . . . . . . . . . . . . . . . . . . . . . . 
D.. Final Examination . . . . . . . . . . . . . . . . . . . . . . . . . . . 

E. – examinations 


E.. In-term examination . . . . . . . . . . . . . . . . . . . . . . . . . . 
E.. In-term examination . . . . . . . . . . . . . . . . . . . . . . . . . . 
E.. Final examination . . . . . . . . . . . . . . . . . . . . . . . . . . . 

Bibliography 

Index 

Contents 
List of Figures

.. Triangular numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . 


.. A pair of equal triangular numbers . . . . . . . . . . . . . . . . . . 
.. A pair of consecutive triangular numbers . . . . . . . . . . . . . . . 
.. Consecutive odd numbers . . . . . . . . . . . . . . . . . . . . . . . 
.. Consecutive odd numbers, without one . . . . . . . . . . . . . . . . 
.. Consecutive even numbers . . . . . . . . . . . . . . . . . . . . . . . 
.. Partitions of circles by straight lines . . . . . . . . . . . . . . . . . 
.. Incommensurability of diagonal and side . . . . . . . . . . . . . . . 

.. Divisors of 60 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
.. Common divisors of 12 and 30 . . . . . . . . . . . . . . . . . . . . 
.. Divisors of 60, again . . . . . . . . . . . . . . . . . . . . . . . . . . 
.. gcd and lcm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
.. The Euclidean algorithm . . . . . . . . . . . . . . . . . . . . . . . . 
.. Diagonal and side . . . . . . . . . . . . . . . . . . . . . . . . . . . . 

.. The integers modulo 13, or Z13 . . . . . . . . . . . . . . . . . . . . 

.. Two ways of counting, for the Law of Quadratic Reciprocity . . . . 
.. Example of the proof of quadratic reciprocity . . . . . . . . . . . . 


List of Tables

.. The number 9 as the sum of odd numbers of summands . . . . . . 


.. The number 11 as the sum of odd numbers of summands . . . . . 
.. Pascal’s Triangle . . . . . . . . . . . . . . . . . . . . . . . . . . . . 

.. The Sieve of Eratosthenes . . . . . . . . . . . . . . . . . . . . . . . 


.. Composite numbers less than 1369 with least prime factor 17 or
more . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 

.. Mersenne primes and perfect numbers . . . . . . . . . . . . . . . . 

.. Successive differences of powers . . . . . . . . . . . . . . . . . . . . 


.. The inductive step for ∆n f (x) . . . . . . . . . . . . . . . . . . . . 

.. Exponentiation modulo 1000 . . . . . . . . . . . . . . . . . . . . . . 


.. Numbers according to gcd with 16, 18, and 21 . . . . . . . . . . . . 

.. Orders modulo 19 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 


.. Powers of 3 modulo 17 . . . . . . . . . . . . . . . . . . . . . . . . . 

.. Computation of (365/941) . . . . . . . . . . . . . . . . . . . . . . . 

D.. Powers of 3 modulo 257 . . . . . . . . . . . . . . . . . . . . . . . . 


Preface

This book started out as a record of my lectures in the course called Elementary
Number Theory I (Math ) at Middle East Technical University in Ankara in
–. When I was to teach the same course in –, I revised my lecture-
notes and made them the official text for the course. That text, dated September
, , was  pages long. After the course, filled with enthusiasm, I made
many revisions and additions. The result is this book.
The standard text for Math  at METU was Burton’s Elementary Number
Theory []. My lectures of – more or less followed this. The catalogue
description of the course was:

Divisibility, congruences, Euler, Chinese Remainder and Wilson’s Theorems.


Arithmetical functions. Primitive roots. Quadratic residues and quadratic
reciprocity. Diophantine equations.

In –, without realizing that I had written the course textbook, one student
complained that it was hard to read. I am glad he felt free to criticize. But I
had not aimed to create a textbook that could replace classroom lectures. I had
written summarily, without trying to give all of the explanations that anybody
could possibly want.
Among the many changes I have made since the – course, I have:
) put proofs of theorems after their statements, and not before as is sometimes
natural in lectures (an omitted proof in the present text is left to the reader
as an exercise);
) removed the Fermat factorization method [, §.] as being out of the main
stream of the course;
) added Dirichlet convolution, which gives a streamlined way of understand-
ing Möbius inversion and of defining the phi-function;
) added forward references, to show better how everything is interconnected;
) added citations for the theorems, when I have been able to find them.
Precisely because these changes are significant, the book must still be considered
as a work in progress, a rough draft.
As I suggested, Burton’s text was the original model for this book,—but not
in style, only in arrangement of topics. Models for style, as well as sources of
content, include the sparer texts of Landau [] and Hardy and Wright []. Much
of the mathematics in the present text can be found in Gauss’s Disquisitiones
Arithmeticae [] of , written when Gauss was the age of many undergraduate
students. Some of the mathematics is two thousand years older than Gauss.


I have made some attempt to trace theorems to their origins; but this work
is not complete. I prefer to see the primary source myself before attributing a
theorem. In this case, I cite the source near the theorem itself, possibly in a
footnote, and not in some extra section at the end of the chapter. Even when I
can find the primary source, usually a secondary source has led me there. The
secondary source helps to determine what the primary source is. The best history
would arise from reading all possible primary sources; but I have not done this.
Full names and dates of mathematicians named in the text are generally taken
from the MacTutor History of Mathematics archive, or from Wikipedia.
I ask students to learn something of the logical foundations of number theory.
Section . contains an account of these foundations, namely a derivation of
basic arithmetic from the so-called Peano Axioms. This section was originally
an appendix, but I have decided that it belongs in the main body of text, even
if most number theory texts do not have such a section. Chapter  is filled
out with a summary review of the constructions of the other standard number
systems, of integers, rationals, reals, and complex numbers. All of these systems
have their place in number theory. Their constructions alone could constitute a
course, and I do not expect number theory students as such to go through them
all; but students should be aware that the constructions can be done, and they
themselves can do them.
Readers will already know most of the results of Chapter . Assuming some of
these results, the preceeding Chapter  is a general exploration of what can be
done with numbers and, in some cases, what has been done for over two thousand
years. The chapter begins with the visual display of certain numbers as triangles
or squares. Throughout the text, where it makes sense, I try to display the
mathematics in pictures or tables, as for example in the account of the Chinese
Remainder Theorem in §..
Appendix A begins with the construction of the natural numbers by von Neu-
mann’s method. This is a part of set theory and is beyond the scope of the
course as such, but it is good for everybody to know that the construction can be
done. The appendix continues with a discussion of common misunderstandings
of foundational matters.
I do not like to quote a theorem without either proving it or being able to
expect readers to prove it for themselves. In the original course, I did quote
theorems, some recent, without myself knowing the proofs; I have now relegated
these to Appendix B.
Appendix C consists of exercises, most of which were made available in install-
ments to the students in the / class. I have not incorporated the exercises
into the main text. One reason for this is to make it less obvious how the exercises
should be done. The position of an exercise in a text is often a hint as to how the
 http://www-gap.dcs.st-and.ac.uk/~history/index.html


exercise should be done; and yet there are no such hints on examinations. The
exercises here are strung together in one numbered sequence. (So, by the way,
are the theorems in the main text.)
Appendices D and E contain the examinations given to the – and –
classes, along with my solutions and remarks on students’ solutions.
In –, I treated 0 as a natural number; in –, I did not. In the present
book, I intend to use the symbol N for the set {1, 2, 3, . . . }; if a symbol for the set
{0, 1, 2, . . . } is desired, this symbol can be ω. I have tried to update Appendix D
(as well as my original lecture-notes from –) accordingly.

 Preface
. Proving and seeing

.. The look of a number


What can we say about the following sequence of numbers?

1, 3, 6, 10, 15, 21, 28, . . .

The terms increase by 2, 3, 4, and so on. A related observation is that the numbers
in the sequence can be given an appearance, a look, as shown in Figure .. In
b b b b b

b b b b b b b b

b b b b b b b b b

b b b b b b b b

b b b b b

Figure .. Triangular numbers

particular, the numbers are the triangular numbers. Let us designate them by
t1 , t2 , t3 , and so on. Then they can be given recursively by the equations

t1 = 1, tn+1 = tn + n + 1.

This definition can be abbreviated as


n
X
tn = k.
k=1

The triangular numbers can also be given non-recursively, in closed form (so
that tn can be calculated directly):

Theorem . For all numbers n,

n(n + 1)
tn = . (∗)
2

Proof. We prove the claim (∗) for all n by induction:


. The claim is true when n = 1.


. If the claim is true when n = k, so that tk = k(k + 1)/2, then

tk+1 = tk + k + 1
k(k + 1)
= +k+1
2
k(k + 1) 2(k + 1)
= +
2 2
(k + 2)(k + 1)
=
2
(k + 1)(k + 2)
= ,
2
so the claim is true when n = k + 1.
By induction then, (∗) is true for all n.

So equation (∗) is true; but we might ask further: why is it true? One answer
can be seen in a picture. First rewrite (∗) as

2tn = n(n + 1).

Two copies of tn do indeed fit together to make an n × (n + 1) array of dots, as


b b b b bc

b b b bc bc

b b bc bc bc

b bc bc bc bc

Figure .. A pair of equal triangular numbers

in Figure .. One may establish other identities in the same way. For example,
b b b b b

b b b b bc

b b b bc bc

b b bc bc bc

b bc bc bc bc

Figure .. A pair of consecutive triangular numbers

Figure . suggests the next theorem.


 The theorem is mentioned by Nicomachus of Gerasa (c. –c. ) in his Introduction to
Arithmetic [, II.XII.–, p. ]. For him, the picture alone seems to have been sufficient
proof. (Gerasa is now Jerash, in Jordan.)

 . Proving and seeing


Theorem . For all numbers n,

tn+1 + tn = (n + 1)2 .

Proof. Just compute:

(n + 1)(n + 2) n(n + 1) n+1


tn+1 + tn = + = (n + 2 + n) = (n + 1)2 .
2 2 2
What can we say about the following sequence?

1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, . . .

It is the sequence of odd numbers. Also, the first n terms seem to add up to n2 .
Indeed we do have:
Theorem . For all numbers n,
n
X
(2k − 1) = n2 . (†)
k=1

Proof. We use induction.


. The claim is true when n = 1.
. If the claim is true when n = k, then
k+1
X k
X
(2j − 1) = (2j − 1) + 2k + 1 = k 2 + 2k + 1 = (k + 1)2 ,
j=1 j=1

so the claim is true when n = k + 1.


Therefore (†) is true for all n.
Figure . shows why the theorem is true. The point here is that, once a
b b b b b

bc bc bc bc b

b b b bc b

bc bc b bc b

b bc b bc b

Figure .. Consecutive odd numbers

numerical sequence is defined recursively, then identities involving the sequence


can be proved by induction; but the identities will probably be first discovered in
other ways, possibly through pictures.

.. The look of a number 


b b b b b b b b b b b

bc bc bc bc b bc bc bc bc bc b

b b b bc b b b b b bc b

bc bc b bc b bc bc bc b bc b

bc b bc b

Figure .. Consecutive odd numbers, without one

From figure ., we may derive two more observations. The rearrangement
shown in Figure . suggests the identity

n2 − 1 = (n + 1)(n − 1),

while Figure . suggests


b b b b b

bc bc bc bc b

b b b bc b

bc bc b bc b

Figure .. Consecutive even numbers

n
X
2k = n(n + 1).
k=1

Observe finally:

1, 3, 5 , 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, . . .
|{z} | {z } | {z } | {z }
8 27 64 125

Does the pattern continue? As an exercise, write the suggested equation,


...
X
n3 = ...,
...
 These observations are suggested by two possible interpretations of a passage in Aristotle’s
Physics. In A History of Greek Mathematics [, p. ], Thomas Heath asserts that Aris-
totle (–) alludes to Figure . in that passage. Here is Apostol’s translation of the
passage [, Γ ]: ‘Moreover, the Pythagoreans posit the infinite as being the Even; for they
say that it is this which, when cut off and limited by the Odd, provides [as matter] for the
infinity of things. A sign of this, they say, is what happens to numbers; for if gnomons are
placed around the one and apart, in the latter case the form produced is always distinct,
but in the former it is unique.’ Here a gnomon is apparently a figure in the shape of the
letter L (the word originally refers to the part of a sundial whose shadow shows the time).
So Figure . results from placing gnomons around one dot. If we then remove the dot, we
get Figure .; if we start with two dots rather than one, we get Figure ..
A few centuries later, Theon of Smyrna (c. –c. ) states Theorem  in his Mathematics
useful for understanding Plato [, pp. –]. (Smyrna is today’s İzmir.)

 . Proving and seeing


and prove it.

.. Patterns that fail


The following passage from V. I. Arnol′ d’s talk ‘On the teaching of mathemat-
ics’ [] seems to provide a reasonable description of how mathematics (and in
particular number theory) is done.
Mathematics is a part of physics. Physics is an experimental science, a part
of natural science. Mathematics is the part of physics where experiments are
cheap. . .
The scheme of construction of a mathematical theory is exactly the same as that
in any other natural science. First we consider some objects and make some
observations in special cases. Then we try and find the limits of application
of our observations, look for counter-examples which would prevent unjustified
extension of our observations onto a too wide range of events (example: the
number of partitions of consecutive odd numbers 1, 3, 5, 7, 9 into an odd
number of natural summands gives the sequence 1, 2, 4, 8, 16, but then comes
29).
As a result we formulate the empirical discovery that we made (for example,
the Fermat conjecture or Poincaré conjecture) as clearly as possible. After
this there comes the difficult period of checking as to how reliable are the
conclusions.
At this point a special technique has been developed in mathematics. This
technique, when applied to the real world, is sometimes useful, but can some-
times also lead to self-deception. This technique is called modelling. When
constructing a model, the following idealisation is made: certain facts which
are only known with a certain degree of probability or with a certain degree
of accuracy, are considered to be ‘absolutely’ correct and are accepted as ‘ax-
ioms’. The sense of this ‘absoluteness’ lies precisely in the fact that we allow
ourselves to use these ‘facts’ according to the rules of formal logic, in the process
declaring as ‘theorems’ all that we can derive from them.

Arnol′ d’s parenthetical example is apparently the following. For each number n,
we consider the number of ways to write the odd number 2n − 1 as a sum

t1 + · · · + t2k−1 ,

where k is an arbitrary number (so that 2k − 1 is an arbitrary odd number), but


t1 > · · · > t2k−1 . Let us call the number of such sums an . Immediately a1 = 1;
 Thistheorem too was apparently known to Nicomachus [, II.XX., p. ].
A footnote explains the origin of the text: ‘This is an extended text of an address at a discus-
sion on the teaching of mathematics in Palais de Découverte in Paris on  March .’ The
text is on line at http://pauli.uni-muenster.de/~munsteg/arnold.html (accessed Novem-
ber , ). I do not actually agree that mathematics is a part of physics.

.. Patterns that fail 


and since
3 = 1 + 1 + 1, 5 = 3 + 1 + 1 = 2 + 2 + 1 = 1 + 1 + 1 + 1 + 1,
we have a2 = 2 and a3 = 4. To find a4 , we note
7 =3+2+2
=5+1+1 =3+1+1+1+1
=4+2+1 =2+2+1+1+1
=3+3+1 = 1 + 1 + 1 + 1 + 1 + 1 + 1,
so a4 = 8; and a5 = 16, by the computations in Table . below. Thus the
equation
an = 2n−1 (‡)
is correct when n is 1, 2, 3, 4, or 5. However, there is no obvious reason why it
should be true when n > 5. In fact it fails when n = 6. We have a6 = 29, by
counting the sums listed in Table .. If one is so inclined, one can find further
information on these numbers an in the The On-Line Encyclopedia of Integer
Sequences.
Another failed pattern is shown in Chapter , ‘Proofs’, of Timothy Gowers’s
Mathematics: A Very Short Introduction []. Suppose n distinct points are
chosen on a circle, and each pair of the n points are connected by a straight line,
and no three of those straight lines have a common point. Then the circle is
divided into a number of regions, say an regions. Figure . shows that (‡) now
holds when n is one of the numbers 1, 2, 3, 4, and 5; but when n = 0, then there
is 1 region, not 1/2; and when n = 6, there are 31 regions, not 32.
Is there a formula for the number an here? When we add a new point, so
that there are n + 1 points in all, then the new point will be connected to n
other points. Suppose we number those n points with the numbers from 1 to n
inclusive. Then the line going to point j has j − 1 points on one side, and n − j
on the other, so it crosses (j − 1)(n − j) lines. So this new line is divided into
(j − 1)(n − j) + 1 segments, and each of these corresponds to a new region. Thus
n
X 
a1 = 1, an+1 = an + (j − 1)(n − j) + 1 ;
j=1

this is a recursive definition of the numbers an , but it is perhaps not a very


attractive definition. We can rewrite the last equation as
n−1
X
an+1 = an + n + (j − 1)(n − j).
j=2

 http://oeis.org/, accessed November , .

 . Proving and seeing


9 =4+2+1+1+1
=7+1+1 =3+3+3
=6+2+1 =3+3+1+1+1
=5+3+1 =3+2+2+1+1
=5+2+2 =3+1+1+1+1+1+1
=5+1+1+1+1 =2+2+2+2+1
=4+4+1 =2+2+1+1+1+1+1
=4+3+2 =1+1+1+1+1+1+1+1+1

Table .. The number 9 as the sum of odd numbers of summands

b b b

b b

b b b

b b
b
b b

b b

b b
b
b

Figure .. Partitions of circles by straight lines

.. Patterns that fail 


. Proving and seeing
11 = 9 + 1 + 1 =4+4+3
=8+2+1 =4+4+1+1+1
=7+3+1 =4+3+2+1+1
=7+2+2 =4+2+2+2+1
=7+1+1+1+1 =4+2+1+1+1+1+1
=6+4+1 =3+3+3+1+1
=6+3+2 =3+3+2+2+1
=6+2+1+1+1 =3+3+1+1+1+1+1
=5+5+1 =3+2+2+2+2
=5+4+2 =3+2+2+1+1+1+1
=5+3+3 =3+1+1+1+1+1+1+1+1
=5+3+1+1+1 =2+2+2+2+2+1
=5+2+2+1+1 =2+2+2+1+1+1+1+1
=5+1+1+1+1+1+1 = 2 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1.

Table .. The number 11 as the sum of odd numbers of summands


Pn−1
The sum j=2 (j − 1)(n − j) can be understood as the number of ways to choose
3 points out of n points. Indeed, if the points are again numbered from 1 to n
inclusive, then for each j, there are (j − 1)(n − j) ways to choose i and k so that
i < j < k 6 n. Therefore we have
     
n n n
an+1 = an + n + = an + + .
3 1 3

Recall that in the so-called Pascal’s


 Triangle (Table
 .) if we start
 counting with
0, then entry i in row j is ji ; in particular, ji + i+1
j
= j+1
i+1 . Hence we have
       
n−1 n−1 n−1 n−1
an+1 = an + + + + .
0 1 2 3

Then by induction,
         
n−1 n−1 n−1 n−1 n−1
an = + + + + .
0 1 2 3 4

Here we should understand n−1 j = 0 if n − 1 < j.
For an alternative derivation of the last formula for an , we can consider the
following.
. Even if there are no points, there is 1 region.
. When a new line is drawn, one  new region is created near one endpoint of
the new line; and there are n2 lines.
. In addition, whenever the new line crosses an old line, a new region is
created; and there are n4 crossings.
. Every region can be understood as arising in exactly one of the foregoing
ways.

1
1 1
1 2 1
1 3 3 1
1 4 6 4 1
1 5 10 10 5 1
1 6 15 20 15 6 1
1 7 21 35 35 20 7 1
........................................................................

Table .. Pascal’s Triangle

.. Patterns that fail 


Therefore, again,
          X 4  
n n n n n n−1
an = 1 + + = + + = .
2 4 0 2 4 j=0
4

.. Incommensurability
A Diophantine equation is a polynomial equation with integral coefficients.
If such a solution has no integral solutions, way to prove this is the method of
infinite descent, which is attributed to Pierre de Fermat (–). A simple
application of the method is the following.

Theorem . No integers solve the equation

x2 = 2y 2 .

Proof. Suppose a2 = 2b2 , and a and b are positive. Then a > b. Also, a must
be even. Say a = 2c. Consequently 4c2 = 2b2 , so b2 = 2c2 . Thus we obtain a
sequence
a, b, c, . . . , k, ℓ, . . . ,
where always k 2 = 2ℓ2 . But we have also a > b > c > · · · , which is absurd; there
is no infinite descending sequence of positive integers. Therefore no positive a
and b exist such that a2 = 2b2 .

In geometric form, the theorem is that the side and diagonal of a square are
incommensurable: there is no one line segment that measures, or evenly
 So called after Diophantus of Alexandria (c. –c. ), whose Arithmetica, comprising 
books, treated such problems as, ‘To divide a given square number into two squares’ [,
pp.–]. Diophantus works out an example when the given square number is 16. The
aim then is to find x such that 16 − x2 is a square. We try letting this square have the form
(mx − 4)2 , presumably so that 16 will cancel from the resulting equation. In case m = 2,
we solve
16 − x2 = (2x − 4)2 16
2
16x = 5x2 , = x,
= 4x − 16x + 16, 5

so that 16 = (16/5)2+ (12/5)2 . Thus Diophantus is interested in rational solutions: in the


present example, solutions to the equation x2 + y 2 = z 2 . It was in the margin next to this
problem, in his own copy of the Arithmetica, that Fermat (see below) wrote the claim that
xn + y n = z n has no [rational] solution when n > 2. This claim is the so-called Fermat’s
Last Theorem, although Fermat did not publish a proof, and he almost certainly did not
know a correct proof.
 In his History of Mathematics [, §XVII., p. ], Boyer writes: ‘Some of his theorems

he [Fermat] proved by a method that he called his “infinite descent”—a sort of inverted
mathematical induction, a process that Fermat was among the first to use.’

 . Proving and seeing


divides, each of them. We can see this as follows, using propositions from Euclid’s
Elements []. In Figure ., there is a square, ABCD (constructed by I.).
A B

F d

D C

Figure .. Incommensurability of diagonal and side

On the diagonal BD, the distance BE is marked equal to AB (as by drawing a


circle with center B, passing through A). The perpendicular at E (constructed
by I.) meets AD at F . The straight line BF is drawn. Then triangles ABF
and EBF are congruent, and in particular EF = AF (by I., I., and I.).
Also, triangle DEF is similar to DAB (by VI., since angle DEF is equal to
angle DAB, and angle EDB is common), so DE = EF . Suppose a straight line
G measures both AB and BD. Then it measures ED and DF , since

ED = BD − AB, DF = AB − ED.

The same construction can be performed with triangle DEF in place of DAB.
Since DE < DF (by I. and I.), so that 2ED < AB, there will eventually
be segments that are shorter than G (by X.), but are measured by it, which is
absurd. So such G cannot exist. √
If we consider DA as a unit, then we can
√ write DB as 2. In two ways then,

we have shown then the irrationality of 2. For yet another proof, suppose 2
is rational. Then there are numbers a1 and a2 such that
a1 √
= 2 + 1.
a2
Consequently

a2 1 2−1 √ a1 a1 − 2a2
=√ = √ √ = 2−1= −2= .
a1 2+1 ( 2 + 1)( 2 − 1) a 2 a2

Now let a3 = a1 − 2a2 , so that


a2 a3
= .
a1 a2
 The method is discussed in Heath’s edition of the Elements [, v. III, p. ].

.. Incommensurability 
Continue recursively by defining
an+2 = an − 2an+1 .
Then by induction
an+1 a1 √
= = 2 + 1.
an+2 a2
But an = 2an+1 + an+2 , so a1 > a2 > a3 > · · · , which again is absurd.

The same argument, adjusted, gives us a way to approximate 2. Suppose
there are b1 and b2 such that
b1 √
= 2 − 1.
b2
Then
b2 √ b1 b1 + 2b2
= 2+1= +2= .
b1 b2 b2
If we define
bn+2 = bn + 2bn+1 , (§)
then by induction
bn+1 √
= 2 − 1.
bn+2
Now however the sequence b1 , b2 , . . . , increases, so there is no obvious contradic-
tion. But the definition (§) alone yields
b3 b1
=2+ ,
b2 b2
b4 b2 1
=2+ =2+ ,
b3 b3 b1
2+
b2
b5 b3 1 1
=2+ =2+ =2+ ,
b4 b4 b2 1
2+ 2+
b3 b1
2+
b2
and so on. If we just let b1 = 1 and b2 = 2, then by (§) we sequence of the bn is
the increasing sequence
1, 2, 5, 12, 29, 70, . . .
Then the sequence
2 5 12 29 70
, , , , ,...
1 2 5 12 29

of fractions converges to 2 + 1. That is, we have the following.

 . Proving and seeing


Theorem . When the sequence b1 , b2 , . . . , is defined recursively by

b1 = 1, b2 = 2, bn+2 = bn + 2bn+1 ,

then
bn+1 √
lim = 2 + 1. (¶)
n→∞ bn

Proof. Considering successive differences, we have

bn+2 bn+1 bn bn+1 bn 2 + 2bn bn+1 − bn+1 2


− =2+ − = .
bn+1 bn bn+1 bn bn bn+1

Replacing n with n + 1 gives

bn+3 bn+2 bn+1 2 + 2bn+1 bn+2 − bn+2 2


− =
bn+2 bn+1 bn+1 bn+2
bn+1 2 + 2bn+1 (2bn+1 + bn ) − (2bn+1 + bn )2
=
bn+1 bn+2
bn 2 + 2bn bn+1 − bn+1 2
=−
bn+1 bn+2
b bn+1 
n+2
=− − .
bn+1 bn

By induction then,
bn+2 bn+1 (−1)n+1
− = , (k)
bn+1 bn bn bn+1
since this holds when n = 1. The sequence of products bn bn+1 is positive an
strictly increasing; so we have

b2 b3
< ,
b1 b1
b2 b4 b3
< < ,
b1 b3 b1
b2 b4 b5 b3
< < < ,
b1 b3 b4 b1
b2 b4 b6 b5 b3
< < < < ,
b1 b3 b5 b4 b1

and in general
b2 b4 b6 b7 b5 b3
< < < ··· < < < .
b1 b3 b5 b6 b4 b1

.. Incommensurability 
A consequence of this and (k) is √
that the sequence of fractions bn+1 /bn must be
a Cauchy sequence. The limit is 2 + 1, since
bn+2 √ b
n+2
2
< 2 + 1 ⇐⇒ −1 <2
bn+1 bn+1
 b 2
n
⇐⇒ +1 <2
bn+1
bn √
⇐⇒ < 2−1
bn+1
bn+1 √
⇐⇒ > 2 + 1.
bn

The limit equation (¶) is written more suggestively as

√ 1
2+1=2+ .
1
2+
1
2+
1
2+
1
2+
..
.

 . Proving and seeing


. Numbers

.. The natural numbers


Theorems about natural numbers have been known for thousands of years. Some
of these theorems come down to us in Euclid’s Elements [], for example, or
Nicomachus’s Introduction to Arithmetic [], which were referred to in the last
chapter. Certain underlying assumptions on which the proofs of these theorems
are based were apparently not worked out until more recent centuries.
It turns out that all theorems about the natural numbers are logical conse-
quences of the Axiom below. The Axiom lists five conditions that the natural
numbers meet. Richard Dedekind published these conditions in  [, II, §,
p. ]. In , Giuseppe Peano [, §, p. ] repeated them in a more symbolic
form, along with some logical conditions, making nine conditions in all, which
he called axioms. Of these, the five specifically number-theoretic conditions have
come to be known as the Peano Axioms.
The foundations of number-theory are often not well understood, even today.
Some books give the impression that all theorems about natural numbers follow
from the so-called ‘Well Ordering Principle’ (Theorem ). Others suggest that
the possibility of definition by recursion (Theorem ) can be proved by induction
(part (e) of the Axiom) alone. These are mistakes about the foundations of
number-theory. They are perhaps not really mistakes about number-theory itself;
still, they are mistakes, and it is better not to make them. This is a reason why
I have written this chapter.
An admirable development of the material in this chapter and more is found
in Edmund Landau’s book Foundations of Analysis: The Arithmetic of Whole,
Rational, Irrational, and Complex Numbers: A Supplement to Text-Books on the
Differential and Integral Calculus [].
In the present chapter, when proofs of lemmas and theorems here are not sup-
plied, I have left them to the reader as exercises.
An expression like ‘f : A → B’ is to be read as the statement ‘f is a function
from A to B.’ This means f is a certain kind of subset of the Cartesian product
A × B, namely a subset that, for each a in A, has exactly one element of the form
(a, b); then one writes f (a) = b. The function f can also be written as x 7→ f (x).
Axiom and definition. The set of natural numbers, denoted by
N,
meets the following five conditions.


a)
There is a first natural number, called 1 (one).
b)
Every n in N has a unique successor, denoted (for now) by s(n).
c)
The first natural number is not a successor: if n ∈ N, then s(n) 6= 1.
d)
Distinct natural numbers have distinct successors: if n ∈ N and m ∈ N and
n 6= m, then s(n) 6= s(m).
e) Proof by induction is possible: Suppose A ⊆ N, and two conditions are
met, namely
(i) the base condition: 1 ∈ A, and
(ii) the inductive condition: if n ∈ A (the inductive hypothesis),
then s(n) ∈ A.
Then A = N.
The natural number s(1) is denoted by 2; the number s(2), by 3; &c.

Remark. Again, the five conditions satisfied by N are the Peano axioms. Parts (c),
(d) and (e) of the axiom are conditions concerning a set with a first element and
an operation of succession. For each of those conditions, there is an example
of such a set that meets that condition, but not the others. In short, the three
conditions are logically independent.

Lemma. Every natural number is either 1 or a successor.

Proof. Let A be the set comprising every natural number that is either 1 or a
successor. In particular, 1 ∈ A, and if n ∈ A, then (since it is a successor)
s(n) ∈ A. Therefore, by induction, A = N.

Theorem  (Recursion). Suppose a set A has an element b, and f : A → A.


Then there is a unique function g from N to A such that
a) g(1) = b, and
b) g(s(n)) = f (g(n)) for all n in N.

Proof. The following is only a sketch. One must prove existence and uniqueness
of g. Assuming existence, one can prove uniqueness by induction. To prove
existence, let S be the set of subsets R of N × A such that
a) if (1, c) ∈ R, then c = b;
b) ifS(s(n), c) ∈ R, then (n, d) ∈ R for some d such that f (d) = c.
Then S is the desired function g.

Remark. In its statement (though not the proof), the Recursion Theorem as-
sumes only parts (a) and (b) of the Axiom. The other parts can be proved as
consequences of the Theorem. Recursion is a method of definition; induction is
a method of proof. There are sets (with first elements and successor-operations)
that allow proof by induction, but not definition by recursion. In short, induction
is logically weaker than recursion.

 . Numbers
Definition (Addition). For each m in N, the operation x 7→ m + x on N is the
function g guaranteed by the Recursion Theorem when A is N and b is m and f
is x 7→ s(x). That is,

m + 1 = s(m), m + s(n) = s(m + n).

Lemma. For all n and m in N,


a) 1 + n = s(n);
b) s(m) + n = s(m + n).
Theorem . For all n, m, and k in N,
a) n + m = m + n;
b) (n + m) + k = n + (m + k);
Remark. It is possible to prove by induction alone that there is a unique operation
of addition satisfying the definition and Theorem .
Definition (Multiplication). For each m in N, the operation x 7→ m · x on N is
the function g guaranteed by the Recursion Theorem when A is N and b is 1 and
f is x 7→ x + m. That is,

m · 1 = m, m · (n + 1) = m · n + m.

Lemma. For all n and m in N,


a) 1 · n = n;
b) (m + 1) · n = m · n + n.
Theorem . For all n, m, and k in N,
a) n · m = m · n;
b) n · (m + k) = n · m + n · k;
c) (n · m) · k = n · (m · k);
Remark. As with addition, so with multiplication, one can prove by induction
alone that there is a unique operation satisfying the definition and Theorem .
However, the next theorem requires also parts (c)–(d) of the Axiom.
Theorem  (Cancellation). For all n, m, and k in N,
a) if n + k = m + k, then n = m;
b) if n · k = m · k, then n = m.
Definition (Exponentiation). For each m in N, the operation x 7→ mx on N is
the function g guaranteed by the Recursion Theorem when A is N and b is m and
f is x 7→ x · m. That is,

m1 = m, mn+1 = mn · m.

.. The natural numbers 


Theorem . For all n, m, and k in N,
a) nm+k = nm · nk ;
b) (n · m)k = nk · mk ;
c) (nm )k = nm·k .

Remark. In contrast with addition and multiplication, exponentiation requires


more than induction for its existence.

Definition (Ordering). If n, m ∈ N, and m + k = n for some k in N, then this


situation is denoted by m < n. That is,

m < n ⇐⇒ ∃x m + x = n.

If m < n, we say that m is a predecessor of n. If m < n or m = n, we write

m 6 n.

Theorem . For all n, m, and k in N,


a) 1 6 n;
b) m 6 n if and only if m + k 6 n + k;
c) m 6 n if and only if m · k 6 n · k.

Theorem . For all m and n in N,


a) m < n if and only if m + 1 6 n;
b) m 6 n if and only if m < n + 1.

Theorem . The binary relation leq is a linear ordering: for all n, m, and
k in N,
a) n 6 n;
b) if m 6 n and n 6 m, then n = m;
c) if k 6 m and m 6 n, then k 6 n;
d) either m 6 n or n 6 m.

We may say then that < is a strict linear ordering, because

n 6< n,
k < m & m < n =⇒ k < n,
m 6< n & m 6= n =⇒ n < m.

Theorem  (Strong Induction). Suppose A ⊆ N, and one condition is met,


namely

• if all predecessors of n belong to A (the strong inductive hypothesis),


then n ∈ A.

 . Numbers
Then A = N.
Proof. Let B comprise the natural numbers whose predecessors belong to A.
As 1 has no predecessors, they belong to A, so 1 ∈ B. Suppose n ∈ B. Then all
predecessors of n belong to A, so by assumption, n ∈ A. Thus, by Theorem  (b),
all of the predecessors of n + 1 belong to A, so n + 1 ∈ B. By induction, B = N.
In particular, if n ∈ N, then n + 1 ∈ B, so n (being a predecessor of n + 1) belongs
to A. Thus A = N.
Remark. In general, strong induction is a proof-technique that can be used with
some ordered sets. By contrast, ‘ordinary’ induction involves sets with first ele-
ments and successor-operations, but possibly without orderings. Strong induction
does not follow from ordinary induction alone; neither does ordinary induction
follow from strong induction.
Theorem . The set of natural numbers is well ordered by <: that is, every
non-empty subset of N has a least element with respect to 6.
Proof. Use strong induction. Suppose A is a subset of N with no least element.
We shall show A is empty, that is, N r A = N. Let n ∈ N. Then n is not a least
element of A. This means one of two things: either n ∈ / A, or else n ∈ A, but also
m ∈ A for some predecessor of n. Equivalently, if no predecessor of n is in A,
then n ∈
/ A. In other words, if every predecessor of n is in N r A, then n ∈ N r A.
By strong induction, we are done.
Remark. We have now shown, in effect, that if a linear order (A, 6) admits proof
by strong recursion, then it is well-ordered. The converse is also true.
Theorem  (Recursion with Parameter). Suppose A is a set with an element
b, and F : N × A → A. Then there is a unique function G from N to A such that
a) G(1) = b, and
b) G(n + 1) = F (n, G(n)) for all n in N.
Proof. Let f : N × A → N × A, where f (n, x) = (n + 1, F (n, x)). By recursion,
there is a unique function g from N to N × A such that g(1) = (1, b) and g(n +
1) = f (g(n)). By induction, the first entry in g(n) is always n. The desired
function G is given by g(n) = (n, G(n)). Indeed, we now have G(1) = b; also,
g(n + 1) = f (n, G(n)) = (n + 1, F (n, G(n))), so G(n + 1) = F (n, G(n)). By
induction, G is unique.
Remark. Recursion with Parameter allows us to define the set of predecessors of
n as pred(n), where x 7→ pred(x) is the function G guaranteed by the Theorem
when A is the set of subsets of N, and b is the empty set, and F is (x, Y ) 7→ {x}∪Y .
Then we can write m < n if m ∈ pred(n) and prove the foregoing theorems about
the ordering.

.. The natural numbers 


Definition (Factorial). The operation x 7→ x! on N is the function G guaranteed
by the Theorem of Recursion with Parameter when A is N and b is 1 and F is
(x, y) 7→ (x + 1) · y. That is,

1! = 1, (n + 1)! = (n + 1) · n!

.. The integers


Number theory is fundamentally about the natural numbers, but it is sometimes
useful to consider natural numbers simply as integers. These compose the set

N ∪ {0} ∪ {−x : x ∈ N}, (∗)

which is denoted by
Z.
One may ask what these new elements 0 and −x are. In that case, one can define
Z as the quotient
N × N/∼,
where ∼ is the equivalence relation given by

(a, b) ∼ (x, y) ⇐⇒ a + y = b + x.

The equivalence class of (a, b) is denoted by

a − b.

There are three cases:


. If a < b, then a + c = b for some unique c, and

a − b = 1 − (c + 1).

. If a = b, then
a − b = 1 − 1.
. If b < a, then b + c = a for some unique c, and

a − b = (c + 1) − 1.

Then N embeds in Z under the the map x 7→ (x + 1) − 1, and one can define

0 = 1 − 1, −((x + 1) − 1) = 1 − (x + 1).

One can then identify N with its image in Z. Then again Z can be understood
as in (∗).

 . Numbers
We extend multiplication to Z by defining

0 · x = 0, −x · y − (x · y), −x · −y = x · y.

It is to be understood that multiplication is still to be commutative, so that also


x · 0 = 0 and y · −x = −(x · y).
We extend the ordering to Z by defining

−x < 0, 0 < y, −x < −y ⇐⇒ y < x.

Here of course x and y are elements of N, and the two inequalities −x < 0 and
0 < y are taken to imply −x < y.
Now we can extend addition by defining

z,
 if x < y and x + z = y
−x + −y = −(x + y), −x + y = 0, if x = y,

−z, if y < x and y + z = x.

Finally, we define
−−x = x.
Now one proves the following, where the letters range over Z. First,

a + (b + c) = (a + b) + c,
b + a = a + b,
a + 0 = a,
a + (−a) = 0,

so that Z is an abelian group with respect to addition. Next,

a · (b · c) = (a · b) · c,
a · 1 = a,
1 · a = a, (†)
a · (b + c) = a · b + a · c,
(a + b) · c = a · c + b · c, (‡)

so Z is a ring. But we need not show (†) and (‡) in particular, because we have
finally
a · b = b · a,
so Z is a commutative ring. Moreover,

a < b =⇒ a + c < b + c,
0 < a & 0 < b =⇒ 0 < a · b,

.. The integers 


so Z is an ordered commutative ring. In particular, if a · b = 0, then one of a
and b is 0; so Z is an integral domain.
An integer a is called positive if a > 0, that is, if a ∈ N; but a is zero, if
a = 0, and a is negative, if a < 0.

.. The rational numbers


It is also useful in number theory to be aware that integers are rational num-
bers. In order to define these precisely, it is useful to begin (as one does in
school) with the positive rational numbers. These compose the quotient

N × N/≈,

where ≈ is the equivalence relation defined by

(a, b) ≈ (x, y) ⇐⇒ a · y = b · x.

The equivalence class of (a, b) is denoted by


a
b
or a/b. Let us denote the set of positive rational numbers by

Q+ .

On this set, one shows that the following are valid definitions:
a x ay + bx a x ab a x
+ = , · = , < ⇐⇒ ay < bx.
b y by b y xy b y
We can also define  a −1 b
= ;
b a
then Q+ is an abelian group with respect to multiplication. One shows that Z
embeds in Q+ under the map x 7→ x/1. Now we can identify N with its image in
Q+ . Letting letters stand now for positive rationals, we have, just as in N,

r < s ⇐⇒ ∃x r + x = s.

Now we can obtain the set Q of rational numbers from Q+ just as we obtained
Z from N in the last section. In particular, Q is a commutative ring; it is moreover
a field, because
a 6= 0 =⇒ ∃x ax = 1.
Since also Q is, like Z, an ordered commutative ring, Q is an ordered field.
Finally, Z is an ordered commutative sub-ring of this ordered field.

 . Numbers
.. Other numbers
As a linear order, Q is dense, that is, between any two distinct elements lies a
third:
a < b =⇒ ∃x (a < x & x < b).
Moreover, Q has no endpoints, that is, no greatest or least element.
An order is called complete if every nonempty subset with an upper bound
has a supremum, namely a least upper bound. Then Q is not complete, since
the set {x : 0 < x & x2 < 2} has no supremum.
If a dense linear order without endpoints is given, and a is an element, we can
define
pred(a) = {x : x < a}.
The union of any collection of such subsets is an open subset of the order. In
particular, the whole set and the empty set are open; all other open subsets are
called cuts of the order. The set of all cuts of the order is the completion of the
order. The completion is itself linearly ordered by inclusion (⊆), and the original
order embeds in its completion under the map x 7→ pred(x). In case the original
order is Q, the completion is denoted by

R.

This is the set of real numbers. The operations on Q extend to R in such a


way that R is also an ordered field. then R is a complete ordered field, and
every complete ordered field is isomorphic to R.
However, all of this takes quite a bit of work to prove. One approach is to
consider first the completion of Q+ . If X and Y are cuts of Q+ , one can define
[
X + Y = {pred(x + y) : pred(x) ⊆ X & pred(y) ⊆ Y },
[
X · Y = {pred(x · y) : pred(x) ⊆ X & pred(y) ⊆ Y }.

Then one can obtain R from the completion of Q+ , just as one obtains Z from
N, and Q from Q+ .
Given a commutative ring, we can form 2 × 2 matrices whose entries are from
the ring. These are added and multiplied by the rules
     
a b x y a+x b+y
+ = ,
c d z w c+z d+w
     
a b x y ax + bz ay + bw
· = .
c d z w cx + dz cy + dw
 Theopen sets, so defined, do indeed compose a topology for the order, but it is not the usual
order topology. In the latter, the open sets are unions of sets {x : a < x & x < b}.

.. Other numbers 


Then the set of these matrices is a ring, but usually not a commutative ring. We
define C as the set of 2 × 2 matrices
 
x y
, (§)
−y x

where x and y range over R. One shows that C is a field. We identify R with its
image in C under the map  
x 0
x 7→ ,
0 x
and we define  
0 1
i= .
−1 0
Then every element of C is uniquely x + yi for some x and y in R; moreover,
i2 = −1.
One shows that
√every positive√real number x has a square root, namely the
positive number x such that ( x)2 = x. Then we define

|x + iy| = (x2 + y 2 ).

The field C is complete in a new sense: every Cauchy sequence of complex


numbers converges. Recall that a sequence (an : n ∈ N) of complex numbers is a
Cauchy sequence if for every positive real number ε, there is a positive integer
k such that, if n > k and m > k, then

|an − am | < ε.

Then R itself is also complete in this sense.


The field of complex numbers also has the convenient property of being alge-
braically closed: it contains a solution of every polynomial equation

a0 + a1 x + · · · + an−1 xn−1 + xn = 0, (¶)

for every n in N, where of course the coefficients ak range over C. But there are
other algebraically closed fields.
The field Q is countable, that is, there is a bijection between Q and N. The
same is not true for R or C: they are uncountable. If we select from C the
solutions of the equations (¶) such that the coefficients are rational, the result
is the set of algebraic numbers. This set is a countable algebraically closed
subfield of C.
Every equation a + bx = 0, where a and b are integers and b 6= 0, has a
solution in Q, namely −a/b (that is, −ab−1 ). In particular, there is a solution
when b = 1; but then the solution is just −a, an integer. More generally, if

 . Numbers
the coefficients in (¶) are integers,√then a solution to the equation is called an
algebraic integer. In particular, 2 is an algebraic integer, being a solution of
x2 −2 = 0. The algebraic integers are the subject of algebraic number theory;
so we have had a taste of this in §.. The only algebraic integers in Q are the
usual integers—which in this context may be called rational integers.
The study of R and C is analysis. There is a part of number theory that
makes use of analysis; this is analytic number theory. We shall not try to
do it here, but if one does prove the Prime Number Theorem (Theorem ) for
example, then the Gamma function may be useful: this is the function Γ given
by Z ∞
Γ(x) = e−t tx−1 d x
0

when x > 1. You can show that Γ(n + 1) = nΓ(n), and Γ(1) = 1, so that
G(n + 1) = n!.
Our subject is mainly elementary number theory. This means not that the
subject is easy, but that our integers are just the rational integers, and we shall
not use analysis. However, the proof of Bertrand’s Postulate in §. gives a taste
of analysis.

 For
an overview of algebraic numbers, analytic number theory, and other areas of mathe-
matics, an excellent print reference is The Princeton Companion to Mathematics, edited by
Timothy Gowers with June Barrow-Green and Imre Leader [].

.. Other numbers 


. Divisibility

.. Division
Henceforth minuscule letters will usually denote integers. If n is such, let the set
{nx : x ∈ Z} be denoted by Zn or nZ or

(n).

To give it a name, we may call (n) the ideal of Z generated by n. Note that

(−n) = (n).

Moreover,
a ∈ (n) ⇐⇒ (a) ⊆ (n).

It is not strictly necessary to introduce ideals, but they may clarify some argu-
ments. By definition, if a ∈ (n), that is, if a = nx for some integer x, then n
divides a, or n is a divisor of a; this situation is denoted by

n | a.

Then the following holds, simply because Z is a commutative ring in the sense of
§..

Theorem . In Z:

a | 0,
0 | a ⇐⇒ a = 0,
1 | a,
a | a,
a | b & b | c =⇒ a | c,
a | b & c | d =⇒ ac | bd,
a | b =⇒ a | bx, (∗)
a | b & a | c =⇒ a | b + c. (†)
 In the original terminology, (n) was an ideal number.


In particular, if a | b, then both a and −a divide both b and −b. Every divisor
of an integer b is a proper divisor if it is not ±b (this notion will be useful when
we discuss prime numbers in Chapter ).
We have an additional property because Z is an ordered commutative ring in
which every positive element is 1 or greater; the following does not hold in Q or
R.

Theorem . In Z,
a | b & b 6= 0 =⇒ |a| 6 |b|.
In particular,
a | b & b | a =⇒ a = ±b.

Proof. If a | b, and b 6= 0, then n · |a| = |b| for some positive n, so 1 6 n and


hence |a| 6 n · |a| = |b|.

We have now shown, in effect:

Theorem . The relation | of divisibility is an ordering of N that is refined


by the linear ordering 6, that is, if k, m, and n are in N, then

n | n,
m | n & n | m =⇒ m = n,
k | m & m | n =⇒ k | n,
m | n =⇒ m 6 n.

Ordered sets can be depicted in so-called Hasse diagrams. Consider for


example the positive divisors of 60, namely 1, 2, 3, 4, 5, 6, 10, 12, 15, 20, 30, and
60: these twelve numbers can be arranged as in Figure .. Here a line is drawn
from a number a up to a number b if a | b, but there is no c distinct from a and
b such that a | c and c | b. In general, a | b if and only if there is a path upwards
from a to b.

.. Congruence
If a − b ∈ (n), then we may also write

a≡b (mod n) (‡)


 It does hold in other ordered commutative rings, such as Z[X], the ring of polynomials in a
single variable X with integer coefficients, ordered so that X is greater than every constant
polynomial.

.. Congruence 
60

12 20 30
b b

4 6 10 15
b b

2 3 5

1
Figure .. Divisors of 60

or a ≡ b (n), saying a and b are congruent with respect to the modulus n, or


a and b are congruent modulo a; also b is a residue of a, and a is a residue of
b, modulo n. If the modulus n is understood, we might write simply

a ≡ b.

Congruence with respect to a given modulus is an equivalence-relation. The


congruence-class of a modulo n is

{x ∈ Z : a − x ∈ (n)}.

If n = 0, then congruence modulo n is equality. In any case, congruence modulo


n is the same as congruence modulo −n. So we usually need only be concerned
with positive moduli.
Lemma. For every positive modulus n, for every integer a, distinct elements of
the n-elemment set {a, a + 1, . . . , a + n − 1} are incongruent.
 The notation of (‡) is introduced by Johann Carl Friedrich Gauss (–) in ¶ of his
Disquisitiones Arithmeticae [], first published in . Gauss notes that Legendre uses
the same sign for both equality and congruence, because they are analogous concepts. Gauss
writes in Latin, and Latin nouns, like Turkish nouns, have cases. In particular, the Latin
noun modulus, meaning literally ‘small measure’, has the cases modulum, moduli, modulo,
modulo, corresponding respectively (albeit roughly) to the Turkish modülü, modülün, mod-
üle, modülden. However, Gauss does not use a form like ‘modulo 5’, at least not in the first
two paragraphs of the Disquisitiones; he says instead ‘secundum modulum ’, that is, with
respect to the modulus , or in Turkish 5 modülüne göre. (I took Gauss’s Latin text from
http://resolver.sub.uni-goettingen.de/purl?PPN235993352, December , ; the link
was in the Wikipedia article on the Disquisitiones.)
 Gauss writes in a footnote to his ¶, ‘The modulus must obviously be taken absolutely, i.e.

without sign.’ This suggests to me the picture in which −5 is ‘really’ 5, from a special point
of view.

 . Divisibility
Proof. If i and j are distinct elements of the set, then 0 < |i − j| < n, so n ∤ i − j
by Theorem .
We want now to show that every integer is congruent to some element of
{a, a + 1, . . . , a + n − 1}. To do so, we shall use the greatest integer in a rational
number. This notion applies to arbitrary real numbers as well, through the
following:
Theorem . For every real number x, there is a unique integer k such that
k 6 x < k + 1.
Proof. Assume first x > 0. By the construction in §., there is a rational
number a/b such that x < a/b; and then x < a. By the Well Ordering Principle
(Theorem ), there is a least integer m such that x < m. Then m − 1 is the
desired integer k. If x < 0, we let m be the least integer such that −x 6 m, and
then −m is the desired integer k.
In either case, the integer k is unique by Theorem  (though again, cases must
be considered).
In the theorem, the integer k is the greatest integer in x and can be denoted
by
[x].
Its existence for all x in R is expressed by saying R is archimedean (as an
ordered commutative ring).
Lemma. For every positive modulus n, every integer has a unique residue in
{0, 1, . . . , n − 1}.
Proof. For any integer a, we just compute
hai a hai
6 < + 1,
n n h in
a a a
−1< 6 ,
n n n
a hai
1> − > 0,
n hna i
n>a−n > 0.
n
So a − n[a/n] belongs to the desired set; and it is an integer congruent to a.
 Another way to say R is archimedean is that if a and b are positive real numbers, then for
some positive integer n, na > b. This principle is used by Archimedes (c. – bce) to
show, for example, that the surface of a sphere is equal to a circle of twice the radius [].
An example of a nonarchimedean ordered commutative ring is Z[X], defined in note  on
page  above. We can characterize Z as the unique archimedean ordered commutative ring
with no positive elements less than 1.

.. Congruence 
The following theorem is basically a restatement of the last lemma. It is called
the Division Algorithm, though it is not really an algorithm; it is the observation
that finding a quotient (with remainder) of one integer after division by a nonzero
integer is always possible. So-called long division is an algorithm for doing this
that is learned in school.
Theorem  (Division Algorithm). For every positive integer q, for every integer
a, there are unique integers k and r such that

a = kq + r, 0 6 r < q.

As a consequence of the last two lemmas, we have:


Theorem . For every positive modulus n, for every integer a, every integer
has a unique residue in the set {a, a + 1, . . . , a + n − 1}.
Proof. Every integer x has a unique residue f (x) in {0, 1, . . . , n − 1}. Let g be
the restriction of f to the set {a, a + 1, . . . , a + n − 1}. Then g is injective, and
its domain and codomain are finite sets of the same size; therefore g is surjective
onto {0, 1, . . . , n − 1}. Then g −1 (f (x)) belongs to {a, a + 1, . . . , a + n − 1} and is
a residue of x; moreover, it is unique.
In the theorem, {a, . . . , a + n − 1} is called a complete set of residues
modulo n. We shall be interested mainly in the cases
 h h n i
n − 1i
{0, . . . , n − 1}, − ,..., ,
2 2

the latter set being {−m+1, . . . , m}, if n = 2m, and {−m, . . . , m}, if n = 2m+1.
Theorem . If a ≡ b and c ≡ d, then

a + c ≡ b + d, ac ≡ bd.

Proof. If n | b − a and n | d − c, then, by Theorem , we have n | b − a + d − c,


that is,
n | b + d − (a + c),
and also n | (b − a)c + (d − c)b, that is,

n | bd − ac.

A first application of this is an ancient theorem, found in the work of Theon


of Smyrna [, pp. –].
Theorem . Every square is congruent to 0 or 1 modulo 3 and 4.

 . Divisibility
Proof. By the last theorem, if two integers are congruent, then their squares
are congruent. So it is enough to observe the following: The set {−1, 0, 1} is a
complete set of residues modulo 3, and the square of each element is congruent
to 0 or 1. The set {−1, 0, 1, 2} is a complete set of residues modulo 3, and the
square of each element is congruent to 0 or 1.

The set of congruence-classes of integers modulo n is denoted by Z/nZ or Z/(n)


or simply
Zn .
Then Theorem  is that addition and multiplication are well-defined on Zn ; so
this becomes a commutative ring.

.. Greatest common divisors


A common divisor of a and b is any j such that j | a and j | b. If one of a
and b is not 0, then |j| 6 min(|a|, |b|) by Theorem . In this case, a and b have
a common divisor that is greatest with respect to the linear ordering 6. This
common divisor is called simply the greatest common divisor of a and b and
is denoted by
gcd(a, b).
If c is a common divisor of a and b, then c 6 gcd(a, b).
We immediately have an algorithm for finding gcd(a, b). If one of a and b is 0,
then the absolute value of the other is the greatest common divisor. Otherwise:
a) List the elements of {1, . . . , |a|} that divide a.
b) List the elements of {1, . . . , |b|} that divide b.
c) Find the greatest number that is common to both lists.
For example, we can read gcd(12, 30) = 6 off the Hasse diagram in Figure ..
See Figure .. For large numbers, this algorithm is impractical; we shall develop
12 30

4 6 6 6 10 15
b b

2 3 2 3 2 3 5

1 1 1
Figure .. Common divisors of 12 and 30

.. Greatest common divisors 


the Euclidean Algorithm, which is far superior, in §. below. Meanwhile, note
that every common divisor of 12 and 30 divides 6. We shall show that this is
always true: gcd(a, b) is a common divisor of a and b that is greatest with respect
to the ordering | of divisibility; that is, if c is a common divisor of a and b, then
c | gcd(a, b).
To prove this result, we may note that, by (∗) and (†) in Theorem , if a | b
and a | c, then a divides every linear combination,

ax + by,

of a and b. Let the set {ax+by : x, y ∈ Z} of these linear combinations be denoted


by
(a, b);
this is the ideal of Z generated by a and b. Then

(a) ⊆ (j) & (b) ⊆ (j) ⇐⇒ (a, b) ⊆ (j).

That is, the common divisors of a and b are those j such that (a, b) ⊆ (j). In fact
we have not introduced any new ideals, by the following:

Lemma. For all integers a and b, for some unique non-negative integer k,

(a, b) = (k).

Proof. Immediately (0, 0) = (0). Now suppose one of a and b is not 0. Then (a, b)
has positive elements, and we may let k be the least of these. Then (k) ⊆ (a, b).
We establish the reverse inclusion by showing k divides a and b. By Theorem 
(the Division Algorithm), we have a = kq + r and 0 6 r < k for some q and r.
Then
r = a − kq = a − (ax + by)q = a(1 − qx) + b(−qy)
for some x and y, so r ∈ (a, b), and hence r = 0 by minimality of k. So k | a. By
symmetry, k | b.

Theorem . If a and b are integers, not both 0, then



(a, b) = gcd(a, b) ,

that is, gcd(a, b) is the unique positive integer k such that (a, b) = (k). Hence
every common divisor of a and b divides gcd(a, b).

Proof. We know (a, b) ⊆ (j) if and only if j is a common divisor of a and b. In


particular, if (a, b) = (k), then k is a common divisor of a and b, and if j is also
a common divisor, then (k) ⊆ (j), so j | k, and therefore |j| 6 |k|.

 . Divisibility
The theorem is the reason why the notation (a, b) is sometimes used in place
of gcd(a, b). The following is immediate.
Corollary (Bézout’s Lemma ). If a and b are not both 0, the diophantine equa-
tion
ax + by = gcd(a, b)
is soluble.
The following is sometimes useful:
Theorem . For all integers a, b, and c, if one of a and b is not 0, then

gcd(ac, bc) = gcd(a, b) · c.

In particular, if gcd(a, b) = ℓ, and k | ℓ, then


a b  ℓ
gcd , = .
k k k
If gcd(a, b) = 1, then a and b together are called either relatively prime or
co-prime; also, each of a and b is prime to the other. This is the case if the
equation
ax + by = 1 (§)
is soluble. Conversely, if a and b are co-prime, then (§) must have a solution, by
Bézout’s Lemma. If gcd(a, b) = k, then a/k and b/k are co-prime, by the last
theorem.
Gauss proves the following in ¶ of the Disquisitiones Arithmeticae [], but
he uses the Fundamental Theorem of Arithmetic (Theorem  below) in his proof.
Theorem . If a and b are co-prime, and each divides c, then ab | c.
Proof. Under the hypothesis, c = bs = ar for some s and r, and then the following
equations are soluble:

ax + by = 1,
acx + bcy = c,
absx + bary = c,
ab(sx + ry) = c.

Euclid proves the following in Proposition VII. of the Elements [, ],
though his statement of the theorem assumes a is prime (see p. ).
Theorem . If a | bc and gcd(a, b) = 1, then a | c.
 http://en.wikipedia.org/wiki/Bezout’s_identity (accessed December , ).

.. Greatest common divisors 


Proof. Again, as in the proof of the last theorem, the following have solutions:

ax + by = 1,
acx + bcy = c.

Since a | ac and a | bc, we are done by Theorem .

.. Least common multiples


The Hasse diagram of divisors of 60 in Figure . is symmetrical: if we interchange
n and 60/n, the result is the same diagram, reflected, as on the right of Figure ..
The general result is the following.
1

5 3 2
b b

15 10 6 4
b b

30 20 12

60
Figure .. Divisors of 60, again

Theorem . If d and e are divisors of some nonzero integer n, then


n n
d | e ⇐⇒ | .
e d
Proof. We have d | e if and only if dx = e for some x; but
nx n
dx = e ⇐⇒ ndx = ne ⇐⇒ = .
e d
The theorem leads to a notion that is ‘dual’ to the greatest common divisor.
A common multiple of a and b is any j such that a | j and b | j, that is,
(j) ⊆ (a) ∩ (b). If ab 6= 0, then (a) ∩ (b) has a positive element (either ab or −ab),
so it has a least positive element; this is the least common multiple of a and
b, denoted by
lcm(a, b).

 . Divisibility
The greatest common divisor of a and b is the common divisor of a and b that is
greatest among all common divisors—greatest with respect to the linear ordering
6, but also with respect to divisibility. The least common multiple of a and b
has the corresponding property:
Theorem . If ab 6= 0, then

lcm(a, b) = (a) ∩ (b). (¶)
In particular, lcm(a, b) divides all common multiples of a and b. Moreover,
|ab|
lcm(a, b) = . (k)
gcd(a, b)
Proof. Let c and d be common multiples of a and b. Then gcd(c, d) must also be
a common multiple of a and b. That is, under the assumption (c) ⊆(a) ∩ (b) and
(d) ⊆ (a) ∩ (b), we have (c, d) ⊆ (a) ∩ (b), and therefore gcd(c, d) ⊆ (a) ∩ (b).
In particular, if d ∈
/ (c), then

(c) ⊂ (c, d) = gcd(c, d) ⊆ (a) ∩ (b),
so |c| 6= lcm(a, b). This establishes (¶) and the conclusion that lcm(a, b) divides
all common multiples of a and b.
As a special case, lcm(a, b) divides ab. By Theorem , if x is an arbitrary
divisor of ab, then x is a common multiple of a and b if and only if ab/x is a
common divisor of ab/a and ab/b, which are just b and a. Hence |ab|/ gcd(a, b)
must be the least common multiple of a and b among the divisors of ab. But
we already know that the least of all common multiples of a and b is among the
divisors of ab. Therefore we have (k).
Corollary. If ab 6= 0, and c is a common multiple of a and b, then
|c|
lcm(a, b) = .
gcd(c/a, c/b)
Proof. Theorem .
For example, since gcd(12, 30) = 6, we have that the least common multiple of
60/12 and 60/30 is 60/6, that is,
lcm(5, 2) = 10.
In general, we have a Hasse diagram as in Figure ..
Another corollary of the theorem is the following:
Corollary. If ab 6= 0, and x ≡ y modulo both a and b, then
x≡y (mod lcm(a, b)).

.. Least common multiples 


ab

lcm(a, b)

a b

gcd(a, b)

Figure .. gcd and lcm

.. The Euclidean algorithm


We have observed that every common divisor of a and b divides every linear
combination of a and b. In particular, it divides the remainder of dividing a by
b. For example, let d = gcd(63, 23). Then d divides 63 − 23 · 2, which is 17.
But then 23 − 17 or 6 is another linear combination of 63 and 23, so d divides
this. Similarly d divides 17 − 6 · 2 or 5. Finally, d divides 6 − 5 or 1. Then d
must be 1; that is, gcd(63, 23) = 1, and so 63 and 23 are relatively prime. The
computations are shown in Figure .. The general method for finding greatest

63 = 23 · 2 + 17,
✁✁
✁✁✁

✁✁ 
23 = 17 · 1 + 6,
✁✁
✁✁✁

✁✁ 
17 = 6 · 2 + 5,

✁✁✁
✁✁
✁✁ 
6 = 5·1 + 1,

Figure .. The Euclidean algorithm

 . Divisibility
common divisors is given by Euclid in Propositions VII. and  of the Elements.
In modern notation, we have the following.
Theorem  (Euclidean Algorithm). Suppose a1 > a2 > 0. There are unique
sequences (an : n ∈ N) and (qn : n ∈ N) such that, if an+1 6= 0, then
an = an+1 · qn + an+2 , 0 6 an+2 < an+1 , (∗∗)
but if an+1 = 0, then an+2 = 0 = qn . Then the sequence (an : n ∈ N) is eventually
0, and if am is the last nonzero entry, then
gcd(a0 , a1 ) = am .
Proof. The given conditions amount to a definition by recursion of the function
n 7→ (an , an+1 ). In the notation of Theorem , the set A is Z×Z, and b = (a1 , a2 ),
while f is given by f (x, y) = (y, z), where z is the least nonnegative residue of
x modulo y, if y 6= 0, but z = 0 if y = 0. (The function f is well defined by
Theorem .)
We now have that, if an+1 6= 0, then an+2 < an+1 ; also, the common divisors
of an and an+1 are just the common divisors of an+1 and an+2 , so that
gcd(an , an+1 ) = gcd(an+1 , an+2 ).
In particular, if am is the least of the positive numbers an , then am+1 = 0, so
gcd(a0 , a1 ) = gcd(am , 0) = am .
In §., to establish the incommensurability of the diagonal and side of a square,
we used the variant of the Euclidean Algorithm used by Euclid himself to prove
his Proposition X..
In the notation of Theorem , two consecutive lines of computations as in
Figure . can be written as
an = an+1 · qn + an+2 ,
an+1 = an+2 · qn+1 + an+3 ;
but we can rewrite these as
an an+2
= qn + ,
an+1 an+1
an+1 an+3
= qn+1 + .
an+2 an+2
With the notation ξn for an+1 /an , we now have
1
0 6 ξn < 1, = qn + ξn+1
ξn

.. The Euclidean algorithm 


(assuming ξn+1 6= 0), so
h1i 1
qn = , ξn+1 = − qn . (††)
ξn ξn

Then we have

1 1 1
= q1 + ξ2 = q1 + = q1 + = ...
ξ1 q2 + ξ3 1
q2 +
q3 + ξ4

For example, if we rewrite the computations of Figure . as above, we get

63 17 23 6 17 5 6 1
=2+ , =1+ , =2+ , =1+ ,
23 23 17 17 6 6 5 5
and therefore
63 1
=2+ .
23 1
1+
1
2+
1
1+
5
But the definition (††) can be applied to any real number chosen as ξ1 . If ξn
is never 0 for any n, or equivalently if (q1 , q2 , . . . ) never ends, then by Euclid’s
Proposition X., the number ξ1 must be irrational. √
In §., we worked out the example where ξ1 = 1/ 2. Indeed, let d and s be
the diagonal and side of a square, respectively, as in Figure .. Since d2 −s2 = s2 ,
s

s
s
d

Figure .. Diagonal and side

we have
d−s s
= .
s d+s

 . Divisibility
From this equation, since s < d + s, we have d − s < s. Letting ξ1 = s/d, we have

1 d d−s
= , q1 = 1, ξ2 = ,
ξ1 s s
1 s d+s d−s
= = , q2 = 2, ξ3 = ,
ξ2 d−s s s

so the sequence of qn is (1, 2, 2, . . . ).

.. The Hundred Fowls Problem


Problem . in the Mathematical Classic of Zhang Qiujian  reads thus:
Now one cock is worth 5 qian, one hen 3 qian, and 3 chicks 1 qian. It is required
to buy 100 fowls with 100 qian. In each case, find the number of cocks, hens,
and chicks bought. Answer says: 4 cocks worth 20 qian, 18 hens worth 54 qian,
78 chicks worth 26 qian. Another answer: 8 cocks worth 40 qian, 11 hens worth
33 qian, 81 chicks worth 27 qian. Another answer: 12 cocks worth 60 qian, 4
hens worth 12 qian, 84 chicks worth 28 qian.
Method says: Add 4 to the number of cocks, subtract 7 from the number of
hens and add 3 to the number of chicks to obtain the answer.

The given ‘answers’ are correct; and according to the ‘method’, the given answers
are the only ones possible (assuming at least one cock, one hen, and one chick
must be bought). But why is the method correct? Let

x = # cocks, y = # hens, z = # chicks.

The problem is to solve

x + y + z = 100,
1
5x + 3y + z = 100.
3
Multiplying the second equation by 3 and subtracting the first equation yields
14x + 8y = 200 and then
7x + 4y = 100.
Since 4 | 100, one solution is (0, 25), that is, x = 0 and y = 25, and then z = 75.
Moreover, since 7 and 4 are co-prime, any increase in x must be a multiple of 4,
and then y must decrease by the same multiple of 7, so z must increase by the
 Burton[, pp. –] discusses the problem, but my source for the text is the anthology edited
by Katz [, pp. –], where it is said that the Classic was probably compiled between
the years  and .

.. The Hundred Fowls Problem 


same multiple of 3 (according to the first equation). So we get the three solutions
given, and no others (assuming at least one cock must be bought):

x y z
4 18 78
8 11 81
12 4 84

Joseph W. Dauben [, p.] writes of the Hundred Fowls Problem:


Outside China, versions of the problem appear in the works of, among others,
Alcuin of York in the eighth century, Mahavira in the ninth century, Abu Kamil
in the tenth century, Bhaskara in the twelfth century, Leonardo of Pisa in the
thirteenth century, and al-Kashi in the fifteenth century.

 . Divisibility
. Prime numbers

.. The Fundamental Theorem of Arithmetic


In the th definition in Book VII of the Elements, Euclid defines a prime
number (πρῶτος ἀριθμός) as a number ‘that is measured by a unit alone.’ But
a number (ἀριθμός) here is ‘a multitude composed of units.’ A multitude is more
than one. Thus a unit is not a number for Euclid; it is just a unit, out of which
numbers can be created.
If, according to Euclid, a prime number is measured—or we might say di-
vided —only by a unit, then it seems that no number measures itself. However,
in Proposition  (mentioned above on page  in §.), Euclid mentions that a
number does measure itself. So there seems to be some confusion in Euclid’s text
as we have it today.
Our formulation of Euclid’s definition is that a positive integer is prime if it
has exactly one proper positive divisor, which must then be 1. Having no proper
divisors, 1 is not prime; but 2 is prime. More generally, b is prime if and only if
b > 1 and
a > 0 & a | b =⇒ a ∈ {1, b}.
By Theorem , an alternative formulation of this last condition is

1 < a < b =⇒ a ∤ b.

Throughout this book, p and q will always stand for primes. Then

gcd(a, p) ∈ {1, p},

so either a and p are co-prime, or else p | a.


Theorem . Every integer greater than 1 has a prime divisor.
Proof. If n > 1, then the least of the divisors of n that are greater than 1 must
be prime by Theorem .
A positive integer with a proper divisor that is greater than 1 is composite.
So 1 is neither prime nor composite, but every integer that is greater than 1 is
prime or composite, but not both.
Theorem  (Euclid, VII.). If p | ab, then either p | a or p | b.
Proof. If p ∤ a, then gcd(a, p) = 1, so p | b by Theorem .


Corollary. If p | a1 · · · an , where n > 1, then p | ak for some k.
Proof. Use induction. The claim is trivially true when n = 1. Suppose it is true
when n = m. Say p | a1 · · · am+1 . By the theorem, we have that p | a1 · · · am or
p | am+1 . In the former situation, by the inductive hypothesis, p | ak for some k.
So the claim holds when n = m + 1, assuming it holds when n = m. Therefore
the claim does indeed hold for all n.
The following appears in Gauss’s Disquisitiones Arithmeticae as ¶; Hardy
and Wright [, p. ] judge that to be the first explicit statement of the theorem.
Theorem  (Fundamental Theorem of Arithmetic). Every positive integer is
uniquely a product
p 1 · · · pn
of primes, where
p 1 6 · · · 6 pn .
Proof. Trivially, 1 = p1 · · · pn , where n = 0. Suppose m > 1, and let p1 be its least
prime divisor (which exists by Theorem ). If m = p1 , we are done; otherwise,
the least divisor of m/p1 that is greater than 1 is a prime, p2 . If m = p1 p2 , we
are done; otherwise, the least divisor of m/p1 p2 that is greater than 1 is a prime
p3 . Continuing thus, we get an increasing sequence p1 , p2 , p3 , . . . of primes, where
p1 · · · pk | m. Since
m m
m> > > ··· ,
p1 p1 p2
the sequence of primes must terminate by the Well Ordering Principle, and for
some n we have m = p1 · · · pn .
For uniqueness, suppose also m = q1 · · · qℓ . Then q1 | m, so q1 | pi for some i
by the corollary to Theorem , and therefore q1 = pi . Hence

p1 6 pi = q 1 .

By the symmetry of the argument, q1 6 p1 , so p1 = q1 . Similarly, p2 = q2 , &c.,


and n = ℓ.
Alternatively, every positive integer is uniquely a product

p 1 a 1 · · · pn a n ,

that is,
n
Y
pk a k ,
k=1
where p1 < · · · < pn and the exponents ak are all positive integers. Here of course
the pk (as well as the ak ) depend on the integer. To incorporate this dependence

 . Prime numbers
into the notation, we may say that, for every positive integer a, there is a unique
function p 7→ a(p) on the set of primes such that a(p) > 0 for all p, and a(p) = 0
for all but finitely many p, and
Y
a= pa(p) . (∗)
p

Now the Fundamental Theorem of Arithmetic allows alternative proofs of theo-


rems like  and , since we have
Y Y
gcd(a, b) = pc(p) , lcm(a, b) = pd(p) ,
p p

or simply gcd(a, b) = c and lcm(a, b) = d, where

c(p) = min(a(p), b(p)), d(p) = max(a(p), b(p)).

.. Irreducibility
What is there about N that makes the Fundamental Theorem of Arithmetic
possible?
In an arbitrary commutative ring, the elements analogous to the prime numbers
are called irreducible, and the elements that respect the analogue of Theorem 
are called prime. To be precise, a nonzero element of an arbitrary commutative
ring is a unit if it has a multiplicative inverse. A nonzero element a of the ring
is irreducible if a is not a unit, but whenever a = bc, one of b and c must be a
unit. In this sense, the prime integers are just the positive irreducibles in Z. In
an arbitrary commutative ring, a nonzero nonunit π is called prime if

π | ab & π ∤ a =⇒ π | b.

In an arbitrary commutative ring, irreducibles need not be prime. For example,


let √ √
Z[ 10] = {x + y 10 : x, y ∈ Z},
which is a sub-ring of R. In this sub-ring, we have
√ √
(4 + 10)(4 − 10) = 6 = 2 · 3.

In particular, √
4+ 10 | 2 · 3.

Also, 4 + 10 is irreducible,
√ but it divides neither 2 nor 3. To show this, we use
the operation σ on Z[ 10] given by
√ √
σ(a + b 10) = a − b 10.

.. Irreducibility 
(Compare this with complex conjugation.) Since
√ √ √
(a ± b 10)(c ± d 10) = ac + 10bd ± (ad + bc) 10,

we have σ(xy) = σ(x) · σ(y). Now define

N (x) = x · σ(x),

so that N (a + b 10) = a2 − 10b2 , which is always an integer. Then

N (xy) = N (x) · N (y).



The units of Z[ 10] are just those elements x such that N (x) = ±1. Indeed,
if x is a unit, then xy = 1 for some y, and then N (x) · N (y) = N (xy) = 1, so
N (x) = ±1; conversely,
√ if N (x) = ±1, this
√ means x · (±σ(x)) = 1, so x√is a unit.
For example, 3 + 10 is a unit; but 4 + 10 is not a unit, since N (4 + 10) = 6.
We always have that N (x) is congruent to√ a square modulo 10; so it is conjugate
to one of 0, ±1, ±4, and 5. If xy = 4 + 10, then N (x) · N (y) = 6, but√N (x)
cannot be ±2 or ±3, so one of N (x) and N (y) must be ±1. Thus 4 + 10 is
irreducible. √
Finally, since 6 divides neither √4 nor 9, that is, N (4 + 10) divides neither
N (2) nor N (3), we have that 4 + 10 divides neither 2 nor 3.

.. The Sieve of Eratosthenes


According to Nicomachus [, pp. –], who appears to be our earliest source
on the matter, the following method of finding prime numbers was referred to by
Eratosthenes as a sieve (κόσκινον).
Perhaps everybody knows this method. We know 2 is prime, but the other
positive even numbers are composite. We list the positive odd integers, starting
with 3, continuing as far as we like. We note 3 as prime, but strike out its proper
multiples from the list. The next unstricken number is 5. We note this as prime,
but strike out its proper multiples, and so on, as in Table .. Those numbers
not stricken are prime.
At each step, once a number k is noted as prime, then only k 2 and greater
multiples of k need be stricken; lesser multiples of k have already been stricken.
Hence, if it is the odd numbers less than n2 that are listed, and the proper
multiples of the primes that are less than n are stricken, then the remaining
 Eratosthenes of Cyrene (– bce) also measured the circumference of the earth, by
measuring the shadows cast by posts a certain distance apart in Egypt. Measuring this
distance must have needed teams of surveyors and a government to fund them. Christopher
Columbus was not in a position to make the measurement again, so he had to rely on ancient
measurements [].

 . Prime numbers
.. The Sieve of Eratosthenes

3 5 7 /9 11 13 /
15 17 19 /
21 23 25 /
27 29 31 /33 35 37 /39 41 43 /45 47 49 /51 53 55 /57 59 61 /63
65 67 /
69 71 73 /
75 77 79 /
81 83 85 /87 89 91 /93 95 97 /99 101 103 /105 107 109 /111 113 115 /117 119

3 5 7 /9 11 13 /
15 17 19 /
21 23 /
25 /
27 29 31 /33 /35 37 /39 41 43 /45 47 49 /51 53 /55 /57 59 61 /63
/65 67 /69 71 73 /
75 77 79 /
81 83 /
85 /87 89 91 /93 /95 97 /99 101 103 /105 107 109 /111 113 /115 /117 119

3 5 7 9/ 11 13 /15 17 19 /
21 23 /
25 /
27 29 31 /33 /35 37 /39 41 43 /45 47 /49 /51 53 /55 /57 59 61 /63
/65 67 6/9 71 73 /
75 7/7 79 /
81 83 /
85 /
87 89 /91 /93 /95 97 /99 101 103 /105 107 109 /111 113 /115 /117 119

3 5 7 /9 11 13 /
15 17 19 /
21 23 /
25 /
27 29 31 /33 /35 37 /39 41 43 /45 47 /49 /51 53 /55 /57 59 61 /63
/65 67 /
69 71 73 /
75 /
77 79 /
81 83 /85 /87 89 /91 /93 /95 97 /99 101 103 /105 107 109 /111 113 /115 /117 119

Table .. The Sieve of Eratosthenes



numbers are prime. In Table ., as it is the odd numbers less than 112 that are
listed, so, once the proper multiples of 3, 5, and 7 are stricken, the remaining
numbers are all prime.
Formulating this as a test for individual primes, we have the following.
Theorem . If 1 < m < n2 , and p ∤ m whenever p < n, then m must be prime.
Proof. Suppose 1 < m < n2 , but m is not prime. Then m = ab for some a and
b, where 1 < a 6 b < m, so a2 6 ab = m < n2 and hence a < n. But then a has
a prime factor p by Theorem , so p < n and p | m.
We normally write numbers in decimal notation, which means for example that
365 is a code for the sum 5 + 6t + 3t2 , where t is the fourth triangular number,
b

b b

b b b

b b b b

—called decem in Latin, but in English ten. There is no obvious reason, other
than our having ten fingers, why t should be ten and not be some other number.
Nonetheless, given the decimal system, we have some standard tests for divisibility
by small primes:
Theorem . Let t = 2 · 5. Every positive integer a0 + a1 t + · · · + an tn is
congruent, modulo
a) 2 and 5, to a0 ,
b) 3 (and 9), to a0 + a1 + · · · + an ,
c) 7, to a0 + 3a1 + · · · + 3n an ,
d) 11, to a0 − a1 + · · · + (−1)n an ,
e) 13, to a0 − 3a1 + · · · + (−3)n an .
Every positive integer b0 + b1 t3 + · · · + bn t3n is congruent, modulo 1001 (that is,
1 + t3 , or 7 · 11 · 13), to b0 − b1 + · · · + (−1)n bn .
Suppose n is a composite number less than 372 (that is, 1369). Then n is
divisible by one of the eleven primes
2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31.
We can easily check for divisibility by 2, 3, and 5. If n = a + 10b + 100c + 1000d,
we can consider n − 1001d, that is, a + 10b + 100c − d: this is divisible by 7, 11,
or 13 if and only if n is. If a prime factor of n has not been detected so far, then
n > 172 , and n is divisible by one of 17, 19, 23, 29, and 31. In particular, n is
one of the numbers listed in Table ..
 To create this table, I used a table of Burton [, Table , pp. –], which lists all odd

 . Prime numbers
289 = 17 · 17 779 = 19 · 41 1121 = 19 · 59
323 = 17 · 19 799 = 17 · 47 1139 = 17 · 67
361 = 19 · 19 817 = 19 · 43 1147 = 31 · 37
391 = 17 · 23 841 = 29 · 29 1159 = 19 · 61
437 = 19 · 23 851 = 23 · 37 1189 = 29 · 41
493 = 17 · 29 893 = 19 · 47 1207 = 17 · 71
527 = 17 · 31 899 = 29 · 31 1219 = 23 · 53
529 = 23 · 23 901 = 17 · 53 1241 = 17 · 73
551 = 19 · 29 943 = 23 · 41 1247 = 29 · 43
589 = 19 · 31 961 = 31 · 31 1271 = 31 · 41
629 = 17 · 37 989 = 23 · 43 1273 = 19 · 67
667 = 23 · 29 1003 = 17 · 59 1333 = 31 · 43
697 = 17 · 41 1007 = 19 · 53 1343 = 17 · 79
703 = 19 · 37 1037 = 17 · 61 1349 = 19 · 71
713 = 23 · 31 1073 = 29 · 37 1357 = 23 · 59
731 = 17 · 43 1081 = 23 · 47 1363 = 29 · 47

Table .. Composite numbers less than 1369 with least prime factor 17 or more

.. The infinity of primes


The following has been known for well over two thousand years.

Theorem  (Euclid, IX.). There are more than any number of primes.

Proof. Suppose we have n primes, say p1 , . . . , pn . Then p1 · · · pn + 1 has a prime


factor by Theorem , and this factor is not one of the pk .

There are many proofs of this ancient theorem. A recent proof by Filip
Saidak [] is as follows. Define a0 = 2 and an+1 = an (1 + an ). Suppose k < n.

positive integers that are less than 5000 and are indivisible by 5, along with their least prime
factors. As a check, I noted that my table should contain 48 numbers, namely
• 17 times one of 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79;
• 19 times one of 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71;
• 23 times one of 23, 29, 31, 37, 41, 43, 47, 53, 59;
• 29 times one of 29, 31, 37, 41, 43, 47;
• 31 times one of 31, 37, 41, 43.
Having copied what should be these products from Burton’s table, along with their smaller
prime factors, I used a pocket calculator to find the other factors and thus verify the numbers.
 I learned of the proof from Matematik Dünyası (-II [no. ], p. ). I write this book

for myself and my students; but it is on the web. A colleague of Dr Saidak’s found it and
informed Dr Saidak, who kindly sent me a copy of his original paper.

.. The infinity of primes 


Then ak | ak+1 , and ak+1 | ak+2 , and so on, up to an−1 | an , so ak | an . Similarly,
since 1 + ak | ak+1 , we have 1 + ak | an . Therefore gcd(1 + ak , 1 + an ) = 1. Thus
any two elements of the infinite set {1 + an : n ∈ N} are co-prime.
For yet another proof, using the full Fundamental Theorem of Arithmetic (The-
orem ), we consider the product
Y 1
,
p
1 − 1/p

which is certainly well defined if there are only finitely many primes. Each factor
in the product is the sum of a geometric series:

1 1 1 X 1
= 1 + + 2 + ··· = .
1 − 1/p p p pk
k=0

We have now

Y 1 YX 1
= ,
p
1 − 1/p p
pk
k=0

a product of sums (of infinitely many addends). By distributivity of multiplication


over addition, this product of sums is also a sum of (infinitely many) products,
and each of these products has, as a factor, an addend from each of the original
sums. Such a product then is of the form
Y 1
, (†)
p
pk(p)

where k(p) > 0. This product is 1/k for some positive integer k. Moreover, by
the Fundamental Theorem of Arithmetic as expressed in (∗), for each positive
integer k, the reciprocal 1/k arises as a product as in (†) in exactly one way.
Therefore, under the assumption that there are finitely many primes, we have

Y 1 X 1
= . (‡)
p
1 − 1/p n=1 n

But this is the harmonic series, which diverges:



X 1 1 1 1 1 1 1 1
=1+ + + + + + + + ···
n=1
n 2 3 4 5 6 7 8
1  1 1 1 1 1 1
>1+ + + + + + + + ···
2 4 4 8 8 8 8
1 1 1
=1+ + + + ···
2 2 2

 . Prime numbers
Therefore there are infinitely many primes.
The same computations that give (‡) yield also

Y 1 X 1
s
= . (§)
p
1 − 1/p n=1
ns

The sum converges, when s > 1, to the value denoted by ζ(s); this is the Rie-
mann zeta function of s. Then the product also converges, in the sense that
Y 1
lim = ζ(s).
n→∞ 1 − 1/ps
p6n

Hardy and Wright [, p. ] describe (§) as ‘an analytical expression of the
fundamental theorem of arithmetic.’

.. Bertrand’s Postulate


We shall prove one result on the distribution of primes, namely the so-called
Bertrand’s Conjecture, Theorem  below. The proof does use a bit of analysis,
though this could be eliminated. given an arbitrary positive real number x, we
define X
ϑ(x) = log p.
p6x

Here of course log x is the natural logarithm of x, that is,


Z x
dt
log x = .
1 t
Q
So ϑ(x) = log p6x p. For example, ϑ(π) = ϑ(3) = log 2 + log 3 = log 6. If x < 2,
then ϑ(x) = 0. We could work with eϑ(x) instead, which is an integer if x is; this
is the only case we need consider; but there is no harm in giving a more general
treatment.

Lemma. For all positive real numbers x,

ϑ(x) < 2x log 2.


 The proof here is based on that of Hardy and Wright [, §.], who attribute it to Paul
Erdős []. Note that Erdős’s paper appeared in , and Erdős was born in . An
earlier proof, from , is due to Srinivasa Ramanujan []; this proof is very short (
pages), but makes use of the Gamma function (defined in §.) and the so-called Stirling’s
approximation to it. Hardy and Wright attribute the earliest proof of Bertrand’s Postulate
to Tchebyshef in .

.. Bertrand’s Postulate 


Proof. It is enough to prove the claim when x is a positive integer n. We shall
use strong induction. We have

ϑ(2m) = ϑ(2m − 1),

so the claim holds when n is an even positive integer, provided it holds for lesser
positive integers n.
We now show that the claim holds when n is an odd positive integer, provided
it holds for lesser positive integers n. We have

ϑ(2m + 1) = ϑ(2m + 1) − ϑ(m + 1) + ϑ(m + 1)


X
= log p + ϑ(m + 1).
m+1<p62m+1

Now, each p such that m + 1 < p 6 2m + 1 is a factor of (2m + 1)! that is not
also a factor of (m + 1)!. We also have
 
(2m + 1)! 2m + 1
= ,
(m + 1)! · m! m+1
which is an integer, and each p such that m + 1 < p 6 2m + 1 must be a factor
of this too. Therefore
 
2m + 1
ϑ(2m + 1) 6 log + ϑ(m + 1).
m+1
Now, we have also    
2m + 1 2m + 1
= ,
m+1 m
and these are terms in the expansion of (1 + 1)2m+1 ; so
 
2m + 1
2 6 22m+1 .
m+1
Therefore

ϑ(2m + 1) 6 log(22m ) + ϑ(m + 1)


= 2m log 2 + ϑ(m + 1).

In particular, if ϑ(m + 1) < 2(m + 1) log 2, then ϑ(2m + 1) < 2(2m + 1) log 2.
Thus the claim holds when n is odd if it holds for lesser n.
We should also observe:
 Erdős attributes the result to Legendre. For another proof, see Exercise .

 . Prime numbers
Theorem . For all positive integers n,
∞ h
X X ni
log n! = log p ,
j=1
pj
p6n

that is, thePnumber of times that p divides n! (that is, the greatest k such that

pk | n!) is j=1 [n/pj ].
Proof. The number of times that p divides n! is the sum of:
• the number of multiples ℓp such that ℓp 6 n,
• the number of multiples ℓp2 such that ℓp2 6 n,
• and so on.
That is, it is the sum over all j of those ℓpj such that ℓpj 6 n; but the number
of such multiples ℓpj is [n/pj ]. In other words, p divides n! once for each entry
in each of the lists
hni hni
p, 2p, . . . , p; p2 , 2p2 , . . . , 2 p2 ; ...
p p
Theorem  (Bertrand’s Postulate). For every positive integer n there is a
prime p such that
n < p 6 2n.
Proof. Note that the claim is equivalent to the claim that the sequence 2, 3, 5,
7, 13, 23, 43, 83, 163, 317, 631 of primes, where each successive term is less than
twice the previous term, can be continued indefinitely. Suppose the claim fails for
some n. Then we must have n > 631: in particular, n > 29 . There are exponents
k(p) such that   Y
2n
= pk(p) .
n
p6n
2n
 
By the last theorem, since log n = log (2n)! − 2 log(n!), we have
∞ h
X 2n i h n i
k(p) = −2 j . (¶)
j=1
pj p

Suppose 2n/3 < p 6 n. Then 2p 6 2n < 3p, so that [2n/p] = 2. Also


4n2 2n
p2 > = · 2n > 2n
9 9
and hence [2n/p2 ] = 0. Therefore k(p) = 0. We have now
 
2n Y
= pk(p) .
n
p62n/3

.. Bertrand’s Postulate 


Also, by the earlier lemma,
X X 4n
log p 6 log p = ϑ(2n/3) 6 log 2.
3
p|( n )
2n p62n/3

In the series expansion (¶) for k(p), each term [2n/pj ] − 2[n/pj ] is 0 if [2n/pj ]
is even, and 1 if [2n/pj ] is odd. Also the term is 0 if pj > 2n, that is, j >
log(2n)/ log p. This then is a bound for k(p), that is,
log(2n)
k(p) 6 . (k)
log p
Therefore, if k(p) > 2, then

2 log p 6 k(p) log p 6 log(2n),


√ √
so in particular p 6 (2n). That is, (2n) is a bound on the number of p such
that k(p) > 2. By the bound (k) on k(p) itself, we have now
X X √
k(p) log p 6 log(2n) 6 (2n) log(2n).
k(p)>2 k(p)>2

Therefore
 
2n X X
log = log p + k(p) log p
n
k(p)=1 k(p)>2
X √
6 log p + (2n) log(2n)
p|(2n
n)

4n √
6 log 2 + (2n) log(2n).
3
Also,
2n   2n−1
X    
2n
X 2n 2n 2n
2 = =2+ 6 2n .
j=0
j j=1
j n

Taking logarithms yields


 
2n
2n log 2 6 log(2n) + log
n
4n √
6 log(2n) + log 2 + (2n) log(2n),
3
which gives
2n √ 
log 2 6 1 + (2n) log(2n). (∗∗)
3

 . Prime numbers
Now, log x grows more slowly than any power of x; so the last inequality should
fail if n is large enough. We complete the proof by showing that the inequality
fails when, as we have assumed, n > 29 . To this end, we define

log n − log 29 log n − 9 log 2


ζ= 10
= ,
log 2 10 log 2

so that 1 + ζ = log(2n)/10 log 2 and 2n = 210(1+ζ) . From the inequality (∗∗), we


have now
√ 
2n log 2 6 3 1 + (2n) log(2n),
210(1+ζ) log 2 6 3(1 + 25(1+ζ) ) · 10(1 + ζ) log 2,
210(1+ζ) 6 30(1 + 25(1+ζ) )(1 + ζ),
25(1+ζ) 6 30(2−5(1+ζ) + 1)(1 + ζ).

Since ζ > 0, we have

25 − 2 −5(1+ζ)
25ζ 6 (2 + 1)(1 + ζ)
25
6 (1 − 2−4 )(1 + 2−5 )(1 + ζ)
6 1 + ζ.

But this cannot be, since

25ζ = e5ζ log 2 > 1 + 5ζ log 2 > 1 + ζ


P∞
because of the series expansion ex = j=0 xj /j!.

Some further theorems about the distribution of primes are stated without
their proofs in Appendix B.

.. Bertrand’s Postulate 


. Computations with congruences

.. Exponentiation
Computing powers with respect to a modulus can be achieved by successively
squaring and taking residues. This is justified by Theorem  on page . For
example, with respect to the modulus 43, to compute 3514 , we can first note
35 ≡ −8, so
3514 ≡ (−8)14 ≡ (−1)14 · 814 ≡ 814 .
Also, 14 = 8 + 4 + 2 = 23 + 22 + 21 , so 814 = 88 · 84 · 82 ; and

82 = 64 ≡ 21, 212 = 441 ≡ 11, 112 = 121 ≡ −8,

so that
3514 ≡ −8 · 11 · 21 ≡ −88 · 21 ≡ −2 · 21 ≡ −44 ≡ 1.

.. Inversion
A special case of Theorem  is the implication

a≡b (mod n) =⇒ ac ≡ bc (mod n). (∗)

The converse fails, because, for example, possibly c ≡ 0 (n). Even if this case is
excluded, the converse still fails:

1 · 4 ≡ 10 · 4 (mod 6), 1 6≡ 10 (mod 6). (†)

The reason why we cannot cancel 4 here is that 4 and 6 have a nontrivial common
divisor, in this case 2. The converse of (∗) does hold if c and n are co-prime:
Theorem . If gcd(c, n) = 1, then

ac ≡ bc mod n =⇒ a ≡ b mod n.

Proof. The claim is a restatement of Theorem  on page .


Hence, considering (†) again, since 1 · 4 ≡ 10 · 4 (3), we have

1 ≡ 10 (mod 3).

The general result is the following:


Theorem . For all positive moduli n, for all integers a, b, and c,
n
ac ≡ bc mod n ⇐⇒ a ≡ b mod .
gcd(c, n)
Proof. Let d = gcd(c, n). Then gcd(c/d, n/d) = 1 by Theorem . Hence
ac bc n
ac ≡ bc mod n =⇒ ≡ mod
d d d
n
=⇒ a ≡ b mod
d
by the last theorem. Conversely,
n n
a ≡ b mod =⇒ |b−a
d d
cn
=⇒ | bc − ac
d
=⇒ n | bc − ac
=⇒ ac ≡ bc mod n.
For example, 6x ≡ 6 (9) ⇐⇒ x ≡ 1 (3).
A longer problem is to solve
70x ≡ 18 (mod 134).
This reduces to
35x ≡ 9 (mod 67), (‡)
and solutions of this correspond to solutions to the Diophantine equation
35x + 67y = 9. (§)
By Bézout’s Lemma (the corollary to Theorem  on page ), this is soluble if
and only if gcd(35, 67) | 9. We find gcd(35, 67) by the Euclidean algorithm:
67 = 35 · 1 + 32,
35 = 32 · 1 + 3,
32 = 3 · 10 + 2,
3 = 2 · 1 + 1,
so gcd(35, 67) = 1. To find the solutions to (§), or rather to 35x + 67y = 1, we
rearrange the computations, getting
32 = 67 − 35,
3 = 35 − 32 = 35 − (67 − 35) = 35 · 2 − 67,
2 = 32 − 3 · 10 = 67 − 35 − (35 · 2 − 67) · 10 = 67 · 11 − 35 · 21,
1 = 3 − 2 = 35 · 2 − 67 − 67 · 11 + 35 · 21 = 35 · 23 − 67 · 12.

.. Inversion 
In particular,
35 · 23 ≡ 1 (mod 67), (¶)

so gcd(23, 67) = 1, and (‡) is equivalent to

x ≡ 23 · 9 ≡ 207 ≡ 6 (mod 67),


x ≡ 6, 73 (mod 134).

A way to read (¶) is that 23 is an inverse of 35 with respect to the modulus


67. We can express this by

1
23 ≡ (mod 67).
35
In particular, 35 is invertible as an element of Z67 . We have in general

Theorem . With respect to a modulus, a number is invertible if and only if it


is prime to the modulus.

Proof. The following are equivalent by Bézout’s Lemma (the corollary to Theo-
rem ):
a) a is invertible modulo n,
b) the congruence ax ≡ 1 (mod n) is soluble,
c) the diophantine equation ax + ny = 1 is soluble,
d) gcd(a, n) = 1.

.. Chinese remainder problems


The first known example of a so-called Chinese remainder problem is .
in the Sunzi suan jing (Mathematical Classic of Master Sun), which is ‘most
probably a work of the fourth or early fifth century ce’ [, p. ]. The problem
and its supplied ‘solution’ read thus:

Now there are an unknown number of things. If we count by threes, there is a


remainder 2; if we count by fives, there is a remainder 3; if we count by sevens,
there is a remainder 2. Find the number of things. Answer: 23.
Method: If we count by threes and there is a remainder 2, put down 140. If we
count by fives and there is a remainder 3, put down 63. If we count by sevens
and there is a remainder 2, put down 30. Add them to obtain 233 and subtract
210 to get the answer. If we count by threes and there is a remainder 1, put
down 70. If we count by fives and there is a remainder 1, put down 21. If we
count by sevens and there is a remainder 1, put down 15. When [a number]
exceeds 106, the result is obtained by subtracting 105. [, p. ]

 . Computations with congruences


In our terms, the problem is to solve three congruences simultaneously:

x≡2 (mod 3), x ≡ 3 (mod 5), x ≡ 2 (mod 7).

Note that 3 · 5 · 7 = 105. The given solution is

x ≡ 2 · 70 + 3 · 21 + 2 · 15 (mod 105).

This is a solution, because

70 ≡ 1 (mod 3), 21 ≡ 0 (mod 3), 15 ≡ 0 (mod 3),


70 ≡ 0 (mod 5), 21 ≡ 1 (mod 5), 15 ≡ 0 (mod 5),
70 ≡ 0 (mod 7), 21 ≡ 0 (mod 7), 15 ≡ 1 (mod 7).

This is the only solution, by the corollary to Theorem  on page . The key to
the solution is finding the numbers 70, 21, and 15. Note that

70 = (5 · 7) · 2, 21 = (3 · 7) · 1, 15 = (3 · 5) · 1.

So the real problem is to find the coefficients 2, 1, and 1, which are, respectively,
inverses of 5 · 7, 3 · 7, and 3 · 5, with respect to 3, 5, and 7. When they exist, such
inverses can be found by means of the Euclidean algorithm, as in the previous
section.
The general problem is now solved as follows:

Theorem . If moduli n1 , . . . , nk are given, each being prime to the rest, then
every system of congruences

x ≡ a1 mod n1 , x ≡ a2 mod n2 , ..., x ≡ ak mod nk , (k)

has a solution, which is unique modulo the product N of the moduli. This solution
is given by
N N
x ≡ a1 · · m1 + · · · + a k · · mk (mod N ),
n1 nk
where mi is an inverse of N/ni with respect to ni .

This theorem will be discussed more theoretically in §.. Meanwhile, we have


given the theorem in a ‘self-proving’ formulation: the proposed solution is easily
seen to be a solution, and as noted above, there can be no others.
It may be useful to consider the case of two congruences,

x≡a (mod n), x≡b (mod m), (∗∗)


 The notion of a self-proving theorem is introduced and discussed by Barry Mazur [].

.. Chinese remainder problems 


where gcd(n, m) = 1. For some r and s, we have
nr ≡ 1 (mod m), ms ≡ 1 (mod n), (††)
so that the solution to (∗∗) is
x ≡ ams + bnr (mod nm).
In finding this solution, we could choose r and s by means of the Euclidean
algorithm, so that
nr + ms = 1;
but all we really need is (††). Moreover, we need not actually calculate both r
and s. Indeed, the solutions of (∗∗) are just those sums a + nt in which t is such
that m | a − b + nt, that is,
b − a ≡ nt (mod m); (‡‡)
so t ≡ r(b − a) (mod m). We need not even calculate r; we can just hunt through
a complete set of residues with respect to m for a value of t as in (‡‡).
For example, the following problem is attributed to Brahmagupta:
An old woman goes to market and a horse steps on her basket and crushes
the eggs. The rider offers to pay for the damages and asks her how many eggs
she had brought. She does not remember the exact number, but when she had
taken them out two at a time, there was one egg left. The same happened
when she picked them out three, four, five, and six at a time, but when she
took them seven at a time they came out even. What is the smallest number
of eggs she could have had?
If x is that number, then
x≡1 (mod 2, 3, 4, 5, 6), x≡0 (mod 7).
Since lcm(2, 3, 4, 5, 6) = 60, the problem is to find the least positive solution to
x ≡ 1 (mod 60), x ≡ 0 (mod 7).
so x = 1 + 60t, where t is least such that 7 | 1 + 60t, that is,
−1 ≡ 60t ≡ 4t (mod 7).
By trial, t = 5, and therefore x = 301.
 My source for the problem is http://www.chinapage.com/math/crt.html (accessed December
, ), where the problem is prefaced with the remark, ‘Oystein Ore mentions another
puzzle with a dramatic element from Brahma-Sphuta-Siddhanta (Brahma’s Correct System)
by Brahmagupta (born  AD)’. The page also gives the problem of Sunzi [Master Sun]
quoted on page  above. The Brahmagupta problem is the basis of an exercise in Burton [,
Prob. ..–, p. ]. But the problem is not among the works of Brahmagupta given in the
Katz volume [].

 . Computations with congruences


. Powers of two

.. Perfect numbers


Of the  books of Euclid’s Elements, Books VII, VIII and IX concern numbers.
The last proposition in these books is about perfect numbers, namely those
numbers that are the sums of their (positive) proper divisors. For example, 6
and 28 are perfect since

6 = 1 + 2 + 3, 28 = 1 + 2 + 4 + 7 + 14.

Euclid gives a sufficient condition for being perfect. The proof uses that

1 + 2 + 4 + · · · + 2k−1 = 2k − 1.

Theorem  (Euclid, IX.). If 2k − 1 is prime, then 2k−1 · (2k − 1) is perfect.


Proof. If 2k − 1 is prime, then the positive divisors of 2k−1 · (2k − 1) are the
divisors of 2k−1 , perhaps multiplied by 2k − 1; namely, they are:

1, 2, 4, ..., 2k−1 ,
2k − 1, 2 · (2k − 1), 4 · (2k − 1), ..., 2k−1 · (2k − 1).

The sum of these is (1 + 2 + 4 + · · · + 2k−1 ) · 2k , which is (2k − 1) · 2k . Subtracting


the improper divisor 2k−1 · (2k − 1) leaves the same.
Theorem  has a partial converse:
Theorem . Every even perfect number is 2k−1 · (2k − 1) for some k such that
2k − 1 is prime.
Proof. Let us write σ(n) for the sum of the positive divisors of n. Suppose n is
an even perfect number. Then n = 2k−1 m for some k and m, where k > 1 and m
is odd. Every factor of n is uniquely the product of a factor of 2k−1 and a factor
of m, so

σ(n) = σ(2k−1 ) · σ(m) = (1 + 2 + · · · + 2k−1 ) · σ(m) = (2k − 1) · σ(m).

Since we assume σ(n) = 2n, we have now

2k m = (2k − 1) · σ(m).
 According to Dickson [, p. ], Euler’s proof of this was published posthumously in .


In particular, 2k | σ(m), so σ(m) = 2k · ℓ for some ℓ. Then
m = (2k − 1) · ℓ = σ(m) − ℓ, σ(m) = m + ℓ.
Since m and ℓ are two distinct factors of m, they must be the only positive factors.
In particular, ℓ = 1, and m is prime, so n is as desired.
In his excellent textbook Elementary Number Theory [] (first published in
German in ), Edmund Landau (–) writes, before proving the fore-
going theorems:
This old-fashioned concept of perfect number, and the questions associated
with it, are not especially important; we consider them only because, in so
doing, we will encounter two questions that remain unanswered to this day:
Are there infinitely many perfect numbers? Is there an odd perfect number?
Modern mathematics has solved many (apparently) difficult problems, even in
number theory; but we stand powerless in the face of such (apparently) simple
problems as these. Of course, the fact that they have never been solved is
irrelevant to the rest of this work. We will leave no gaps; when we come to a
bypath which leads to an insurmountable barrier, we will turn around, rather
than—as is so often done—continue on beyond the barrier.
The questions that Landau cites are still unanswered. It is also the aim of the
present book to leave no gaps (except for the unproved theorems in Appendix B,
which however we shall never use).

.. Mersenne primes


The number 2n − 1 is called a Mersenne number, after Marin Mersenne, –
; if the number is prime, it is a Mersenne prime. Since we do not know
whether there are infinitely many even perfect numbers, we do not know whether
there are infinitely many Mersenne primes. However, we do have the following
necessary condition for being a Mersenne prime:
Theorem . if 2n − 1 is prime, then so must n be.
Proof. We have 2k − 1 | 2kℓ − 1 from the identity
xm − 1 = (x − 1)(xm−1 + xm−2 + · · · + x + 1).
So every Mersenne prime is 2p − 1 for some p; but the converse fails, as shown
in Table .. For every p in the table such that 2p − 1 is not prime, we have
 Exercise  asks how we have used that n is even.
 Stated by Fermat in a letter of  to Mersenne, according to Dickson [, p. ].
 The counterexample 211 = 2047 = 23 · 89 was apparently known to Ulrich Regius in  [,

pp. III & ]. However, see Theorem  on page .


 I have not personally verified that 2p − 1 is prime when p is 13, 17, or 19; nor have I verified

that 178481 is prime.

 . Powers of two
p 2p − 1 factorization 2p−1 (2p − 1)
2 3 − 6
3 7 − 28
5 31 − 496
7 127 − 8128
11 2047 23 · 89
13 8191 − 33550336
17 131071 − 8589869056
19 524287 − 137438691328
23 8388607 47 · 178481

Table .. Mersenne primes and perfect numbers

2p + 1 is prime, and 2p + 1 ∼ = ±1 (8). A odd prime p such that 2p + 1 is also


prime is called a Germain prime. Later, with Theorem  on page , we
shall have that if 2p + 1 is a prime q, and q ∼
= ±1 (8), then 2p − 1 is not prime,
because q is a factor, that is,

2p ≡ 1 (mod q).

 Named for Sophie Germain, –.

.. Mersenne primes 


. Prime moduli

.. Fermat’s Theorem


On October , , Pierre de Fermat (–) wrote the following in a letter
to Bernard Frénicle de Bessy (–):

Every prime number is always a factor of one of the powers of any [geometric]
progression minus 1, and the exponent of this power is a divisor of the prime
number minus 1. After one has found the first power that satisfies the propo-
sition, all those powers of which the exponents are multiples of the exponent
of the first power also satisfy the proposition.
Example: Let the given progression be

1 2 3 4 5 6
3 9 27 81 243 729 etc.

with its exponents written on top.


Now take, for instance, the prime number 13. It is a factor of the third power
minus 1, of which 3 is the exponent and a divisor of 12, which is one less than
the number 13, and because the exponent of 729, which is 6, is a multiple of
the first exponent, which is 3, it follows that 13 is also a factor of this power
729 minus 1.
And this proposition is generally true for all progressions and for all prime
numbers, of which I would send you the proof if I were not afraid to be too
long.

More symbolically, the claim is:


. For all p, for all a [such that p ∤ a—Fermat does not appear to make this
condition explicit], there is some positive n such that p | an − 1.
. If k is the least such n, then k | p − 1.
. In this case, if k | m, then p | am − 1.
A consequence of the claim is that, if p ∤ a, then p | ap−1 − 1. This is called
Fermat’s Theorem.
 The letter was in French; I take this selection, in translation, from Struik’s anthology [,
p. ]. The translator of Gauss assigns to what must be the same letter the date of October
,  [, p. , n. ].
 The theorem is sometimes called Fermat’s Little Theorem, as opposed to the so-called Fer-

mat’s Last Theorem (see page , note ).


A proof of this theorem was found among the writings of Leibniz (–
) [, p. ]. The first published proof was by Euler, in . This proof
uses the following:
Lemma. If 0 < k < p, then  
p
p| .
k
Proof. If 0 < k < p, then p divides p!, but not k! or (p − k)!. Since
 
p
p! = · k!(p − k)!,
k
the claim follows from Theorem  on page .
Theorem  (Fermat). For all a,

ap ≡ a (mod p). (∗)

Consequently, for all positive m and n,

m ≡ n mod (p − 1) =⇒ am ≡ an mod p.

If p ∤ a, that is, gcd(p, a) = 1, then

ap−1 ≡ 1 (mod p). (†)

Proof (Euler). We use induction. The claim (∗) holds trivially when a = 1. If it
holds when a = b, then by the lemma,

(b + 1)p ≡ bp + 1p ≡ b + 1 (mod p),

so the claim holds when a = b + 1. Therefore (∗) holds for all a. We now have (†)
by Theorem  on page .
Induction normally proves something true for all positive integers. But (∗)
holds for all integers a, and Euler’s proof establishes this, since every integer is
congruent modulo p to a positive integer, and if a ≡ b (p), then ap ≡ bp (p) by
Theorem . Alternatively, we can understand the proof as establishing ap = a
for all a in Zp . Induction still works here; it just takes us around in a circle,
from 1, to 2, to 3, and so on up to p, and then back to 1. (See Figure ..) In
particular, Zp is one of the sets mentioned after the Axiom in §., in which only
part of the Axiom is satisfied. Indeed, Zp allows induction, but here 1 is the
successor of p.
 Thisis stated by Gauss in the Disquisitiones Arithmeticae [, ¶] and confirmed by Dick-
son [, p. ] and Struik [, p. , n. ].

.. Fermat’s Theorem 


13
12 1
11 2

10 3

9 4

8 5
7 6

Figure .. The integers modulo 13, or Z13

Euler later proved the more general claims of Fermat in the quotation above.
In particular, he showed that, if p ∤ a, then there is some λ such that λ > 1 and
p | aλ − 1. The least such λ is what we shall call the order of a modulo p in §..
If λ is this order, then Euler showed λ | p − 1, and then p | ap−1 − 1. He later
generalized this result, establishing what is called Euler’s Theorem (Theorem 
on page ).
There is yet another proof of Fermat’s Theorem, published by James Ivory in
 []. Perhaps it is the best. If gcd(a, p) = 1, then the products a, 2a, . . . ,
(p − 1)a are all incongruent modulo p, since

ia ≡ ja mod p =⇒ i ≡ j mod p

by Theorem . But 1, 2, . . . , p − 1 are also incongruent. By Theorem , there


are only p − 1 numbers that are incongruent with each other and 0 modulo p; so
the numbers a, 2a, . . . , (p − 1)a are congruent respectively to 1, 2, . . . , p − 1 in
some order. Now multiply the numbers on each side together:

(p − 1)! · ap−1 ≡ (p − 1)! (mod p).

Since (p − 1)! and p are co-prime, we can conclude (†). This implies (∗) in case
p ∤ a; but if p | a, then (∗) is obvious.
With Fermat’s Theorem, we can compute residues of large powers easily. For
example,
658 ≡ 648+10 ≡ (616 )3 · 610 ≡ 610 (mod 17).
 Euler’s treatment can be read in Struik [, pp. –].
 An account of this is in Dickson [, p. ].
 According to Dickson [, p. ], this proof was later rediscovered and published by Dirichlet

in . Landau [, p. ] uses the proof. Hardy and Wright [, p. ] also use it, but
the historical information that they supply about Fermat’s and Euler’s theorems does not
address this proof.

 . Prime moduli
We can continue the computation as in §., by analyzing the exponent 10 as a
sum of powers of 2. Since 10 = 8 + 2, we have 610 = 68 · 62 ; but 62 ≡ 36 ≡ 2
(17), so 68 ≡ (62 )4 ≡ 24 ≡ 16 ≡ −1 (17), and hence

658 ≡ −2 (mod 17).

A contrapositive formulation of Fermat’s Theorem is that, if an 6≡ a (mod n),


then n must not be prime. For example, to see whether 133 is prime, we may
7 2
note that 133 = 128 + 4 + 1 = 27 + 22 + 1, so 2133 = 22 · 22 · 2. Also,

22 = 4;
2
22 = 42 = 16;
3
22 = 162 = 256 ≡ 123 ≡ −10 (mod 133);
24 2
2 ≡ (−10) = 100 ≡ −33;
25
2 ≡ (−33)2 = 1089 ≡ 25;
6
22 ≡ 252 = 625 ≡ −40;
7
22 ≡ (−40)2 = 1600 ≡ 4. (‡)

Therefore 2133 ≡ 4 · 16 · 2 ≡ −5 (mod 133), so 133 must not be prime. Note an


7
alternative computation after (‡): We have 2128 = 22 ≡ 4 ≡ 22 , so 2126 ≡ 1,
133 126+7 7
hence 2 =2 ≡ 2 = 128 ≡ −5.
Now, if we just want to know√whether 133 is prime, it is probably easier to use
the theorems in §.. Indeed, [ 133] = 11, so it is enough to test for divisibility
by 2, 3, 5, 7, and 11. We find then 133 = 7 · 19.
Still, we may raise the theoretical question: Does Fermat’s Theorem give us
an infallible method for testing for primes? Can every composite number be
detected by means of the theorem? The answer turns out to be no.

.. Carmichael numbers


The converse of Fermat’s Theorem fails. It may be that an ≡ a (mod n) for all
a, although n is not prime. To see this, we first define n to be a pseudo-prime
if n is composite, but
2n ≡ 2 (mod n).

To establish an example, we shall use:

Theorem . If p 6= q, and ap ≡ a (q) and aq ≡ a (p), then apq ≡ a (pq).

.. Carmichael numbers 


Proof. Under the hypothesis, we have

apq = (ap )q ≡ aq ≡ a (mod q),


apq = (aq )p ≡ ap ≡ a (mod p),

and hence apq ≡ a (mod lcm(p, q)) by Theorem  and corollary.


Then 341 is a pseudo-prime, since 341 = 11 · 31, and

211 = 2048 = 31 · 66 + 2 ≡ 2 (mod 31),


31 10 3
2 = (2 ) · 2 ≡ 2 (mod 11).

We can now state and prove what resembles a converse to Theorem :
Theorem . If n is a pseudo-prime, then so is 2n − 1.
Proof. If n is a pseudo-prime, then it is not prime, so by Theorem , neither is
2n − 1. We also have 2n ≡ 2 (mod n) by Fermat’s Theorem; say 2n − 2 = kn.
Then n n
22 −1 − 2 = 2 · (22 −2 − 1) = 2 · (2kn − 1),
n
which has the factor 2n − 1; so 22 −1
≡ 2 (mod 2n − 1).
Pseudo-primes as we defined them can be called more precisely pseudo-primes
of base 2. Then a pseudo-prime of base a is a composite number n such that
an ≡ a (mod n). A composite number that is a pseudo-prime of every base can
be called an absolute pseudo-prime. It is also called a Carmichael number
after Robert Daniel Carmichael (–), who published the first examples of
such numbers in  []. If n is a Carmichael number, then

an−1 ≡ 1 (mod n)

whenever gcd(a, n) = 1. We shall establish the converse of this in Theorem  on


page .
Meanwhile, 561 is a Carmichael number. To see this, we first factorize 561 as
3 · 11 · 17 and note

3 − 1 | 561 − 1, 11 − 1 | 561 − 1, 17 − 1 | 561 − 1,

that is, 2 | 560, 10 | 560, and 16 | 560. We now make the following observations.
a) If 3 ∤ a, then a2 ≡ 1 (mod 3), so a560 ≡ 1 (mod 3).
b) If 11 ∤ a, then a10 ≡ 1 (mod 3), so a560 ≡ 1 (mod 11).
c) If 17 ∤ a, then a16 ≡ 1 (mod 3), so a560 ≡ 1 (mod 17).

 . Prime moduli
Hence if one of 3, 11, and 17 fails to divide a, then we have a560 ≡ 1 (561) and
therefore
a561 ≡ a (mod 561). (§)
But if each of 3, 11, and 17 divides a, then 561 | a, so again we have (§).
A positive integer is squarefree if it has no divisor p2 . The proof that 561 is
an absolute pseudo-prime generalizes to establish the following:
Theorem . A number n greater than 1 is a prime or absolute pseudo-prime
if it is squarefree and p − 1 | n − 1 whenever p | n.
The sufficient condition given by the theorem for being an absolute pseudo-
prime is Korselt’s Criterion, so called after Alwin Reinhold Korselt (–
), who proved its sufficiency and necessity in , apparently without ac-
tually finding any absolute pseudo-primes. The term Korselt’s Criterion is used
by Alford et al. in their  paper [], where they prove that there are infinitely
many absolute pseudo-primes.
We can prove the necessity of part of Korselt’s Criterion now; the rest will have
to wait until Theorem  (p. ), when we have primitive roots of primes.
Theorem . Every absolute pseudo-prime is squarefree.
Proof. Suppose n is an absolute pseudo-prime. If p2 | n, then
pn ≡ p (mod p2 ).
But n > 1 (since it is composite), so pn ≡ 0 (mod p2 ), and therefore p ≡ 0
(mod p2 ), which is absurd.

.. Wilson’s Theorem


Evidently (p − 1)! 6≡ 0 (mod p). By the next theorem, the congruence
(p − 1)! ≡ x (mod p) (¶)
has the same solution for all p, namely −1. This was known to Abu Ali al-Hasan
ibn al-Haytham (–) and probably also to Leibniz. The theorem was
published by Edward Waring (c. –) in  and attributed to his student
John Wilson (–), so it is called Wilson’s Theorem. However, the first
published proof was by Joseph-Louis Lagrange (–) in .
Lagrange’s proof makes use of a result that arises from considering successive
differences of powers as in Table . below. (However, Lagrange’s proof is not
 The proof is Exercise .
 According to http://www-history.mcs.st-andrews.ac.uk/Biographies/Al-Haytham.html
(accessed December , ).
 The bare facts are in Dickson [, p. ].

.. Wilson’s Theorem 


1 1 0 1 2 0 1 4 9 0 1 8 27 64
0 1 1 1 3 5 1 7 19 37
0 2 2
0 6 12 18
6 6
0
0 1 16 81 256 625
1 15 65 175 369
14 50 110 194
36 60 84
24 24
0
0 1 32 243 1024 3125 7776

1 31 211 781 2101 4651

30 180 570 1320 2550

150 390 750 1230

240 360 480

120 120

Table .. Successive differences of powers

the simplest; so the reader may wish to skip ahead.) In each triangular array
in the table, the top row is the sequence 0n , 1n , 2n , . . . ; then each successive
row consists of the differences of consecutive entries in the previous row. Let us
number the rows from the top, starting with 0. If row 0 consists of nth powers, it
appears that the entries in row n are n!, so that the entries of all further rows are
0. The appearance is the reality, by induction: First of all it is true when n = 0.
Suppose it is true when n 6 m. We consider the array whose top row consists of
powers xm+1 . We compute
   
m+1 m+1 m m + 1 m−1 m + 1 m−2
(x + 1) −x = (m + 1)x + x + x + ··· .
2 3
By inductive hypothesis, the only term that will have any effect, m rows later,
is (m + 1)xm . That is, as far as row m + 1 is concerned, row 1 might as well
consist of the entries (m + 1)xm . So each entry of row m + 1 is m + 1 times the

 . Prime moduli
corresponding entry of row m of the array whose top row consists of powers of
m. By inductive hypothesis, every entry of this row m is m!. This completes the
induction.
This result gives us the (p − 1)! in Wilson’s Theorem; the −1 that solves (¶)
comes from a more general expression for successive differences:
Lemma. For all non-negative integers n, for all x in R,
n  
n−k n
X
n! = (−1) (x + k)n .
k
k=0

Proof. Given a function f on R, we define the function ∆f by


∆f (x) = f (x + 1) − f (x).
Then by recursion we define
∆0 f = f, ∆n+1 f = ∆n ∆f.
By induction,
n  
X n
∆n f (x) = (−1)n−k f (x + k).
k
k=0
Indeed, the claim holds easily when n = 0, and if it holds when n = m, then by
the computations in Table . (page ), it holds when n = m + 1.
Now we consider the special case when f (x) = xm . We shall be done if we
show that, if 0 6 m 6 n, then
(
0, if m < n,
∆n (xm ) =
n!, if m = n.
(Here of course ∆n (xm ) stands for ∆n f (x), where f is x 7→ xm .) The claim is
easily true when n = 0. Suppose it is true when n = s. If m 6 s, then ∆s (xm ) is
a constant function of x, so ∆s+1 (xm ) = 0. Considering the case m = s + 1, we
have

∆s+1 (xs+1 ) = ∆s (x + 1)s+1 − xs+1 )
s  
s
X s+1 k
=∆ x
k
k=0
s  
X s+1
= ∆s xk
k
k=0
= (s + 1) · s!
= (s + 1)!.
 The lemma appears to be due to Euler [, p. ].

.. Wilson’s Theorem 


So the claim is true when n = s + 1.
Theorem  (Wilson). Suppose n > 1. Then (n − 1)! ≡ −1 (mod n) if and
only if n is prime.
Proof (Lagrange). Suppose n is not prime, so that n = ab, where 1 < a < n.
Then a 6 n − 1, so a | (n − 1)!, so a ∤ (n − 1)! + 1, so n ∤ (n − 1)! + 1.
For the converse, from the lemma in case n = p − 1 and x = 0 we have
p−1  
X p − 1 p−1
(p − 1)! = (−1)p−1−k k .
k
k=0

By Fermat’s Theorem then,


p−1 
X p−1
p−1−k
(p − 1)! ≡ (−1)
k
k=1
p−1  
X p−1
≡ (−1)p−1−k −1
k
k=0
≡ (1 − 1)p−1 − 1
≡ −1 (mod p).

Wilson’s Theorem gives a theoretical test for primality, though not a practical
one.
For an alternative proof of the hard direction of Wilson’s Theorem, we may
note that, by Theorem , each number on the list 1, 2, 3, . . . , p − 1 has an inverse
modulo p. Also, x2 ≡ 1 (mod p) has only the solutions ±1, that is, 1 and p − 1,
since if p | x2 − 1, then p | x ± 1. So each number on the list 2, 3, . . . , p − 2 has
an inverse that is also on the list and is distinct from itself. Also the inverse of
the inverse is the original number. Therefore the product of the numbers on the
list is 1 modulo p. Consequently

(p − 1)! ≡ p − 1 ≡ −1 (mod p).

For example, modulo 11, we have

1 ≡ 2 · 6 ≡ 3 · 4 ≡ 5 · 9 ≡ 7 · 8,

and therefore

10! ≡ (2 · 6)(3 · 4)(5 · 9)(7 · 8) · 10 ≡ 10 ≡ −1.


 The necessity that n be prime was apparently not part of the original statement of Wilson’s
Theorem. Lagrange proved it [, p. ].

 . Prime moduli
Since the modulus was small, the inverses here could be found by trial. With a
larger modulus, the Euclidean Algorithm can be used as in §..
We may also note that 2 has the following powers with respect to the modu-
lus 11:
k 1 2 3 4 5 6 7 8 9 10
2k 2 4 8 5 10 9 7 3 6 1 mod 11
So every number that is prime to 11 is congruent to a power of 2. In particular,
the invertible integers modulo 11 compose a multiplicative group generated by 2;
we express this by saying 2 is a primitive root of 11. We shall investigate primitive
roots in Chapter . Meanwhile, if in the last table, we write the residues that
are least in absolute value, we get

k 1 2 3 4 5 6 7 8 9 10
2k 2 4 −3 5 −1 −2 −4 3 −5 1 mod 11

In particular,
−1 ≡ 25 (mod 11).
Then the congruence −1 ≡ x2 (11) is insoluble. Indeed, any solution would be
congruent to a power 2k , and then 25 ≡ 22k , so 22k−5 ≡ 1; but this is impossible,
since all residues of 2k − 5 with respect to 10 are odd, and powers of 2 with odd
exponents 1, 3, 5, 7, or 9 are never 1. We say therefore that −1 is a quadratic
nonresidue of 11.
By contrast, from the table

k 1 2 3 4 5 6 7 8 9 10 11 12
2k 2 4 −5 3 6 −1 −2 −4 5 −3 −6 1 (mod 13)

we have
−1 ≡ 26 ≡ (±5)2 (mod 13),
so −1 is a quadratic residue of 13.
In general, if p is an odd prime not dividing a, then a is a quadratic residue
of p if the congruence a ≡ x2 (p) is soluble; otherwise, a is a quadratic non-
residue of p. We shall develop the theory of quadratic residues and nonresidues
in Chapter . Meanwhile, a preliminary result follows from Wilson’s Theorem.
For convenience in stating and proving it, we use the notation
p−1
̟ = ̟(p) = , (k)
2
where p is an odd prime.
 The symbol ̟ is a variant of π; in using it here I follow Hardy and Wright [, p. ].

.. Wilson’s Theorem 


Theorem . Suppose p is an odd prime. Then

(̟!)2 ≡ (−1)̟−1 (mod p), (∗∗)

and the following are equivalent.

. p ≡ 1 (mod 4).

. (̟!)2 ≡ −1 (mod p).

. −1 is a quadratic residue of p.

Proof. By Wilson’s Theorem, modulo p,

−1 ≡ (p − 1)! ≡ 1 · 2 · · · ̟ · (̟ + 1) · · · (p − 1)
≡ 1 · (p − 1) · 2 · (p − 2) · · · ̟ · (̟ + 1)
≡ 1 · (−1) · 2 · (−2) · · · ̟ · (−̟)
≡ (−1)̟ (̟!)2 ,

that is,
̟
Y p−1
Y ̟
Y ̟
Y

−1 ≡ k· k≡ k · (p − k) ≡ (−1)̟ · (k 2 ) ≡ (−1)̟ · (̟!)2 ,
k=1 k=̟+1 k=1 k=1

which yields (∗∗). If p ≡ 1 (mod 4), then ̟ is even, so (̟!)2 ≡ −1, and therefore
−1 is a quadratic residue of p.
Conversely, if a2 ≡ −1 (mod p), then by Fermat’s Theorem,

1 ≡ ap−1 ≡ (a2 )̟ ≡ (−1)̟ (mod p),

so ̟ must be even, and therefore p ≡ 1 (mod 4).

A related argument using quadratic residues in §. will provide yet another
proof of Wilson’s Theorem.

 . Prime moduli
∆m+1 f (x)
= ∆m ∆f (x)
m  
m−k m
X
= (−1) ∆f (x + k)
k
k=0
m  
X m 
= (−1)m−k f (x + k + 1) − f (x + k)
k
k=0
m   m  
m−k m m−k m
X X
= (−1) f (x + k + 1) − (−1) f (x + k)
k k
k=0 k=0
m−1  
X m
= f (x + m + 1) + (−1)m−k f (x + k + 1)
k
k=0
m  
m−k m
X
− (−1) f (x + k) − (−1)m f (x)
k
k=1
m  
X
m+1−k m
= f (x + m + 1) + (−1) f (x + k)
k−1
k=1
m  
X m
+ (−1)m+1−k f (x + k) + (−1)m+1 f (x)
k
k=1
m  
m+1−k m + 1
X
= f (x + m + 1) + (−1) f (x + k) + (−1)m+1 f (x)
k
k=1
m+1  
X m+1
= (−1)m+1−k f (x + k),
k
k=0

Table .. The inductive step for ∆n f (x) (see page )

.. Wilson’s Theorem 


. Arithmetic functions

.. Multiplicative functions


We work now with positive integers—natural numbers—only. A function on N
is an arithmetic function. One such function is σ as defined in the proof of
Theorem , so that σ(n) is the sum of the (positive) divisors of n. For the
number of positive divisors of n, we write τ(n). For example,
τ(12) = 1 + 2 + 3 + 4 + 6 + 12 = 28,
σ(12) = 1 + 1 + 1 + 1 + 1 + 1 = 6.

Indeed, 12 = 22 · 3, so the divisors of 12 are


20 · 30 , 21 · 30 , 22 · 30 ,
20 · 31 , 21 · 31 , 22 · 31 .
Then the factors of 12 are determined by a choice from {0, 1, 2} for the exponent
of 2, and from {0, 1} for the exponent of 3. Hence
τ(12) = (2 + 1) · (1 + 1).
Similarly, each factor of 12 itself has two factors: one from {1, 2, 4}, and the other
from {1, 3}; so
σ(12) = (1 + 2 + 4) · (1 + 3)
= (1 + 2 + 22 ) · (1 + 3)
23 − 1 32 − 1
= · .
2−1 3−1
These ideas work in general. Here we use the notation introduced in §.:
Q
Theorem . If n = p pn(p) , then
Y Y pn(p)+1 − 1
τ(n) = (n(p) + 1), σ(n) = .
p p
p−1

We can abbreviate the definitions of σ and τ as follows:


X X
σ(n) = d, τ(n) = 1. (∗)
d|n d|n


Implicitly here, d ranges over the positive divisors of n. In the theorem, the
indices p range over all primes; but they need only range over
Q the primes dividing
n (since n(p) = 0 when p ∤ n). That is, we can write n as p|n pn(p) , and then

Y Y pn(p)+1 − 1
τ(n) = (n(p) + 1), σ(n) = .
p−1
p|n p|n
Q
In short, each of σ(n) and τ(n) is of the form p|n f (p) for some function f on
the set of primes.
Theorem . If gcd(m, n) = 1, then for any function f on the set of primes,
Y Y Y
f (p) = f (p) · f (q).
p|mn p|m q|n

Proof. If gcd(m, n) = 1 and p | mn, then by Theorem , p | m ⇐⇒ p ∤ n.


Consequently, if gcd(m, n) = 1, then

σ(mn) = σ(m) · σ(n), τ(mn) = τ(m) · τ(n).

We say therefore that σ and τ are multiplicative. in general, an arithmetic func-


tion f is multiplicative if

f (nm) = f (n) · f (m)

whenever n and m are co-prime. We do not require the identity to hold for
arbitrary m and n. For example,

σ(2 · 2) = σ(4) = 1 + 2 + 4 = 7, σ(2) · σ(2) = (1 + 2) · (1 + 2) = 9.

The identify function n 7→ n and the constant function n 7→ 1 are multiplicative.


We can denote these functions by

id, 1,
P P P P
respectively. Since σ(n) = d|n d = d|n id(d) and τ(n) = d|n 1 = d|n 1(d),
the multiplicativity of σ and τ is also a special case of the following.
Theorem . If f is multiplicative, and F is given by
X
F (n) = f (d), (†)
d|n

then F is multiplicative.

.. Multiplicative functions 


Before working out a formal proof, we can see why the theorem ought to be
true from an example. Note first that, if f is multiplicative and non-trivial, so
that f (n) 6= 0 for some n, then

0 6= f (n) = f (n · 1) = f (n) · f (1),

so f (1) = 1. If also f and F are related by (†), then

F (36)
= F (22 · 32 )
= f (1) + f (2) + f (4) + f (3) + f (6) + f (12) + f (9) + f (18) + f (36)
= f (1) · f (1) + f (2) · f (1) + f (4) · f (1) +
+ f (1) · f (3) + f (2) · f (3) + f (4) · f (3) +
+ f (1) · f (9) + f (2) · f (9) + f (4) · f (9)
= (f (1) + f (2) + f (4)) · (f (1) + f (3) + f (9))
= F (4) · F (9).

Proof of theorem. Assuming gcd(m, n) = 1, we show first


X XX
F (mn) = f (c) = f (de). (‡)
c|mn d|m e|n

Suppose c | mn. Then every prime power that divides c divides exactly one of
m and n. Hence c and gcd(c, m) gcd(c, n) have the same prime power divisors,
so they are equal. Moreover, if c = de, where d | m and e | n, then c | mn,
d = gcd(c, m), and e = gcd(c, n). So we have (‡). Continuing, we have
XX
F (mn) = f (de)
d|m e|n
XX
= f (d) · f (e)
d|m e|n
X X
= f (d) · f (e) (§)
d|m e|n

= F (m) · F (n).

PIn the proof,


P note that
 the expression in (§) should
P be understood
P first as
d|m f (d) · e|n f (e) , and second as its equal, d|m f (d) · e|n f (e).

 . Arithmetic functions
.. The Möbius function
Suppose again F is defined from f as in (†), so that

F (1) = f (1)
F (2) = f (1) + f (2)
F (3) = f (1) + f (3)
F (4) = f (1) + f (2) + f (4)
F (6) = f (1) + f (2) + f (3) + f (6)
F (8) = f (1) + f (2) + f (4) + f (8)
F (9) = f (1) + f (3) + f (9)
F (12) = f (1) + f (2) + f (3) + f (4) + f (6) + f (12)
F (18) = f (1) + f (2) + f (3) + f (6) + f (9) + f (18)
F (24) = f (1) + f (2) + f (3) + f (4) + f (6) + f (8) + f (12) + f (24)

Then we can solve successively for f (1), f (2), and so on:

f (1) = F (1)
f (2) = −F (1) + F (2)
f (3) = −F (1) + F (3)
f (4) = − F (2) + F (4)
f (6) = F (1) − F (2) − F (3) + F (6)
f (8) = − F (4) + F (8)
f (9) = − F (3) + F (9)
f (12) = F (2) − F (4) − F (6) + F (12)
f (18) = F (3) − F (6) − F (9) + F (18)
f (24) = F (4) − F (8) − F (12) + F (24)
There is some function ξ, taking integral values, such that
X
f (n) = F (d) · ξ(n, d).
d|n

A candidate for ξ that works in our examples is (n, d) 7→ µ(n/d), where µ is given
by (
0, if p2 | n for some prime p;
µ(n) =
(−1)r , if n = p1 · · · pr , where p1 < · · · < pr .
In particular, µ(1) = 1. The function µ is called the Möbius function (af-
ter August Ferdinand Möbius, –). In an alternative (but equivalent)
definition, µ(n) = 0 unless n is squarefree, but in this case
Y
µ(n) = −1. (¶)
p|n

.. The Möbius function 


Theorem . The Möbius function µ is multiplicative.
Proof. Suppose gcd(m, n) = 1. If p2 | mn, then we may assume p2 | m, so
µ(mn) = 0 = µ(m) = µ(m) · µ(n). If mn is squarefree, then (¶) and the proof of
Theorem  show µ(mn) = µ(m) · µ(n).
It will be useful to define the unit function, namely the function ε given by
(
1, if n = 1,
ε(n) =
0, if n > 1.

This is easily a multiplicative function. Both the statement and the proof of the
following theorem are important.
Theorem . For all n, X
µ(d) = ε(n).
d|n

Proof. Both sides of the desired equation are multiplicative functions of n. There-
fore it is sufficient to prove the equation when n is a prime power. This is easy:
X s
X
µ(d) = µ(pk )
d|ps k=0

= µ(1) + µ(p) + · · · + µ(ps )


(
µ(1), if s = 0,
=
µ(1) + µ(p), if s > 0
(
1, if s = 0,
=
1 − 1, if s > 0
= ε(ps ).

Another important, albeit easy, observation is:


Theorem . For all arithmetic functions f ,
X n
f (d) · ε = f (n).
d
d|n

Now we can prove that the function ξ above is indeed (n, d) 7→ µ(n/d):
Theorem  (Möbius Inversion). If f determines F by the rule (†), namely
X
F (n) = f (d),
d|n

 . Arithmetic functions
then F determines f by the rule
X n
f (n) = µ · F (d),
d
d|n

and conversely.
Proof. We just start calculating:
X n X n X
µ · F (d) = µ · f (e)
d d
d|n d|n e|d
X X n
= µ · f (e).
d
d|n e|d

Now we want to rearrange indices. For all factors d and e of n, we have


n n
e | d ⇐⇒ |
d e
by Theorem . So there is a bijection between {(d, e) : d | n & e | d} and
{(e, c) : e | n & c | n/e}, namely (d, e) 7→ (e, n/d); the inverse is (e, c) 7→ (n/c, e).
Therefore
X n X X
µ · F (d) = µ(c) · f (e)
d
d|n e|n c|(n/e)
X X
= f (e) · µ(c)
e|n c|(n/e)
X n
= f (e) · ε
e
e|n

= f (n)

by the last two theorems. The converse is similar.

.. Convolution
We can streamline some of the foregoing results. If f and g are arithmetic func-
tions, their convolution is the function f ∗ g, given by
X n
(f ∗ g)(n) = f (d) · g .
d
d|n

Now, we have the following general principle:


 This is Exercise .

.. Convolution 
Theorem . For every arithmetic function f ,
X X n
f (d) = f .
d
d|n d|n

We shall use this below for an alternative proof of Theorem  (p. ) and for
Theorem  (p. ). Meanwhile, we have
X n
(f ∗ g)(n) = f · g(d),
d
d|n

or more simply
f ∗ g = g ∗ f. (k)
The definition (∗) of σ and τ can be written as

σ = id ∗ 1, τ = 1 ∗ 1.

Theorem  is that if f is multiplicative and F = f ∗ 1, then F is multiplicative.


The proof can be adapted to show that, if f and g are multiplicative, then so is
f ∗ g. Theorems  and  are

µ ∗ 1 = ε, f ∗ ε = f. (∗∗)

Then Theorem , Möbius Inversion, is

F = f ∗ 1 ⇐⇒ f = F ∗ µ.

We proved this by manipulating indices of summation. Using such manipulations,


we can show instead
f ∗ (g ∗ h) = (f ∗ g) ∗ h.
By this and (k), Theorem  is equivalent to

f ∗ 1 ∗ µ = f;

but we can now understand this as an immediate consequence of Theorems 


and , as expressed in (∗∗).
By repeated convolution, we have the following equations:

µ ∗ 1 = ε, ε ∗ µ = µ,
ε ∗ 1 = 1, 1 ∗ µ = ε,
1 ∗ 1 = τ, τ ∗ µ = 1.

 . Arithmetic functions
You can read down the first column, and up the second; each row is an instance
of Möbius inversion. In short, we have a sequence

. . . , µ, ε, 1, τ, . . .

where passage to the right is by convolving with 1; and to the left, µ. Since
id ∗ 1 = σ, the corresponding sequence with σ is

. . . , id, σ, . . .

We now define the entry to the left of id as ϕ. That is,

ϕ = id ∗ µ. (††)

Then ϕ is multiplicative, and


(
s 1, if s = 0,
ϕ(p ) =
ps − ps−1 , if s > 0.

This is precisely the size of the set {x : 0 6 x < n & gcd(x, n) = 1} when n = ps .
In general, this set can be understood as the set of invertible congruence-classes
modulo n. Recall from §. that the set of all congruence-classes modulo n can
be denoted by Zn . Then the set of invertible elements is denoted by

Zn× .

So in case n = ps , we have
ϕ(n) = |Zn× |.
We shall show in the next chapter that this holds generally.
Meanwhile, it may be of interest to note that convolution is called in particular
Dirichlet convolution (after Johann Peter Gustav Lejeune Dirichlet, –),
because analogous operations, also called convolutions, arise in other contexts.
For example, the reader may be in a position to recall that in analysis one defines
Z t
(f ∗ g)(t) = f (x)g(t − x) d x.
0

This is related to the Laplace transform, which converts a suitable function f


into the function L{f }, namely
Z ∞
s 7→ e−st f (t) d t.
0
 The pronunciation is dirikle, not dirişle.

.. Convolution 
Then
L{f ∗ g} = L{f } · L{g}.
Also, the transform is linear, and

L{f ′ } = id · L{f } − f (0),


L{f ′′ } = id2 · L{f } − id · f (0) − f ′ (0),

so that, if
f ′′ + af ′ + bf = g,
then
f (0) · id + af (0) + f ′ (0) L{g}
L{f } = + 2
id2 + a · id + b id + a · id + b
= L{ϕ} + L{g} · L{h}
= L{ϕ} + L{g ∗ h}

for some ϕ and h that are independent of g.

 . Arithmetic functions
. Arbitrary moduli

.. The Chinese Remainder Theorem


The possibility of solving Chinese remainder problems can be understood through
tables. Since gcd(4, 9) = 1, for every choice of a and b, Theorem  (p. ) gives
us a solution to

x≡a (mod 4), x≡b (mod 9), (∗)

and the solution is unique modulo 36. We can find this solution by first filling
out a table diagonally as follows:

0 1 2 3 4 5 6 7 8
0 0 4 8
1 1 5
2 2 6
3 3 7
0 1 2 3 4 5 6 7 8
0 0 12 4 16 8
1 9 1 13 5 17
2 10 2 14 6
3 11 3 15 7
0 1 2 3 4 5 6 7 8
0 0 20 12 4 24 16 8
1 9 1 21 13 5 25 17
2 18 10 2 22 14 6 26
3 19 11 3 23 15 7
0 1 2 3 4 5 6 7 8
0 0 28 20 12 4 32 24 16 8
1 9 1 29 21 13 5 33 25 17
2 18 10 2 30 22 14 6 34 26
3 27 19 11 3 31 23 15 7 35

The solution to (∗) is the entry in row a, column b. For example, 14 solves
the congruences x ≡ 2 (4) and x ≡ 5 (9). Making such a table is not always
practical. Still, the general procedure has the following theoretical formulation.


Theorem  (Chinese Remainder Theorem). If gcd(m, n) = 1, then the function
x 7→ (x, x) is a well-defined bijection from Zmn to Zm × Zn .

Proof. The given function is well defined, since if a ≡ b (mn), then a ≡ b modulo
m and n. The converse of this holds too, by the corollary to Theorem , since
mn = lcm(m, n); so the function is injective. Since the domain and codomain are
finite sets of the same size (namely mn), the function is a bijection.

For all m and n, we have

gcd(x, mn) = 1 ⇐⇒ gcd(x, m) = 1 & gcd(x, n) = 1. (†)

This means, in the table above, if we delete row i and column j whenever
gcd(4, i) 6= 1 and gcd(9, j) 6= 1, then the remaining numbers are precisely those
that are prime to 36:

0 1 2 3 4 5 6 7 8
0
1 1 29 13 5 25 17
2
3 19 11 31 23 7 35

Recall that on page  we defined Zn× as the set of invertible elements of Zn .


Then we have the following general result.

Theorem . If gcd(m, n) = 1, then the function x 7→ (x, x) is a well-defined


bijection from Zmn× to Zm× × Zn× .

Proof. By (†), for all m and n, the function x 7→ (x, x) maps Zmn× into Zm× ×Zn× .
If gcd(m, n) = 1, then by the Chinese Remainder Theorem, every element of
Zm× × Zn× is (x, x) for some x, which must be in Zmn× .

Recall that ϕ was defined as id ∗ µ in (††) in §. (p. ). As promised, we now
have:

Theorem . For all n,


ϕ(n) = |Zn× |.

Proof. We follow the principle used in proving Theorem . Being the convolution
of multiplicative functions, ϕ is multiplicative. By the last theorem, the function
n 7→ |Zn× | is multiplicative. Finally, the given equation holds when n is a prime
power, as shown in §..

This will enable us to establish a generalization of Fermat’s Theorem.

 . Arbitrary moduli
.. Euler’s Theorem
Since ϕ(p) = p − 1, Fermat’s Theorem is that, if n is prime, and gcd(a, n) = 1,
then
aϕ(n) ≡ 1 (mod n).
We shall show that this holds for all n.
The multiplicative function ϕ is called the Euler phi-function after Leon-
hard Euler, –. Euler’s original definition apparently corresponds to The-
orem : ϕ(n) is the number of x such that 0 6 x < n and x is prime to n. For
calculating this, we now have

Theorem . For all n,


Y 1
ϕ(n) = n 1− .
p
p|n

Q
Proof. If n = p|n pn(p) , then
Y Y
ϕ(n) = ϕ(pn(p) ) = (pn(p) − pn(p)−1 )
p|n p|n
Y Y 1 Y 1
= pn(p) 1− =n 1− .
p p
p|n p|n p|n

For example,
 1  1  1 1 2 4
ϕ(30) = 30 · 1 − · 1− · 1− = 30 · · · = 8.
2 3 5 2 3 5
Since 180 has the same prime divisors as 30, we have

ϕ(180) 180
= = 6,
ϕ(30) 30

so ϕ(180) = 6ϕ(30) = 48. But 15 and 30 do not have the same prime divisors, and
we cannot expect ϕ(15)/ϕ(30) to be 15/30, or 1/2; indeed, ϕ(15) = ϕ(3)·ϕ(5) =
2 · 4 = 8 = ϕ(30).

Theorem  (Euler). If gcd(a, n) = 1, then

aϕ(n) ≡ 1 (mod n).


 Gaussproves the theorem in the Disquisitiones Arithmeticae [, ¶], attributing it to Euler
in –.

.. Euler’s Theorem 


Proof. Assume gcd(a, n) = 1. Then the function x 7→ ax is a bijection from Zn×
to itself. Hence
Y Y Y
x≡ (ax) ≡ aϕ(n) x (mod n).
x∈Zn× x∈Zn× x∈Zn×
Q
Since the product x∈Zn× x is invertible (since its factors are), we obtain the
result.
Again, Fermat’s Theorem is the special case when n = p. But we do not
generally have aϕ(n)+1 ≡ a (mod n) for arbitrary a. For example, ϕ(12) = 4,
but 25 = 32 ≡ 8 (mod 12).
Euler’s Theorem gives us a procedure for solving certain congruences. For
example, to solve
36919587 x ≡ 1 (mod 1000),
we compute
ϕ(1000) = ϕ(103 ) = ϕ(23 · 53 ) = ϕ(23 ) · ϕ(53 ) = 4 · 100 = 400.
Now reduce the exponent:
19587 387
= 48 + .
400 400
So we want to solve
369387 x ≡ 1 (mod 1000),
x ≡ 36913 (mod 1000).
Now proceed, using that 13 = 8 + 4 + 1 = 23 + 22 + 1. Multiplication modulo
1000 requires only three columns, so the computations of Table . give us the
solution x ≡ 609 (mod 1000).
Euler’s Theorem gives a neat theoretical solution to Chinese remainder prob-
lems:
Theorem . Suppose the positive integers n1 , . . . , ns are pairwise co-prime,
and the integers a1 , . . . , as are arbitrary. Define
n
n = n1 · · · ns , Ni = .
ni
Then we have
x ≡ a1 (mod n1 ), ..., x ≡ as (mod ns )
if and only if
x ≡ a1 · N1 ϕ(n1 ) + · · · + as · Ns ϕ(ns ) (mod n).
ϕ(ni )
Proof. If i 6= j, then nj | Ni , so Ni ≡ 0 (mod nj ).

 . Arbitrary moduli
369 161
369 161
321 161
14 66
7 1
1 6 1 so 3692 ≡ 161 (1000); 9 2 1 so 3694 ≡ 1612 ≡ 921 (1000);

921
921
921
42
9
2 4 1 so 3698 ≡ 9212 ≡ 241 (1000);
36913 ≡ 3698 · 3694 · 369 ≡ 241 · 921 · 369 (1000);
241 961
921 369
241 649
82 66
9 3
961 6 0 9 so 36913 ≡ 609 (mod 1000).

Table .. Exponentiation modulo 1000

.. Gauss’s Theorem


Given the theoretical developments of the previous chapter, we can immediately
prove:
Theorem  (Gauss ). For all positive integers n,
X
ϕ(d) = n. (‡)
d|n

 The three theorems of the present section are versions of the three theorems in Burton’s
section, ‘Some properties of the phi-function’ [, §., pp. –]. I have tried to suggest
a connection between the first two theorems. In Burton, the last theorem is just what we
have expressed as ϕ = µ ∗ id; but this is also derivable from Gauss’s Theorem. Hence I have
named the section for Gauss.
 Gauss proves this in the Disquisitiones Arithmeticae [, ¶], but he does not have all of

our theory at his disposal.

.. Gauss’s Theorem 


Proof. The claim is ϕ ∗ 1 = id, which is the result of applying Möbius Inversion
(in reverse) to the original definition of ϕ.

Without relying on Möbius inversion, we can prove Gauss’s theorem by the


technique of Theorems  and . Both sides of the equation are multiplicative
functions of n, and
X s
X s
X
ϕ(d) = ϕ(pk ) = 1 + (pk − pk−1 )
d|ps k=0 k=1

= 1 + (p − 1) + (p2 − p) + · · · + (ps − ps−1 ) = ps .

Yet another proof of Gauss’s theorem makes use of the principle of Theo-
rem . Partition the set {0, 1, . . . , n − 1} according to greatest common divisor
with n. For example, suppose n = 12. We can construct a table as follows,
where the rows are labelled with the divisors of 12. Each number x from 0 to 11
inclusive is assigned to row d, if gcd(x, 12) = d.

0 1 2 3 4 5 6 7 8 9 10 11
12 0
6 6
4 4 8
3 3 9
2 2 10
1 1 5 7 11

But when d | 12, we have

0 6 x < 12 & gcd(x, 12) = d

if and only if we have


 x 12  x 12
d | x & gcd , =1 & 06 < .
d d d d
So the number of entries in row d of the table is just ϕ(12/d). The number of
entries in all rows together is 12, so
X  12 
12 = ϕ ;
d
d|12
P
but this is just d|12 ϕ(d), by Theorem . This argument is not specific to 12;
it can be generalized to establish Gauss’s theorem. Is there anything noticeable
about the table for n = 12? Try some other values of n, as in Table ..

 . Arbitrary moduli
.. Gauss’s Theorem

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
16 0
8 8
4 4 12
2 2 6 10 14
1 1 3 5 7 9 11 13 15
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
18 0
9 9
6 6 12
3 3 15
2 2 4 8 10 14 16
1 1 5 7 11 13 17
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
21 0
7 7 14
3 3 6 9 12 15 18
1 2 4 5 8 10 11 13 16 17 19 20

Table .. Numbers according to gcd with 16, 18, and 21



The entries are symmetric about a vertical axis, except for 0. More precisely,
if d is a proper divisor of n, then the function x 7→ n − x is a permutation of
{0 6 x < n : gcd(x, n) = d}. In other words, the average value of an element x
of {1, . . . , n − 1} such that gcd(x, n) = d is n/2. We can write out the case where
d = 1 as follows.

Theorem . For all n, if we understand

Zn× = {k : 0 < k < n & gcd(k, n) = 1}

then
2 X
ϕ(n) = k.
n
k∈Zn
×

Proof. Since the function x 7→ n−x permutes the indices of the given summation,
and |Zn× | = ϕ(n), we have
X X X
k= (n − k) = ϕ(n) · n − k,
k∈Zn× k∈Zn× k∈Zn×

which yields the claim.

The following relates a function of all of the divisors of n with a function of its
prime divisors.

Theorem . For all n,

X µ(d) Y 1
= 1− .
d p
d|n p|n

Proof. From the original definition (††) of ϕ as id ∗ µ, or by applying Möbius


Inversion to Gauss’s Theorem, and then by Theorem , as well as by Theorem ,
we have
Xn Y 1
· µ(d) = ϕ(n) = n 1− .
d p
d|n p|n

Now divide by n.

 This is basically Gauss’s proof.


 Information as in this table will be of use in the next section, §..

 . Arbitrary moduli


For example,
X µ(d) µ(1) µ(2) µ(3) µ(4) µ(6) µ(12)
= + + + + +
d 1 2 3 4 6 12
d|12
1 1 1
=1− − +
2 3 6
1 1 1
=1− − +
2 3 2·3
 1  1  Y 1
= 1− 1− = 1− .
2 3 p
p|12

This may suggest a proof of the last theorem by direct computation. Indeed,
suppose the distinct prime factors of n are p1 , . . . , pr . Then
r
Y 1 X X (−1)j X µ(d)
1− = = .
p j=0
pk(1) · · · pk(j) d
p|n 16k(1)<···<k(j)6r d|n

.. Gauss’s Theorem 


. Primitive roots

.. Order
Euler’s Theorem can be improved in some cases. For example, 255 = 3 · 5 · 17, so
ϕ(255) = ϕ(3) · ϕ(5) · ϕ(17) = 2 · 4 · 16 = 128, and hence, by Euler’s Theorem,

gcd(a, 255) = 1 =⇒ a128 ≡ 1 (mod 255).

But by Fermat’s Theorem,

3 ∤ a =⇒ a2 ≡ 1 (mod 3) =⇒ a16 ≡ 1 (mod 3);


4 16
5 ∤ a =⇒ a ≡ 1 (mod 5) =⇒ a ≡1 (mod 5);
16
17 ∤ a =⇒ a ≡ 1 (mod 17).

Therefore gcd(a, 255) = 1 =⇒ a16 ≡ 1 (mod 3, 5, 17), that is,

gcd(a, 255) = 1 =⇒ a16 ≡ 1 (mod 255).

If it exists, the order of a modulo n is the least positive k such that

ak ≡ 1 (mod n).

Theorem . A number a has an order modulo n if and only if

gcd(a, n) = 1.

Proof. If a has the order k modulo n, then ak − 1 = n · ℓ for some ℓ, so

a · ak−1 − n · ℓ = 1,

and therefore gcd(a, n) = 1. Conversely, if gcd(a, n) = 1, then aϕ(n) ≡ 1


(mod n), so a has an order modulo n.
Assuming gcd(a, n) = 1, let us denote the order of a modulo n by

ordn (a).

For example, what is ord17 (2)? Just compute powers of 2 modulo 17:

k 1 2 3 4 5 6 7 8
2k (mod 17) 2 4 8 −1 −2 −4 −8 1


Then ord17 (2) = 8. Likewise, ord17 (3) = 16:

k 1 2 3 4 5 6 7 8
3k (mod 17) 3 −8 −7 −4 5 −2 −6 −1
k 9 10 11 12 13 14 15 16
3k (mod 17) −3 8 7 4 −5 2 6 1

Note how, in each computation, halfway through, we just change signs. From the
last table, taking every other entry, we can extract

k 1 2 3 4 5 6 7 8
k
(−8) (mod 17) −8 −4 −2 −1 8 4 2 1

which means ord17 (−8) = 8. Likewise, ord17 (−4) = 4, and ord17 (−1) = 2. So we
have
a 1 2 3 4 5 6 7 8
ord17 (a) 1 16
ord17 (−a) 2 4 8
How can we complete the table? For example, what is ord17 (−7)? Since −7 ≡ 33
(mod 17), and gcd(3, 16) = 1, we shall be able to conclude ord17 (−7) = 16.
Likewise, ord17 (5) = 16. But ord17 (−2) = 16/ gcd(6, 16) = 8, since −2 ≡ 36
(mod 17). This is by a general theorem to be proved presently. We complete the
last table thus:
a 1 2 3 4 5 6 7 8
ord17 (a) 1 8 16 4 16 16 16 8
ord17 (−a) 2 8 16 4 16 16 16 8

Theorem . Suppose gcd(a, n) = 1. Then

a) ak ≡ 1 (mod n) if and only if ordn (a) | k;

b) ordn (as ) = ordn (a)/ gcd(s, ordn (a));

c) ak ≡ aℓ if and only if k ≡ ℓ (mod ordn (a)).

Proof. For (a), the reverse direction is easy. For the forward direction, suppose
ak ≡ 1 (mod n). Now use division:

k = ordn (a) · s + r

for some s and r, where 0 6 r < ordn (a). Then

1 ≡ ak ≡ aordn (a)·s+r ≡ (aordn (a) )s · ar ≡ ar (mod n).

.. Order 


By minimality of ordn (a) as an integer k such that ak ≡ 1 (mod n), we conclude
r = 0. This means ordn (a) | k.
To prove (b), by (a) we have, modulo n,
ordn (a)
(as )k ≡ 1 ⇐⇒ ask ≡ 1 ⇐⇒ ordn (a) | sk ⇐⇒ | k,
gcd(s, ordn (a))
but also (as )k ≡ 1 ⇐⇒ ordn (as ) | k, hence
ordn (a)
| k ⇐⇒ ordn (as ) | k.
gcd(s, ordn (a))
This is true for all k. Since orders are positive, we conclude (b).
Finally, (c) follows from (a), since

ak ≡ aℓ (mod n) ⇐⇒ ak−ℓ ≡ 1 (mod n)


⇐⇒ ordn (a) | k − ℓ
⇐⇒ k ≡ ℓ (mod ordn (a)).

(We have used that gcd(a, n) = 1, so that a−ℓ exists.)


Hence, from
k 1 2 3 4 5 6 7 8 9
2k (mod 19) 2 4 8 −3 −6 7 −5 9 −1
2k+9 (mod 19) −2 −4 −8 3 6 −7 5 −9 1
we obtain
a 1 2 3 4 5 6 7 8 9
ord19 (a) 1 18 18 9 9 9 3 6 9
ord19 (−a) 2 9 9 18 18 18 6 3 18
by the computations in Table . (which make use of information in Table .
on page  above). If d | 18, let ψ19 (d) be the number of incongruent residues
modulo 19 that have order d. Then we have
d ψ19 (d)
18 6
9 6
6 2
3 2
2 1
1 1
 In
the Disquisitiones Arithmeticae [, ¶], Gauss introduces the notation ψd for this num-
ber.

 . Primitive roots


ord19 (2k ) = 18 ⇐⇒ gcd(k, 18) = 1
⇐⇒ k ≡ 1, 5, 7, 11, 13, 17 (mod 18)
k
⇐⇒ 2 ≡ 2, −6, −5, −4, 3, −9 (mod 19);
k
ord19 (2 ) = 9 ⇐⇒ gcd(k, 18) = 2
⇐⇒ k ≡ 2, 4, 8, 10, 14, 16 (mod 18)
k
⇐⇒ 2 ≡ 4, −3, 9, −2, 6, 5 (mod 19),
k
ord19 (2 ) = 6 ⇐⇒ gcd(k, 18) = 3
⇐⇒ k ≡ 3, 15 (mod 18)
⇐⇒ 2k ≡ 8, −7 (mod 19),
k
ord19 (2 ) = 3 ⇐⇒ gcd(k, 18) = 6
⇐⇒ k ≡ 6, 12 (mod 18)
⇐⇒ 2k ≡ 7, −8 (mod 19),
k
ord19 (2 ) = 2 ⇐⇒ gcd(k, 18) = 9
⇐⇒ k ≡ 9 (mod 18)
⇐⇒ 2k ≡ −1 (mod 19).

Table .. Orders modulo 19

Note that ψ19 (d) = ϕ(d) here. This is no accident. Indeed, if d | 18, so 18 = dℓ
for some ℓ, we have

ord19 (2k ) = d ⇐⇒ gcd(k, 18) = ℓ


k 
⇐⇒ ℓ | k & gcd , d = 1.

Thus, modulo 18, the number of k such that ord19 (2k ) = d is just ϕ(d). But
every number that is prime to 19 is congruent modulo 19 to 2k for some such k.
Therefore ψ19 (d) = ϕ(d).
If gcd(a, n) = 1, and ordn (a) = ϕ(n), then a is called a primitive root of n.
So we have shown that 3, but not 2, is a primitive root of 17.
Also, 2 is a primitive root of 19, and we have used this to show ψ19 (d) = ϕ(d)
if d | 18. The same argument shows ψn (d) = ϕ(d), if n has a primitive root.
We shall show that every p has a primitive root; but this will be a corollary to
Theorem , that ψp (d) = ϕ(d).
There will be no formula for determining primitive roots: we just have to look

.. Order 


for them. But once we know that 2 is a primitive root of 19, then we know that
25 , 27 , 211 , 213 , and 217 are primitive roots—or rather, −6, −5, −4, 3, and −9
are primitive roots. In particular, the number of primitive roots of 19 is ϕ(18).

.. Groups
We can understand what we are doing algebraically as follows. On the set Zn
of congruence-classes modulo n, addition and multiplication are well-defined by
Theorem , and so the set, considered with these operations, is a ring. The
multiplicatively invertible elements of this ring compose the set Zn× . This set is
closed under multiplication and inversion: it is a (multiplicative) group. Suppose
k ∈ Zn× . (More precisely one might write the element as k + (n) or k̄. On the
other hand, we are free to treat Zn× as being literally a subset of Z: we did this
in Theorem . In this case, one must just remember that multiplication and
addition are not the usual operations on Z.) Then we have the function

x 7→ k x

from Z to Zn× . Since k x+y = k x · k y , this function is a homomorphism from


the additive group Z to the multiplicative group Zn× .
We have shown that the function x 7→ 2x is surjective onto Z19× , and its kernel
is (18). Call this function f2 . Then (by the First Isomorphism Theorem for
Groups) f2 is an isomorphism from Z18 onto Z19× :

Z18 ∼= Z19× ,
({0, 1, 2, . . . , 17}, +) ∼
= ({1, 2, 3, . . . , 18}, · ).

From analysis, we have the exponential function x 7→ ex or exp from R to R× ,


where R× = R r {0} (the set of multiplicatively invertible real numbers). We
have
exp(x + y) = exp(x) · exp(y).
The range of exp is the interval (0, ∞), which is closed under multiplication and
inversion. Also exp is injective. So exp is an isomorphism from (R, +) onto
((0, ∞), · ).
We are looking at a similar isomorphism in discrete mathematics. If a is a
primitive root of n, then x 7→ ax is an isomorphism from Zϕ(n) to Zn× . In par-
ticular, Zn× is cyclic. Conversely, if Zn× is cyclic, then a generator is a primitive
root of n. For example:
a) Z2× = {1}, so 1 is a primitive root of 2.
b) Z3× = {1, 2}, and 22 ≡ 1 (mod 3), so 2 is a primitive root of 3.
c) Z4× = {1, 3}, and 32 ≡ 1 (mod 4), so 3 is a primitive root of 4.

 . Primitive roots


d) Z5× = {1, 2, 3, 4}, and 22 ≡ 4, 23 ≡ 3, and 24 ≡ 1 (mod 5), so 2 is a
primitive root of 5.
e) Z6× = {1, 5}, and 52 ≡ 1 (mod 6), so 5 is a primitive root of 6.
f) Z7× = {1, 2, 3, 4, 5, 6}, and we have
k 1 2 3 4 5 6
2k 2 4 1
3k 3 2 6 4 5 1
so 3 (but not 2) is a primitive root of 7.
g) Z8× = {1, 3, 5, 7}, but 32 ≡ 1, 52 ≡ 1, and 72 ≡ 1 (mod 8), so 8 has no
primitive root.
We shall show in §. that the following numbers, and only these, have primitive
roots:
a) powers of odd primes;
b) 2 and 4;
c) doubles of powers of odd primes.

.. Primitive roots of primes


To prove generally that the number of primitive roots of p is ϕ(p − 1), we shall
need the following (attributed to Joseph-Louis Lagrange, –.)
Theorem  (Lagrange ). Every congruence of the form
xn + a1 xn−1 + · · · + an−1 x + an ≡ 0 (mod p)
has n solutions or fewer ( modulo p).
Proof. Use induction. The claim is easily true when n = 1. Suppose it is true
when n = k. Say the congruence
xk+1 + a1 xk + · · · + ak x + ak+1 ≡ 0 (mod p) (∗)
has a solution b. Then we can factorize the left member, and rewrite the congru-
ence as
(x − b) · (xk + c1 xk−1 + · · · + ck−1 x + ck ) ≡ 0 (mod p).
Any solution to this that is different from b is a solution of
xk + c1 xk−1 + · · · + ck−1 x + ck ≡ 0 (mod p).
But by inductive hypothesis, there are at most k such solutions. Therefore (∗)
has at most k + 1 solutions. This completes the induction and the proof.
 Inthe Disquisitiones Arithmeticae [, ¶¶–], Gauss proves this theorem and traces its
original proof to Lagrange in , while mentioning also later proofs by Legendre and Euler.
He says Euler had proved an (unspecified) special case in –.

.. Primitive roots of primes 


How did we use that p is prime? We needed to know that, if f (x) and g(x) are
polynomials, and f (a) · g(a) ≡ 0 (mod p), then either f (a) ≡ 0 (mod p), or else
g(a) ≡ 0 (mod p). That is, if mn ≡ 0 (mod p), then either m ≡ 0 (mod p) or
n ≡ 0 (mod p). That is, if p | mn, then p | m or p | n. This fails if p is replaced
by a composite number.
Indeed, the congruence x2 −1 ≡ 0 (8) has the solutions 1, 3, 5, and 7 (as shown
in §.). Also x2 − 5x ≡ 0 (6) has solutions 0 and 5, but also 2 and 3, since
x2 − 5x ≡ x2 − 5x + 6 ≡ (x − 2)(x − 3).
Theorem . If d | p − 1, let ψp (d) be the number of incongruent residues
modulo p that have order d. Then

ψp (d) = ϕ(d).

Proof. Every number prime to p has an order modulo p, and this order divides
ϕ(p), which is p − 1; so X
ψp (d) = p − 1.
d|p−1
P
By Gauss’s Theorem (Theorem , p. ), we have d|p−1 ϕ(d) = p−1; therefore
X X
ψp (d) = ϕ(d). (†)
d|p−1 d|p−1

Hence, to establish ψp (d) = ϕ(d), it is enough to show that ψp (d) 6 ϕ(d)


whenever d | p − 1. Indeed, if we show this, but ψp (e) < ϕ(e) for some divisor e
of p − 1, then
X X X X
ψp (d) = ψp (d) + ψp (e) < ϕ(d) + ϕ(e) = ϕ(d),
d|p−1 d|p−1 d|p−1 d|p−1
d6=e d6=e

contradicting (†).
If ψp (d) = 0, then certainly ψp (d) 6 ϕ(d). So suppose ψp (d) 6= 0. Then
ordp (a) = d for some a. In particular, a is a solution of the congruence

xd − 1 ≡ 0 (mod p). (‡)

But then every power of a is a solution, since (ad )n = (an )d . Moreover, if


0 < k < ℓ 6 d, then
ak 6≡ aℓ (mod p)
by Theorem . Hence the numbers a, a2 , . . . , ad are incongruent solutions
to the congruence (‡). Moreover, by Lagrange’s Theorem, , every solution is
congruent to one of these solutions. Among these solutions, those that have order

 . Primitive roots


d modulo p are just those powers ak such that gcd(k, d) = 1, again by Theorem .
The number of such powers is just ϕ(d). Therefore ψp (d) = ϕ(d), under the
assumption ψp (d) > 0; in any case, ψp (d) 6 ϕ(d).
Corollary. Every prime number has a primitive root.
Proof. ψp (p − 1) = ϕ(p − 1) > 1.
Now we can prove the necessity of (all of) Korselt’s Criterion for being a
Carmichael number (p. ):
Theorem . If n is a Carmichael number, and p | n, then p − 1 | n − 1.
Proof. Given that n is a Carmichael number and p | n, we let a be a primitive
root of p. Since an ≡ a (n), we have an ≡ a (p), and therefore an−1 ≡ 1 (p).
But ordp (a) = p − 1, so p − 1 | n − 1.
So now we know that the Carmichael numbers are precisely those squarefree
composite numbers n such that p | n =⇒ p − 1 | n − 1. We shall be able to
give another characterization in §., once we know that squares of primes have
primitive roots.

.. Discrete logarithms


The inverse of the function exp fromR x R onto (0, ∞) is the logarithm function

log, where as noted in §., log x = 1 (dt/t).


We can use similar terminology for the inverse of an isomorphism x 7→ bx from
Zp−1 to Zp× . Here b must be a primitive root of p, and if bx ≡ y (p), we can
write
x ≡ logb y (mod (p − 1)).
For example, modulo 17, we have Table .. If 3k = ℓ, then we can denote k by
log3 ℓ. But we can think of these numbers as congruence classes:

3k ≡ ℓ (mod 17) ⇐⇒ k ≡ log3 ℓ (mod 16).

The usual properties hold:

log3 (xy) ≡ log3 x + log3 y (mod 16); log3 xn ≡ n log3 x (mod 16).

For example,

log3 (11 · 14) ≡ log3 11 + log3 14 ≡ 7 + 9 ≡ 16 ≡ 0 (mod 16),


 Gaussgives just this proof in the Disquisitiones Arithmeticae [, ¶¶–].
 This
function can be denoted by ln, for logarithmus naturalis, in case one happens to
want to use log to denote the inverse of the function x 7→ 10x .

.. Discrete logarithms 


. Primitive roots
k 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 (mod 16)
3k 1 3 9 10 13 5 15 11 16 14 8 7 4 12 2 6 (mod 17)
Rearranged:

3k 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 (mod 17)
k 0 14 1 12 5 15 11 10 2 3 7 13 4 9 6 8 (mod 16)

Table .. Powers of 3 modulo 17


and therefore 11 · 14 ≡ 30 ≡ 1 (mod 17).
We can define logarithms for any modulus that has a primitive root; then the
base of the logarithms will be a primitive root. If b is a primitive root of a
modulus n, and gcd(a, n) = 1, then there is some s such that

bs ≡ a (mod n).

Then s is unique modulo ϕ(n). Indeed, by Theorem ,

b x ≡ by (mod n) ⇐⇒ x ≡ y (mod ϕ(n)).

Then logb a can be defined as the least non-negative such s.


Another application of logarithms, besides multiplication problems, is congru-
ences of the form
xd ≡ a (mod n),
again where n has a primitive root b. The last congruence is then equivalent to

logb (xd ) ≡ logb a (mod ϕ(n)),


d logb x ≡ logb a (mod ϕ(n)).

If this is to have a solution, then we must have

gcd(d, ϕ(n)) | logb a.

For example, let’s work modulo 7:


k 0 1 2 3 4 5 ℓ 1 2 3 4 5 6
3k 1 3 2 6 4 5 log3 ℓ 0 2 1 4 5 3

Then we have, for example,

x3 ≡ 2 (mod 7) ⇐⇒ 3 log3 x ≡ 2 (mod 6),

so there is no solution, since gcd(3, 6) = 3, and 3 ∤ 2. But we also have

x3 ≡ 6 (mod 7) ⇐⇒ 3 log3 x ≡ 3 (mod 6)


⇐⇒ log3 x ≡ 1 (mod 2)
⇐⇒ log3 x ≡ 1, 3, 5 (mod 6)
⇐⇒ x ≡ 31 , 33 , 35 (mod 7)
⇐⇒ x ≡ 3, 6, 5 (mod 7).

We expect no more than 3 solutions, by Lagrange’s Theorem. Is there an alter-


native to using logarithms? As 6 ≡ 33 (mod 7), we have

x3 ≡ 6 (mod 7) ⇐⇒ x3 ≡ 33 (mod 7);

.. Discrete logarithms 


but we cannot conclude from this x ≡ 3 (mod 7).
For congruences modulo 11, we can use the following table:

k 0 1 2 3 4 5 6 7 8 9 log2 ℓ (10)
2k (11) 1 2 4 −3 5 −1 −2 −4 3 −5 ℓ

We have then

4x15 ≡ 7 (mod 11) ⇐⇒ 4x5 ≡ 7 (mod 11)


5
⇐⇒ log2 (4x ) ≡ log2 7 (mod 10)
⇐⇒ log2 4 + 5 log2 x ≡ log2 7 (mod 10)
⇐⇒ 2 + 5 log2 x ≡ 7 (mod 10)
⇐⇒ 5 log2 x ≡ 5 (mod 10)
⇐⇒ log2 x ≡ 1 (mod 2)
⇐⇒ log2 x ≡ 1, 3, 5, 7, 9 (mod 10)
1 3 5 7 9
⇐⇒ x ≡ 2 , 2 , 2 , 2 , 2 (mod 11)
⇐⇒ x ≡ 2, 8, 10, 7, 6 (mod 11).

Why are there five solutions?

Theorem . Suppose n has a primitive root, gcd(a, n) = 1, and

d = gcd(k, ϕ(n)).

The following are equivalent:

a) The congruence
xk ≡ a (mod n) (§)
is soluble.

b) The congruence (§) has d solutions.

c) aϕ(n)/d ≡ 1 (mod n).

Proof. The following are equivalent:

xk ≡ a is soluble (mod n);


k log x ≡ log a is soluble (mod ϕ(n));
d | log a;
ϕ(n)
ϕ(n) | · log a;
d

 . Primitive roots


ϕ(n)
· log a ≡ 0 (mod ϕ(n));
d
log(aϕ(n)/d ) ≡ 0 (mod ϕ(n));
aϕ(n)/d ≡ 1 (mod n).

Thus (a)⇔(c). Trivially, (b)⇒(a). Finally, assume (a), so that d | log a, as above.
Letting r be the base of the logarithms, we have

xk ≡ a (mod n) ⇐⇒ k log x ≡ log a (mod ϕ(n))


k log a ϕ(n)
⇐⇒ · log x ≡ (mod )
d d d
log a ϕ(n)
⇐⇒ log x ≡ (mod )
k d
log a ϕ(n)
⇐⇒ log x ≡ + · j (mod ϕ(n)),
k d
where j ∈ {0, 1, . . . , d − 1}
⇐⇒ x ≡ r(log a)/k · (rϕ(n)/d )j (mod n),
where j ∈ {0, 1, . . . , d − 1}.

Finally, these d solutions are incongruent. Indeed, since ordn (r) = ϕ(n), the
powers (rϕ(n)/d )j are incongruent; and r(log a)/k is invertible.

.. Composite numbers with primitive roots


We know that all primes have primitive roots. Now we show that the numbers
with primitive roots are precisely:

2, 4, ps , 2 · ps ,

where p is an odd prime, and s > 1. We shall first show that the numbers not on
this list do not have primitive roots:

Lemma. If k > 2, then 2 | ϕ(k).

Proof. Suppose k > 2. Then either k = 2s , where s > 1, or else k = ps · m for


some odd prime p, where s > 0 and gcd(p, m) = 1. In the first case, ϕ(k) =
2s − 2s−1 = 2s−1 , which is even. In the second case, ϕ(k) = ϕ(ps ) · ϕ(m), which
is even, since ϕ(ps ) = ps − ps−1 , the difference of two odd numbers.

Theorem . If m and n are co-prime, both greater than 2, then mn has no
primitive root.

.. Composite numbers with primitive roots 


Proof. Suppose gcd(a, mn) = 1. (This is the only possibility for a primitive root.)
Then a is prime to m and n, so

aϕ(m) ≡ 1 (mod m), aϕ(n) ≡ 1 (mod n),

Therefore alcm(ϕ(m),ϕ(n)) ≡ 1 modulo m and n, and hence modulo lcm(m, n),


which is mn. By the lemma, 2 divides both ϕ(m) and ϕ(n), so

ϕ(m)ϕ(n)
lcm(ϕ(m), ϕ(n)) | ,
2
that is, lcm(ϕ(m), ϕ(n)) | ϕ(mn)/2. Therefore

ϕ(mn)
ordmn (a) 6 ,
2
so a is not a primitive root of mn.

Theorem . If k > 1, then 22+k has no primitive root.

Proof. Any primitive root of 22+k must be odd. Let a be odd. We shall show by
induction that
2+k
aϕ(2 )/2 ≡ 1 (mod 22+k ).
Since ϕ(22+k ) = 22+k − 21+k = 21+k , it is enough to show
k
a2 ≡ 1 (mod 22+k ).

The claim is true when k = 1, since a2 ≡ 1 (mod 8) for all odd numbers a.
Suppose the claim is true when k is some positive integer ℓ, that is,

a2 ≡ 1 (mod 22+ℓ ).

This means

a2 = 1 + 22+ℓ · m
for some m. Now square:
1+ℓ ℓ
a2 = (a2 )2 = (1 + 22+ℓ · m)2 =1 + 23+ℓ · m + 24+2ℓ · m2
=1 + 23+ℓ · m · (1 + 21+ℓ · m).
1+ℓ
Hence a2 ≡ 1 (mod 23+ℓ ), so our claim is true when k = ℓ + 1.

Now for the positive results. These will use the following.

 . Primitive roots


Lemma. Let r be a primitive root of p, and k > 0. Then
ordpk (r) = (p − 1)pℓ
for some ℓ, where 0 6 ℓ < k.
Proof. Let ordpk (r) = n. Then n | ϕ(pk ). But ϕ(pk ) = pk − pk−1 = (p − 1) · pk−1 .
Thus,
n | (p − 1) · pk−1 .
Also, rn ≡ 1 (mod pk ), so rn ≡ 1 (mod p), which means ordp (r) | n. But r is a
primitive root of p, so ordp (r) = ϕ(p) = p − 1. Therefore
p − 1 | n.
The claim now follows.
Theorem . p2 has a primitive root. In fact, if r is a primitive root of p, then
either r or r + p is a primitive root of p2 .
Proof. Let r be a primitive root of p. If r is a primitive root of p2 , then we are
done. Suppose r is not a primitive root of p2 . Then ordp2 (r) = p − 1, by the last
lemma. Hence, modulo p2 , we have
 
p−1 p−1 p−2 p−1
(r + p) ≡r + (p − 1) · r ·p+ · rp−3 · p2 + · · ·
2
≡ rp−1 + (p − 1) · rp−2 · p
≡ 1 + (p − 1) · rp−2 · p
≡ 1 − rp−2 · p
6≡ 1,
since p ∤ r. (Note that this argument holds even if p = 2.) Hence ordp2 (r + p) 6=
p − 1, so by the lemma, the order must be (p − 1) · p, that is, ϕ(p2 ). This means
r is a primitive root of p2 .
Alternatively, if r is a primitive root of p, then either r or r + rp is a primitive
root of p2 . For, ordp2 (1 + p) = p, simply because the order is not 1, but
p   p  
p
X p j 2
X p j
(1 + p) = p =1+p + p ≡ 1 (mod p2 ).
j=0
j j=2
j

Then r and 1 + p have orders p − 1 and p respectively, modulo p2 , so their product


must have order p(p − 1) (see Exercise ).
Now we can give another characterization of Carmichael numbers (which were
defined on page  as those composite numbers n such that an ≡ a (n) for all a):

.. Composite numbers with primitive roots 


Theorem . A composite number n is a Carmichael number if and only if,
whenever gcd(a, n) = 1, we have

an−1 ≡ 1 (mod n). (¶)

Proof. Suppose n is a Carmichael number and gcd(a, n) = 1. If p | n, then an ≡ a


(mod p), so an−1 ≡ 1 (p). Since n is squarefree by Theorem  (p. ), we have
that n is the least common multiple of its prime divisors, and therefore (¶) holds.
Conversely, suppose (¶) holds whenever gcd(a, n) = 1. The proof of Theo-
rem  (p. ) still works to show p | n =⇒ p − 1 | n − 1. Also, n is squarefree.
Indeed, suppose p2 | n. But p2 has a primitive root a, and by the Chinese Re-
mainder Theorem, we may assume gcd(a, n) = 1. Then an−1 ≡ 1 modulo n and
therefore modulo p2 , so ϕ(p2 ) | n − 1. But p | ϕ(p2 ), so p | n − 1, which is
absurd. Therefore n must be squarefree, so by Theorem , it is a Carmichael
number.
Theorem . All odd prime powers (that is, all powers of odd primes) have
primitive roots. In fact, a primitive root of p2 is a primitive root of every power
p1+k , where p is odd.
Proof. Assume p is an odd prime. We know p and p2 have primitive roots. Let
r be a primitive root of p2 . We prove by induction that r is a primitive root of
p1+k . The claim is trivially true when k = 1. Suppose it is true when k is some
positive integer ℓ. This means

ordp1+ℓ (r) = (p − 1) · pℓ .

In particular,
ℓ−1
r(p−1)·p 6≡ 1 (mod p1+ℓ ).
However, since ϕ(pℓ ) = (p − 1) · pℓ−1 , we have
ℓ−1
r(p−1)·p ≡ 1 (mod pℓ ).

We can now conclude ℓ−1


r(p−1)·p = 1 + pℓ · m
for some m that is indivisible by p. Now raise both sides of this equation to the
power p:
   
(p−1)·pℓ 1+ℓ p 2ℓ 2 p
r =1+p ·m+ ·p ·m + · p3ℓ · m3 + · · ·
2 3

Since p > 2 and ℓ > 1, so that p | p2 and 2ℓ > 1 + ℓ, we have

r(p−1)·p ≡ 1 + p1+ℓ · m 6≡ 1 (mod p2+ℓ ).

 . Primitive roots


Therefore we must have

ordp2+ℓ (r) = (p − 1) · p1+ℓ = ϕ(p2+ℓ ),

which means r is a primitive root of p2+ℓ .


For example, 3 has the primitive root 2, since 2 6≡ 1 (mod 3), but 22 ≡ 1
(mod 3). Hence, either 2 or 5 is a primitive root of 9, by Theorem . In fact,
both are. Using 5 ≡ −4 (mod 9), we have:

k 2 3
2k (mod 9) 4 −1 ,
(−4)k (mod 9) −2 −1

so the order of 2 and −4 is not 2 or 3 modulo 9; hence it must be 6, since this


is ϕ(9). By Theorem  then, 27 has 6 non-congruent primitive roots, each
congruent modulo 9 to one of 2 and −4; those roots then are −13, −7, −4, 2, 5,
and 11. Indeed, ϕ(27) = 18 and we have

k 2 3 4 5 6 7 8 9
(−13)k (mod 27) 7 −10 −5 11 −8 −4 −2 −1
(−4)k (mod 27) −11 −10 13 2 −8 5 7 −1
5k (mod 27) −2 −10 4 −7 −8 −13 −11 −1
(−7)k (mod 27) −5 8 −2 13 10 −11 4 −1
2k (mod 27) 4 8 −11 5 10 −7 13 −1
11k (mod 27) 13 8 7 −4 10 2 −5 −1

But does 18 have a primitive root? The numbers 2 and −4 cannot be primitive
roots of 18, since they are not prime to it; but ϕ(18) = 6 and we have

k 2 3
k
(−7) (mod 18) −5 −1
5k (mod 18) 7 −1

so −7 and 5 are primitive roots of 18.


Theorem . If p is an odd prime, and r is a primitive root of ps , then either
r or r + ps is a primitive root of 2ps —whichever one is odd.
Proof. Let r be an odd primitive root of ps . Then gcd(r, 2ps ) = 1, so r has an
order modulo 2ps . Since also ordps (r) = ϕ(ps ), we have

ϕ(ps ) | ord2ps (r).

But also ord2ps (r) | ϕ(2ps ); and ϕ(ps ) = ϕ(2ps ). Hence ord2ps (r) = ϕ(2ps ).

.. Composite numbers with primitive roots 


. Quadratic reciprocity

.. Quadratic equations


If p ∤ a, then the linear congruence

ax + b ≡ 0 (mod p)

is always soluble. The next step is to consider quadratic congruences,

ax2 + bx + c ≡ 0 (mod p), (∗)

where still p ∤ a. For example, let us try to solve

2x2 − 8x + 9 ≡ 0 (mod 11). (†)

We cannot factorize the polynomial 2x2 − 8x + 9 over Z (or even R), since 82 −
4 · 2 · 9 = −8, which is not a square (or even positive). However, after replacing
coefficients with residues modulo 11, we may be able to factorize. Still, a better
method of solution is completing the square. We have, modulo 11,
9
2x2 − 8x + 9 ≡ 0 ⇐⇒ x2 − 4x ≡ −
2
9
⇐⇒ x2 − 4x + 4 ≡ 4 −
2
2 1 10
⇐⇒ (x − 2) ≡ − ≡ ≡ 5.
2 2
(We did not need to compute the inverse of 2 modulo 11, although we may see
easily enough that it is 6.) If 5 is a square modulo 11, then (†) has a solution; if
not, not. One way to settle the question is by hunting: we have 5 ≡ 16 ≡ 42 , so

2x2 − 8x + 9 ≡ 0 ⇐⇒ (x − 2)2 ≡ 42
⇐⇒ x − 2 ≡ ±4
⇐⇒ x ≡ 2 ± 4 ≡ 6 or 9.

Note that we have used Lagrange’s Theorem (Theorem ) to conclude that the
congruence has exactly two solutions. We now know

2x2 − 8x + 9 ≡ 2(x − 6)(x + 2) ≡ 2(x − 6)(x − 9).


Possibly, with some cleverness, we might have been able to see this from the
beginning. But suppose we want to solve

x2 − 4x − 3 ≡ 0 (mod 11). (‡)

We find

x2 − 4x − 3 ≡ 0 ⇐⇒ x2 − 4x + 4 ≡ 7 ⇐⇒ (x − 2)2 ≡ 7.

Now, if 7 ≡ k 2 , then we may assume −5 6 k < 5. The positive integers that


are congruent to 7 and are less than or equal to 52 are 7 and 18, and neither of
them is a square. Therefore the congruence (‡) is insoluble. In particular, the
polynomial x2 − 4x − 3 has no factorization over Z11 ; so it would have been futile
to hunt for a factorization. Completing the square is the way to go.
Another way to see that 7 has no square root modulo 11 is to note first that 2
is a primitive root of 11. Since 11 ∤ 7, but 7 ≡ −4, the following table shows that
7 is not a square modulo 11, because −4 does not appear as an even power of 2
(that is, a power of 2 with even exponent):

k 0 1 2 3 4 mod 5
22k 1 4 5 −2 3 mod 11

Indeed, 2m ≡ 2n (11) if and only if m ≡ n (10), by Theorem . Since 10 is even,


the only numbers prime to 11 that are squares modulo 11 are the even powers of
2.
Considering the general quadratic congruence (∗), and assuming p is odd (so
that 2 is invertible modulo p), we have
b c
ax2 + bx + c ≡ 0 ⇐⇒ x2 + x ≡ −
a a
b b2 b2 c
⇐⇒ x2 + x + 2 ≡ 2 −
a 4a 4a a
 b 2 b2 − 4ac
⇐⇒ x + ≡ ,
2a (2a)2
just as when one derives the usual quadratic formula. Working over R, one
knows that the equation ax2 + bx + c = 0 (where a 6= 0) is soluble if and only
if b2 − 4ac > 0. Another way to express this condition is that the discriminant
b2 − 4ac must be a square in R. It is the same modulo p: the congruence (∗) is
soluble if and only if b2 − 4ac is a square modulo p. In the terminology introduced
in §., this condition is that b2 − 4ac either is divisible by p or is a quadratic
residue of p.
As we have just observed, assuming p ∤ b2 −4ac, one way to tell whether b2 −4ac
is a quadratic residue is first to find its least positive residue, say m, and then

.. Quadratic equations 


to check whether any of the residues m + kp is a square, where 0 6 k and also
m + kp 6 ((p − 1)/2)2 , that is,

m + kp 6 ̟2 ,

in the notation of (k) in §. (p. ). So it is sufficient to check when 0 6 k < ̟/2.
This could still be a lot of work if p is large.
We shall develop a way to test for quadratic residues that is more practical as
well as theoretically interesting.

.. Quadratic residues


We have just seen that the quadratic residues of 11 are the even powers of 2,
namely 1, 4, 5, −2, and 3, or rather 1, 4, 5, 9, and 3. The quadratic non-
residues are the odd powers: 2, −3, −1, −4, and −5, that is, 2, 8, 10, 7, and 6.
So there are five residues, and five non-residues. (The general formulation of this
equality will be Theorem .)

Theorem  (Euler’s Criterion). Let p be an odd prime, and gcd(a, p) = 1.


Then a is a quadratic residue of p if and only if

a(p−1)/2 ≡ a̟ ≡ 1 (mod p), (§)

and a is a quadratic non-residue of p if and only if

a̟ ≡ −1 (mod p). (¶)

Proof. Let r be a primitive root of p. Any solution of x2 ≡ a (mod p) is rk for


some k, and then

a̟ ≡ (r2k )̟ ≡ (rk )p−1 ≡ 1 (mod p)

by Fermat’s Theorem (Theorem ).


In any case, a ≡ rℓ (mod p) for some ℓ. Suppose a̟ ≡ 1 (mod p). Then

1 ≡ (rℓ )̟ ≡ rℓ·̟ (mod p),

so ordp (r) | ℓ · ̟, that is,


p − 1 | ℓ · ̟.
Therefore ℓ/2 is an integer, that is, ℓ is even. Say ℓ = 2m. Then a ≡ r2m ≡ (rm )2
(mod p).
Since ap−1 ≡ 1 (mod p), by Fermat’s Theorem, we have a̟ ≡ ±1 (mod p), so
the second part of the claim follows.

 . Quadratic reciprocity


Another way to prove the theorem arises from the following considerations,
which also lead to the alternative proof of Wilson’s Theorem promised at the end
of §. (p. ). Suppose a is a quadratic non-residue of p. If b ∈ {1, . . . , p − 1},
then the congruence
bx ≡ a (mod p)
has a unique solution in {1, . . . , p − 1}, and we denote the solution by a/b. Then
b 6= a/b, since a is not a quadratic residue of p. Now we define a sequence
(b1 , . . . , b̟ ) recursively. If bk has been chosen when k < ℓ < ̟, then let bℓ be the
least element of {1, . . . , p − 1} r {b1 , a/b1 , . . . , bℓ−1 , a/bℓ−1 }. Note then that a/bℓ
must be in this set too, since otherwise a/bℓ = bk for some k such that k < ℓ,
and then bℓ = a/bk . We now have
n a a o
b 1 , , . . . , b̟ , = {1, . . . , p − 1}.
b1 b̟

Now multiply everything together:

a̟ ≡ (p − 1)! (mod p). (k)

If we have Wilson’s Theorem (Theorem , p. ), we can conclude (¶). Con-
versely, this and (k) give us Wilson’s Theorem.
Now suppose a is a quadratic residue of p. We choose the bk as before, except
this time let b1 be the least positive solution of x2 ≡ a (mod p), and replace a/b1
with the next least positive solution, which is p − b1 . We have then
n a a o
b 1 , p − b 1 , b 2 , , . . . , b̟ , = {1, . . . , p − 1},
b2 b̟

and multiplication now gives us

−a̟ ≡ (p − 1)! (mod p).

Now (§) is equivalent to Wilson’s Theorem. Since we do have (§) when a = 1,


Wilson’s Theorem holds.

.. The Legendre symbol


Again, p is an odd prime, and p ∤ a. Euler’s Criterion can be abbreviated by
a
a̟ ≡ (mod p), (∗∗)
p
 This is the first proof of Wilson’s Theorem given by Hardy and Wright [, p. ].

.. The Legendre symbol 


where (a/p) is called the Legendre symbol. More precisely, we have
(
a 1, if a is a quadratic residue of p;
=
p −1, if a is a quadratic non-residue of p.

Theorem . If p is an odd prime not dividing a or b, then:


 a ± kp   a 
= ,
p p
 a2 
= 1,
p
1
= 1,
p
(
 −1  1, if p ≡ 1 (mod 4),
= (††)
p −1, if p ≡ 3 (mod 4),
 ab   a  b 
= .
p p p
Proof. The first three equations follow immediately from the definitions; the oth-
ers, from Euler’s Criterion as summarized by (∗∗). (Also (††) is equivalent to
Theorem  on page .)
With these properties, we can calculate many Legendre symbols. For example,
 50   12   2 2  3   3 
= = = ,
19 19 19 19 19
3̟ ≡ 39 ≡ 38 · 3 ≡ 94 · 3 ≡ 812 · 3 ≡ 52 · 3 ≡ 6 · 3 ≡ 18 ≡ −1 mod 19,

so (50/19) = −1, which means the congruence x2 ≡ 50 (mod 19) has no solution.
We may ask whether (††) has a simpler form, owing to the existence of only
finitely many p satisfying one of the cases. This possibility fails.
Theorem . There are infinitely many primes p such that p ≡ 3 (mod 4).
Proof. Suppose (q1 , q2 , . . . , qn ) is a list of primes. We shall prove that there is a
prime p, not on this list, such that p ≡ 3 (mod 4). Let

s = 4q1 · q2 · · · qn − 1.

Then s ≡ 3 (mod 4). Then s must have a prime factor p such that p ≡ 3 (mod 4).
Indeed, if all prime factors of s are congruent to 1, then so must s be. But p is
not any of the qk .
 Named for Adrien-Marie Legendre, –.

 . Quadratic reciprocity


A similar argument fails to show that there are infinitely many primes p such
that p ≡ 1 (4). For, even though 4q1 · q2 · · · qn − 3 ≡ 1 (4), possibly all prime
factors of 4q1 · q2 · · · qn − 3 are congruent to 3. (This is the case when n = 1 and
q1 = 3, for example.) Nonetheless, we still have:
Theorem . There are infinitely many primes p such that p ≡ 1 (mod 4).
Proof. Suppose (q1 , q2 , . . . , qn ) is a list of primes. We shall prove that there is a
prime p, not on this list, such that p ≡ 1 (mod 4). Let
s = 2q1 · q2 · · · qn .
Then s2 + 1 is odd, so it is divisible by some odd prime p, which is distinct from
each of the qk . This means s2 +1 ≡ 0 (mod p), so s is a solution of the congruence
x2 ≡ −1 (mod p). Then (−1/p) = 1, so p ≡ 1 (mod 4), by (††) above.
From the rules so far, we obtain the following table:
a 1 2 3 4 5 6 7 8 9 10 11 12
(a/13) 1 1 1 1 1 1

Indeed, under the squares 1, 4, and 9, we put 1. Also 42 = 16 ≡ 3, so (3/13) = 1.


Finally, by (††), we have (−1/13) = 1; or we can just compute this: (−1)̟ =
(−1)6 = 1. Hence the table will be symmetric; that is, (13 − a/13) = (−a/13) =
(−1/13) · (a/13) = (a/13). In particular, (10/13) = 1 and (12/13) = 1. So half of
the slots have been filled with 1. The other half must take −1, by the following.
Theorem . For all odd primes p,
p−1 
X k
= 0.
p
k=1

Proof. Let r be a primitive root of p. Then


p−1  p−1 k  p−1 
X k X r X r k
= = .
p p p
k=1 k=1 k=1

But (r/p) = −1, because r is a primitive root and therefore r̟ ≡ −1 (mod p).
Hence
p−1  p−1
X k X
= (−1)k = 0.
p
k=1 k=1

So now we can complete the table above:


a 1 2 3 4 5 6 7 8 9 10 11 12
(a/13) 1 −1 1 1 −1 −1 −1 −1 1 1 −1 1

.. The Legendre symbol 


.. Gauss’s Lemma
Again, p is an odd prime. Given an integer k, we have
hki k hki
6 < + 1,
p p p
hki hki
p· 6k <p· + p,
p p
hki
06k−p· < p.
p
Thus the least positive residue of k modulo p is k − p · [k/p]. For use in some
proofs, let us define
(
k − p · [k/p], if this is less than p/2,
|k|p = (‡‡)
p − (k − p · [k/p]), otherwise.

Then 0 6 |k|p < p/2, and |k|p is the least distance between k and a multiple of p.
Theorem  (Gauss’s Lemma). Let p be an odd prime, and gcd(a, p) = 1. Then
a
= (−1)n ,
p
where n is the number of elements k of the set

a, 2a, 3a, . . . , ̟a

whose least positive residues exceed p/2.


Proof. If |ka|p = |ℓa|p , then ka ≡ ±ℓa (p), so k = ±ℓ (p). Therefore

{1, 2, . . . , ̟} = {|a|p , |2a|p , . . . , |̟a|p },

so
̟
Y ̟
Y
k= |ka|p .
k=1 k=1

Also |ka|p ≡ ±ka (p), and |ka|p ≡ −ka (p) if and only if ka has least positive
residue exceeding p/2. Therefore, with n as in the statement, we have
̟
Y ̟
Y ̟
Y
̟! · a̟ ≡ (ka) ≡ (−1)n · |ka|p ≡ (−1)n · k ≡ (−1)n · ̟! mod p,
k=1 k=1 k=1

which yields the claim by Euler’s Criterion.

 . Quadratic reciprocity


For example, to find (3/19), we can look at

3, 6, 9, 12, 15, 18, 21, 24, 27,

whose remainders on division by 19 are, respectively,

3, 6, 9, 12, 15, 18, 2, 5, 8.

Of these, only 12, 15, and 18 exceed 19/2, and they are three; so
3
= (−1)3 = −1.
19
We shall use Gauss’s Lemma to prove the Law of Quadratic Reciprocity, by
which we shall be able to relate (p/q) and (q/p) when both p and q are odd
primes. Meanwhile, besides the direct application of Gauss’s Lemma to comput-
ing Legendre symbols, we have the following, which we shall also need in order
to take full advantage of the Law of Quadratic Reciprocity:
Theorem . If p is an odd prime, then
(
2 1, if p ≡ ±1 (mod 8);
=
p −1, if p ≡ ±3 (mod 8).

Proof. To apply Gauss’s Lemma, we look at the numbers 2 · 1, 2 · 2, . . . , 2 · ̟.


Each is its own remainder on division by p. Hence (2/p) = (−1)n , where n is the
number of integers k such that
p
< 2k 6 p − 1,
2
or rather p/4 < k 6 ̟. This means
hpi
n=̟− .
4
Now consider the possibilities:
h 1i
p = 8k + 1 =⇒ n = 4k − 2k + = 2k,
4
h 3 i
p = 8k + 3 =⇒ n = 4k + 1 − 2k + = 2k + 1,
4
h 5 i
p = 8k + 5 =⇒ n = 4k + 2 − 2k + = 4k + 1,
4
h 7 i
p = 8k + 7 =⇒ n = 4k + 3 − 2k + = 4k + 2.
4
In each case then, (2/p) is as claimed.

.. Gauss’s Lemma 


As 13 ≡ −3 (mod 8), we have (2/13) = −1, which we found by other methods
above. An alternative formulation of the theorem is
2 2
= (−1)(p −1)/8 ,
p
since

p ≡ ±1 mod 8 =⇒ p ≡ ±1, ±7 mod 16 =⇒ p2 ≡ 1 mod 16,


p ≡ ±3 mod 8 =⇒ p ≡ ±3, ±5 mod 16 =⇒ p2 ≡ 9 mod 16.

We can also use the theorem to find some primitive roots. Given a prime q and an
integer a that q does not divide, we know that a is a primitive root of q, provided
that
ad 6≡ 1 (mod q)
whenever d is a proper divisor of q −1. Verifying this condition is easier, the fewer
proper divisors q has. If q is odd, then q − 1 has the fewest possible divisors when
it is 2p for some p. Recall from page  that in this case p is called a Germain
prime, assuming p itself is odd. That is, an odd prime p is a Germain prime if
and only if 2p + 1 is also prime.
Theorem . Suppose p is a Germain prime, and let ̟ = (p−1)/2. Then 2p+1
has the primitive root (−1)̟ · 2, which is 2 if p ≡ 1 (mod 4), and is otherwise
−2.
Proof. Let r = (−1)̟ · 2, and denote 2p + 1 by q. We want to show ordq (r) is
not 1, 2, or p. But p > 3, so q > 7, and hence r1 , r2 6≡ 1 (mod q). Hence ordq (r)
is not 1 or 2. Also, from Euler’s Criterion,
r
rp ≡ r(q−1)/2 ≡ (mod q).
q
So it is enough to show (r/q) = −1. We consider two cases.
. If p ≡ 1 (mod 4), then r = 2, but also q ≡ 3 (mod 8), so
r 2
= = −1
q q
by the last theorem.
. If p ≡ 3 (mod 4), then r = −2, but also q ≡ 7 (mod 8), and
 −1 
= (−1)(q−1)/2 = (−1)p = −1,
q
so (r/q) = (−2/q) = (−1/q)(2/q) = −1.

 . Quadratic reciprocity


Hence, for example, we have the following Germain primes and their primitive
roots:
p 3 5 11 23 29 41 53 83 89 113 131 173 179
2p + 1 7 11 23 47 59 83 107 167 179 227 263 347 359
p.r. of 2p + 1 −2 2 −2 −2 2 2 2 −2 2 2 −2 2 −2

It is not known whether there infinitely many Germain primes. However, some
of them give examples of Mersenne numbers that are not primes, as noted on
page :
Theorem . If p is a Germain prime, and 2p + 1 ≡ ±1 (mod 8), then 2p − 1
is not prime, because
2p ≡ 1 (mod 2p + 1).
Proof. Let q = 2p + 1. Under the given conditions, we have (2/q) = 1 by Theo-
rem , so 2q ≡ 1 (q) by Euler’s Criterion.
Another consequence of Theorem  is:
Theorem . There are infinitely many primes congruent to −1 modulo 8.
Proof. Let q1 , . . . , qn be a finite list of primes. We show that there is p not on
the list such that p ≡ −1 (mod 8). Let

M = (4q1 · · · qn )2 − 2.

Then M ≡ −2 (mod 16), so M is not a power of 2; in particular, M has odd


prime divisors. Also, for every odd prime divisor p of M , we have

(4q1 · · · qn )2 ≡ 2 (mod p),

so (2/p) = 1, and therefore p ≡ ±1 (mod 8). Since M/2 ≡ −1 (mod 8), we


conclude that not every odd prime divisor of M can be congruent to 1 modulo 8.

.. The Law of Quadratic Reciprocity


We now aim to establish the Law of Quadratic Reciprocity, Theorem  below.
To prove the Law, we shall use the following consequence of Gauss’s Lemma
(Theorem ):
Lemma. If p is an odd prime, p ∤ a, and a is odd, then
a
= (−1)m ,
p

.. The Law of Quadratic Reciprocity 


where
̟  
X ka
m= .
p
k=1

Proof. With n as in Gauss’s Lemma, we need only show m ≡ n (2). As in the


proof of Gauss’s Lemma, we have

{1, 2, . . . , ̟} = {|a|p , |2a|p , . . . , |̟a|p }.

We now work with modulus 2, so that −1 ≡ 1, and a + 1 ≡ 0. Then


̟
X ̟
X ̟
X
0 ≡ (a + 1) · k≡ (ka − k) ≡ (ka + |ka|p ).
k=1 k=1 k=1

From the original definition (‡‡) of |k|p on page , and because −1 ≡ 1, we
have
(
p · [ka/p], if (residue of ka modulo p) < p/2,
ka + |ka|p ≡
p · [ka/p] + p, otherwise.

Therefore
̟
X h ka i
0≡ p· + np ≡ m + n.
p
k=1

The following was:


• conjectured by Euler, ;
• imperfectly proved by Legendre, , ;
• discovered and proved independently by Gauss, , at age .
The following proof is due to Gauss’s student Eisenstein. We have so far denoted
(p − 1)/2 by ̟; but now, going back to the original definition (k) on page , we
must use ̟(p):
Theorem  (Law of Quadratic Reciprocity). If p and q are distinct odd primes,
then  p  q 
= (−1)̟(p)·̟(q) .
q p
Proof. By the lemma, it is enough to show
̟(p) 
p−1 q−1 X kq  ̟(q)X  ℓp 
· = ̟(p) · ̟(q) = + .
2 2 p q
k=1 ℓ=1

We do this by considering a rectangle ABCD in the Cartesian plane as in Fig-


ure .. The number of points in the interior of ABCD with integral coordinates

 . Quadratic reciprocity


D C
q/2

kq/p

B
A k p
2

Figure .. Two ways of counting, for the Law of Quadratic Reciprocity

is [p/2] · [q/2], that is, ̟(p) · ̟(q). None of these points lie on the diagonal AC.
The number of points in the interior of triangle ABC with first coordinate k and
second coordinate integral is [kp/q]. Therefore the number of points in the inte-
P̟(p)
rior of ABC with integral coordinates is k=1 [kq/p]. A similar consideration of
triangle ACD yields the claim.
For example, suppose p = 13 and q = 7. The points that we count in the proof
are shown in Figure .. Counted in columns, the number of points inside ABC

D C
7/2
b b b b b b

b b b b b b

b b b b b b

B
A 13
2

Figure .. Example of the proof of quadratic reciprocity

is 0 + 1 + 1 + 2 + 2 + 3, which is
h 7 i h 14 i h 21 i h 28 i h 35 i h 42 i
+ + + + + .
13 13 13 13 13 13

.. The Law of Quadratic Reciprocity 


Counted in rows, the number of points inside ACD is 1 + 3 + 5, which is
h 13 i h 26 i h 39 i
+ + .
7 7 7
The more useful form of the Law of Quadratic Reciprocity is:
(
q (p/q), if p ≡ 1 or q ≡ 1 (mod 4);
=
p −(p/q), if q ≡ 3 ≡ p (mod 4).

It is important to remember here that both p and q are odd primes. We have not
defined the symbol (a/n) except when n is an odd prime not dividing a. In this
case, we can reduce the computation to computation of symbols (p/q) by means
of Theorems  and . For example, we can compute one Legendre symbol as

 365  5  73 
= [factorizing]
941 941 941
 941  941 
= [5, 73 ≡ 1 (4)]
5 73
 1  65 
= [dividing]
5 73
 5  13 
= [factorizing]
73 73
 73  73 
= [5, 13 ≡ 1 (4)]
5 13
 3  8 
= [dividing]
5 13
 5  2 3
= [5 ≡ 1 (4); factorizing]
3 13
 2  2 
= [(p/q)2 = 1]
3 13
= (−1)(−1) = 1 [3 ≡ 3 & 13 ≡ −3 (8)].

Table .. Computation of (365/941)

in Table .. Similarly, we have


 47   199   11   47   3   11  2
=− =− = = =− =− = 1.
199 47 47 11 11 3 3
Thus we can compute any Legendre symbol (a/p), as long as we can recognize
which numbers less than p are prime.

 . Quadratic reciprocity


The value of (2/p) cannot be computed by the Law of Quadratic Reciprocity;
we need Theorem . We can use the Law to compute (3/p) when we need it;
but we can also compute it once for all as follows.
Theorem . For all primes greater than 3,
(
3 1, if p ≡ ±1 (mod 12),
=
p −1, if p ≡ ±5 (mod 12).

Proof. We have
 p 
3  , if p ≡ 1 (mod 4),
= 3 p 
p − , if p ≡ 3 (mod 4),
( 3
p 1, if p ≡ 1 (mod 3),
=
3 −1, if p ≡ 2 (mod 3).

It is a Chinese remainder problem to compute


( )
p ≡ 1 (4)
⇐⇒ p ≡ 1 (12),
p ≡ 1 (3)
( )
p ≡ 1 (4)
⇐⇒ p ≡ 5 (12),
p ≡ 2 (3)
( )
p ≡ 3 (4)
⇐⇒ p ≡ 7 (12),
p ≡ 1 (3)
( )
p ≡ 3 (4)
⇐⇒ p ≡ 11 (12).
p ≡ 2 (3)

One could find a similar rule for (q/p) for any fixed q.

.. Composite moduli


Assuming gcd(a, n) = 1, we know when the congruence x2 ≡ a (mod n) has
solutions, provided n is an odd prime; but what about the other cases? When
n = 2, then the congruence always has the solution 1. If gcd(m, n) = 1, and
gcd(a, mn) = 1, then the congruence x2 ≡ a (mod mn) is soluble if and only if
the system

x2 ≡ a (mod m), x2 ≡ a (mod n)

.. Composite moduli 


is soluble. By the Chinese Remainder Theorem, the system is soluble if and
only if the individual congruences are separately soluble. Indeed, suppose b2 ≡ a
(mod m), and c2 ≡ a (mod n). By the Chinese Remainder Theorem, there is
some d such that d ≡ b (mod m) and d ≡ c (mod n). Then d2 ≡ b2 ≡ a
(mod m), and d2 ≡ c2 ≡ a (mod n), so d2 ≡ a (mod mn).
For example, suppose we want to solve

x2 ≡ 365 (mod 667).

Factorize 667 as 23 · 29. Then we first want to solve

x2 ≡ 365 (mod 23), x2 ≡ 365 (mod 29).

But we have (365/23) = (20/23) = (5/23) = (23/5) = (3/5) = −1 by the


formula for (3/p), so the first of the two congruences is insoluble, and therefore
the original congruence is insoluble. It doesn’t matter whether the second of the
two congruences is insoluble.
Contrast with the following: (2/11) = −1, and (7/11) = −(11/7) = −(4/7) =
−1; so the congruences

x2 ≡ 2 (mod 11), x2 ≡ 7 (mod 11)

are insoluble; but x2 ≡ 14 (mod 11) is soluble.


Now consider
x2 ≡ 361 (mod 667).
One may notice that this has the solutions x ≡ ±19; but there are others, and
we can find them as follows. We first solve

x2 ≡ 16 (mod 23), x2 ≡ 13 (mod 29).

The first of these is solved by x ≡ ±4 (mod 23) (and nothing else, since 23 is
prime). For the second, note 13 ≡ 42, 71, 100 (mod 29), so x ≡ ±10 (mod 29).
So the solutions of the original congruence are the solutions of one of the following
systems:
( ) ( )
x ≡ 4 (mod 23), x ≡ 4 (mod 23),
, ,
x ≡ 10 (mod 29) x ≡ −10 (mod 29)
( ) ( )
x ≡ −4 (mod 23), x ≡ −4 (mod 23),
, .
x ≡ 10 (mod 29) x ≡ −10 (mod 29)

One finds x ≡ ±19, ±280 (mod 667), or alternatively

x ≡ 648, 280, 387, 19 (mod 667).

 . Quadratic reciprocity


So now x2 ≡ a (mod n) is soluble if and only if the congruences

x2 ≡ a (mod pn(p) )
Q
are soluble, where n = p|n pn(p) .

Theorem . If p is odd, p ∤ a, and (a/p) = 1, then the congruence

x2 ≡ a (mod pk ) (∗)

has two solutions for each positive k.


Proof. The set {x2 : x ∈ Zpk× } consists of those a in Zpk× such that (∗) is soluble.
For such a, we have (a/p) = 1. Thus

{x2 : x ∈ Zpk× } ⊆ {a ∈ Zpk× : (a/p) = 1}.

But we have also


ϕ(pk )
|{a ∈ Zpk× : (a/p) = 1}| = .
2
Indeed, this formula is correct when k = 1, by Theorem  on page . Moreover,
for every element a of Zp , exactly pk−1 elements of Zpk× have the residue a modulo
p: those elements are a, a + p, a + 2p, . . . , a + (pk−1 − 1) · p. This yields the claim
for arbitrary positive k, since the value of (a/p) depends only on the residue of a
modulo p.
Each congruence (∗) has at most 2 solutions, and therefore

ϕ(pk )
|{x2 : x ∈ Zpk× }| > .
2
For, if x2 = y 2 (mod pk ), then p | (x + y)(x − y), but if p divides both x + y and
x − y, then p divides 2x and therefore x, and similarly p | y. Assuming we have
neither of these conclusions, we have pk | x ± y, that is, x ≡ ±y (pk ).
Combining what we have so far yields

ϕ(pk )
|{x2 : x ∈ Zpk× }| = |{a ∈ Zpk× : (a/p) = 1}| = .
2
But we have also shown that the function x 7→ x2 from Zpk× to itself sends at
most two elements to the same element. Since Zpk× has just ϕ(pk ) elements, the
squaring function must send exactly two elements to the same element. This just
means (∗) has exactly two solutions when (a/p) = 1.
In this proof, we have used a kind of pigeonhole principle: If the ϕ(pk )-many
elements of Zpk× are pigeons, and the squares of those elements are pigeon-holes,

.. Composite moduli 


then there are at most two pigeons for each hole, so there are at least ϕ(pk )/2-
many holes; but there are at most ϕ(pk )/2-many holes, therefore there are exactly
that many, and there are two pigeons for each hole.
An alternative argument that (∗) is soluble is by induction. Suppose b2 ≡ a
k
(p ) for some positive k. This means

b2 = a + c · p k

for some c. Then

(b + pk · y)2 = b2 + 2bpk · y + p2k · y 2


= a + (c + 2by)pk + p2k · y 2

Therefore (b + pk · y)2 ≡ a (mod pk+1 ) ⇐⇒ c + 2by ≡ 0 (mod p). But the latter
congruence is soluble, since p is odd.
We must finally consider powers of 2.

Theorem . Suppose a is odd.

a) x2 ≡ a (mod 2) is soluble.

b) x2 ≡ a (mod 4) is soluble if and only if a ≡ 1 (mod 4).

c) The following are equivalent:


(i) x2 ≡ a (mod 22+k ) is soluble for all positive k;
(ii) x2 ≡ a (mod 22+k ) is soluble for some positive k;
(iii) x2 ≡ a (mod 8) is soluble;
(iv) a ≡ 1 (mod 8).

Proof. The only hard part is to show that, if a ≡ 1 (8), then for all positive k,
the congruence x2 ≡ a (22+k ) is soluble. We prove this by induction. It is easily
true when k = 1. Suppose it is true when k = ℓ, and in fact b2 ≡ a (mod 22+ℓ ).
Then b2 = a + 22+ℓ · c for some c. Hence

(b + 21+ℓ · y)2 = b2 + 22+ℓ · by + 22+2ℓ · y 2


= a + 22+ℓ · c + 22+ℓ · by + 22+2ℓ · y 2
= a + 22+ℓ · (c + by) + 22+2ℓ · y 2 ,

and this is congruent to a modulo 23+ℓ if and only if c + by ≡ 0 (mod 2). But
this congruence is soluble, since b is odd (since a is odd).

 . Quadratic reciprocity


. Sums of squares

Now we shall show that, if n is a natural number, then the Diophantine equation

x2 + y 2 + z 2 + w 2 = n (∗)

is soluble. This is easy when n is 1 or 2, since

12 + 02 + 02 + 02 = 1, 12 + 12 + 02 + 02 = 2.

We continue by showing:
) for each odd prime p, (∗) is soluble when n = mp for some m where m < p;
) for each odd prime p, (∗) is soluble when n = p;
) the set of n for which (∗) is soluble is closed under multiplication.
For the first step, the following lemma is more than enough. Note that the lemma
is nothing new when p is odd and (a/p) = 1.

Lemma. For every odd prime p, for every integer a, the congruence

x2 + y 2 ≡ a (mod p)

is soluble.

Proof. If 0 6 s 6 t 6 ̟, then 0 6 s + t < p. If also s2 ≡ t2 (p), then


p | (t + s)(t − s) and hence p | t − s, so s = t. This shows that no two distinct
elements of the set
{x2 : 0 6 x 6 ̟}
are congruent to one another modulo p; and the same is true for the set

{a − y 2 : 0 6 y 6 ̟}.

But each of these sets has (p + 1)/2 elements, so one element from one of the sets
must be congruent to an element of the other, by the pigeonhole principle.

Another way to express the lemma is that, for all odd primes p, there are a, b,
and m such that
a2 + b2 + 12 + 02 = a2 + b2 + 1 = mp.
We may assume |a| and |b| are less than p/2, so a2 + b2 < p2 /2, and hence m < p.


Theorem  (Euler). The product of two sums of four squares is the sum of
four squares, and indeed
 

 (ax + by + cu + dv)2  

 
 + (ay − bx + cv − du)2  
2 2 2 2 2 2 2 2
(a + b + c + d )(x + y + u + v ) = (†)
 + (au − bv − cx + dy)2 
 

 

+ (av + bu − cy − dx)2 .
 

One can prove this by multiplying out either side; but there is a neater way to
proceed. In C, if z = x + yi, we define
z̄ = x − yi;
this is the conjugate of z. If we think of z as the matrix in (§) on page  in §.,
then z̄ is its transpose. Then z · z̄ = x2 + y 2 , an element of R. More generally,
z · w = w̄ · z̄ = z̄ · w̄.
Now we define the set H of quaternions as the set of matrices
 
z w
, (‡)
−w̄ z̄
where z and w range over C. Then H is still a ring, albeit not commutative.
Indeed, we identify C with its image in H under the map
 
z 0
z 7→ ,
0 z̄
and we define  
0 1
j= .
−1 0
Then every element of H is uniquely z + wj for some z and w in C; moreover,
j2 = −1. But j · i = −i · j, by the computation
         
0 1 i 0 0 −i i 0 0 1
· = =− · .
−1 0 0 −i −i 0 0 −i −1 0
We may write k for i · j; then every element of H is uniquely x + yi + uj + vk for
some x, y, u, and v in R. If the matrix in (‡) is α, then we define
 
z̄ −w
ᾱ = ,
w̄ z
which is the transpose of the matrix resulting from taking the conjugate of every
entry. Hence if also β ∈ H, then β · α = ᾱ · β̄. Moreover,
α · ᾱ = z · z̄ + w · w̄;

 . Sums of squares


this is an element of R, so it commutes with all quaternions. If α = x+yi+uj+vk,
then α · ᾱ = x2 + y 2 + z 2 + w2 . We have also

β · α · β · α = β · α · ᾱ · β̄ = β · β̄ · α · ᾱ,

which is just Euler’s Theorem. Indeed, if β = a + bi + cj + dk, then


 
β · α = (a + bi) + (c + di)j · (x + yi) + (u + vi)j
( )
(a + bi) · (x + yi) − (c + di) · (u − vi)
= 
+ (a + bi) · (u + vi) + (c + di) · (x − yi) j
 
 ax − by − cu − dv 

 
 + (ay + bx + cv − du)i  
=

 + (au − bv + cx + dy)j  

 

+ (av + bu − cy + dx)k,

and therefore
 

 (ax − by − cu − dv)2  

 
 + (ay + bx + cv − du)2  
2 2 2 2 2 2 2 2
(a + b + c + d ) · (x + y + u + v ) = ;


 + (au − bv + cx + dy)2  

 
+ (av + bu − cy + dx)2 .
 

this yields (†) when β is replaced with β̄.


Theorem  (Lagrange). Every positive integer is the sum of four squares.
Proof. By the lemma and Euler’s Theorem (Theorem ), it is now enough to
show the following. Let p be a prime. Suppose m is a positive integer less than
p such that
a2 + b2 + c2 + d2 = mp (§)
for some a, b, c, and d. We shall show that the same is true for some smaller
positive m, unless m is already 1.
First we show that, if m is even, then we can replace it with m/2. Indeed, if
a2 + b2 = n, then
 a + b 2  a − b 2 n
+ = ,
2 2 2
and if n is even, then so are (a ± b)/2. In (§) then, if m is even, then we may
assume that a2 + b2 and c2 + d2 are both even, so
 a + b 2  a − b 2  c + d 2  c − d 2 m
+ + + = · p.
2 2 2 2 2


Henceforth we may assume m is odd. Then there are x, y, u, and v strictly
between −m/2 and m/2 such that, modulo m,

x ≡ a, y ≡ b, u ≡ c, v ≡ d.

Then
x2 + y 2 + u2 + v 2 ≡ 0 (mod m),
but also x2 + y 2 + u2 + v 2 < m2 , so

x2 + y 2 + u2 + v 2 = km

for some positive k less than m. We now have

(a2 + b2 + c2 + d2 )(x2 + y 2 + u2 + v 2 ) = km2 p.

By Euler’s Theorem, we know the left-hand side as a sum of four squares; more-
over, each of the squared numbers in that sum is divisible by m:

ax + by + cu + dv ≡ x2 + y 2 + u2 + v 2 ≡ 0 (mod m),
ay − bx + cv − du ≡ xy − yx + uv − vu = 0,
au − bv − cx + dy ≡ xu − yv − ux + vy = 0,
av + bu − cy − dx ≡ xv + yu − uy − vx = 0.

Therefore we obtain kp as a sum of four squares. This yields the claim, as


discussed above.

 . Sums of squares


A. Foundations

A.. Construction of the natural numbers


In §. it is assumed that the set N of natural numbers exists with certain prop-
erties. We can prove this assertion by constructing N. The following is the most
direct formulation of the construction that I can come up with. The construction
is basically John von Neumann’s [] of .
Of course we shall still have to assume something. We start with the undefined
notion of a class. A class is a sort of thing that has members or elements.
Let us denote classes by boldface capital letters. If a class C has a member a,
we write
a ∈ C.
The members determine the class, in the sense that two classes with the same
members are identical. A class D includes a class C, so that C is a subclass
of D, if every member of C is a member of D. In this case, we write
C ⊆ D.
If C is a proper subclass of D,—a subclass that is not the whole class itself—,
we write
C ⊂ D.
In ordinary language, one tends to confuse the notions of membership and inclu-
sion; but here we must keep them distinct. For our purposes, a set is a class with
two special properties:
a) it is a member of other classes;
b) its own members are sets.
Some classes are not sets; for example, the class of all sets that are not members
of themselves is not a set. This is the Russell Paradox []. We may denote
sets by plainface minuscule letters.
Let us restrict our attention to classes whose only members are sets. A class in
this sense is called transitive when it confuses the notions of membership and
inclusion to the point that it includes each of its members. Symbolically, a class
C is transitive if and only if
x ∈ y & y ∈ C =⇒ x ∈ C.
 Thismeans our sets are hereditary sets; but we need not consider any other kind of set.
 TheBurali-Forti Paradox, Theorem  below, was discovered earlier. The resolution of the
paradoxes, by distinguishing sets from classes, took some time.


A set is called an ordinal number, or just an ordinal, if it is transitive and
well-ordered by membership. The class of ordinals is denoted by
ON.
The Greek letters α, β, γ, . . . , will denote ordinals. A well-ordering is to be
understood in particular as a strict ordering, so that α ∈
/ α.
Lemma. ON is transitive, that is, every element of an ordinal is an ordinal.
Also every ordinal properly includes its elements.
Proof. Suppose α ∈ ON and b ∈ α. Then b ⊆ α by transitivity of α, so b, like
α, is well-ordered by membership. Suppose c ∈ b and d ∈ c. Then c ∈ α, so
c ⊆ α, and hence d ∈ α. Since d ∈ c and c ∈ b, and all are elements of α, where
membership is a transitive relation, we have d ∈ b. Thus b is transitive. Now we
know b is an ordinal. Therefore α ⊆ ON. So ON is transitive. Finally, b ⊂ α
simply because membership is a strict ordering of α.
Lemma. Every ordinal contains every ordinal that it properly includes.
Proof. Suppose β ⊂ α. Then α r β contains some γ. Then β ⊆ γ; indeed, if
δ ∈ β, then, since γ ∈/ β, we have γ ∈ / δ and γ 6= δ, so δ ∈ γ. We show that,
if γ is the least member of α r β, then γ = β. Suppose β ⊂ γ. Then γ r β
contains some δ. In particular, δ ∈ α r β. By the last lemma, δ ⊂ γ, so γ ∈
/ δ.
In particular, γ was not the least element of α r β.
Theorem  (Burali-Forti Paradox []). ON is transitive and well-ordered by
membership; so it is not a set.
Proof. By the next-to-last lemma, ON is transitive. Now let α and β be two
ordinals such that β ∈ / α. We prove α ⊆ β, so that either α = β or α ∈ β by the
last lemma. If not, then α r β has a least element, γ. This means every element
of γ is an element of β; that is, γ ⊆ β. But γ 6= β (since β ∈
/ α), so γ ∈ β by the
lemma, contrary to assumption.
If a is a set of ordinals with an element β, then the least element of a is the
least element of a ∩ β, if this set is nonempty; otherwise it is β. Thus ON is
well-ordered by membership. In particular, it cannot contain itself; so it must
not be a set.
Since, on ON and hence on every ordinal, the relations of membership and
proper inclusion are the same, they can both be denoted by <. However, we have
not yet established that there are any ordinals, or even any sets at all.
We take it for granted that there is an empty set, which is generally denoted
by ∅, but which, in the present context, we denote by
0.

 A. Foundations
We also assume that if x and y are sets, then so is the class whose members are
just y and the members of x; this is the class—now a set—denoted by

x ∪ {y}.

We are interested mainly in the set x ∪ {x}, which we denote by

x′ .

The following is easy to show.

Theorem . ON contains 0 and is closed under the operation x 7→ x′ .

An ordinal is called a limit if it is neither 0 nor α′ for any α. The class of


ordinals that neither are limits nor contain limits is denoted by

ω.

Theorem  (Dedekind ). The class ω satisfies the Peano Axioms when 0 is
considered as the first element of ω, and α′ is the successor of α.

Proof. We must show three things.


. Since α ∈ α′ , we have 0 6= α′ .
. If α and β are distinct ordinals, then we may assume α ∈ β, so that β ∈

and β 6= α, and therefore β ∈/ α′ ; but β ∈ β ′ , so α′ 6= β ′ .
. Suppose C ⊆ ω. Then ω r C has a least element α. Either α = 0 or else
α = β ′ for some β, which must be in C. Hence C either does not contain 0 or
else is not closed under succession.
 The following is a remark on English grammar. One could say, ‘The class of ordinals that
neither are nor contain limits is denoted by ω’; but this would violate the principles laid
down by Fowler in his Modern English Usage [, Cases] of  and reaffirmed by Gowers
in the second edition [] of . In the original sentence, namely ‘The class of ordinals
that neither are limits nor contain limits is denoted by ω’, the second instance of limits is
the direct object of contain, so it is notionally in the ‘objective case’; but the first instance of
limits is is not an object of are (which does not take objects), but is in the ‘subjective case’,
like the subject, that, of the relative clause. On similar grounds, the common expression ‘x
is less than or equal to y’ is objectionable, unless than, like to, is construed as a preposition.
However, allowing than to be used as a preposition can cause ambiguity: does ‘She likes tea
better than me’ mean ‘She likes tea better than she likes me’, or ‘She likes tea better than
I do’ ? Therefore it is recommended in [, Than ] and (less strongly) in [] that than not
be used as a preposition. On these grounds, ‘x 6 y’ should be read as ‘x is less than y or [x
is] equal to y.’ But I don’t believe anybody does so, and this itself is grounds for rethinking
Fowler’s grammatical distinctions.
 Dedekind recognized that the natural numbers have the properties given by this theorem,

and that all structures with these properties are isomorphic [, II: §§ , ].

A.. Construction of the natural numbers 


We have so far not assumed that ω is a set. If it is a set, then it is in ON; if
it is not a set, then it must be ON. As far as number theory is concerned, there
does not appear to be any need to make a decision one way or other; but it seems
to be customary to consider ω and its subclasses as sets. If ω is a set, it is the
least of the transfinite ordinals.
We denote {0} by 1. Then ω r {0} also satisfies the Peano Axioms, when
1 is considered as the first element. You just have to decide whether to begin
the natural numbers with 0 or 1; but if you start with 0, you should adjust the
definitions of addition, multiplication, exponentiation, and factorial accordingly,
so that

m + 0 = m, m · 0 = 0, m0 = 1, 0! = 1;

also one should note that the ordering of ω satisfies

m 6 n ⇐⇒ ∃x m + x = n.

A.. Why it matters


Some teachers and texts give the impression that the properties of the natu-
ral numbers can be derived from a single principle, such as the so-called Well-
Ordering Principle. My excellent high school teacher Mr Brown did this. Burton
does this, writing for example, ‘With the Well-Ordering Principle available, it is
an easy matter to derive the First Principle of Finite Induction’ [, p. ]. The
principle of induction here is that a set containing the first natural number, and
containing the successor of each natural number that it contains, contains all of
the natural numbers. One needs more than well-ordering to prove this, since
every ordinal number is well-ordered, but the ordinals like ω′ that are greater
than ω do not admit induction in the sense referred to. Indeed, ω is distin-
guished among all of the transfinite ordinals as the one that admits induction in
the present sense.
Burton is saved from true inconsistency by having written on the previous page,
We shall make no attempt to construct the integers axiomatically, assuming
instead that they are already given and that any reader of this book is familiar
with many elementary facts about them.

Burton’s proof of induction relies on some of these unspecified ‘elementary facts’.


On the other hand, the needed ‘facts’ are so simple that it seems dishonest not
to name them. Burton could have axiomatized the set of natural numbers by the
requirements:
) it is well ordered,
) there is no greatest element,

 A. Foundations
) every element after the first is a successor.
Like the Peano Axioms, these conditions determine the set up to isomorphism.
By referring, in the passage quoted above, to an ‘attempt to construct the
integers axiomatically’, Burton confuses two approaches to the natural numbers:
) assuming they exist so as to satisfy the Peano Axioms, as we did in §.;
) constructing them as ω, as we did in the last section.
The construction of ω is perhaps too specialized for a number theory course.
However, I will suggest that every mathematician should know the Peano Axioms
and know that they determine the natural numbers up to isomorphism. It might
prevent certain infelicities and mistakes, such as can be found, for example, in
Burton.
Before proving induction as Theorem ., Burton proves the ‘Archimedean
property’ as Theorem .. Before stating this theorem, he says,
Because this principle [of well-ordering] plays a crucial role in the proofs here
and in subsequent chapters, let us use it to show that the set of positive integers
has what is known as the Archimedean property.

This comment does not clarify why the Archimedean property should be proved.
Will it be needed later, or is is just a warming-up example of the use of well-
ordering?
Burton’s ‘Second Principle of Finite Induction’ is that a set S contains all
positive integers if
a) S contains 1, and
b) S contains k + 1 when it contains 1, . . . , k.
This statement may be useful for the writer in a hurry. Such a writer may attempt
a proof by the ‘First Principle’ of induction, only to find that the weaker inductive
hypothesis there, namely k ∈ S, is not enough. Then the writer can just assume
that 1, . . . , k are all in S. But it would be better to go back and erase the proof
that 1 ∈ S, then prove k ∈ S on the assumption that 1, . . . , k − 1 are in S,
using what I have called ‘Strong Induction’ (Theorem ). In case k = 1, one has
proved 1 ∈ S; this need not be treated separately.
Burton says presently, ‘Mathematical induction is often used as a method of
definition as well as a method of proof.’ This is a misconception that Peanso
shared, but that Landau identified in his Foundations of Analysis []. Definition
by induction should be called something else, like definition by recursion, because
it is logically stronger than proof by induction, as noted in §..

A.. Why it matters 


B. Some theorems without their proofs
I state some theorems, without giving proofs; some of them are recent and reflect
ongoing research:

Theorem  (Dirichlet). If gcd(a, b) = 1, and b > 0, then {a + bn : n ∈ N}


contains infinitely many primes.

That is in an arithmetic progression whose initial term is prime to the com-


mon difference, there are infinitely many primes. It is moreover possible to find
arbitrary long arithmetic progressions consisting entirely of primes:

Theorem  (Ben Green and Terence Tao,  []). For every n, there are
a and b such that each of the numbers a, a + b, a + 2b, . . . , a + nb is prime (and
b > 0).
Is it possible that each of the numbers

a, a + b, a + 2b, a + 3b, . . .

is prime? Yes, if b = 0. What if b > 0? Then No, since a | a + ab. But what if
a = 1? Then replace a with a + b.
Two primes p and q are twin primes if |p − q| = 2. The list of all primes
begins:
2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, . . .
| {z } | {z } | {z } | {z } | {z }
and there are several twins. Are there infinitely many? People think so, but
cannot prove it. We do have:

Theorem  (Goldston, Pintz, Yıldırım,  []). For every positive real
number ε, there are primes p and q such that 0 < q − p < ε · log p.

The logarithm function also appears in the much older

Theorem  (Prime Number Theorem). Let π(n) be the number of primes p
such that p 6 n. Then
π(n)
lim = 1.
n→∞ n/ log n

 This theorem is not mentioned in Burton [].


C. Exercises
In the following exercises, if a statement is given that is not a definition, then the
exercise is to prove the statement. Minuscule letters range over Z, or sometimes
just over N; letters p, pi , and q range over the prime numbers.
Many of these exercises are inspired by exercises in [, Ch. ].
Exercise . Prove the unproved propositions in Chapter .
Exercise . An integer n is a triangular number if and only if 8n + 1 is a
square number. Solution: If n is triangular, then x = k(k + 1)/2 for some k,
and then 8n + 1 = 4k 2 + 4k + 1 = (2k + 1)2 . Conversely, if 8n + 1 is square,
then, since this number
 is also odd, the square is (2k + 1)2 for some k, and then
2
n = (2k + 1) − 1) /8 = k(k + 1)/2, a triangular number.
Exercise .
a) If n is triangular, then so is 9n + 1.
b) Find infinitely many pairs (k, ℓ) such that, if n is triangular, then so is
kn + ℓ.
Exercise . If a = n(n + 3)/2, then ta + tn+1 = ta+1 .
Exercise . The pentagonal numbers are 1, 5, 12, . . . : call these p1 , p2 , &c.
a) Give a recursive definition of these numbers.
b) Find a closed expression for pn (that is, an expression not involving pn−1 ,
pn−2 , &c.).
c) Find such an expression involving triangular numbers and square numbers.
Exercise . Given a positive modulus n and an integer a, find a formula for the
unique residue in {a, . . . , a + n − 1} of an arbitrary integer x. (Gauss does this
in the Disquisitiones Arithmeticae.)
Exercise . Show that every cube is congruent to 0 or ±1 modulo 7.
Exercise .
a) 7 | 23n + 6.
b) Given a in Z and k in N, find integers b and c such that b | akn + c for all
n in N.
Exercise . gcd(a, a + 1) = 1.
Exercise . (k!)n | (kn)! for all k and n in N.


Exercise . If a and b are co-prime, and a and c are co-prime, then a and bc
are co-prime.

Exercise . Let gcd(204, 391) = n.


a) Compute n.
b) Find a solution of 204x + 391y = n.

Exercise . Let gcd(a, b) = n.


a) If k | ℓ and ℓ | 2k, then |ℓ| ∈ {|k|, |2k|}.
b) Show gcd(a + b, a − b) ∈ {n, 2n}.
c) Find an example for each possibility.
d) gcd(2a + 3b, 3a + 4b) = n.
e) Solve gcd(ax + by, az + bw) = n.

Exercise . gcd(a, b) | lcm(a, b).

Exercise . When are gcd(a, b) and lcm(a, b) the same?

Exercise . The binary operation (x, y) 7→ gcd(x, y) on N is commutative and


associative.

Exercise . The co-prime relation on N, namely

{(x, y) ∈ N × N : gcd(x, y) = 1}

—is it reflexive? irreflexive? symmetric? anti-symmetric? transitive?

Exercise . Give complete solutions, or show that they do not exist, for:
a) 14x − 56y = 34;
b) 10x + 11y = 12.

Exercise . I have some -TL pieces and some - and -Kr pieces:  coins
in all. They make  TL. How many coins of each denomination have I got?

Exercise . p ≡ ±1 (mod 6) if p > 3. (This exercise is used in Exercise .)

Exercise . If p ≡ 1 (mod 3) then p ≡ 1 (mod 6).

Exercise . If n ≡ 2 (mod 3), then n has a factor p such that p ≡ 2 (mod 3).

Exercise . Find all primes of the form n3 − 1.

Exercise . Find all p such that 3p + 1 is square.

Exercise . Find all p such that p2 + 2 is prime.

Exercise . n4 + 4 is composite unless n = ±1.

 C. Exercises
Exercise . If n is positive, then 8n + 1 is composite.
Exercise . Find all integers n such that the equation

x2 = ny 2

has only the zero solution. Prove your findings.


Exercise . If p1 < · · · < pn , prove that the sum
1 1
+ ··· +
p1 pn
is not an integer.
Exercise . Prove that the following are equivalent:
a) Every even integer greater than 2 is the sum of two primes.
b) Every integer greater than 5 is the sum of three primes.
Exercise . Infinitely many primes are congruent to −1 modulo 6.
P
Exercise . With ϑ(x) = p6x log p as in §., and defining

X √
ψ(x) = ϑ( k x),
k=1

show
X∞ x
log[x]! = ψ .
j=1
j

Exercise . Define the Mangoldt function, Λ, by


(
log p, if n = pm for some positive m;
Λ(n) =
0, otherwise.
P
a) log k = d|k Λ(d) (as in Exercise ).
Pn
b) log(n!) = j=1 Λ(j)[n/j].
c) Now give another proof of Theorem , that
∞ h
X X ni
log(n!) = log p .
j=1
pj
p6n
P
Exercise . Prove that p 1/p diverges.
Exercise . Find all n such that


a) n! is square;
b) n! + (n + 1)! + (n + 2)! is square.
Exercise . Determine whether a2 ≡ b2 (mod n) =⇒ a ≡ b (mod n).
P1001
Exercise . Compute k=1 k 365 (mod 5).
Exercise . 39 | 53103 + 10353 .
Exercise . Solve 6n+2 + 72n+1 ≡ x (mod 43).
Exercise . Determine whether a ≡ b (mod n) =⇒ ca ≡ cb (mod n).
Exercise . Solve the system

x ≡ 1 (mod 17),

x ≡ 8 (mod 19),


x ≡ 16 (mod 21).

Exercise . The system


(
x ≡ a (mod n)
x ≡ b (mod m)

has a solution if and only if gcd(n, m) | b − a.


Exercise . In the proof of Theorem  (p. ), how do we use that n is even?
Exercise . Show that every even perfect number is a triangular number.
Exercise .
a) If a ≡ b (mod 6), show 2a ≡ 2b (mod 9).
b) Show that n ≡ 1 (mod 9) for every even perfect number n other than 6.
(See Exercise .)
Exercise . Find every positive integer that is equal to the product of its
proper divisors. (See Exercise .)
Exercise . Compute 16200 modulo 19.
Exercise . If p 6= q, and gcd(a, pq) = 1, and n = lcm(p − 1, q − 1), show

an ≡ 1 (mod pq).

Exercise . Prove Theorem  (p. ).

 C. Exercises
Exercise . Show a13 ≡ a (mod 70).
Exercise . Assuming gcd(a, p) = 1, and 0 6 n < p, solve the congruence

an x ≡ b (mod p).

Exercise . Solve 214 x ≡ 3 (mod 23).


p−1
X
Exercise . Show k p ≡ 0 (mod p).
k=1

Exercise . We can write the congruence 2p ≡ 2 (mod p) as

2p − 1 ≡ 1 (mod p).

Show that, if n | 2p − 1, then n ≡ 1 (mod p). (Suggestion: Do this first if n is a


prime q. Then 2q−1 ≡ 1 (mod q). If q 6≡ 1 (mod p), then gcd(p, q − 1) = 1, so
pa + (q − 1)b = 1 for some a and b. Now look at 2pa · 2(q−1)b modulo n.)
n
Exercise . Let Fn = 22 + 1. (Then F0 , . . . , F4 are primes.) Show

2Fn ≡ 2 (mod Fn ).

Exercise . Show that 1105, 2821, and 15841 are Carmichael numbers. Solu-
tion: First, factorize: 1105 = 5 · 13 · 17, 2821 = 7 · 13 · 31, and 15841 = 7 · 31 · 73.
Exercise . Assuming p is an odd prime:
a) (p − 1)! ≡ p − 1 (mod 1 + 2 + · · · + (p − 1));
b) 1 · 3 · · · (p − 2) ≡ (−1)(p−1)/2 · (p − 1) · (p − 3) · · · 2 (mod p);
c) 1 · 3 · · · (p − 2) ≡ (−1)(p−1)/2 · 2 · 4 · · · (p − 1) (mod p);
d) 12 · 32 · · · (p − 2)2 ≡ (−1)(p+1)/2 (mod p).

Exercise . τ(n) 6 2 n.
Q
Exercise . d|n = nτ(n)/2 . (See Exercise .)

Exercise . τ(n) is odd if and only if n is square.


Exercise . Assuming n is odd: σ(n) is odd if and only if n is square.
X1 σ(n)
Exercise . = .
d n
d|n

Exercise . {n : τ(n) = k} is infinite (when k > 1), but {n : σ(n) = k} is finite.
 Carmichael did this in  [].


Exercise . Let m ∈ Z. The number-theoretic function n 7→ nm is multiplica-
tive.
Exercise . Let ω(n) be the number of distinct prime divisors of n, and let m
be a non-zero integer. Then n 7→ mω(n) is multiplicative.
Exercise . Prove the other half of the Möbius Inversion Theorem (Theorem 
on page ): if F and f are arithmetic functions such that
X n
f (n) = µ · F (d),
d
d|n

then X
F (n) = f (d).
d|n

Exercise .XLet Λ be the Mangoldt function defined in Exercise .


a) log n = Λ(d).
d|n
X n
b) Λ(n) = µ log d.
d
d|n
X
c) Λ(n) = − µ(d) log d.
d|n
Q P
Exercise . p|n (1 − p) = d|n µ(d) · d.
Exercise . If f is multiplicative and non-zero, then
X Y
µ(d) · f (d) = (1 − f (p)).
d|n p|n

X
Exercise . If ω is as in Exercise , then µ(d) · τ(d) = (−1)ω(n) .
d|n

Exercise . f (568) = f (638) when f ∈ {τ, σ, ϕ}.


Exercise . Solve:
a) n = 2ϕ(n).
b) ϕ(n) = ϕ(2n).
c) ϕ(n) = 12. (Do this without a table. There are  solutions.)
Exercise . Find a sequence (an : n ∈ N) of positive integers such that

ϕ(an )
lim = 0.
n→∞ an

 C. Exercises
(If you assume that there is an answer to this problem, then it is not hard to
see what the answer must be. To actually prove that the answer is correct, recall
that, formally,
X1 Y 1
= ,
n
n p
1 − p1
n
Y 1
so lim 1 = ∞ if (pk : k ∈ N) is the list of primes.)
n→∞
k=1
1 − pk
X Y
Exercise . Prove µ(d)ϕ(d) = (2 − p). (This is a special case of Exer-
d|n p|n
cise .)
Exercise . If n is squarefree, and k > 0, show
X
σ(dk )ϕ(d) = nk+1 .
d|n

X n
Exercise . σ(d)ϕ = nτ(n).
d
d|n
X n
Exercise . τ(d)ϕ = σ(n).
d
d|n

Exercise . a) Show a100 ≡ 1 (mod 1000) if gcd(a, 1000) = 1.


b) Find n such that n101 6≡ n (mod 1000).
Exercise . a) Show a24 ≡ 1 (mod 35) if gcd(a, 35) = 1.
13
b) Show a ≡ a (mod 35) for all a.
c) Is there n such that n25 6≡ n (mod 35)?
Exercise . If gcd(m, n) = 1, show mϕ(n) ≡ nϕ(m) (mod mn).
Exercise . If n is odd, and is not a prime power, and if gcd(a, n) = 1, show
aϕ(n)/2 ≡ 1 (mod n). (This generalizes Exercise (b).)
Exercise . Solve 510000 x ≡ 1 (mod 153).
Exercise . We have (±3)2 ≡ 2 (mod 7). Compute the orders of 2, 3, and −3,
modulo 7.
Exercise . Suppose ordn (a) = k, and b2 ≡ a (mod n).
a) Show that ordn (b) ∈ {k, 2k}.
b) Find an example for each possibility of ordn (b).
c) Find a condition on k such that ordn (b) = 2k.


Exercise . This is about 23:
a) Find a primitive root of least absolute value.
b) How many primitive roots are there?
c) Find these primitive roots as powers of the root found in (a).
d) Find these primitive roots as elements of [−11, 11].
Exercise . Assuming ordp (a) = 3, show:
a) a2 + a + 1 ≡ 0 (mod 3);
b) (a + 1)2 ≡ a (mod 3);
c) ordp (a + 1) = 6.
Exercise . Find all elements of [−30, 30] having order 4 modulo 61.
Exercise . f (x) ≡ 0 (mod n) may have more than deg(f ) solutions:
a) Find four solutions to x2 − 1 ≡ 0 (mod 35).
b) Find conditions on a such that the congruence x2 − a2 ≡ 0 (mod 35) has
four distinct solutions, and find these solutions.
c) If p and q are odd primes, find conditions on a such that the congruence
x2 − a2 ≡ 0 (mod pq) has four distinct solutions, and find these solutions.
Exercise . If ordn (a) = n − 1, then n is prime.
Exercise . If a > 1, show n | ϕ(an − 1).
Exercise . If 2 ∤ p and p | n2 + 1, show p ≡ 1 (mod 4).
Exercise .
a) Find conditions on p such that, if r is a primitive root of p, then so is −r.
b) If p does not meet these conditions, then what is ordp (−r)?
Exercise . For (Z/(17))× :
a) construct a table of logarithms using 5 as the base;
b) using this (or some other table, with a different base), solve:
(i) x15 ≡ 14 (mod 17);
(ii) x4095 ≡ 14 (mod 17);
(iii) x4 ≡ 4 (mod 17);
(iv) 11x4 ≡ 7 (mod 17).
Exercise . If n has primitive roots r and s, and gcd(a, n) = 1, prove
logr a
logs a ≡ (mod ϕ(n)).
logr s

Exercise . In (Z/(337))× , for any base, show

log(−a) ≡ log a + 168 (mod 336).

 C. Exercises
Exercise . Solve 4x ≡ 13 (mod 17).
Exercise . a) If ordr (a) and ordr (b) are relatively prime, show

ordr (ab) = lcm(ordr (a), ordr (b)).

b) Show that this may fail if ordr (a) and ordr (b) are not relatively prime.
Exercise . How many primitive roots has 22? Find them.
Exercise . Find a primitive root of 1250.
Exercise . Define the function λ by the rules
(
ϕ(2k ), if 0 < k < 3;
λ(2k ) = k
ϕ(2 )/2, if k > 3;
λ(2k · p1 ℓ(1) · · · pm ℓ(m) ) = lcm(λ(2k ), ϕ(p1 ℓ(1) ), . . . , ϕ(pm ℓ(m) )).

where the pi are distinct odd primes.


a) Prove that, if gcd(a, n) = 1, then aλ(n) ≡ 1 (mod n).
b) Using this, show that, if n is not 2 or 4 or an odd prime power or twice an
odd prime power, then n has no primitive root.
Exercise . Solve the following quadratic congruences.
a) 8x2 + 3x + 12 ≡ 0 (mod 17);
b) 14x2 + x − 7 ≡ 0 (mod 29);
c) x2 − x − 17 ≡ 0 (mod 23);
d) x2 − x + 17 ≡ 0 (mod 23).
Exercise . The Law of Quadratic Reciprocity makes it easy to compute
many Legendre symbols, but this law is not always needed. Compute (n/17) and
(m/19) for as many n in {1, 2, . . . , 16} and m in {1, 2, . . . , 18} as you can, using
only that, whenever p is an odd prime, and a and b are prime to p, then:
• a ≡ b (mod p) =⇒ (a/p) = (b/p);
• (1/p) = 1;
• (−1/p) = (−1)(p−1)/2 ;
• (a2 /p) = 1;
(
1, if p ≡ ±1 (mod 8);
• (2/p) =
−1, if p ≡ ±3 (mod 8).
 Carmichael defined this function in  [].


Exercise . Compute all of the Legendre symbols (n/17) and (m/19) by
means of Gauss’s Lemma.
Exercise . Find all primes of the form 5 · 2n + 1 that have 2 as a primitive
root.
Exercise . For every prime p, show that there is an integer n such that

p | (3 − n2 )(7 − n2 )(21 − n2 ).

Exercise .
a) If an − 1 is prime, show that a = 2 and n is prime.
b) Primes of the form 2p − 1 are called Mersenne primes. Examples are
3, 7, and 31. Show that, if p ≡ 3 (mod 4), and 2p + 1 is a prime q, then
q | 2p − 1, and therefore 2p − 1 is not prime. (Hint: Compute (2/q).)
Exercise . Assuming p is an odd prime, and 2p + 1 is a prime q, show that
−4 is a primitive root of q. (Hint: Show ordq (−4) ∈
/ {1, 2, p}.)
Exercise . Compute the Legendre symbols (91/167) and (111/941).
Exercise . Find (5/p) in terms of the class of p modulo 5.
Exercise . Find (7/p) in terms of the class of p modulo 28.
n
Exercise . The nth Fermat number, or Fn , is 22 + 1. A Fermat prime
is a Fermat number that is prime.
a) Show that every prime number of the form 2m + 1 is a Fermat prime.
b) Show 4k ≡ 4 (mod 12) for all positive k.
c) If p is a Fermat prime, show (3/p) = −1.
d) Show that 3 is a primitive root of every Fermat prime.
e) Find a prime p less than 100 such that (3/p) = −1, but 3 is not a primitive
root of p.
Exercise . Solve the congruence x2 ≡ 11 (mod 35).
Exercise . We have so far defined the Legendre symbol (a/p) only when
p ∤ a; but if p | a, then we can define (a/p) = 0. We can now define (a/n)
for arbitrary a and arbitrary odd n: the result is the Jacobi symbol, and the
definition is  a  Y a k(p) Y
= , where n = pk(p) .
n p
p p

a) Prove that the function x 7→ (x/n) on Z is completely multiplicative


in the sense that (ab/n) = (a/n) · (b/n) for all a and b (not necessarily
co-prime).

 C. Exercises
b) If gcd(a, n) = 1, and the congruence x2 ≡ a (mod n) is soluble, show
(a/n) = 1.
c) Find an example where (a/n) = 1, and gcd(a, n) = 1, but x2 ≡ a (mod n)
is insoluble.
d) If m and n are co-prime, show
m  n  m−1 n−1
· = (−1)k , where k= · .
n m 2 2


D. – examinations

In the following examinations, the set of natural numbers is {0, 1, 2, . . . } or ω,


while (as usual) N = ω r {0} = {1, 2, 3, . . . }.

D.. In-term examination


The exam lasts  minutes. All answers must be justified to the reader.

Problem .. For all natural numbers k and integers n, prove

k! | n · (n + 1) · · · (n + k − 1).

Solution.
 
n+k−1


 , if n > 0;
 k
n · (n + 1) · · · (n + k − 1) 
= 0, if n 6 0 < n + k;
k!   

 −n
(−1)k ·
 , if n + k 6 0.
k

Remark. Every binomial coefficient ji is an integer for the reason implied by
its name: it is one of the coefficients in the expansion of (x + y)j . (It is pretty
obvious that those coefficients in this expansion must be integers, but one can
prove it by induction on j.)
Remark. In the set {n, n + 1, . . . , n + k − 1}, one of the elements is divisible by
k, one by k − 1, one by k − 2, and so forth. This observation is not enough to
solve the problem, since for example, in the set {3, 4, 5}, one of the elements is
divisible by 4, one by 3, and one by 2, but 4! ∤ 3 · 4 · 5.
Remark. For similar reasons, proving the claim by induction is difficult. It is
therefore not recommended. However, one way to proceed is as follows. The claim
is trivially true (for all n) when k = 0, since 0! = 1, which divides everything.
(When k = 0, then the product n · (n + 1) · · · (n + k − 1) is the ‘empty product’, so
it should be understood as the neutral element for multiplication, namely 1.) As
a first inductive hypothesis, we suppose the claim is true (for all n) when k = ℓ.
We want to show
(ℓ + 1)! | n · (n + 1) · · · (n + ℓ) (∗)


for all n. We first prove it when n > −ℓ by entering a second induction. The
relation (∗) is true when n = −ℓ, since then n · (n + 1) · · · (n + ℓ) = 0. As a second
inductive hypothesis, we suppose the relation is true when n = m, so that
(ℓ + 1)! | m · (m + 1) · · · (m + ℓ). (†)
By the first inductive hypothesis, we have
ℓ! | (m + 1) · · · (m + ℓ).
Since also ℓ + 1 | m + ℓ + 1 − m, we have
(ℓ + 1)! | (m + 1) · · · (m + ℓ)(m + ℓ + 1 − m).
Distributing, we have
(ℓ + 1)! | (m + 1) · · · (m + ℓ)(m + ℓ + 1) − m · (m + 1) · · · (m + ℓ).
By the second inductive hypothesis, (†), we conclude
(ℓ + 1)! | (m + 1) · · · (m + ℓ)(m + ℓ + 1).
So the second induction is complete, and (∗) holds when n > −ℓ. It therefore
holds for all n, since
n · (n + 1) · · · (n + ℓ) = (−1)ℓ+1 (−n − ℓ) · (−n − ℓ + 1) · · · (−n).
Hence the first induction is now complete.
Problem .. Find the least natural number x such that

x ≡ 1
 (mod 5),
x≡3 (mod 6),


x≡5 (mod 7).
Solution. We have
6 · 7 ≡ 1 · 2 ≡ 2 (mod 5), 2·3≡1 (mod 5);
5 · 7 ≡ −1 · 1 ≡ −1 (mod 5), −1 · 5 ≡ 1 (mod 6);
5 · 6 ≡ −1 · (−2) ≡ 2 (mod 7), 2·4≡1 (mod 7).
Therefore, modulo 5 · 6 · 7 (which is 210), we conclude
x≡1·6·7·3+3·5·7·5+5·5·6·4
≡ 126 + 525 + 600
≡ 1251
≡ 201.

Therefore x = 201 (since 0 6 201 < 210).

D.. In-term examination 


Remark. Instead of solving the equations

2x1 ≡ 1 (mod 5),


−1x2 ≡ 1 (mod 6),
2x3 ≡ 1 (mod 7),

(getting (x1 , x2 , x3 ) = (3, 5, 4) as above,) one may solve

2y1 ≡ 1 (mod 5),


−1y2 ≡ 3 (mod 6),
2y3 ≡ 5 (mod 7),

getting (y1 , y2 , y3 ) = (3, 3, 6). But then

x≡6·7·3+5·7·3+5·6·6

(that is, one doesn’t use as coefficients the numbers 1, 3, and 5 respectively,
because they are already incorporated in the yi ).
Remark. Some people noticed, in effect, that the original system is equivalent to

x + 9 ≡ 10 ≡ 0 (mod 5),

x + 9 ≡ 12 ≡ 0 (mod 6),


x + 9 ≡ 14 ≡ 0 (mod 7),

which in turn means x + 9 ≡ 0 (mod 210) and so yields the minimal positive
solution x = 201. But not every such problem will be so easy.

Problem .. Find all integers n such that n4 + 4 is prime.

Solution. We can factorize as follows:

n4 + 4 = n4 + 4n2 + 4 − 4n2
= (n2 + 2)2 − (2n)2
= (n2 + 2 + 2n) · (n2 + 2 − 2n)
= ((n + 1)2 + 1) · ((n − 1)2 + 1).

Both factors are positive. Moreover, one of the factors is 1 if and only if n = ±1.
So n4 + 4 is prime only if n = ±1. Moreover, if n = ±1, then n4 + 4 = 5, which
is prime. So the answer is, n = ±1.

Problem .. a) Find a solution to the equation 151x + 71y = 1.

 D. – examinations


b) Find integers s and t such that

gcd(a, b) = 1 =⇒ gcd(151a + 71b, sa + tb) = 1.

Solution. (a) We compute

151 = 71 · 2 + 9,
71 = 9 · 7 + 8,
9 = 8 · 1 + 1,

and hence

9 = 151 − 71 · 2,
8 = 71 − (151 − 71 · 2) · 7 = −151 · 7 + 71 · 15,
1 = 151 − 71 · 2 − (−151 · 7 + 71 · 15) = 151 · 8 − 71 · 17.

Thus, (8, −17) is a solution to 151x + 71y = 1.


(b) We want s and t such that, if a and b are co-prime, then so are 151a + 71b
and sa + tb. It is enough if we can obtain a and b as linear combinations of
151a + 71b and sa + tb. That is, it is enough if we can solve

(151a + 71b)x + (sa + tb)y = a

and (independently) (151a + 71b)x + (sa + tb)y = b. The first equation can be
rearranged as
(151x + sy)a + (71x + ty)b = a,
which is soluble if and only if the linear system
(
151x + sy = 1,
71x + ty = 0

is soluble. Similarly, we want to be able to solve


(
151x + sy = 0,
71x + ty = 1.
 
151 s
It is enough if the coefficient matrix is invertible over the integers; this
71 t
means  
151 s
±1 = det = 151t − 71s
71 t

(since ±1 are the only invertible integers). A solution to this equation is (17, 8).

D.. In-term examination 


Remark. Another method for (a) is to solve

151x ≡ 1 (mod 71),


9x ≡ 1 (mod 71),
x ≡ 8 (mod 71),

and then solve

151 · 8 + 71y = 1,
−1207
y= = −17.
71
But finding inverses may not always be so easy as finding the inverse of 9 modulo
71.

Problem .. Find the least positive x such that

19365 x ≡ 2007 (mod 17).

Solution. By applying the elementary-school division algorithm as necessary


[computations omitted here], we find

19 ≡ 2 (mod 17),
365 ≡ 13 (mod 16),
2007 ≡ 1 (mod 17),

which means our problem is equivalent to solving

213 x ≡ 1 (mod 17),


x ≡ 23 (mod 17),
x ≡ 8 (mod 17);

so x = 8 (since 0 < 8 6 17).

Remark. Some people failed to use that 216 ≡ 1 (mod 17) by Fermat’s Theo-
rem. Of these, some happened to notice an alternative simplification: 24 ≡ −1
(mod 17); but a simplification along these lines, unlike the Fermat Theorem, may
not always be available.

Problem .. Prove a13 ≡ a (mod 210) for all a.

Solution. We have the prime factorization 210 = 2 · 3 · 5 · 7, along with the


following implications:

 D. – examinations


• If 2 ∤ a, then a ≡ 1 (mod 2), and hence a12 ≡ 1 (mod 2);

• if 3 ∤ a, then a2 ≡ 1 (mod 3), and hence a12 ≡ 1 (mod 3);

• if 5 ∤ a, then a4 ≡ 1 (mod 2), and hence a12 ≡ 1 (mod 5);

• if 7 ∤ a, then a6 ≡ 1 (mod 2), and hence a12 ≡ 1 (mod 7).

This means that, for all a, we have

a13 ≡ a (mod 2),


13
a ≡a (mod 3),
13
a ≡a (mod 5),
13
a ≡a (mod 7).

Therefore a13 ≡ a (mod 210) for all a, since 210 = lcm(2, 3, 5, 7).

Remark. One should be clear about the restrictions on a, if any. The argument
here assumes that the reader is familiar with the equivalence between the two
forms of Fermat’s Theorem:
a) ap−1 ≡ 1 (mod p) when p ∤ a;
b) ap ≡ p (mod p) for all a.

Problem .. On ω, we define the binary relation 6 so that a 6 b if and only if


the equation a + x = b is soluble. Prove the following for all natural numbers a,
b, and c. You may use the ‘Peano Axioms’ and the standard facts about addition
and multiplication that follow from them.
a) 0 6 a.
b) a 6 b ⇐⇒ a + c 6 b + c.
c) a 6 b ⇐⇒ a · (c + 1) 6 b · (c + 1).

Solution. (a) 0 + a = a.
(b) By the definition of 6, and the standard cancellation properties for addition,
we have

a 6 b ⇐⇒ a + d = b for some d
⇐⇒ a + c + d = b + c for some d
⇐⇒ a + c 6 b + c.

(c) We use induction on a. By part (a), the claim is trivial when a = 0.


Suppose it is true when a = d; we shall prove it is true when a = d + 1. Note
that, if d + 1 6 b, then d + e + 1 = b for some e, so b is a successor: b = e + 1 for

D.. In-term examination 


some e; in particular, b 6= 0. Similarly, if (d + 1) · (c + 1) 6 b · (c + 1), then b 6= 0,
so b is a successor. So it is enough now to observe:

d + 1 6 e + 1 ⇐⇒ d 6 e [by (b)]
⇐⇒ d · (c + 1) 6 e · (c + 1) [by I.H.]
⇐⇒ d · (c + 1) + c + 1 6 e · (c + 1) + c + 1 [by (b)]
⇐⇒ (d + 1) · (c + 1) 6 (e + 1) · (c + 1).

This completes the induction.


Remark. In (c), one may proceed as in (b):

a 6 b =⇒ a + d = b for some d
=⇒ a · (c + 1) + d · (c + 1) = b · (c + 1)
=⇒ a · (c + 1) 6 b · (c + 1).

Conversely, if a · (c + 1) 6 b · (c + 1), then a · (c + 1) + d = b · (c + 1) for some d;


but then d must be a multiple of c + 1 (although this is not proved in my notes
on ‘Foundations of number-theory’, which are the source of this problem). So we
have

a · (c + 1) + e · (c + 1) = b · (c + 1),
(a + e) · (c + 1) = b · (c + 1),
a + e = b,
a6b

by the standard cancellation properties of multiplication.

D.. In-term examination


The exam lasts  minutes. Answers must be justified. Solutions should follow a
reasonably efficient procedure.
Problem .. We define exponentiation on ω recursively by n0 = 1 and nm+1 =
nm · n. Prove that nm+k = nm · nk for all n, m, and k in ω.
Solution. Use induction on k. For the base step, that is, k = 0, we have

nm+0 = nm = nm · 1 = nm · n0 .

So the claim holds when k = 0. For the inductive step, suppose, as an inductive
hypothesis, that the claim holds when k = ℓ, so that

nm+ℓ = nm · nℓ .

 D. – examinations


Then

nm+(ℓ+1) = n(m+ℓ)+1
= nm+ℓ · n [by def’n of exponentiation]
m ℓ
= (n · n ) · n [by inductive hypothesis]
m ℓ
= n · (n · n)
= nm · nℓ+1 [by def’n of exponentiation].

Thus the claim holds when k = ℓ + 1. This completes the induction and the
proof.
Remark. Some people apparently forgot that, by the convention of this course,
the first element of ω is 0, so that the induction here must start with the case
k = 0. This convention can be inferred from the statement of the problem, since
the given recursive definition of exponentiation starts with n0 , not n1 .
Remark. The formal recursive definition of exponentiation is intended to be make
precise the informal definition

nm = n
| · n{z· · · n} .
m

Likewise, mathematical induction makes precise the informal proof

nm+k = |n · n{z· · · n} = |n · n{z· · · n} · |n · n{z· · · n} = nm · nk .


m+k m k

Everybody knows nm+k = nm ·nk ; the point of the problem is to prove it precisely,
so the informal proof is not enough.
Problem .. Find some n such that 35 · ϕ(n) 6 8n.
ϕ(n) 8
Solution. We want 6 . We have
n 35
ϕ(n) Y p − 1
= .
n p
p|n

If we take enough primes, this product should get down to 8/35. As 35 = 5 · 7,


we might try the primes up to 7. Indeed,
1 2 4 6 2·4 8
· · · = = ;
2 3 5 7 5·7 35
so we may let n = 2 · 3 · 5 · 7 = 210.

D.. In-term examination 


Problem .. Suppose f and g are multiplicative
X functions
n on N. Define h and
H by h(n) = f (n) · g(n) and H(n) = f (d) · g . Prove that these are
d
d|n
multiplicative.
Solution. Suppose gcd(m, n) = 1. Then
h(mn) = f (mn) · g(mn)
= f (m) · f (n) · g(m) · g(n) [by multiplicativity of f and g]
= f (m) · g(m) · f (n) · g(n)
= h(m) · h(n),
so h is multiplicative. Also, since every divisor of mn can be factorized uniquely
as d · e, where d | m and e | n, we have
X  mn 
H(mn) = f (d) · g
d
d|mn
XX  mn 
= f (de) · g
de
d|m e|n
XX m n
= f (d) · f (e) · g ·g [mult. of f , g]
d e
d|m e|n
X m X m n
= f (d) · · f (e) · g ·g [distributivity]
d d e
d|m e|n
X m X  m n
= f (d) · · f (e) · g ·g [distributivity]
d d e
d|m e|n

= H(m) · H(n),
so H is multiplicative.
Remark. The assumption that gcd(m, n) = 1 is essential here, because otherwise
we could not conclude, for example, f (mn) = f (m) · f (n); neither could we do
the trick with the divisors of mn.
P
Remark. Since f is multiplicative, we know for example that d|n f (d) is a mul-
P
tiplicative function of n. Hence d|n f (n/d) is also multiplicative, since it is
the same function.
P Likewise, once we know that f g is multiplicative, then we
know that d|n f (d)g(d) is multiplicative. But we cannot conclude so easily that
P
d|n f (d)g(n/d) is multiplicative. It does not make sense to say g(n/d) is multi-
plicative, since it has two variables. We do not have g(mn/d) = g(m/d) · g(n/d);
neither do we have g(n/de) = g(n/d) · g(n/e). What we have is g(mn/de) =
g(m/d)g(n/e), if d | m and e | n; but it takes some work to make use of this.

 D. – examinations


Problem .. Concerning 13:
a) Show that 2 is a primitive root.
b) Find all primitive roots as powers of 2.
c) Find all primitive roots as elements of [1, 12].
d) Find all elements of [1, 12] that have order 4 modulo 13.

Solution. (a) Modulo 13, we have

k 1 2 3 4 5 6 7 8 9 10 11 12
2k 2 4 8 3 6 12 11 9 5 10 7 1

(b) 2k , where gcd(k, 12) = 1; so 2, 25 , 27 , 211 .


(c) From the table, 2, 6, 11, 7.
(d) 2k , where 4 = 12/ gcd(k, 12), that is, gcd(k, 12) = 3, so k is 3 or 9; so, again
from the table, 8, 5.
X Y
Problem . ( points). Prove µ(d) · σ(d) = (−p).
d|n p|n

Solution. Each side of the equation is a multiplicative function of n, so it is


enough to check the claim when n is a prime power. Accordingly, we have

X s
X
µ(d) · σ(d) = µ(pk ) · σ(pk ) =
d|ps k=0
Y
= µ(1) · σ(1) + µ(p) · σ(p) = 1 − (1 + p) = −p = (−q).
q|ps

This establishes the claim when n is a prime power, hence for all n.
Q
Remark. It should be understood in the product p|n (−p) that p is prime. This
product is a multiplicative function of n, Q
because if gcd(m,
Q n) = 1, Q and p | mn,
then p | m or p | n, but not both, so that p|mn (−p) = p|m (−p) · p|n (−p).
Remark. Using multiplicativity of functions to prove their equality is a powerful
technique. It works like magic. It is possible here to prove the desired equation
directly, for arbitrary n; but the proof is long and complicated. It is not enough to
write out part of the summation, detect a pattern, and claim (as some people did)
that everything cancels but what is wanted: one must prove this claim Q precisely.
One way is as follows. Every positive integer n can be written as p∈A ps(p) ,
where A is a (finite) set of prime numbers, and each exponent s(p) is at least 1.
(Note the streamlined method of writing a product.) Then the only divisors d of

D.. In-term examination 


Q
n for which µ(d) 6= 0 are those divisors of the form p∈B p for some subset B of
A. Moreover, each such number is a divisor of n. Hence
X X Y  Y 
µ(d) · σ(d) = µ p ·σ p
d|n X⊆A p∈X p∈X
X Y
= (−1)|X| · (1 + p)
X⊆A p∈X
X X Y
= (−1)|X| · p
X⊆A Y ⊆X p∈Y
X Y X
= p· (−1)|X|
Y ⊆A p∈Y Y ⊆X⊆A
X Y X
= p · (−1)|Y | · (−1)|Z|
Y ⊆A p∈Y Z⊆ArY
|ArY |  
X Y
|Y |
X |A r Y |
= p · (−1) · (−1)j
j=0
j
Y ⊆A p∈Y
X Y
|Y |
= p · (−1) · (1 + (−1))|ArY |
Y ⊆A p∈Y
Y
= p · (−1)|A|
p∈A
Y
= (−p).
p∈A

This proves the desired equation; but it is probably easier just to use the multi-
plicativity of each side, as above.

Problem .. Solve 63164 x ≡ 2 (mod 365).

Solution. 365 = 5 · 73, so ϕ(365) = ϕ(5) · ϕ(73) = 4 · 72 = 288. And 288 goes
into 3164 ten times, with remainder 284. Therefore, modulo 365, we have

63164 x ≡ 2 ⇐⇒ 6284 x ≡ 2
⇐⇒ x ≡ 2 · 64
≡ 2 · 362
≡ 2 · 1296
≡ 2 · 201
≡ 402
≡ 37.

 D. – examinations


Remark. One may note that, since 4 | 72, we have that a72 ≡ 1 (mod 365)
whenever gcd(a, 365) = 1. Such an observation might make computations easier
in some problems, though perhaps not in this one.
Problem .. Show that the least positive primitive root of 41 is 6. (Try to
compute as few powers as possible.)
Solution. ϕ(41) = 40 = 23 · 5 = 8 · 5, so the proper divisors of ϕ(41) are divisors
of 8 or 20. So we want to show, modulo 41,
a) when ℓ ∈ {2, 3, 4, 5}, then either ℓ8 or ℓ20 is congruent to 1;
b) neither 68 nor 620 is congruent to 1.
To establish that ℓ2k ≡ 1, it is enough to show ℓk ≡ ±1. To establish that ℓ2k 6≡ 1,
it is enough to show ℓk 6≡ ±1. So we proceed:
a) 22 ≡ 4; 24 ≡ 42 ≡ 16; 28 ≡ 162 ≡ 256 ≡ 10; 210 ≡ 28 · 22 ≡ 10 · 4 ≡ 40 ≡ −1.
b) 32 ≡ 9; 34 ≡ 92 ≡ 81 ≡ −1.
c) 45 ≡ 210 ≡ −1.
d) 52 ≡ 25 ≡ −16; 54 ≡ 162 ≡ 256 ≡ 10 ≡ 28 ≡ 44 ; hence 520 ≡ 420 ≡ 1;
e) 62 ≡ 36 ≡ −5; 64 ≡ 25 ≡ −16; 68 ≡ 256 ≡ 10; 610 ≡ 10 · (−5) ≡ −50 ≡ −9;
620 ≡ 81 ≡ −1.
Remark. Another possible method is first to write out all of the powers of 6
(modulo 41), thus showing that 6 is a primitive root, and then to select from
among these the other primitive roots of 41, write them as positive numbers, and
note that 6 is the least. That is, one can start with
k 1 2 3 4 5 6 7 8 9 10
6k 6 −5 11 −16 −14 −2 −12 10 19 −9
k 11 12 13 14 15 16 17 18 19 20
6k −13 4 −17 −20 3 18 −15 −8 −7 −1
k 21 22 23 24 25 26 27 28 29 30
6k −6 5 −11 16 14 2 12 −10 −19 9
k 31 32 33 34 35 36 37 38 39 40
6k 13 −4 17 20 −3 −18 15 8 7 1
Then 6 is indeed a primitive root of 41, so every primitive root of 41 takes the
form 6k , where gcd(k, 40) = 1. So the incongruent primitive roots are 2k , where
k ∈ {1, 3, 7, 9, 11, 13, 17, 19, 21, 23, 27, 29, 31, 33, 37, 39}
(that is, k is an odd positive integer less than 40 and indivisible by 5). From the
table, if we convert these powers to congruent positive integers less than 41, we
get the list
6, 11, 29, 19, 28, 24, 26, 34, 35, 30, 12, 22, 13, 17, 15, 7
The least number on the list is 6.

D.. In-term examination 


Remark. Some people noted that 6 is the least element of the set {6k : 0 < k 6
40 & gcd(k, 40) = 1}. This is true, but it does not establish the claim that 6
is the least positive primitive root of 41, since some of the powers in the set may
be congruent modulo 41 to lesser positive numbers, which numbers will still be
primitive roots.

D.. In-term examination


The exam lasts  minutes. Several connected problems involve the prime num-
ber 23. As usual, answers must be reasonably justified to the reader.
Bracketed numbers (as []) refer to related homework exercises.
 63 
Problem .. Compute the Legendre symbol . []
271
 63   7 · 32   7   271  5 7
Solution. = = = − = − = − =
2 271 271 271 7 7 5
− = −(−1) = 1.
5
Remark. The computation uses the following features of the Legendre symbol:
a) the complete multiplicativity of x 7→ (x/p);
b) that (a/p) = ±1;
c) the Law of Quadratic Reciprocity;
d) the dependence of (a/p) only on the class of a modulo p;
e) the rule for (2/p).
If (p/q) = −(q/p) by the Law of Quadratic Reciprocity, then also −(q/p) =
(−1/p)(q/p) = (−q/p), since p ≡ 3 (mod 4). So one could also argue (63/271) =
(7 · 32 /271) = (7/271) = −(271/7) = (−271/7) = (2/7) = 1.
However, the equation (63/271) = −(271/63) is not available without explana-
tion and proof. Because 63 is not prime, (271/63) is not a Legendre symbol. It
is a Jacobi symbol, but these were defined only in [].

Problem . ( points). Find the Legendre symbol (a/29), given that []
n   o
ka
ka − 29 · : 1 6 k 6 14 = {1, 2, 5, 6, 7, 10, 11, 12, 15, 16, 20, 21, 25, 26}.
29

Solution. The given set has 6 elements greater than 29/2. Since ka − 29 · [ka/29]
is the remainder of ka after division by 29, by Gauss’s Lemma we have (a/29) =
(−a)6 = 1.

Problem . ( points). The numbers 1499 and 2999 are prime. Find a primi-
tive root of 2999. []

 D. – examinations


Solution. Since 2999 = 2 · 1499 + 1, it has the primitive root (−1)(1499−1)/2 · 2,
that is, −2.
Remark. The number 1499 is a Germain prime. If p is a Germain prime, so that
2p + 1 is a prime q, then the number of (congruence classes of) primitive roots
of q is ϕ(ϕ(q)), which is p − 1 or (q − 3)/2. So almost half the numbers that are
prime to q are primitive roots of q. We showed (−1)(p−1)/2 · 2 is a primitive root;
the cited homework exercise shows −4 is a primitive root. By the same method
of proof, if q ∤ r, then the following are equivalent:
a) r is a primitive root of q;
b) ordq (r) 6∈ {1, 2, p};
c) r 6≡ ±1 (mod q) and (r/q) = 1.
In particular, to show r is a primitive root of q, it is not enough to show (r/q) = 1.
(One must also show r2 6= 1 (mod q); and again, this is enough only in case
(q − 1)/2 is prime.)
Problem . ( points). Fill out the following table of logarithms. (It should be
clear what method you used.) [(a)]
k 1 2 3 4 5 6 7 8 9 10 11 (mod 23)
log5 k (mod 22)
log5 (−k) (mod 22)
Solution. First compute powers of 5, then rearrange:
ℓ 0 1 2 3 4 5 6 7 8 9 10 (mod 22)
5ℓ 1 5 2 10 4 −3 8 −6 −7 11 9 (mod 23)
5ℓ+11 −1 −5 −2 −10 −4 3 −8 6 7 −11 −9 (mod 23)
k 1 2 3 4 5 6 7 8 9 10 11 (mod 23)
log5 k 0 2 16 4 1 18 19 6 10 3 9 (mod 22)
log5 (−k) 11 13 5 15 12 7 8 17 21 14 20 (mod 22)

Remark. Implicitly, 5 must be a primitive root of 23, which implies 511 ≡ −1


(mod 23). Hence log5 (−1) ≡ 11 (mod 22), and more generally log5 (−k) ≡
log5 k ± 11 (mod 22). Thus the second row of the table can be obtained eas-
ily from the first.
Problem . ( points). Fill out the following table of Legendre symbols. (Again,
your method should be clear.)

aa
1 2 3 4 5 6 7 8 9 10 11

23 
 −a
23

D.. In-term examination 


Solution. The quadratic residues of 23 are just the even powers of a primitive
root, such as 5. Those even powers are just the numbers whose logarithms are
even. So, in the logarithm table in Problem ., we can replace even numbers
with 1, and odd numbers with −1, obtaining

a a
1 2 3 4 5 6 7 8 9 10 11
1 1 1 1 −1 1 −1 1 1 −1 −1
23 
 −a
−1 −1 −1 −1 1 −1 1 −1 −1 1 1
23

Remark. One can find the Legendre symbols by means of Euler’s Criterion and the
properties in the remark on Problem . (as in []), or by Gauss’s Lemma (as in
[]); but really, all of the necessary work has already been done in Problem ..

Problem . ( points). Solve the following congruences modulo 23. [(b)]

a) x2 ≡ 8 b) x369 ≡ 7

Solution. (a) From the solution to Problem ., we have 8 ≡ 56 ≡ (53 )2 ≡ 102 ,
so
x2 ≡ 8 ⇐⇒ x ≡ ±10 ≡ 10, 13 .

 
(b) From the computation at the right, as well as Problem ., we  
have 

x369 ≡ 7 (mod 23) ⇐⇒ x17 ≡ 7 (mod 23) 
⇐⇒ 17 log5 x ≡ 19 (mod 22) 
19 −3 3
⇐⇒ log5 x ≡ ≡ ≡ (mod 22)
17 −5 5
⇐⇒ log5 x ≡ 3 · 9 ≡ 27 ≡ 5 (mod 22)
⇐⇒ x ≡ 55 ≡ −3 (mod 23)
⇐⇒ x ≡ 20 (mod 23)

Remark. Some people seemed to overlook the information available from Prob-
lem .. In part (a), one may note from Problem . that there must be a
solution, since (8/23) = 1; but there is no need to do this, if one actually finds
the solutions.

Problem . ( points). Solve the congruence x2 − x + 5 ≡ 0 (mod 23). []

 D. – examinations


Solution. Complete the square:

1 1 −19
x2 − x + 5 ≡ 0 ⇐⇒ x2 − x + ≡ − 5 ≡ ≡1
4 4 4
 
1 2
⇐⇒ x − ≡1
2
1
⇐⇒ x − ≡ ±1
2
1
⇐⇒ x ≡ ± 1 ≡ 12 ± 1 ≡ 11, 13 (mod 23).
2

Remark. Although fractions with denominators prime to 23 are permissible here,


one may avoid them thus:

x2 − x + 5 ≡ 0 ⇐⇒ x2 + 22x + 5 ≡ 0
⇐⇒ x2 + 22x + 121 ≡ 121 − 5 ≡ 116 ≡ 1
⇐⇒ (x + 11)2 ≡ 1
⇐⇒ x + 11 ≡ ±1.

Alternatively, one may apply the identity

4a(ax2 + bx + c) = (2ax + b)2 − (b2 − 4ac),

finding in the present case

x2 − x + 5 ≡ 0 ⇐⇒ 4x2 − 4x + 20 ≡ 0
⇐⇒ (2x − 1)2 ≡ 1 − 20 ≡ −19 ≡ 4.

All approaches used to far can be used on any quadratic congruence (with odd
prime modulus). Nonetheless, many people chose to look for a factorization. Here

D.. In-term examination 


are some that were found:

x2 − x + 5 ≡ x2 − x − 110 ≡ (x − 11)(x + 10);


x2 − x + 5 ≡ x2 − x + 143 ≡ (x − 11)(x − 13);
x2 − x + 5 ≡ 0 x2 − x + 5 ≡ 0
⇐⇒ −22x2 + 22x − 18 ≡ 0 ⇐⇒ −22x2 + 22x − 18 ≡ 0
⇐⇒ −11x2 + 11x − 9 ≡ 0 ⇐⇒ −11x2 + 11x − 9 ≡ 0
⇐⇒ 12x2 − 12x + 14 ≡ 0 ⇐⇒ 12x2 + 11x − 9 ≡ 0
⇐⇒ 6x2 − 6x + 7 ≡ 0 ⇐⇒ 12x2 − 12x − 9 ≡ 0
⇐⇒ 6x2 + 17x + 7 ≡ 0 ⇐⇒ 4x2 − 4x − 3 ≡ 0
⇐⇒ (3x + 7)(2x + 1) ≡ 0; ⇐⇒ (2x − 3)(2x + 1) ≡ 0;
x2 − x + 5 ≡ 0
⇐⇒ 24x2 + 22x + 28 ≡ 0 x2 − x + 5 ≡ 0
⇐⇒ 12x2 + 11x + 14 ≡ 0 ⇐⇒ 24x2 + 22x + 5 ≡ 0
⇐⇒ 12x2 + 34x + 14 ≡ 0 ⇐⇒ (12x + 5)(2x + 1) ≡ 0.
⇐⇒ (4x + 2)(3x + 7) ≡ 0;

But for such problems, it does not seem advisable to rely on one’s ingenuity to find
factorizations. How would one best solve a congruence like x2 − 2987 + 2243 ≡ 0
(mod 2999)?

Problem . ( points). Explain briefly why exactly one element n of the set
{2661, 2662} has a primitive root. Give two numbers such that at least one of
them is a primitive root of n. []

Solution. The numbers with primitive roots are just 2, 4, odd prime powers,
and doubles of odd prime powers. Since 2661 = 3 · 887, and 3 ∤ 887, the number
2661 has no primitive root. However, 2662 = 2 · 1331 = 3 · 11 · 121 = 2 · 113 , so
this has a primitive root.
By the computation

k 1 2 3 4 5 (mod 10)
2k 2 4 −3 −6 −1 (mod 11)

we have that 2 is a primitive root of 11. Therefore 2 or 2 + 11 is a primitive root


of 121. Therefore 2 + 121 or 2 + 11 is a primitive root of 121, hence of 1331, hence
of 2662.

Remark. This problem relies on the following propositions about odd primes p:

 D. – examinations


a) if r is a primitive root of p, then r or r + p is a primitive root of p2 ;
b) every primitive root of p2 is a primitive root of every higher power p2+k ;
c) every odd primitive root of pℓ is a primitive root of 2 · pℓ .
One must also observe that being a primitive root is a property of the congruence
class of a number, so if r ≡ s (mod n), and r is a primitive root of p, then so
is s.

D.. Final Examination


You may take  minutes. Several connected problems involve the Fermat
prime 257. As usual, answers must be reasonably justified.
A table of powers of 3 modulo 257 was provided for use in several problems
[see Table D.].

Problem .. For positive integers n, let ω(n) = |{p : p | n}|, the number of
primes dividing n.
a) Show that the function n 7→ 2ω(n) is multiplicative.
b) DefineX the Möbius function µ in terms of ω.
c) Show |µ(d)| = 2ω(n) for all positive integers n.
d|n

Powers of 3 modulo 257:

Solution. a) If gcd(m, n) = 1, then ω(mn) = ω(m) + ω(n), so

2ω(mn) = 2ω(m)+ω(n) = 2ω(m) · 2ω(n) .


(
0, if p2 | n for some p;
b) µ(n) = ω(n)
(−1) , otherwise.
P
c) As µ is multiplicative, so are |µ| and n 7→ d|n |µ(d)|. Hence it is enough
to establish the equation when n is a prime power. We have

X s
X s
|µ(d)| = |µ(pk )| = |µ(1)| + |µ(p)| = 1 + 1 = 2 = 21 = 2ω(p ) .
d|ps k=0

Problem .. Fill out the following table of Legendre symbols:

 a
a 
1 2 3 5 7 11 13 17 19

257
Solution. By the table of powers, 3 must be a primitive root of 257. Hence
(a/257) = 1 if and only if a is an even power of 3 modulo 257. In particular,

D.. Final Examination 


D. – examinations
k 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
3k 3 9 27 81 −14 −42 −126 −121 −106 −61 74 −35 −105 −58 83 −8
316+k −24 −72 41 123 112 79 −20 −60 77 −26 −78 23 69 −50 107 64
332+k −65 62 −71 44 −125 −118 −97 −34 −102 −49 110 73 −38 −114 −85 2
348+k 6 18 54 −95 −28 −84 5 15 45 −122 −109 −70 47 −116 −91 −16
364+k −48 113 82 −11 −33 −99 −40 −120 −103 −52 101 46 −119 −100 −43 128
380+k 127 124 115 88 7 21 63 −68 53 −98 −37 −111 −76 29 87 4
396+k 12 36 108 67 −56 89 10 30 90 13 39 117 94 25 75 −32
3112+k −96 −31 −93 −22 −66 59 −80 17 51 −104 −55 92 19 57 −86 −1

Table D.. Powers of 3 modulo 257


(−1/257) = 1, so (a/257) = (−a/257). So the table of powers yields the answers:

 a
a 
1 2 3 5 7 11 13 17 19
1 1 −1 −1 −1 1 1 1 −1
257
Remark. Many people preferred to find these Legendre symbols by means of the
Law of Quadratic Reciprocity. Possibly this method is faster than hunting for
numbers in the table of powers; but it may also provide more opportunity for
error.
Problem .. In the following table, in the box below each number a, write the
least positive integer n such that ord257 (n) = a.
1 2 4 8 16 32 64 128 256

Solution. If r is a primitive root of 257, then ord257 (r256/a ) = a. The primitive


roots of 257 are 3s , where s is odd. So below a we want the least n such that
n ≡ 3(256/a)·s for some odd s. (In searching the table of powers, since 3k+128 ≡
−3k , we can ignore signs, except when a 6 2. For example, when a = 4, then
3(256/a)·s = 364s , so n can only be |364 |. When a = 32, then 3(256/a)·s = 38s , so n
will be the absolute value of an entry in the column of powers that is headed by
8.)
1 2 4 8 16 32 64 128 256
1 256 16 4 2 15 11 9 3
Remark. Another way to approach the problem is to note that
256
ord257 (3k ) = .
gcd(256, k)
Then one must look among those powers 3k such that gcd(256, k) = 256/a. Some
explanation is necessary, though it need not be so elaborate as what I gave above.
Some people apparently misread the problem as asking for the orders of the
given numbers. Others provided numbers that had the desired orders; but they
weren’t the least positive such numbers.
Problem .. Solve x2 + 36x + 229 ≡ 0 (mod 257).
Solution. Complete the square: (36/2)2 = (2·9)2 = 4·81 = 324, and 324−229 =
95, so (using the table of powers)
x2 + 36x + 229 ≡ 0 ⇐⇒ (x + 18)2 ≡ 95 ≡ 3128+52 ≡ 3180 ≡ (390 )2
⇐⇒ x + 18 ≡ ±390 ≡ ∓98
⇐⇒ x ≡ −116, 80
⇐⇒ x ≡ 141, 80 (mod 257).

D.. Final Examination 


Remark. There were a few unsuccessful attempts to factorize the polynomial
directly. See my remark on Problem  of Exam .

Problem .. Solve 197x ≡ 137 (mod 257).

Solution. From the table of powers of 3, we can obtain logarithms:

197x ≡ 137 (mod 257) ⇐⇒ (−60)x ≡ −120 (mod 257)


⇐⇒ x log3 (−60) ≡ log3 (−120) (mod 256)
⇐⇒ x · 24 ≡ 72 (mod 256)
⇐⇒ x · 8 ≡ 24 (mod 256)
⇐⇒ x ≡ 3 (mod 32)
⇐⇒ x ≡ 3, 35, 67, 99, 131, 163, 195, 227 (mod 256).

Remark. A number of people overlooked the change of modulus when passing


from x · 8 ≡ 24 to x ≡ 3. One need not use logarithms explicitly; one can observe
instead 197 ≡ −60 ≡ 324 and 137 ≡ −120 ≡ 372 (mod 256), so that

197x ≡ 137 (mod 257) ⇐⇒ 324x ≡ 372 (mod 257)


⇐⇒ 24x ≡ 72 (mod 256),

and then proceed as above.

Problem .. Solve 127x + 55y = 4.

Solution. Use the Euclidean algorithm:

127 = 55 · 2 + 17, 17 = 127 − 55 · 2,


55 = 17 · 3 + 4, 4 = 55 − (127 − 55 · 2) · 3 = 55 · 7 − 127 · 3,
17 = 4 · 4 + 1, 1 = 17 − 4 · 4 = 127 − 55 · 2 − (55 · 7 − 127 · 3) · 4
= 127 · 13 − 55 · 30.

Hence 4 = 127 · 52 − 55 · 120, and gcd(127, 55) = 1, so the original equation has
the general solution
(52, −120) + (55, −127) · t.

Remark. Some people omitted to find the general solution. In carrying out the
Euclidean algorithm here, one can save a step, as some people did, by noting that,
once we find 4 = 55 · 7 − 127 · 3, we need not find 1 as a linear combination of 127
and 55; we can pass immediately to the general solution (7, −3) + (55, −127) · t.

Problem .. Solve x2 ≡ 59 (mod 85).

 D. – examinations


Solution. Since 85 = 5 · 17, we first solve x2 ≡ 59 modulo 5 and 17 separately:

x2 ≡ 59 (mod 5) x2 ≡ 59 (mod 17)


2 2
⇐⇒ x ≡ 4 (mod 5) ⇐⇒ x ≡ 8 (mod 17)
⇐⇒ x ≡ ±2 (mod 5); ⇐⇒ x2 ≡ 25 (mod 17)
⇐⇒ x ≡ ±5 (mod 17).

Now there are four systems to solve:


)
x ≡ ±2 (mod 5)
⇐⇒ x ≡ ±22 (mod 85),
x ≡ ±5 (mod 17)
)
x ≡ ±2 (mod 5)
⇐⇒ x ≡ ±12 (mod 85).
x ≡ ∓5 (mod 17)

(I solved these by trial.) So the original congruence is solved by

x ≡ ±22, ±12 (mod 85),

or x ≡ 12, 22, 63, 73 (mod 85).

Remark. One may, as some people did, use the algorithm associated with the
Chinese Remainder Theorem here. Even if we do not use the algorithm, we rely
on it to know that the solution we find to each pair of congruences is the only
solution.
( Some used a ) theoretical formation of the solution, noting for example
x ≡ 2 (mod 5)
that has the solution x ≡ 2 · 17ϕ(5) + 5 · 5ϕ(17) (mod 85); but
x ≡ 5 (mod 17)
this is not useful (the number is not between 0 and 85, or between −85/2 and
85/2).

D.. Final Examination 


E. – examinations

E.. In-term examination


Problem .. Let ω = {0, 1, 2, . . . }. All variables in this problem range over ω.
Given a and b such that a 6= 0, we define

rem(b, a) = r,

if b = ax + r for some x, and r < a.


a) Prove rem(a + b, n) = rem(rem(a, n) + rem(b, n), n).
b) Prove rem(ab, n) = rem(rem(a, n) · rem(b, n), n).
Solution. a) For rem(c, n), write c′ . Then for some x, y, and z in ω, we have

a = nx + a′ , b = ny + b′ , a′ + b′ = nz + (a′ + b′ )′ ,

hence a + b = n(x + y + z) + (a′ + b′ )′ . Since (a′ + b′ )′ < n, we have

(a + b)′ = (a′ + b′ )′

as desired.
b) With the same notation, for some w in ω we have

a′ · b′ = nw + (a′ · b′ )′ ,

so for some u in ω, we have ab = nu + a′ · b′ = n(w + u) + (a′ · b′ )′ , and therefore


(since (a′ · b′ )′ < n) we have

(ab)′ = (a′ · b′ )′

as desired.
Remark. Books VII, VIII, and IX of Euclid’s Elements develop some of the theory
of what we would call the positive integers. If we allow also a zero, but not
negative numbers, then we could define

a≡b (mod n) ⇐⇒ rem(a, n) = rem(b, n).

This problem then could be used to establish the basic facts about congruence.


Remark. A number of students used the arrow “⇒” in their proofs. Such usage
is a bad habit, albeit a common one, even among teachers. Indeed, I learned this
bad habit from somebody who was otherwise one of my best teachers. Later I
unlearned the habit.
In logic, the expression A ⇒ B means
If A is true, then B is true.
One rarely wants to say this in proofs. Rather, one wants to say things like
A is true, and therefore B is true.

If this is what you want to say, then you should just say it in words.
In the expression “A ⇒ B”, the arrow is a verb, usually read as “implies”. When
somebody writes the arrow in a proof, the intended meaning seems usually to be
that of “which implies” or “and this implies”. But the arrow should not be loaded
up with these extra meanings.
One student used the arrow in place of the equals sign “=”. This usage must
definitely be avoided.
Another practice that should be avoided is drawing arrows to direct the reader’s
eye. It should be possible to read a proof left to right, top to bottom, in the usual
fashion. If you need to refer to something that came before, then just say so.
It is true that, when I grade papers, I may use arrows. This is in part because,
when you see your paper, I am there to explain what I meant by the arrow, if
this is necessary. But what you write on exam should make sense without need
for additional explanation by you.
If I ask you to prove a claim, I already know the claim is true. The point is not
to convince me that the claim is true, or even to convince me that you know the
claim is true. The point is to write a proof of the claim. The point is to write the
sort of thing that is found in research articles and books of mathematics, often
labelled with the word Proof.
Problem .. Find integers k and ℓ, both greater than 1, such that, for all
positive integers n,
k | 196510n + ℓ.

Solution. Since 196510n is odd, we can let ℓ = 3, k = 2.

Remark. This problem is based on Exercise . As it is stated, the problem has


many solutions.
(i) The solution given here is a special case of letting k be any number such
that 1965 ≡ 1 (mod k), and then letting ℓ = 2k − 1 (or k − 1 if k > 2).
(ii) We could also let ℓ be a factor of 1965, and then let k be a factor of ℓ.
(iii) Finally, since 11 ∤ 1965, we have by Fermat 196510 ≡ 1 (mod 11), so we
could let k = 11 and ℓ = 10.

E.. In-term examination 


Problem .. Find two positive integers a and b such that, for all integers m
and n, the integer am − bn is a solution of the congruences

x≡m (mod 999), x ≡ n (mod 1001).

Solution. A solution of the congruences takes the form

x ≡ m · 1001s + n · 999t (mod 999 · 1001),

where 1001s ≡ 1 (mod 999) and 999t ≡ 1 (mod 1001). So we want

2s ≡ 1, s ≡ 500 (mod 999), −2t ≡ 1, t ≡ 500 (mod 1001).

Then the solution to the original congruences is

x ≡ m · 1001 · 500 + n · 999 · 500 ≡ 1001 · 500m − 999 · 501n (mod 999 · 1001).

So we can let a = 1001 · 500, b = 999 · 501.

Remark. This is just a Chinese Remainder Theorem problem with letters instead
of numbers.
P408
Problem .. Letting n = j=1 j, find an integer k such that 0 6 k < 409 and

408! ≡ k (mod n).

Solution. We have n = 409 · 408/2; also 409 is prime, so by Wilson’s Theorem


408! ≡ −1 (mod 409). Then 408! ≡ 408 modulo both 409 and 408, hence modulo
any divisor of the least common multiple of these. But n is such a divisor. Thus
we can let k = 408.
Remark. This problem is based on Exercise . A number of people argued as
follows.
Since 408! ≡ −1 (409), we must have k ≡ −1 (409). Since it is required that
0 6 k < 409, it must be that k = 408.

But this argument does not prove 408! ≡ 408 (n). Maybe I made a mistake, and
there is no k meeting the stated conditions.
Problem .. With justification, find an integer n, greater than 1, such that,
for all integers a,
an ≡ a (mod 1155).
Solution. We have 1155 = 3 · 5 · 7 · 11, and gcd(3 − 1, 5 − 1, 7 − 1, 11 − 1) =
gcd(2, 4, 6, 10) = 60. Then we can let n = 61. Indeed, by Fermat,

 E. – examinations


• If 3 ∤ a, then a2 ≡ 1 (3), so a60 ≡ 1 (3).
• If 5 ∤ a, then a4 ≡ 1 (5), so a60 ≡ 1 (5).
• If 7 ∤ a, then a6 ≡ 1 (7), so a60 ≡ 1 (7).
• If 11 ∤ a, then a10 ≡ 1 (11), so a60 ≡ 1 (11).
Therefore, for all a, we have a61 ≡ a modulo any of 3, 5, 7, and 11, hence modulo
their least common multiple, which is 1155.
Remark. This problem is related to Exercise  and our discussion of absolute
pseudoprimes.
Problem .. Let N = {1, 2, 3, . . . }. Suppose all we know about this set is:
(i) proofs by induction are possible;
(ii) addition can be defined on N, and it satisfies
x + y = y + x, x + (y + z) = (x + y) + z;
(iii) multiplication can be defined by
x · 1 = x, x · (y + 1) = x · y + x.
Prove
x · y = y · x.
Solution. We use induction on y. As the base step, we show x · 1 = 1 · x for
all x. We do this by induction: Trivially, 1 · 1 = 1 · 1. Suppose, as an inductive
hypothesis, x · 1 = 1 · x for some x. Then
1 · (x + 1) = 1 · x + 1 [by definition of multiplication]
=x·1+1 [by inductive hypothesis]
=x+1 [by definition of multiplication]
= (x + 1) · 1. [by definition of multiplication]
By induction then, x · 1 = 1 · x.
Next we assume x · y = y · x for all x, for some y, and we prove x · (y + 1) =
(y + 1) · x. We do this by induction on x. By what we have already shown,
1 · (y + 1) = (y + 1) · 1. Suppose, as an inductive hypothesis, x · (y + 1) = (y + 1) · x
for some x. Then
(x + 1) · (y + 1) = (x + 1) · y + x + 1 [by definition of multiplication]
= y · (x + 1) + x + 1 [by the first inductive hypothesis]
=y·x+y+x+1 [by definition of multiplication]
=x·y+x+y+1 [by the first inductive hypothesis]
= x · (y + 1) + y + 1 [by definition of multiplication]
= (y + 1) · x + y + 1 [by the second inductive hypothesis]
= (y + 1) · (x + 1). [by definition of multiplication]

E.. In-term examination 


This completes the proof that x · (y + 1) = (y + 1) · x for all x. This completes
the proof that x · y = y · x for all x and y.
Remark. This is part of Exercise . I tried to write out a “first generation” proof:
one you might write without thinking of how to break it into parts. A proof that
is easier to follow is perhaps the “second generation” proof that goes as follows
(see Lemma A. and Theorem A.): First show

x·1=1·x (∗)

by induction on x, then show

(y + 1) · x = y · x + x (†)

by induction on x, and finally show x · y = y · x by induction on x. In fact, almost


all students just assumed that (∗) and (†) were known; but they were not among
the propositions that the problem allowed you to use.

E.. In-term examination


Problem .. Exactly one of 1458 and 1536 has a primitive root. Which one,
and why? Find a primitive root of the number that has one.
Solution. 1458 = 2 · 729 = 2 · 36 and 1536 = 3 · 512 = 3 · 29 .
The numbers with primitive roots are just 2, 4, pk , and 2 · pk , where p is an
odd prime.
Therefore 1458, but not 1536, has a primitive root.
φ(9) = 6, and
k 1 2 3 4 5 6
5k 5 −2 −1 4 2 1 mod9
so 5 is a primitive root of 9.
Then 5 is a primitive root of 36 .
Since 5 is odd, it is a primitive root of 1458.
Remark. . A number of people computed φ(1458) and φ(1536), but this is of
no practical use in this problem.
. Some people pointed out that if a is a primitive root of n, then aφ(n) ≡ 1
(mod n). This is logically correct, but useless, since by Euler’s Theorem we have
aφ(n) ≡ 1 (mod n) whenever gcd(a, n) = 1 (not just when a is a primitive root).
. Our sequence of theorems about primitive roots of composite numbers is
the following. Throughout, p is an odd prime.
(i) If r is a primitive root of p, then r or r + p is a primitive root of p2 .
(ii) If r is a primitive root of p2 , then r is a primitive root of ps whenever s > 2.

 E. – examinations


(iii) If r is a primitive root of ps (where s > 2), then r or r + ps (whichever is
odd) is a primitive root of 2ps .
Some people misremembered this sequence, or wrongly combined two of its the-
orems. For example, some wrote ‘If r is a primitive root of p, then r or r + ps
(whichever is odd) is a primitive root of 2ps .’ This assertion is false. It would
be correct to say for example, ‘If r is a primitive root of p2 , then r or r + p2
(whichever is odd) is a primitive root of 2ps .’ Using this, one might observe that
2 is a primitive root of 9, and therefore 11 is a primitive root of 1458.

Problem .. Remembering that p is always prime, define the arithmetic func-
tion ω by
X
ω(n) = 1.
p|n

a) Define µ, preferably using ω.

b) Prove that, if m and n are co-prime, then ω(mn) = ω(m) + ω(n).

c) Prove that
X
τ(d) · µ(d) = (−1)ω(n) .
d|n

d) Find a simple description of the function f given by


X n
f (n) = ω(d) · µ .
d
d|n

(
0, if p2 | n for some p,
Solution. a) µ(n) = .
(−1) ω(n)
, if p2 | n for no p.
b) Assume m and n are co-prime. If p | mn, then

p | m ⇐⇒ p ∤ n.

Therefore X X X
ω(mn) = 1= 1+ 1 = ω(m) + ω(n).
p|mn p|m p|n

c) Each side of the equation is multiplicative, and


X k
τ(d) · µ(d) = τ(1) · µ(1) + τ(p) · µ(p) = 1 − 2 = (−1)ω(p ) .
d|pk

E.. In-term examination 


d) By Möbius inversion, X
ω(n) = f (d).
d|n
P
Since also ω(n) = p|n 1, we have
(
1, if n is prime,
f (n) =
0, if n is not prime.

Remark. . In my solution to part a, the condition ‘p2 | n for no p’ is equivalent


to ‘p2 ∤ n for all p’. Similarly in part d.
. For part a, some people wrote (as part of their answer) ‘µ(n) = (−1)s if
n = p1 · · · ps ’. Strictly, one must specify that the pi are all distinct. The best
way that I know to do this is to say p1 < · · · < ps .
. As an alternative solution to part b, one can write (as some people did)
that, since m and n are co-prime, we have

m = p1 m(1) · · · ps m(s) , n = q1 n(1) · · · qt n(t) ,

where the exponents are positive, p1 < · · · < ps , q1 < · · · < qt , and pi 6= qj in
each case, and therefore

ω(mn) = s + t = ω(m) + ω(n).

This may be a clearer argument than the one I wrote above. I don’t know a good
way to make the argument just with the Σ-notation. Some people wrote
X
‘ω(mn) = 1’,
pq|mn

which doesn’t make sense. (If it means anything, it means ω(mn) is the number
of factors d that mn has, where d is the product of two primes, possibly not
distinct. This is not what ω(mn) is.) Others wrote
XX
‘ω(mn) = 1’;
p|m q|n

this is meaningful, but false, since it makes ω(mn) equal to the product ω(m) ·
ω(n).
. In part c, it doesn’t hurt to say why the two sides are multiplicative. The
left-hand side is multiplicative because the product of two multiplicative func-
tions is multiplicative (weP didn’t prove this, but it’s fairly obvious), and if g is
multiplicative, so is n 7→ d|n g(d) (we did prove this). The right-hand side is
multiplicative by part b.

 E. – examinations


. In notation introduced in class, the function f in part d is given by f = ω∗µ,
and therefore ω = f ∗ 1 by Möbius inversion. It may not be immediately obvious
that f must be as in the solution above. But if f is that function, then indeed
ω = f ∗ 1, and therefore f = ω ∗ µ, as required. So f must be as given in the
solution.

Problem .. Find the least positive x such that

115117 x ≡ 57 (mod 600).

 31
Solution. 600 = 23 · 3 · 52 , so φ(600) = 4 · 2 · 20 = 160. We compute 160 5117 .
480
317
160
157
Hence
5117 ≡ 157 ≡ −3 (mod 160).

Therefore

111557 x ≡ 5 (mod 600)


⇐⇒ 11−3 x ≡ 5 (mod 600)
⇐⇒ x ≡ 5 · 113 (mod 600).

But

113 = 121 · 11 = 1331 ≡ 131 (mod 600),


5 · 131 = 655 ≡ 55 (mod 600),

so the least positive solution is 55 .

Remark. Not too many problems here. I’m guessing this is the sort of problem
that the dershane prepares one for. According to the Wikipedia article ‘Long
division’, my notation for long division is what used in Anglophone countries;
the notation I see on papers, Francophone. But the symbolism b ) a (used in the
former notation) for a/b is traced to Michael Stifel of the University of Jena in
Germany in  (see the Wikipedia article ‘Division (mathematics)’).

Problem .. a) Since 2 is a primitive root of 29, the function x 7→ log2 x


from Z29× to Z28 is defined. Considering this as a function from {−14, . . . , −1, 1, . . . 14}

E.. In-term examination 


to {−14, . . . , 14}, fill out the table below.

m 1 2 3 4 5 6 7 8 9 10 11 12 13 14

log2 m

log2 (−m)

b) With respect to the modulus 29, exactly one of the two congruences

x400 ≡ 13, x400 ≡ −13

has a solution. Find all of its solutions ( modulo 29), and explain why the
other congruence has no solutions.

Solution. a)
m 1 2 3 4 5 6 7 8 9 10 11 12 13 14
log2 m 0 1 5 2 −6 6 12 3 10 −5 −3 7 −10 13
log2 (−m) 14 −13 −9 −12 8 −8 −2 −11 −4 9 11 −7 4 −1
b) For the first congruence, we have

x400 ≡ 13 (mod 29)


⇐⇒ 400 log x ≡ −10 (mod 28)
⇐⇒ 200 log x ≡ −5 (mod 14);

the congruence has no solution since gcd(200, 14) = 2, and 2 ∤ −5. For the second
congruence:

x400 ≡ −13 (mod 29)


⇐⇒ 400 log x ≡ 4 (mod 28)
⇐⇒ 100 log x ≡ 1 (mod 7)
⇐⇒ 2 log x ≡ 1 (mod 7)
⇐⇒ log x ≡ 4 (mod 7)
⇐⇒ log x ≡ 4, 11, −10, −3 (mod 28)
⇐⇒ x ≡ −13, −11, 13, 11 (mod 29).

Remark. The quickest way I know to fill out the table is, keeping in mind

log2 m ≡ k mod 28 ⇐⇒ 2k ≡ m mod 29,

 E. – examinations


to start out as follows,
m 1 2 3 4 5 6 7 8 9 10 11 12 13 14
log2 m 0 1 2 3
log2 (−m) 4
continuing to get
m 1 2 3 4 5 6 7 8 9 10 11 12 13 14
log2 m 0 1 5 2 6 12 3 10 7 13
log2 (−m) 14 8 9 11 4
then filling in the remaining spaces by using
log m − log(−m) ≡ log(−1) ≡ ±14 (mod 28).
Some people may have done something like this, but they put the logarithms
into the set {0, . . . , 27} rather than {−14, . . . , 14} as requested (this set could
have been {−13, . . . , 14}. Other people gave negative logarithms, but they were
off by 1, as if the modulus had been taken as 29 rather than 28. In solving the
congruences, there were various confusions about modulus.

E.. Final examination


Problem .. Find all solutions of the congruence
x2821 ≡ x (mod 2821).
Solution. Every integer would be a solution if 2821 were a prime or a Carmichael
number. It factorizes as 7 · 403, hence as 7 · 13 · 31, which is squarefree. Also
2820 = 10 · 2 · 141 = 22 · 3 · 5 · 47, so it is divisible by 6, 12, and 30. Therefore
2821 is a Carmichael number, and all integers solve the given congruence.

Problem .. Find all solutions to the congruence x2 ≡ 23 (mod 133).


Solution. 133 = 7 · 19, so we solve simultaneously
x2 ≡ 23 (mod 7), x2 ≡ 23 (mod 19).
For the first, x2 ≡ 2 ≡ 9, so x ≡ ±3 (7); for the second, x2 ≡ 4, so x ≡ ±2 (19).
Now we have some Chinese remainder problems: First,
x ≡ ±3 (mod 7), x ≡ ±2 (mod 19),
that is, x ≡ ±(3 · 19 · 3 + 2 · 7 · −8) ≡ ±59 (133), since 19 ≡ −2 (7) and −2 · 3 ≡ 1
(7), while 7 · −8 ≡ 1 (19). Second,
x ≡ ±3 (mod 7), x ≡ ∓2 (mod 19),

E.. Final examination 


that is, x ≡ ±(3 · 19 · 3 − 2 · 7 · −8) ≡ ±283 ≡ ±17 (133). The solutions to the
original problem are therefore x ≡ ±59, ±17 (mod 133) .

Problem .. Is the following congruence soluble? Explain. (It is given that
2999 is prime.)
x2 − 2987x + 2243 ≡ 0 (mod 2999).

Solution. By completing the square, the congruence is equivalently

x2 + 12x ≡ 756,
(x + 6)2 ≡ 792.

Also, 792 = 23 · 32 · 11, so


 792   2  11 
=
2999 2999 2999
 11 
= [since 2999 ≡ −1 (8)]
2999
 2999 
=− [since 11 ≡ 3 ≡ 2999 (4)]
11
 −4 
=−
11
 −1 
=−
11
= 1; [since 11 ≡ 3 (4)]

therefore there must be a solution.

Problem ..

a) Find an arithmetic function that is not multiplicative.

b) Prove that, for all positive integers n,


X X X
φ(e) = d.
d|n e|n/d d|n

Solution. a) n 7→ 2.

b) By a theorem of Gauss,
X X X X
φ(e) = n/d = d.
d|n e|n/d d|n d|n

 E. – examinations


Remark. Various approaches are possible. One may, for example, write the de-
sired equation as 1 ∗ φ ∗ 1 = 1 ∗ id, and this follows from Gauss’s theorem,
expressed
P as 1 ∗ φ = id. If one does not remember Gauss’s Theorem, one may let
f (n) = d|n φ(d), so that
X X X X
φ(e) = f (n/d) = f (d).
d|n e|n/d d|n d|n

Then it is enough to prove f (n) = n; but each side of this equation is am multi-
Pk Pk
plicative function of n, and f (pk ) = j=0 φ(pj ) = 1 + j=1 (pj − pj−1 ) = pk .

Problem .. Describe, as well as possible, the set of primes q such that 2 is a
primitive root of q and q = 2n · p + 1 for some prime p. (In particular, first find
the possibilities for n, and then p.)

Solution. If n = 0, then p can only be 2, and then q = 3, which is in the desired


set.
Now suppose q is as desired, but not 3, so n > 1. A primitive root cannot be
a square, so we must have (2/q) = −1, that is, q ≡ ±3 (8), and therefore n 6 2.
If n = 1, then for the same reason, we must have 2p+ 1 ≡ 3, 5 (8), equivalently,
2p ≡ 2, 4 (8), that is, p ≡ 1, 2 (4). If p ≡ 2 (4), then p = 2, so q = 5; of this, 2 is
a primitive root, so 5 is in the desired set.
Suppose conversely p ≡ 1 (4), so q > 11 and q ≡ 3 (8). By Euler’s Criterion,
−1 = (2/q) ≡ 2p (q), so ordq (2) 6= p. But this order can only be 1, 2, p or q − 1,
and it is not 1 or 2 (since q > 11), so it must be q − 1. Therefore 2 is a primitive
root of a prime number 2p + 1 if and only if p ≡ 1 (4).
Now suppose n = 2. Then (2/q) = −1 if and only if 4p ≡ 2, 4 (8), that is,
p ≡ 1 (2), which is always the case (since p is odd). So we have ordq (2) ∤ 2p.
Also the order is not 4 when p = 3 or when p > 5—that is, ever. Therefore, 2 is
a primitive root of every prime number 4p + 1.
In sum, the desired set consists of:
• 3;
• primes 2p + 1, where p ≡ 1 (mod 4);
• primes 4p + 1.

E.. Final examination 


Bibliography

[] W. R. Alford, Andrew Granville, and Carl Pomerance. There are infinitely
many Carmichael numbers. Ann. of Math. (), ():–, .

[] Hippocrates G. Apostle. Aristotle’s Physics. The Peripatetic Press, Grinnell,


Iowa, . Translated with Commentaries and Glossary.

[] Archimedes. The works of Archimedes. Vol. I. Cambridge University Press,


Cambridge, . The two books on the sphere and the cylinder, Translated
into English, together with Eutocius’ commentaries, with commentary, and
critical edition of the diagrams by Reviel Netz.

[] V. I. Arnol′ d. On the teaching of mathematics. Russian Mathematical Sur-


veys, ():–, .

[] Carl B. Boyer. A history of mathematics. John Wiley & Sons Inc., New
York, .

[] Cesare Burali-Forti. A question on transfinite numbers (). In Jean van


Heijenoort, editor, From Frege to Gödel, pages –. Harvard University
Press, .

[] David M. Burton. Elementary Number Theory. McGraw-Hill, Boston, sixth


edition, .

[] R. D. Carmichael. Note on a new number theory function. Bull. Amer.


Math. Soc., ():–, .

[] Richard Dedekind. Essays on the theory of numbers. I: Continuity and irra-
tional numbers. II: The nature and meaning of numbers. authorized trans-
lation by Wooster Woodruff Beman. Dover Publications Inc., New York,
.

[] Leonard Eugene Dickson. History of the Theory of Numbers, volume .


Chelsea, New York, .

[] P. Erdős. Beweis eines Satzes von Tschebyschef (in German). Acta Litt.
Sci. Szeged, :–, . Available at http://www.renyi.hu/~p_erdos/
1932-01.pdf (as of December , ).


[] Euclid. The thirteen books of Euclid’s Elements translated from the text of
Heiberg. Vol. I: Introduction and Books I, II. Vol. II: Books III–IX. Vol.
III: Books X–XIII and Appendix. Dover Publications Inc., New York, .
Translated with introduction and commentary by Thomas L. Heath, nd ed.
[] Euclid. Euclid’s Elements. Green Lion Press, Santa Fe, NM, . All
thirteen books complete in one volume, the Thomas L. Heath translation,
edited by Dana Densmore.

[] H. W. Fowler. A Dictionary of Modern English Usage. Oxford University


Press, second edition, . revised and edited by Ernest Gowers.

[] H. W. Fowler. A Dictionary of Modern English Usage. Wordsworth Editions,


Ware, Hertfordshire, UK, . reprint of the original  edition.

[] Carl Friedrich Gauss. Disquisitiones Arithmeticae. Springer-Verlag, New


York, . Translated into English by Arthur A. Clarke, revised by William
C. Waterhouse.

[] D. A. Goldston, J. Pintz, and C. Y. Yıldırım. http://arxiv.org, .


arXiv:math/v [math.NT].

[] Timothy Gowers. Mathematics. Oxford University Press, Oxford, . A


very short introduction.

[] Timothy Gowers, June Barrow-Green, and Imre Leader, editors. The Prince-
ton companion to mathematics. Princeton University Press, Princeton, NJ,
.
[] Ben Green and Terence Tao. The primes contain arbitrarily long arithmetic
progressions. http://arxiv.org, . arXiv:math/v [math.NT].

[] G. H. Hardy and E. M. Wright. An introduction to the theory of numbers.


The Clarendon Press Oxford University Press, New York, fifth edition, .

[] Thomas Heath. A history of Greek mathematics. Vol. I. Dover Publications


Inc., New York, . From Thales to Euclid, Corrected reprint of the 
original.

[] James Ivory. Demonstration of a theorem respecting prime numbers. In


Thomas Leybourn, editor, New Series of the Mathematical Repository, vol-
ume I, chapter II, pages –. W. Glendinning, London, .

[] Victor J. Katz, editor. The Mathematics of Egypt, Mesopotamia, China,


India, and Islam: A Sourcebook. Princeton University Press, Princeton and
Oxford, .

Bibliography 
[] Edmund Landau. Elementary number theory. Chelsea Publishing Co., New
York, N.Y., . Translated by J. E. Goodman.

[] Edmund Landau. Foundations of Analysis. The Arithmetic of Whole, Ra-


tional, Irrational and Complex Numbers. Chelsea Publishing Company, New
York, N.Y., third edition, . translated by F. Steinhardt; first edition
; first German publication, .

[] Barry Mazur. How did Theaetetus prove his theorem? In P. Kalkav-
age and E. Salem, editors, The Envisoned Life: Essays in honor of Eva
Brann. Paul Dry Books, . http://www.math.harvard.edu/~mazur/
preprints/Eva.pdf, accessed September , .

[] Nicomachus of Gerasa. Introduction to Arithmetic, volume XVI of University


of Michigan Studies, Humanistic Series. University of Michigan Press, Ann
Arbor, . First printing, .

[] Giuseppe Peano. The principles of arithmetic, presented by a new method


(). In Jean van Heijenoort, editor, From Frege to Gödel, pages –.
Harvard University Press, .

[] Srinivasa Ramanujan. A proof of Bertrand’s postulate. Journal of the Indian


Mathematical Society, XI:–, . Available at http://www.imsc.res.
in/~rao/ramanujan/CamUnivCpapers/Cpaper24/page1.htm (as of Decem-
ber , ).

[] Bertrand Russell. Letter to Frege (). In Jean van Heijenoort, editor,
From Frege to Gödel, pages –. Harvard University Press, .

[] Lucio Russo. The forgotten revolution. Springer-Verlag, Berlin, . How
science was born in  BC and why it had to be reborn, Translated from
the  Italian original by Silvio Levy.

[] Filip Saidak. A new proof of Euclid’s theorem. The American Mathematical
Monthly, ():–, Dec. .

[] D. J. Struik, editor. A source book in mathematics, –. Princeton


Paperbacks. Princeton University Press, Princeton, NJ, . Reprint of the
 edition.

[] Théon de Smyrne. Exposition des connaissances mathématiques utiles pour la


lecture de Platon. Hachette, Paris, . Greek text, with French translation
by J. Dupuis.

 Bibliography
[] Ivor Thomas, editor. Selections illustrating the history of Greek mathematics.
Vol. I. From Thales to Euclid. Harvard University Press, Cambridge, Mass.,
. With an English translation by the editor.

[] Ivor Thomas, editor. Selections illustrating the history of Greek mathematics.
Vol. II. From Aristarchus to Pappus. Harvard University Press, Cambridge,
Mass, . With an English translation by the editor.

[] Jean van Heijenoort. From Frege to Gödel. A source book in mathematical
logic, –. Harvard University Press, Cambridge, Mass., .

[] John von Neumann. On the introduction of transfinite numbers (). In


Jean van Heijenoort, editor, From Frege to Gödel, pages –. Harvard
University Press, .

Bibliography 
Index

abelian group,  —’s Theorem, , , 


absolute pseudo-prime,  Euclidean algorithm, 
algebraic,  Euler, , 
archimedean property of R,  — phi-function, 
arithmetic function,  —’s Criterion, 
—’s Theorem, , 
Bézout’s Lemma, 
base of induction,  Fermat, 
Bertrand’s Postulate,  — number, — prime, 
—’s Theorem, 
Carmichael, — number, 
Fermat’s Last Theorem, 
Cauchy sequence, , 
field, 
Chinese Remainder Problem, 
first natural number, 
Chinese remainder problem, 
function
class, 
arithmetic —, 
closed form, 
completely multiplicative —,
co-prime, 

commutative ring, 
Euler phi-—, 
complete, 
homomorphism, 
complete set of residues, 
isomorphism, 
completely multiplicative function,
Möbius function, 

multiplicative —, 
completion, 
unit —, 
composite number, 
Fundamental Theorem of Arith-
congruent numbers, 
metic, 
conjugate, 
convolution, 
countable,  Gamma function, , 
cut,  Gauss, 
—’s Lemma, 
dense,  —’s Theorem, 
Diophantine equation,  geometric series, 
divides, divisor,  Germain, — prime, , 
greatest common divisor, 
element,  greatest integer, 
Euclid,  group, 


abelian —,  natural logarithm, 
natural number, 
harmonic series,  negative, 
Hasse diagram,  non-residue, quadratic, 
homomorphism,  number, see also prime
Carmichael —, 
ideal, ,  composite —, 
incommensurable,  congruent —s, 
induction, ,  first natural —, one, 
inductive condition,  Mersenne —, 
inductive hypothesis,  natural —, 
strong —,  one, 
infinite descent,  pentagonal —, 
integral domain,  perfect —, 
inverse,  predecessor, 
irreducible,  squarefree —, 
isomorphism,  successor, 
triangular —, 
Jacobi symbol, 
one, 
Korselt’s Criterion, ,  open subset, 
order, , 
Lagrange, —’s Theorem,  ordered commutative ring, 
least common multiple,  ordered field, 
Legendre, ,  ordering, 
— symbol,  linear —, 
Leibniz,  well ordered, 
linear ordering,  ordinal number, ordinal, 
logarithm, 
look,  Peano axioms, 
pentagonal number, 
measure,  perfect number, 
member,  Pigeonhole Principle, , 
Mersenne,  positive, 
— number,  predecessor, 
— prime, ,  prime, , 
Möbius,  Germain —, , 
— Inversion,  absolute pseudo-—, 
— function,  Fermat —, 
modulus, modulo,  Mersenne —, , 
multiplicative function,  pseudo-—, 
completely —,  relatively —, co-—, 

Index 
twin —s,  Euler’s Th—, , 
prime number,  Fermat’s Last Th—, 
primitive root, , ,  Fermat’s Th—, 
proof Fundamental Th— of Arith-
— by induction,  metic, 
— by infinite descent,  Gauss’s Lemma, 
pseudo-prime,  Gauss’s Th—, 
absolute —,  Lagrange’s Th—, 
Möbius Inversion, 
quadratic Pigeonhole Principle, , 
— non-residue,  Wilson’s Th—, 
— residue,  topology, 
quadratic residue, nonresidue,  transfinite, 
quaternion,  transitive class, 
triangular number, 
rational numbers,  twin primes, 
real number, 
recursive definition,  uncountable, 
relatively prime,  unit function, 
remainder unit of a ring, 
Chinese — problem, 
residue,  well ordered, 
complete set of —s,  Wilson, —’s Theorem, 
quadratic —, 
zero, 
quadratic non-—, 
Riemann zeta function, 
ring, , 

set, 
square root, 
squarefree number, 
Stirling’s approximation, 
strict linear ordering, 
strong inductive hypothesis, 
subclass, 
successor, 
supremum, 

Theon of Smyrna, , 


theorem
Euclid’s Th—, , , 
Euler’s Criterion, 

 Index

You might also like